Platform

AI

AI Agents
Sense, decide, and act faster than ever before
AI Visibility
See how your brand shows up in AI search
AI Feedback
Distill what your customers say they want
Amplitude MCP
Insights from the comfort of your favorite AI tool

Insights

Product Analytics
Understand the full user journey
Marketing Analytics
Get the metrics you need with one line of code
Session Replay
Visualize sessions based on events in your product
Heatmaps
Visualize clicks, scrolls, and engagement

Action

Guides and Surveys
Guide your users and collect feedback
Feature Experimentation
Innovate with personalized product experiences
Web Experimentation
Drive conversion with A/B testing powered by data
Feature Management
Build fast, target easily, and learn as you ship
Activation
Unite data across teams

Data

Data Governance
Complete data you can trust
Integrations
Connect Amplitude to hundreds of partners
Security & Privacy
Keep your data secure and compliant
Warehouse-native Amplitude
Unlock insights from your data warehouse
Solutions
Solutions that drive business results
Deliver customer value and drive business outcomes
Amplitude Solutions →

Industry

Financial Services
Personalize the banking experience
B2B
Maximize product adoption
Media
Identify impactful content
Healthcare
Simplify the digital healthcare experience
Ecommerce
Optimize for transactions

Use Case

Acquisition
Get users hooked from day one
Retention
Understand your customers like no one else
Monetization
Turn behavior into business

Team

Product
Fuel faster growth
Data
Make trusted data accessible
Engineering
Ship faster, learn more
Marketing
Build customers for life
Executive
Power decisions, shape the future

Size

Startups
Free analytics tools for startups
Enterprise
Advanced analytics for scaling businesses
Resources

Learn

Blog
Thought leadership from industry experts
Resource Library
Expertise to guide your growth
Compare
See how we stack up against the competition
Glossary
Learn about analytics, product, and technical terms
Explore Hub
Detailed guides on product and web analytics

Connect

Community
Connect with peers in product analytics
Events
Register for live or virtual events
Customers
Discover why customers love Amplitude
Partners
Accelerate business value through our ecosystem

Support & Services

Customer Help Center
All support resources in one place: policies, customer portal, and request forms
Developer Hub
Integrate and instrument Amplitude
Academy & Training
Become an Amplitude pro
Professional Services
Drive business success with expert guidance and support
Product Updates
See what's new from Amplitude

Tools

Benchmarks
Understand how your product compares
Templates
Kickstart your analysis with custom dashboard templates
Tracking Guides
Learn how to track events and metrics with Amplitude
Maturity Model
Learn more about our digital experience maturity model
Pricing
LoginContact salesGet started

AI

AI AgentsAI VisibilityAI FeedbackAmplitude MCP

Insights

Product AnalyticsMarketing AnalyticsSession ReplayHeatmaps

Action

Guides and SurveysFeature ExperimentationWeb ExperimentationFeature ManagementActivation

Data

Data GovernanceIntegrationsSecurity & PrivacyWarehouse-native Amplitude
Amplitude Solutions →

Industry

Financial ServicesB2BMediaHealthcareEcommerce

Use Case

AcquisitionRetentionMonetization

Team

ProductDataEngineeringMarketingExecutive

Size

StartupsEnterprise

Learn

BlogResource LibraryCompareGlossaryExplore Hub

Connect

CommunityEventsCustomersPartners

Support & Services

Customer Help CenterDeveloper HubAcademy & TrainingProfessional ServicesProduct Updates

Tools

BenchmarksTemplatesTracking GuidesMaturity Model
LoginSign Up

Building Reliable AI Infrastructure: What We Learned Scaling AI Visibility

Releasing AI Visibility exposed some reliability gaps. Early report failures ultimately led to a more stable product.
Product

Feb 4, 2026

6 min read

Leo Jiang

Leo Jiang

Head of Engineering, AI Products, Amplitude

AI Visibility Expansion feature

Amplitude AI Visibility measures what LLMs say about your brand. Every week, it generates fresh reports so you can track how your marketing efforts influence AI answers over time. Thousands of customers have already generated and used these reports to understand their brand's presence in AI-driven search. The most innovative brands are hungry for data about LLM performance, and AI Visibility is exactly what they need to effectively reach modern customers.

Running AI Visibility at scale has taught us a lot about building reliable AI infrastructure. We also received a lot of helpful feedback from our users. We’ve learned a lot in just a few months. Here is a summary of the lessons we’ve learned so far and an explanation of how those learnings have helped us continually improve our product.

Generating reports is incredibly complex

Creating AI Visibility reports involves coordination between multiple internal and external services. For each brand that uses AI Visibility, we send thousands of prompts to multiple AI models, extract brand mentions and citations, determine competitors, collect sentiment, and analyze cited URLs.

This process has a lot of moving parts, but it works well most of the time. However, it depends on multiple third-party services, each with their own reliability variance. Over the past months, we saw more failures than we found acceptable. Reports didn't update on schedule, data went stale, and reports came back incomplete.

We heard your feedback and took it seriously. We’ve investigated the infrastructure to look for improvements.

What we learned about orchestrating AI workflows

The core lesson of our analysis is that in a large-scale system with many external dependencies, failure modes compound in ways that won't occur during testing.

For example, when one LLM provider experienced an outage, our system retried requests, which is normally the right behavior. But during a sustained outage, retries increased load. So other reports started timing out, meaning more reports retried soon afterwards. Retries consumed usage limits with providers. Once limits were hit, new failures appeared, which persisted even after the original outage was resolved.

One problem became three. Three became ten. By the time we noticed the symptoms, the root cause was buried.

Finding the root cause

After months of putting bandage fixes around the symptoms, we traced the root cause to how we were implementing rate limiting.

AI Visibility reports run on Temporal, a workflow engine for scheduling complex tasks. We added rate limiters to workflows, expecting them to be shared across all reports. But they weren't. Since Temporal sometimes executed code in isolated environments, each report created its own instance of the limiter. When hundreds of reports ran simultaneously, the effective limit was hundreds of times higher than intended.

The result was a series of unwanted problems: we overloaded providers, triggered failures, and created cascading retries that made the system unstable under heavy load.

What we built to fix it

During the process of finding the root cause, we've made many improvements to make the system more durable. When errors inevitably occur, the system can recover or pause without causing cascading failures, regardless of what caused the original error:

  • Smarter retry behavior. We added guardrails so reports do not endlessly retry when a dependency is clearly unhealthy. The workflow now detects when failure rates are too high and aborts early, rather than burning compute and usage limits on work that is unlikely to succeed.
  • Partial success handling. Previously, small failures could cause the entire report to fail. We changed the workflow to tolerate a limited amount of failure in each step and still complete a report when the majority of the data is available. This reduced the number of missing weekly updates and made the system more resilient to intermittent issues.
  • Better load distribution. We improved how work is distributed so the system does not swing between overloaded and idle. This reduced peak-time congestion and made report completion more predictable.
  • Faster execution. We redesigned parts of report generation to run more work in parallel and batch external calls more efficiently. Faster reports mean fewer timeouts, fewer retries, and fewer opportunities for partial failures.
  • Clearer status reporting. When a report fails, users should not have to guess whether the data is fresh. We improved how failures are surfaced so customers do not accidentally make decisions based on incomplete data.
  • Real-time monitoring. We added better internal monitoring and alerting so we can detect drops in completion rates quickly, identify the most common failure modes, and respond before customers notice.

How Amplitude will continue to improve AI Visibility

Going forward, we are treating AI Visibility report generation as critical infrastructure. That means installing strong guardrails against cascades, improving failure visibility, and detecting problems early. We are also prioritizing stability whenever we make significant changes.

If you have not tried AI Visibility recently, now is a good time to try it out for free. You can use it to see how your brand appears across leading LLMs and track how your position changes from week to week. For those of you who have already tried AI Visibility, check out our latest updates and let us know what you think.

About the author
Leo Jiang

Leo Jiang

Head of Engineering, AI Products, Amplitude

More from Leo

Leo Jiang is the Head of Engineering, AI Products at Amplitude, focused on building new AI and marketing products. He has helped build Ask Amplitude, Agents, and AI Visibility.

More from Leo
Topics

AI

Marketing Analytics

Recommended Reading

article card image
Read 
Company
Why Hackathons Are the Best Kept Secret to Drive GTM Innovation

Feb 4, 2026

6 min read

article card image
Read 
Company
Meet the Ampliteers: Values Awards Winners, Q3 2025

Feb 4, 2026

10 min read

article card image
Read 
Company
Our Quest to Become AI-First and What We Learned

Jan 28, 2026

5 min read

article card image
Read 
Insights
Stop Reacting to Customer Churn—Start Predicting It

Jan 27, 2026

12 min read

Platform
  • Product Analytics
  • Feature Experimentation
  • Feature Management
  • Web Analytics
  • Web Experimentation
  • Session Replay
  • Activation
  • Guides and Surveys
  • AI Agents
  • AI Visibility
  • AI Feedback
  • Amplitude MCP
Compare us
  • Adobe
  • Google Analytics
  • Mixpanel
  • Heap
  • Optimizely
  • Fullstory
  • Pendo
Resources
  • Resource Library
  • Blog
  • Product Updates
  • Amp Champs
  • Amplitude Academy
  • Events
  • Glossary
Partners & Support
  • Contact Us
  • Customer Help Center
  • Community
  • Developer Docs
  • Find a Partner
  • Become an affiliate
Company
  • About Us
  • Careers
  • Press & News
  • Investor Relations
  • Diversity, Equity & Inclusion
Terms of ServicePrivacy NoticeAcceptable Use PolicyLegal
EnglishJapanese (日本語)Korean (한국어)Español (LATAM)Español (Spain)Português (Brasil)Português (Portugal)FrançaisDeutsch
© 2026 Amplitude, Inc. All rights reserved. Amplitude is a registered trademark of Amplitude, Inc.
Blog
InsightsProductCompanyCustomers
Topics

101

AI

APJ

Acquisition

Adobe Analytics

Agents

Amplify

Amplitude Academy

Amplitude Activation

Amplitude Analytics

Amplitude Audiences

Amplitude Community

Amplitude Feature Experimentation

Amplitude Guides and Surveys

Amplitude Heatmaps

Amplitude Made Easy

Amplitude Session Replay

Amplitude Web Experimentation

Amplitude on Amplitude

Analytics

B2B SaaS

Behavioral Analytics

Benchmarks

Churn Analysis

Cohort Analysis

Collaboration

Consolidation

Conversion

Customer Experience

Customer Lifetime Value

DEI

Data

Data Governance

Data Management

Data Tables

Digital Experience Maturity

Digital Native

Digital Transformer

EMEA

Ecommerce

Employee Resource Group

Engagement

Event Tracking

Experimentation

Feature Adoption

Financial Services

Funnel Analysis

Getting Started

Google Analytics

Growth

Healthcare

How I Amplitude

Implementation

Integration

LATAM

LLM

Life at Amplitude

MCP

Machine Learning

Marketing Analytics

Media and Entertainment

Metrics

Modern Data Series

Monetization

Next Gen Builders

North Star Metric

Partnerships

Personalization

Pioneer Awards

Privacy

Product 50

Product Analytics

Product Design

Product Management

Product Releases

Product Strategy

Product-Led Growth

Recap

Retention

Revenue

Startup

Tech Stack

The Ampys

Warehouse-native Amplitude