The State of Startup Visibility in AI Search (Q1 2026)

In March 2026, FogTrail ran a 3-wave research study across 25 B2B SaaS brands, querying 5 AI engines with 20 queries over 3 weeks. The brands span enterprise, midmarket, and startup tiers. The engines are ChatGPT, Perplexity, Gemini, Grok, and Claude. The goal: produce the first quarterly benchmark on how startups actually appear (or do not appear) in AI-generated search results.

The findings are not encouraging for startups. Two brands were completely invisible across all engines and all waves. The average startup gets mentioned 2.4x less than the average enterprise brand. And the one query type where startups should have an advantage, "alternative to X" queries, actually favors incumbents 87% of the time.

This report presents every major finding from the study, organized into four sections: the visibility gap, engine behavior, what is working for the startups that do break through, and what the data means for your AEO strategy.

Methodology

Brands tracked: 25 B2B SaaS companies across 5 categories (project management, analytics, email marketing, CRM, developer tools). Each category includes a mix of enterprise leaders, midmarket players, and startups.

Engines queried: ChatGPT, Perplexity, Gemini, Grok, Claude.

Queries per wave: 20 queries covering informational, comparative, and "alternative to" intent types.

Waves: 3 waves, spaced ~1 week apart (March 6, March 10, March 15, 2026).

Metrics tracked: Brand mentions (name appears in response), brand citations (URL included in response), #1 recommendation position, cross-engine consensus.

Total data points: 300 engine-query pairs per wave. 900 total across the study.

Section 1: The Visibility Gap

Startups are structurally disadvantaged in AI search. The average startup gets 2.4x fewer mentions than the average enterprise brand, two tracked startups are completely invisible across all engines, and "alternative to X" queries favor incumbents 87% of the time.

8% of Brands Are Completely Invisible

Two of the 25 tracked brands received zero mentions across all 5 engines, all 20 queries, and all 3 waves. Both are startups: Height (project management) and Loops (email marketing).

These are not obscure products. Height has raised over $28M in funding. Loops has active developer adoption and community presence. Neither registers in any AI engine's recommendations for their core categories.

Brand	Category	Funding	Mentions (All Engines, All Waves)	Citations
Height	Project Management	$28M+	0	0
Loops	Email Marketing	$17M+	0	0

This is the baseline reality for startups in AI search. If you are not already in the training data or the retrieval index, you do not exist.

The 2.4x Gap: Startups vs. Enterprise

Across all waves and engines, enterprise brands averaged 17.3 mentions per brand. Startups averaged 7.1. The gap is consistent across categories.

Tier	Brands	Avg Mentions Per Brand	Avg Citations Per Brand
Enterprise	8	17.3	6.8
Midmarket	9	12.4	4.1
Startup	8	7.1	2.3

The 2.4x mention gap between enterprise and startup tiers is not surprising on its own. What matters is that this gap exists even when the query is category-generic ("best email marketing platform") rather than brand-specific. AI engines are not just reflecting market share. They are amplifying it.

"Alternative to X" Queries Favor Incumbents

One might expect that queries like "best alternative to Salesforce" or "HubSpot alternative for startups" would surface challengers. They do not. Across all engines and waves, incumbents appeared as the #1 recommendation in 87% of "alternative to X" queries.

Query Type	Incumbent at #1	Startup at #1	Other
Informational	62%	18%	20%
Comparative	71%	14%	15%
"Alternative to X"	87%	8%	5%

The engines interpret "alternative to X" as "something similar to X" and then recommend... brands that look like X. Enterprise brands with similar feature sets, pricing tiers, and market positioning dominate. The full analysis of this pattern shows that even when a startup is objectively a better fit for the query, the engines default to established names.

Section 2: Engine Behavior and Disagreement

AI engines produce structurally different recommendation sets and are not converging over time. They disagree on the #1 pick in 50% of queries, ChatGPT's startup-friendliness is declining, and category-level fragmentation varies dramatically.

AI Engines Disagree on the Top Pick Half the Time

Across the 20 queries, engines disagreed on the #1 recommendation in 50% of queries. Consensus is not improving over time. It oscillated: 50%, 55%, 50% across the three waves.

Metric	Wave 1	Wave 2	Wave 3	Pattern
Strong consensus (4+/5 agree on #1)	50%	55%	50%	Oscillating
Pairwise overlap floor	58%	63%	58%	Oscillating

The pairwise overlap floor (the lowest agreement between any two engines on which brands to mention at all) also oscillated between 58% and 63%. The engines are not converging. They are producing structurally different recommendation sets on each run.

This matters for startups because engine disagreement creates openings. If all engines agreed on the same recommendations, there would be no way in. The fact that they disagree means a startup can gain traction on one engine and use it as a foothold.

ChatGPT Is Pulling Away From Startups

ChatGPT's rate of placing a startup as its #1 recommendation dropped across the study.

Engine	Startup-at-#1 Rate (Wave 1)	Startup-at-#1 Rate (Wave 3)	Change
ChatGPT	25%	15%	-10pp
Perplexity	10%	10%	0pp
Gemini	10%	15%	+5pp
Grok	15%	15%	0pp
Claude	10%	10%	0pp

No engine recommends startups first in more than 15% of queries by Wave 3. ChatGPT had the highest startup-at-#1 rate in Wave 1 at 25%, but it dropped to 15% by Wave 3. This aligns with ChatGPT's broader pattern of citation volatility: it gives more initially, then corrects toward established brands.

Category Fragmentation: Project Management vs. Analytics

Not all categories behave the same way. Project management is the most fragmented category in the dataset. Analytics is consolidating fastest.

Category	#1 Consensus (4+/5 engines agree)	Pattern
Project Management	0/4 queries	No consensus on any query
Email Marketing	1/4 queries	Mailchimp dominates on 1 query
CRM	2/4 queries	Salesforce + HubSpot split
Developer Tools	2/4 queries	Category-dependent
Analytics	3/4 queries	Google Analytics default, consolidating

In project management, all four tracked queries produced a different #1 recommendation on every engine. There is no default. This is the most open category for startups to break in, because no brand has locked down the recommendation slot. Category-level consensus patterns are the clearest signal for where AEO effort will have the highest return.

Section 3: What Is Working

The startups that break through share a pattern: deep technical content, niche positioning, and strong community presence. PostHog and Beehiiv demonstrate that startups can win specific queries against category leaders when their positioning aligns exactly with what the user asked.

PostHog: The Breakout Startup

PostHog is the only startup in the dataset whose citation count increased in every wave.

Wave	PostHog Citations (URL links)	PostHog Mentions (name in text)
Wave 1 (Mar 6)	2	8
Wave 2 (Mar 10)	3	10
Wave 3 (Mar 15)	5	12

No other brand in the study, at any tier, achieved three consecutive citation increases. PostHog's growth is coming from developer-focused queries where its documentation, open-source presence, and community content give it a structural advantage. The engines are surfacing PostHog not because of brand awareness but because the content PostHog produces matches what LLMs treat as credible sources.

PostHog demonstrates that startups can gain AI search visibility without enterprise-scale marketing budgets. The prerequisites are specific: deep technical content, active community, third-party validation, and consistent publishing cadence.

Beehiiv vs. Mailchimp: Niche Positioning Wins on 3 of 5 Engines

Beehiiv, a startup newsletter platform, beat Mailchimp (the category incumbent) for newsletter-specific queries on 3 of 5 engines. This is the clearest example of a startup outranking a category leader through niche positioning.

Engine	#1 for "best newsletter platform"	#1 for "best email marketing platform"
ChatGPT	Beehiiv	Mailchimp
Perplexity	Beehiiv	Mailchimp
Gemini	Mailchimp	Mailchimp
Grok	Beehiiv	Mailchimp
Claude	Substack	Mailchimp

When the query is broad ("best email marketing platform"), Mailchimp wins on all 5 engines. When the query narrows to newsletters, Beehiiv takes 3 of 5. The lesson: startups do not beat incumbents on generic category queries. They beat them on specific sub-category queries where they have clear positioning and content depth.

This is a repeatable strategy. The engines are not randomly assigning recommendations. They are matching query specificity to brand positioning. If your content is all about newsletters and the query is about newsletters, you have a structural advantage over a generalist platform even if that platform has 100x your market share.

Section 4: Citation Sources and Engine Quirks

The engines differ dramatically in which sources they trust. 92.5% of all citation URLs point to third-party sources, Grok cites Reddit 13x more than other engines, and single-snapshot measurements are unreliable due to week-over-week citation swings of up to 48%.

92.5% of Citations Go to Third-Party Sources

Across all engines and waves, 92.5% of all citation URLs point to third-party sources, not brand websites. The engines overwhelmingly cite review sites, comparison articles, blog posts, documentation hubs, and community threads over official brand pages.

Source Type	Share of All Citation URLs
Third-party review/comparison sites	41.2%
Documentation and technical content	22.8%
Community threads (Reddit, forums, etc.)	15.3%
News and press coverage	13.2%
Brand websites (official pages)	7.5%

This means your own website is not the primary vector for AI search visibility. The sources that LLMs trust are the third-party pages that mention you. Your AEO strategy needs to account for where your brand appears on sites you do not control.

Grok Cites Reddit 13x More Than Other Engines

Engine-level differences in source preferences are dramatic.

Metric	ChatGPT	Perplexity	Gemini	Grok	Claude
Brand site citation rate	18.4%	12.1%	9.8%	8.5%	11.2%
Reddit citation rate	2.1%	3.4%	1.8%	27.3%	2.0%
Review site citation rate	38.5%	42.1%	44.2%	31.8%	41.3%

Grok cites Reddit threads 13x more frequently than the average of the other four engines. If your brand has strong Reddit presence (genuine community mentions, not astroturfed threads), Grok is disproportionately likely to surface it. ChatGPT, by contrast, links to brand websites at roughly double the rate of Grok.

These engine-level source preferences determine where your AEO investment should go. A multi-engine strategy requires understanding which engines reward which source types. Optimizing for one engine at the expense of others is a common mistake, given that each engine has distinct retrieval and ranking behavior.

Single Snapshots Are Unreliable

Brand citation counts swung by up to 48% between identical query runs one week apart. ChatGPT's total brand citations went from 23 to 12 between Wave 1 and Wave 2, a single-week drop of 48%.

Engine	Max Week-Over-Week Citation Swing
ChatGPT	48%
Grok	250% (2 to 7)
Perplexity	29%
Gemini	17%
Claude	0%

Claude was the most stable engine: identical citation counts across all three waves. ChatGPT and Grok were the most volatile. Nondeterministic citation behavior means any single snapshot of your AI search visibility is unreliable. You need longitudinal tracking to separate signal from noise.

What This Means for Startups

The data points to five actionable conclusions.

1. If you are invisible today, you will stay invisible without intervention. Height and Loops received zero mentions across 900 data points. AI engines do not discover brands organically. If you are not in the retrieval set, you need to get there deliberately.

2. Generic category queries are a losing game for startups. The 2.4x mention gap and 87% incumbent advantage on "alternative to" queries make broad positioning a dead end. Target specific sub-categories where your positioning is clear and your content is deepest.

3. Third-party sources are your primary lever. With 92.5% of citations going to third-party URLs, your brand website alone is insufficient. Focus on getting mentioned in the review articles, comparison posts, documentation hubs, and community threads that the engines actually cite.

4. Track multiple engines over multiple weeks. Engine disagreement on the #1 recommendation at 50%, pairwise overlap floors oscillating between 58-63%, and single-engine citation swings up to 48% all point to the same conclusion. Single snapshots from single engines are not strategy. Longitudinal, multi-engine tracking is the minimum viable measurement.

5. PostHog and Beehiiv show the path. Deep technical content, niche positioning, community presence, and consistent publishing cadence are the common threads. PostHog grew from 2 to 5 citations across three waves. Beehiiv beats a 20-year-old category leader on 3 of 5 engines for newsletter queries. The engines reward specificity and depth over scale.

Quarterly Benchmark: Summary Table

Finding	Metric	Value
Completely invisible brands	Count	2 of 25 (8%), both startups
Startup vs. enterprise mention gap	Ratio	2.4x (7.1 vs. 17.3 avg mentions)
Engine disagreement on #1 pick	Rate	50% of queries
ChatGPT startup-at-#1 trend	Change	25% to 15% (Wave 1 to Wave 3)
PostHog citation growth	Trajectory	2, 3, 5 (3 consecutive increases)
Beehiiv vs. Mailchimp (newsletter queries)	Engine wins	3 of 5 engines
Third-party citation share	Rate	92.5% of all citation URLs
Grok Reddit citation premium	Multiple	13x vs. other engines
"Alternative to X" incumbent advantage	Rate	87%
Most fragmented category	Category	Project Management (0/4 consensus)
Max citation swing (week-over-week)	Magnitude	48%
Pairwise overlap floor	Range	58-63%, oscillating

This is the Q1 2026 baseline. We will repeat this study quarterly to track whether the startup visibility gap narrows, which engines shift their source preferences, and whether the brands investing in AEO move the needle.

If your startup is not showing up in AI search results today, the data is clear: it will not start showing up on its own. The brands that are gaining ground are the ones treating AI search visibility as an active, measurable, multi-engine discipline.

The FogTrail AEO platform tracks your brand's visibility across all 5 major AI search engines, identifies gaps, and produces verified content to close them. As of March 2026, FogTrail is $499/mo ($399/mo annual) with 100 queries, 5 engines, and 48-hour monitoring cycles.

Frequently Asked Questions

How many brands are completely invisible in AI search?

In our study of 25 B2B SaaS brands, 2 (8%) received zero mentions across all 5 engines, all 20 queries, and all 3 waves. Both are startups with significant funding (Height at $28M+, Loops at $17M+). AI engines do not discover brands organically. If you are not in the retrieval set, you need to get there deliberately.

Which AI engine is most startup-friendly?

No engine recommends startups first in more than 15% of queries by Wave 3. ChatGPT had the highest startup-at-#1 rate in Wave 1 at 25%, but it dropped to 15% by Wave 3. Startup-friendliness across engines is volatile, not systematically rising.

Do "alternative to X" queries help startups get discovered?

Counter to expectations, incumbents appeared as the #1 recommendation in 87% of "alternative to X" queries. The engines interpret "alternative to X" as "something similar to X" and default to established brands with similar feature sets and market positioning.

How much do AI engine recommendations vary between identical runs?

Citation counts swung up to 48% between identical query runs one week apart. ChatGPT and Grok were the most volatile. Claude was the most stable, returning identical citation counts across all three waves. Single snapshots are unreliable; longitudinal, multi-engine tracking is the minimum viable measurement.