AI Search Engines Are Forming Alliances: Which Engines Agree and Why It Matters

FogTrail analyzed pairwise brand mention overlap across ChatGPT, Perplexity, Gemini, Grok, and Claude over 3 weekly waves (300 engine-query pairs per wave, 20 queries, 25 B2B SaaS brands). The data reveals a Grok-Claude-Gemini cluster forming, with all three engines sharing 69% or higher overlap with each other as of March 2026. ChatGPT is going its own way: its overlap with Gemini dropped to 58%, the lowest pairwise agreement in the entire dataset, while its overlap with Grok climbed to 71%. Perplexity sits in the middle, with moderate overlap across all engines. The highest-agreement pair rotated every week (Perplexity-Gemini 71% in Wave 1, Grok-Gemini 79% in Wave 2, Grok-Claude 75% in Wave 3), and the overlap floor oscillated between 58% and 63%, disproving the hypothesis that engines are converging toward shared recommendations.

These shifting alliances have direct consequences for AEO strategy. Optimizing for one engine inside the Grok-Claude-Gemini cluster is likely to benefit the other two, but ChatGPT increasingly requires its own playbook.

The Bottom Line

A Grok-Claude-Gemini cluster is forming: all three engines share 69%+ pairwise overlap, meaning content that works for one tends to work for the others
ChatGPT is diverging from the pack, with the lowest overlap against 3 of 4 engines in Wave 3 and a Gemini overlap of just 58%
Engine alliances shift weekly. The highest-agreement pair rotated every wave, and the strong consensus rate oscillated 50% to 55% back to 50%, proving these relationships are not fixed

What Pairwise Overlap Actually Measures

Pairwise overlap is the percentage of brand mentions shared between two engines across all queries in a wave. If Grok mentions 18 brands across 20 queries and Claude mentions 17 of those same 18, their overlap is high. If ChatGPT mentions a substantially different set, its overlap with both is lower. This is not about whether two engines agree on the number one pick for a single query. It measures how similar their entire recommendation universes are across the full query set.

FogTrail calculates this as the Jaccard index of brand mentions shared between engine pairs across all 20 queries. The metric captures structural similarity in which brands each engine considers relevant, independent of ranking order.

The Full Pairwise Tables Across 3 Waves

Three weeks of data reveal how engine relationships evolved from March 6 to March 15, 2026.

Wave 1 (March 6, 2026)

	Perplexity	ChatGPT	Gemini	Grok	Claude
Perplexity		64%	71%	63%	66%
ChatGPT	64%		62%	58%	61%
Gemini	71%	62%		67%	70%
Grok	63%	58%	67%		62%
Claude	66%	61%	70%	62%

Highest pair: Perplexity-Gemini (71%). Lowest pair: ChatGPT-Grok (58%).

Wave 2 (March 10, 2026)

	Perplexity	ChatGPT	Gemini	Grok	Claude
Perplexity		65%	72%	66%	63%
ChatGPT	65%		67%	70%	67%
Gemini	72%	67%		79%	69%
Grok	66%	70%	79%		72%
Claude	63%	67%	69%	72%

Highest pair: Grok-Gemini (79%). Lowest pair: Perplexity-Claude (63%).

Wave 3 (March 15, 2026)

	Perplexity	ChatGPT	Gemini	Grok	Claude
Perplexity		64%	67%	67%	62%
ChatGPT	64%		58%	71%	62%
Gemini	67%	58%		74%	69%
Grok	67%	71%	74%		75%
Claude	62%	62%	69%	75%

Highest pair: Grok-Claude (75%). Lowest pair: ChatGPT-Gemini (58%).

The Grok-Claude-Gemini Cluster

Grok, Claude, and Gemini maintained above 69% pairwise overlap with each other in Wave 3: Grok-Claude at 75%, Grok-Gemini at 74%, and Gemini-Claude at 69%. This cluster has been visible since Wave 1, where Gemini-Claude was already at 70% and Gemini-Grok at 67%. By Wave 2, Grok-Gemini surged to 79%, the highest single-pair overlap in the entire dataset. The cluster tightened further in Wave 3 as Grok-Claude rose to 75%.

What makes this cluster significant is the consistency. No pair within the trio has dropped below 67% in any wave. Compare that to ChatGPT's relationships, where overlap with Gemini has bounced between 58% and 67%, a 9-point swing.

The practical implication: content that earns mentions from one engine inside this cluster is more likely to earn mentions from the other two. If Grok cites your brand, there is a roughly 75% chance Claude mentions you for the same query, and a 74% chance Gemini does. That is not a guarantee, but it is a meaningfully better hit rate than optimizing for ChatGPT and hoping it transfers.

ChatGPT Is Going Its Own Way

ChatGPT's overlap with Gemini dropped from 62% (Wave 1) to 67% (Wave 2) to 58% (Wave 3). That 58% is the lowest pairwise agreement between any two engines in the entire three-wave dataset. ChatGPT also holds the lowest overlap with Claude (62%) and Perplexity (64%) in Wave 3, making it the most isolated engine in the recommendation landscape.

The divergence is not just statistical. ChatGPT's behavior is qualitatively different from the other four engines. It is the only engine that heavily cites Wikipedia (10.4% of its URLs). It dropped ActiveCampaign entirely from email marketing responses in Wave 3 while the other four engines continued to mention it. It gave Netlify its first ever number one position, a structural break that no other engine mirrored. Its brand citation count swings more than any other engine: 23, 12, 14 across three waves, compared to Claude's unwavering 6, 6, 6.

There is one exception to ChatGPT's isolation. Its overlap with Grok has been climbing steadily: 58% (Wave 1), 70% (Wave 2), 71% (Wave 3). ChatGPT and Grok are converging with each other even as ChatGPT diverges from the other three. This creates a loose secondary axis in the data: a Grok-ChatGPT alignment that coexists with the broader Grok-Claude-Gemini cluster, with Grok serving as the bridge between the two groups.

For brands, this means optimizing for ChatGPT requires distinct tactics. ChatGPT's unique sourcing behavior, its Wikipedia reliance, its volatility, its willingness to break structural patterns, all suggest it processes brand authority signals differently than the other four engines.

Perplexity: The Neutral Middle

Perplexity does not belong to either cluster. Its overlap with every engine falls between 62% and 72% across all three waves, consistently moderate, never the highest or lowest pair except briefly in Wave 1 (when Perplexity-Gemini was the top pair at 71%) and Wave 2 (when Perplexity-Claude was the bottom pair at 63%).

Perplexity's position makes it an interesting diagnostic tool. A brand that appears on Perplexity but not on the Grok-Claude-Gemini cluster may have retrieval signals that are too narrow. A brand that appears on the cluster but not Perplexity may lack the recency or source-diversity signals that Perplexity weights. Perplexity was also the first engine to surface Loops, a near-invisible email startup, suggesting it has the most independent retrieval behavior for discovering new brands.

The Convergence Hypothesis Was Wrong

After Wave 2, the data appeared to show engines converging. The pairwise overlap floor rose from 58% to 63%. Unanimous consensus jumped from 20% to 30%. Strong consensus climbed from 50% to 55%. It looked like AI search was settling into shared recommendations.

Wave 3 disproved this. The floor dropped back to 58%. Strong consensus returned to 50%. The highest-agreement pair rotated for the third consecutive wave. The three-wave pattern is oscillation, not convergence.

Metric	Wave 1	Wave 2	Wave 3
Pairwise floor	58%	63%	58%
Strong consensus (4+/5)	50%	55%	50%
Highest pair	Perplexity-Gemini 71%	Grok-Gemini 79%	Grok-Claude 75%
Lowest pair	ChatGPT-Grok 58%	Perplexity-Claude 63%	ChatGPT-Gemini 58%

This matters because it means engine alliances are not fixed. Optimizing based on a single week's pairwise data is like trading stocks based on one day's price. The relationships are real but they shift. Continuous monitoring, not point-in-time snapshots, is what separates actionable AEO intelligence from noise. FogTrail's 48-hour monitoring cycle exists precisely because these dynamics move this fast.

What This Means for Multi-Engine AEO Strategy

The pairwise data suggests a practical framework for engine prioritization. If you are optimizing across all five engines, the cluster structure tells you where effort transfers and where it does not.

High transfer: Grok, Claude, Gemini. Content that earns visibility on one of these three engines has the highest probability of transferring to the other two. All three share 69%+ overlap. Start with the engine where you have the strongest existing presence within this trio and let the cluster effect carry you.

Low transfer: ChatGPT. ChatGPT requires dedicated effort. Its 58% overlap with Gemini means that nearly half of the brands Gemini mentions are absent from ChatGPT's responses, and vice versa. ChatGPT's sourcing behavior (Wikipedia reliance, brand-owned URL preference at 18.4%, high citation volatility) demands a different optimization approach.

Bridge engine: Grok. Grok sits at the intersection. It overlaps 74-75% with both Gemini and Claude (cluster members) and 71% with ChatGPT (the outlier). Grok is the single engine where gains are most likely to echo across the entire landscape.

Monitoring canary: Perplexity. Perplexity's moderate, independent overlap makes it the best early warning system. If a brand appears on Perplexity first (as Loops did), it may signal emerging retrieval signals that other engines will pick up in subsequent waves.

Why Nobody Else Is Publishing This Data

Most AEO monitoring platforms track whether your brand appears on each engine individually. They report presence and position per engine, per query. That is useful but incomplete. Pairwise overlap data reveals the relationships between engines, which is what actually determines whether optimizing for one engine creates a multiplier effect across others or wastes effort on an island.

As of March 2026, no other AEO platform publishes cross-engine pairwise agreement data. The FogTrail AEO platform tracks these relationships across 5 AI engines simultaneously as part of its intelligence briefing cycle, surfacing cluster shifts and alliance changes that per-engine dashboards miss entirely.

Frequently Asked Questions

Do AI search engines give the same answers?

Partially. As of March 2026, AI engines share between 58% and 75% of their brand mentions for the same queries. Grok, Claude, and Gemini form a high-agreement cluster (69%+ overlap), while ChatGPT diverges, sharing only 58% overlap with Gemini. No two engines give identical answers, but some pairs are far more similar than others.

Which AI engines agree the most?

Grok and Claude had the highest pairwise overlap at 75% in FogTrail's Wave 3 study (March 15, 2026). Grok-Gemini was second at 74%. The highest pair rotates weekly: it was Perplexity-Gemini (71%) in Wave 1, Grok-Gemini (79%) in Wave 2, and Grok-Claude (75%) in Wave 3. The relationships are not fixed.

Should I optimize for all 5 AI engines or focus on one?

Optimizing for one engine inside the Grok-Claude-Gemini cluster provides partial transfer to the other two, but ChatGPT requires separate effort due to its 58% overlap with Gemini. A practical approach: start with one cluster engine, then dedicate separate effort to ChatGPT. Perplexity serves as a monitoring canary for emerging brand signals.

Are AI search engines converging on the same recommendations?

No. FogTrail's three-wave data shows oscillation, not convergence. Strong consensus went 50%, 55%, 50% across three weekly waves, and the pairwise overlap floor oscillated 58%, 63%, 58%. Engine alliances shift weekly, disproving the hypothesis that AI engines are settling into shared recommendations.

Why does ChatGPT disagree with other AI engines?

ChatGPT's divergence stems from distinct sourcing behavior. It is the only engine that heavily cites Wikipedia (10.4% of URLs), has the most volatile brand citation counts (swinging from 23 to 12 to 14 across three waves), and makes unique editorial decisions such as dropping entire brands from responses between waves. Its overlap with Gemini (58%) is the lowest of any engine pair in the dataset.