How AI engines decide which brands to mention

By Rankply · 19 May 2026 · 6 min read

## The black box buyers actually use

When an AI engine answers a buyer question, it's making thousands of micro-decisions about which brands to surface. Those decisions look opaque from the outside, but the underlying signal mix is consistent across engines. ChatGPT, Claude, Gemini, and Perplexity all draw on the same broad universe of public web content, and they all converge on roughly the same four signals to decide who gets named.

Understanding those signals is the entire game. Once you know what the model is weighting, every other GEO decision becomes a budget-allocation question.

## The four signals that move citations

**1. Authority of the source.** AI engines score every domain on a trust axis. Wikipedia and government sites are at the top. Established editorial outlets (TechCrunch, Bloomberg, The Verge, Financial Times) come next. UGC platforms (Reddit, Quora, Hacker News) carry weight but with sentiment-aware adjustments. Your own marketing site barely registers — that's why owned content alone doesn't move the needle. In practical terms, one mention on a top-tier reference domain is worth roughly 30-50 mentions on your own blog.

**2. Recency.** Citations from the last 12 months weigh approximately 3x more than older content; citations from the last 90 days weigh another 1.5x on top of that for fast-moving categories. This is why podcast appearances and recent press placements compound so fast: each one resets the freshness clock, adds a new citation node, and keeps the brand visible in retrieval indices that re-rank by date.

**3. Co-occurrence.** When your brand appears alongside category-defining terms ("CRM for sustainability startups", "carbon-accounting platform", "fintech for SMB"), the AI builds an association. The more often that pairing shows up across distinct sources, the more confident the model becomes that you belong in answers about that category. Co-occurrence is the signal most brands underestimate — it's why an unrelated podcast appearance can boost your visibility on a category query you never targeted directly.

**4. Sentiment polarity.** AI engines analyse the tone around your brand mentions. A single high-impact negative — a viral Reddit thread, an unhappy ex-employee Medium post, a damning investigative article — can suppress your visibility for months across all engines simultaneously. The sentiment signal is asymmetric: one strong negative cancels roughly ten neutral mentions, and recovery from a sentiment hit typically takes 4-9 months even with active mitigation.

## How this plays out in practice

When a buyer asks "best CRM for sustainability startups", the AI runs something close to this sequence:

1. **Candidate generation.** Pull brands from training data + live web search (Perplexity, ChatGPT Search, Gemini Grounding) + retrieval index. Typically 30-80 candidate brands at this stage. 2. **Authority filter.** Drop brands whose supporting sources are below a trust threshold. Leaves 10-25 candidates. 3. **Co-occurrence rank.** Score each by how often they appear alongside the specific query terms in the candidate's supporting sources. 4. **Sentiment discount.** Penalise brands with recent negative coverage. A strong negative can move a candidate from top-3 to unmentioned. 5. **Diversity rule.** Most engines impose a soft cap (3-5 brands named) to keep answers readable. 6. **Citation pick.** Emit a final ranked list, citing the highest-authority source per brand.

The buyer sees the output and treats it as objective. The vendors who optimised for these signals are the ones on stage. The ones who didn't are invisible — not punished, just not present.

## Engine-by-engine differences worth knowing

Although the underlying signal mix converges, each engine has a tilt:

- **ChatGPT** leans on training-data recency and live Bing-style retrieval. Strong on direct-recall queries, moderate on long-tail. Updates to training data roll out in roughly 6-month cycles; live retrieval refreshes daily for indexed sources. - **Claude** weights editorial and reference-graph sources heavily, and tends to be more conservative about naming brands without strong evidence. Brands cited frequently in tier-1 editorial see disproportionate lift here. Conversely, brands relying on UGC supply often underperform on Claude relative to ChatGPT. - **Gemini** integrates Google's Knowledge Graph aggressively, which makes structured data and entity disambiguation matter more here than elsewhere. A correct Organization schema block and a clean Knowledge Graph entry can shift Gemini citations more than a moderate PR push. - **Perplexity** is the most retrieval-driven — recency and citation density move the needle fastest, sometimes within days of a new placement. Perplexity is the canary in the coal mine for new PR work: if you don't see lift in Perplexity within 14 days of a major placement, the placement isn't being indexed correctly.

A balanced GEO programme aims to be cited across all four. Single-engine optimisation is rarely worth the effort because buyers don't standardise on one assistant — and the engine landscape itself shifts faster than your optimisation timeline. The Rankply audit runs each prompt against all four by default; you can configure per-engine weights if your buyer base skews heavily to one assistant, but the platform-default equal weighting is a sensible starting point for most teams.

## The signals you can't move (and what to do about them)

A few signals worth naming because customers ask about them, even though they're not directly controllable:

- **Training-data freezes.** Each model has a knowledge cutoff. Content published after the cutoff won't influence the static-knowledge layer until the next training run. This is why live-retrieval matters — Perplexity and the search-augmented modes of ChatGPT, Claude, and Gemini are the only routes for recent content to surface immediately. - **Engine policy changes.** Each provider tweaks how it weights sources, what it refuses to discuss, and how it cites. A category that was easy to win in last quarter can become harder this quarter for reasons that have nothing to do with your effort. The mitigation is diversification — across signals, across engines, across source types. - **Aggregator API access.** Some category aggregators (G2, Capterra) make data programmatically available to AI engines; others don't. The ones that do are disproportionately influential. We surface this in the citation-source leaderboard so you know which aggregators are actually feeding the engines you care about.

## What we measure for you

Rankply's audit isolates each of these four signals so you know exactly which one is holding you back. Authority gap? We recommend PR placements in the domains AI cites for your category (the citation-source leaderboard surfaces these automatically). Recency gap? We trigger fresh content through the monthly delivery cycle. Co-occurrence gap? We adjust your brief in the brief editor to target the right pairings. Sentiment problem? We surface the offending URL and offer mitigation services — covered in detail in our lesson on negative citations.

The recommendations panel ranks the gaps by expected lift, so you're not guessing which lever to pull first. Most customers find that one or two signals dominate their visibility ceiling — fixing those first is dramatically higher-ROI than spreading effort thinly across all four.

You can't game these signals — but you can build them, deliberately, month over month. That's the entire product, and that's why GEO is a 12-month commitment, not a quarterly campaign.