Tracked prompts — the queries we test against

By Rankply · 19 May 2026 · 7 min read

## What a tracked prompt is

A tracked prompt is a question we run through every configured AI engine on every audit. The same prompts, every time. By running the same set repeatedly, we can chart whether your brand's appearance rate goes up or down month over month.

Without tracked prompts, every audit is a one-off snapshot. With them, you get a curve — and the curve is the only thing that matters when measuring AI-visibility growth. A single audit tells you where you stand today; a sequence of audits tells you whether your strategy is working.

## How Rankply seeds your prompt set

On your first audit, we generate a prompt pack that mixes three flavours, calibrated to the typical buyer journey in your category:

**Brand-recall prompts** (weighted 2x). These name your brand directly: "What does {brand} do?", "{brand} reviews and pricing", "Is {brand} a good {category} provider?", "How does {brand} compare to its competitors?". These test whether the AI knows you exist at all and what it thinks you do. We weight these heavily because direct recall is the foundation pillar — without it, the other prompts can't move.

**Head-to-head comparison prompts** (weighted 1.5x). When we know your top competitor, we add prompts like "{brand} vs {competitor}", "Best alternatives to {competitor}", "{competitor} vs {brand}: which is better for {use case}". These test category-position visibility, which is where buyer intent peaks. Comparison prompts are the highest-value queries to win — they're the ones where a buyer has narrowed the field and is making a decision.

**Category prompts** (weighted 1x). The most generic — "Best {category} companies in 2026", "Top {category} platforms reviewed", "Leading {category} vendors for {audience}". These test whether you surface in unprompted discovery. They're the noisiest prompts (any small brand will look invisible) but the highest-ceiling — winning category prompts is what separates category-defining brands from also-rans.

The mix is deliberate. Pure category prompts give noisy results. Pure brand-recall prompts give artificially high scores (the AI mentions you because you're named in the prompt). The blend is what produces a meaningful number — one that moves predictably as your underlying signals strengthen.

## How many prompts is enough

For a tight B2B niche, 15-25 prompts is usually enough to produce a stable score. For a broader category or a brand operating across multiple sub-segments, 30-60 prompts gives better signal. Above 100, the marginal value of each new prompt drops sharply.

The right number is the one that covers the queries your buyers actually use. We seed the pack from a combination of search data, your own sales team's anecdotes (captured in the brief editor), and our category-prompt library. You can review and edit the set before your first paid audit runs.

## Editing the prompt set

You can add or remove prompts at any time via **Settings → Brief**. Add prompts that buyers in your specific niche actually use — internal sales teams usually know these by heart, and adding 3-5 of them within the first month is the single most valuable customisation you can make. Remove prompts that don't apply (a B2B SaaS doesn't need "best holiday gifts for X" in its prompt pack).

A common pattern: customers add prompts for objections they hear in sales calls. If buyers regularly ask "is {brand} compliant with {regulation}?", that's a tracked prompt worth adding. If they ask "does {brand} integrate with {tool}?", that's another. Each one becomes a measurable, trackable position in your visibility surface.

## How the score is calculated

For each prompt × engine combination, we record whether your brand was mentioned, in what position, with what sentiment. The overall visibility score is a weighted mention rate:

- **Mention or not** (binary, but capturing the core signal) - **Position weight** (mentions in the first paragraph count more than passing mentions in the third) - **Sentiment polarity** (positive > neutral > negative, with explicit penalties for negative) - **Engine weight** (configurable — most customers run all four equally, but you can downweight an engine if your buyers don't use it) - **Prompt-class weight** (brand-recall 2x, comparison 1.5x, category 1x as above)

The result lands as a single percentage on your dashboard, plus a per-engine breakdown so you can see exactly which AI engine knows you best — and which one needs work. The platform-scan component then maps the prompt-by-prompt results back to the underlying content and citation gaps driving them, so you can see not just *what* the number is but *why*.

## Why this measurement model matters

Most "AI visibility" products give you a single black-box number. We expose the prompts, the weights, and the per-engine response — so when the number moves, you know why. And so when you change your strategy, you can predict in advance which prompts will benefit.

This transparency matters for two reasons. First, board reporting: when your CMO needs to explain a quarterly visibility lift to the founder, "we went from 12% to 19% on brand-recall prompts, driven by the Bloomberg placement and the new G2 listing" is a credible story. "Our AI score went up" is not.

Second, debugging: when the number stalls or drops, you can see exactly which prompts moved against you and trace the cause. A drop in brand-recall? Probably a stale Wikipedia edit or a new negative on a high-authority source. A drop in category visibility? Probably a competitor just landed a major placement and pushed you out of the standard comparison set.

## What to do when the number doesn't move

A common scenario: month 2 audit shows the same composite score as month 1. The instinct is to declare GEO broken. Almost always, the right read is different.

The composite score is a weighted average; individual prompt-level wins can be masked by stable performance on other prompts. Drill into the per-prompt view and check three things: which specific prompts moved (positively or negatively), which engines moved (sometimes one engine surges while another lags by a month), and which source domains shifted in the citation-source leaderboard.

Nine times out of ten, when the headline number looks stuck, the underlying signals are actually moving — they just haven't crossed the threshold that flips a previously-unmentioned prompt into a mentioned one. Once enough citation density accumulates, multiple prompts flip simultaneously and the composite jumps. The jump can look sudden from the outside, but it's the visible result of weeks of underlying movement.

If after 3 months of consistent investment the underlying signals also haven't moved, the diagnostic question is which signal is stuck. The four-pillars breakdown plus the citation-source leaderboard usually narrow it down within a single review session.

## Refresh cadence

Tracked prompts run on every audit — monthly for paid customers, ad-hoc for free audits. The same prompts every time means the curve is comparable across months. We re-seed the prompt set annually (or sooner if your category shifts) to keep it relevant; you'll see proposed changes in the brief editor with the option to accept, edit, or reject each one.

Stable prompts, moving signals, visible curve. That's the measurement loop the entire product is built around.