The Science Of What AI Actually Rewards

Boost your skills with Growth Memo’s weekly expert insights. Subscribe for free!

In “The Science Of How AI Pays Attention,” I analyzed 1.2 million ChatGPT responses to understand exactly how AI reads a page. In “The Science Of How AI Picks Its Sources,” I analyzed 98,000 citation rows to understand which pages make it into the reading pool at all.

This is Part 3.

Where Part 1 told you where on a page AI looks, and Part 2 told you which pages AI routinely considers, this one tells you what AI actually rewards inside the content it reads.

The data clarifies:

Most AI SEO writing advice doesn’t hold at scale. There is no universal “write like this to get cited” formula – the signals that lift one industry’s citation rates can actively hurt another.
The entity types that predict citation are not the ones being targeted. DATE and NUMBER are universal positives. PRICE suppresses citation in five of six verticals, and KG-verified entities are a negative signal.
The one writing signal that holds across all seven verticals: Declarative language in your intro, +14% aggregate lift.
Heading structure is binary. Commit to the right number for your vertical or use none. Three to four headings are worse than zero in every vertical.
Corporate content dominates. Reddit doesn’t. AI citation behavior does not mirror what happened to organic search in 2023-2024.

1. Specific Writing Signals Influence Citation, While Others Harm It

While “The Science Of How AI Pays Attention” covers parts of the page and types of writing that influence ChatGPT visibility, I wanted to understand which writing-level signals – word count, structure, language style – predict higher AI citation rates across verticals.

Approach

I compared high-cited pages (more than three unique prompt citations) vs. low-cited across seven writing metrics: word count, definitive language, hedging, list items, named entity density, and intro-specific signals.
I analyzed the first 1,000 words for list item count, named entity density, intro definitive language token density, and intro number count.

Results: Across all verticals, definitive phrasing and including relevant entities matter. But most signals are flat.

What The Industry Patterns Showed

When splitting the data up by vertical, we suddenly see preferences:

Total word count was strongest in CRM/SaaS (1.59x).
Finance was an anomaly with word count: Shorter pages win (0.86x word count).
Definitive phrases in the first 1,000 characters were positive for most verticals.
Education is a signal void. Writing style explains almost nothing about citation likelihood there.

Top Takeaways

1. There is no universal “write like this to get cited” formula. For example, the signals that lift CRM/SaaS citation rates actively hurt Finance. Instead, match content format to vertical norms.

2. The one universal rule: open with a direct declarative statement. Not a question, not context-setting, not preamble. The form is “[X] is [Y]” or “[X] does [Z].” This is the only writing instruction that holds regardless of vertical, content type, or length.

3. LLMs “penalize” hedging in your intro. “This may help teams understand” performs worse than “Teams that do X see Y.” Remove qualifiers from your opening paragraph before any other optimization.

2. The Entity Types That Predict Citation Are Not The Ones Being Targeted

Most AEO advice focuses on named entities as a category: Pack in more known brand names, tool names, numbers. The cross-vertical entity type analysis below tells a more specific (and more useful) story.

Approach

Ran Google’s Natural Language API on the first 1,000 characters (about 200-250 words) of each unique URL.
Computed lift per entity type: % of high-cited pages with that type / % of low-cited pages.
Analyzed 5,000 pages across seven verticals.

* A quick note on terminology: Google NLP classifies software products, apps, and SaaS tools as CONSUMER_GOOD, a legacy label from when the API was built for physical retail. Throughout this analysis, CONSUMER_GOOD means software/product entities.

Results: DATE and NUMBER are the most universal positive signals. Interestingly, PRICE is the strongest universal negative.

What The Industry Patterns Showed

DATE is the most universal positive signal, with the exception of Finance (0.65x).
NUMBER is the second most universal. Specific counts, metrics, and statistics in the intro consistently predict higher citation rates. Finance (0.98x) and Product Analytics (1.10x) mark the floor and ceiling of that range.
PRICE is the strongest universal negative. Pages that open with pricing signal commercial intent. Finance is the sole exception at 1.16x, likely because price here means fee percentages and rate comparisons, which are the actual reference data financial queries are looking for.
CONSUMER_GOOD (software/product entities) is mixed. In Healthcare, product entities signal established brands and tools. In Crypto, naming specific protocols and products is core to answering technical queries.
PHONE_NUMBER is a positive signal in Healthcare (1.41x) and Education (1.40x). In both cases, it is almost certainly a proxy for established brands/institutions/providers with real physical presence, not a literal signal to add phone numbers to your pages.

The Knowledge Graph inversion deserves its own note here:

The data showed that high-cited pages average 1.42 KG-verified entities vs. 1.75 for low-cited pages (lift: 0.81x).
Pages built around well-known, KG-verified entities (major brands, institutions, famous people) tend toward generic coverage, which isn’t preferred by ChatGPT.
High-cited pages are dense with specific, niche entities: a particular methodology, a precise statistic, a named comparison. Many of those niche entities have no KG entries at all. That specificity is what AI reaches for.

Top Takeaways

1. Add the publish date to your pages and aim to use at least one specific number in your content. That combination is the closest thing to a universal AI citation signal this dataset produced. But Finance gets there through price data and location specificity instead.

2. Avoid opening with pricing in non-finance verticals. Price-dominant intros correlate with lower citation rates.

3. KG presence and brand authority do not translate to an AI citation advantage. Chasing Wikipedia entries, brand panels, or KG verification is the wrong lever. Specific, niche entities (even ones without KG entries) outperform famous ones.

3. Heading Structure: Commit To One Or Don’t Bother

We know headings matter for citations from the previous two analyses. Next, I wanted to understand whether heading count predicts citation rates and whether the optimal structure varies by vertical.

Approach

Counted total headings per page (H1+H2+H3) across all cited URLs.
Grouped pages into 7 heading-count buckets: 0, 1-2, 3-4, 5-9, 10-19, 20-49, 50+.
Computed high-cited rate (% of URLs that are high-cited) per bucket per vertical.

Results: Including more headings in your content is not universally better. The sweet spot depends on vertical and content type. One finding holds everywhere: Strangely, 3-4 headings are worse than zero.

What The Industry Patterns Showed

CRM/SaaS is the only vertical where the 20+ heading lift is confirmed: 12.7% high-cited rate at 20-49 headings vs. a 5.9% baseline. The 50+ bucket reaches 18.2%. Long structured reference pages and comparison guides with one section per tool outperform everything else here.
Healthcare inverts most sharply. The high-cited rate drops from 15.1% at zero headings to 2.5% at 20-49 headings. A page with 30 H2s on telehealth topics signals optimization intent, not clinical authority.
Finance peaks at 10-19 headings (29.4% high-cited rate). Structured but not exhaustive: think rate tables, regulatory breakdowns, and advisor comparison pages with moderate heading depth.
Crypto peaks at five to nine headings (34.7% high-cited rate). Technical documentation in this vertical tends toward dense prose with moderate navigation structure. Over-structuring breaks up the technical depth.
Education is flat across all heading counts, which is consistent with the writing signals finding. Heading structure explains almost nothing about citation likelihood in education content.
The three to four heading dead zone holds across every vertical without exception. Partial structure confuses AI navigation without providing the full benefit of a committed hierarchy.

Top Takeaways

1. The 20+ heading finding from Part 1 is a CRM/SaaS finding, not a universal one. Applying it to healthcare, education, or finance could actively suppress citation rates in those verticals.

2. The principle that holds everywhere: Commit to structure or don’t use it. The middle ground costs you in every vertical. A fully-structured page with the right heading depth outperforms a half-structured page in every vertical.

3. Use the optimal heading range for your vertical. Crypto: 5-9. Finance and Education: 10-19. CRM/SaaS: 20+ (with H3s). Healthcare: 0 or 5-9 at most. Long CRM reference pages with 50+ sections are the one case where maximum heading depth pays off.

4. UGC Doesn’t Dominate

The “Reddit effect” reshaped organic search between 2024 and 2025. I wanted to understand whether ChatGPT cites user-generated content (Reddit, forums, reviews) at meaningful rates or whether corporate/editorial content dominates.

The common industry assumption – that AI also preferentially cites community voices – is not what we found in the data.

Approach

Classified these cited URLs as (1) UGC: Reddit, Quora, Stack Overflow, forum subdomains, Medium, Substack, Product Hunt, Tumblr, or (2) community/forum prefixes or corporate/editorial by domain.
Computed citation share per category per vertical.
Dataset: 98,217 citations across 7 verticals.

Results: Corporate content accounts for 94.7% of all citations. UGC is nearly invisible.

What The Industry Patterns Showed

Finance is the most corporate-locked vertical at 0.5% UGC. YMYL (Your Money, Your Life) content appears to systematically suppress citations to community opinion.
Healthcare sits at 1.8% UGC for the same structural reason. Clinical, telehealth, and HIPAA content draws almost exclusively from institutional sources.
Crypto has the highest UGC penetration in the dataset at 9.2%. Community-generated content (Reddit technical threads, Medium tutorials, developer forum posts) answers a meaningful proportion of analyzed queries. In a fast-moving technical niche where official documentation consistently lags, community posts fill the gap.
Product Analytics and HR Tech sit at 6.9% and 5.8% UGC. Both are verticals where Reddit comparison threads and product review communities provide genuine signal alongside corporate content.

Top Takeaways

1. The “Reddit effect” in SEO has not translated proportionally to AI citations. In most verticals, reddit.com captures 2-5% of total citations. This finding is in line with other industry research, including this report from Profound.

2. For finance and healthcare: UGC has near-zero AI citation value. Invest in structured, authoritative corporate content with clear sourcing. Community engagement may matter for other reasons, but it does not contribute meaningfully to AI citation share in these verticals.

3. For crypto, product analytics, and HR tech: Community presence has measurable citation value. Detailed Reddit comparison threads, technical Medium posts, and structured developer forum answers can supplement corporate content reach.

What This Means For How You Strategize For LLM Visibility

Across all three parts of this study, the consistent finding is that AI citation is not primarily a writing quality problem.

Part 2 showed it is a content architecture problem: Thin single-intent pages are structurally locked out regardless of how well they’re written. This piece shows the same logic applies inside the content itself.

The aggregate writing signals table is the most important chart in this analysis. Not because it shows you what to do, but because it shows how much of what the AI SEO/GEO/AEO industry is telling you doesn’t survive cross-vertical scrutiny. Word count, list density, named entity counts … all flat or negative at the aggregate. The signals that work are vertical-specific and smaller than our industry’s consensus implies.

The meta-lesson from this analysis is that findings are vertical (and probably topic) specific, which is no different in SEO.

This part concludes the Science of AI – for now. Because the AI ecosystem is constantly changing.

Methodology

We analyzed ~98,000 ChatGPT citation rows pulled from approximately 1.2 million ChatGPT responses from Gauge.

Because AI behaves differently depending on the topic, we isolated the data across seven distinct, verified verticals to ensure the findings weren’t skewed by one specific industry.

Analyzed verticals:

B2B SaaS
Finance
Healthcare
Education
Crypto
HR Tech
Product Analytics

Featured Image: CoreDESIGN/Shutterstock; Paulo Bobita/Search Engine Journal

What's Hot

At Least 32 People Dead After a Mine Bridge Collapsed Due to Overcrowding

Here’s how I turned a Raspberry Pi into an in-car media server

Beloved SF cat’s death fuels Waymo criticism

The Science Of What AI Actually Rewards

Technical SEO for generative search: Optimizing for AI agents

New Google TurboQuant algorithm improves vector search speed

How To Identify Which LLM Is Actually Working For Your Clients

Reddit Pro opens to all publishers, adds new features in public beta

Reporting Uncertainty Without Losing Credibility

Microsoft lets merchants update store names and domains in Merchant Center

At Least 32 People Dead After a Mine Bridge Collapsed Due to Overcrowding

Here’s how I turned a Raspberry Pi into an in-car media server

Beloved SF cat’s death fuels Waymo criticism

[2501.08096] Hybrid Action Based Reinforcement Learning for Multi-Objective Compatible Autonomous Driving

Technical SEO for generative search: Optimizing for AI agents

Building a Personal AI Agent in a couple of Hours

The Science Of What AI Actually Rewards

Scaling Test-time Physical Memory for Robot Manipulation

New Google TurboQuant algorithm improves vector search speed

Most Popular

13 Trending Songs on TikTok in Nov 2025 (+ How to Use Them)

How to watch the 2026 GRAMMY Awards online from anywhere

Corporate Reputation Management Strategies | Sprout Social

Our Picks

At Least 32 People Dead After a Mine Bridge Collapsed Due to Overcrowding

Here’s how I turned a Raspberry Pi into an in-car media server

Beloved SF cat’s death fuels Waymo criticism

Subscribe to Updates

What's Hot

The Science Of What AI Actually Rewards

1. Specific Writing Signals Influence Citation, While Others Harm It

Approach

What The Industry Patterns Showed

Top Takeaways

2. The Entity Types That Predict Citation Are Not The Ones Being Targeted

Approach

What The Industry Patterns Showed

Top Takeaways

3. Heading Structure: Commit To One Or Don’t Bother

Approach

What The Industry Patterns Showed

Top Takeaways

4. UGC Doesn’t Dominate

Approach

What The Industry Patterns Showed

Top Takeaways

What This Means For How You Strategize For LLM Visibility

Methodology

Related Posts

Subscribe to Updates