Category: AI Visibility Insights

  • How Many Links Does Each AI Model Cite on Average?

    How Many Links Does Each AI Model Cite on Average?

    When AI chatbots answer questions, they often include links to websites as sources. But how many links does each AI model typically cite? We analyzed over 400,000 responses from the last month across 7 major AI platforms to find out.

    The results show big differences between models. Some AI platforms cite many sources, while others cite few or none. Understanding these patterns helps brands know where to focus their content efforts.

    How Many Links Does Each AI Model Cite on Average?

    Based on data from over 400,000 analysis results collected in the last month, here’s how many links each AI model cites per response – when it includes sources:

    • Grok: 43.21 links per response
    • ChatGPT: 25.08 links per response
    • AI Mode: 13.78 links per response
    • Perplexity: 10.00 links per response
    • Gemini: 7.27 links per response
    • Google AI Overviews: 7.24 links per response
    • Copilot: 4.75 links per response

    When AI models do include sources, they often cite many of them. Grok leads with over 43 links per response when it includes sources. ChatGPT averages 25 links per response when it cites sources, showing it’s not shy about including multiple references when it does provide them.

    What Percentage of Responses Include Links?

    Not every AI response includes links. Here’s what percentage of responses from each model include at least one source link:

    • Perplexity: 98.67% of responses include links
    • AI Mode: 91.99% of responses include links
    • Copilot: 85.18% of responses include links
    • Google AI Overviews: 75.92% of responses include links
    • Grok: 70.17% of responses include links
    • Gemini: 54.86% of responses include links
    • ChatGPT: 35.00% of responses include links

    This reveals an important pattern: Perplexity and AI Mode include links in almost every response, making them consistent citation opportunities. ChatGPT, despite citing many links when it does include them, only includes links in about one-third of its responses. This suggests ChatGPT is selective about when to provide sources.

    What Do These Numbers Mean for Your Brand?

    These citation patterns matter because they show where your content has the best chance of appearing. If a model cites many links, there are more opportunities for your website to be included. But it also means more competition for those spots.

    For example, when Grok includes sources, it averages over 43 links per response, creating many citation opportunities. However, with so many links, each individual link gets less attention. ChatGPT’s pattern is different: when it does include links, it averages 25 per response, but it only includes links in about 35% of responses, making those citations more selective and potentially more valuable.

    According to research from Search Engine Journal, citation patterns in AI responses directly impact website traffic. Brands that appear in AI citations often see increased organic traffic from users clicking through to learn more.

    Why Do Some Models Cite More Links Than Others?

    Different AI models have different approaches to providing sources. Some models are designed to show many sources to give users multiple perspectives. Others focus on fewer, higher-quality sources.

    Grok’s high citation count (43+ links when sources are included) likely comes from its design to show diverse viewpoints. The platform aims to present many sources so users can explore different angles on a topic. This aligns with research from TechCrunch showing that Grok emphasizes source diversity in responses.

    Perplexity takes a different approach: it includes links in nearly 99% of responses, but averages 10 links per response when it does. This makes Perplexity the most consistent citation opportunity. ChatGPT shows a more selective pattern: when it includes links, it averages 25 per response, but it only includes links in about 35% of responses overall. This matches findings from Nature that show ChatGPT prioritizes quality and relevance when deciding whether to include sources.

    How Can You Optimize Content for Each Model?

    Understanding citation patterns helps you tailor your content strategy. Here’s how to approach each model:

    For High-Citation Models (Grok, ChatGPT, AI Mode, Perplexity)

    These models cite many sources when they include links, so there are more opportunities to get included. Focus on:

    • Creating diverse content: These models look for multiple perspectives, so cover different angles of your topic
    • Building authority: Even with many citations, these models still prefer authoritative sources
    • Optimizing for specific queries: With more citation spots, you can target niche topics where you have expertise

    For Consistent Citation Models (Perplexity, AI Mode, Google AI Overviews)

    These models include links in most of their responses (75-99%), making them reliable citation opportunities. Focus on:

    • Creating comprehensive content: These models prefer in-depth, well-researched pages
    • Establishing expertise: Show clear author credentials and original research
    • Optimizing technical signals: Ensure your site has proper schema markup and clear structure, as noted in Google’s structured data guidelines

    For Low-Citation Models (Gemini, Copilot)

    These models cite fewer sources, making each citation more valuable. Focus on:

    • Becoming the definitive source: Create content that’s clearly the best resource on a topic
    • Demonstrating expertise: Show why your content is more authoritative than competitors
    • Targeting high-value queries: Since citations are rare, focus on topics where being cited has the most impact

    What Does This Mean for Your SEO Strategy?

    Traditional SEO focuses on ranking in search results. AI SEO (also called GEO – Generative Engine Optimization) focuses on getting cited in AI responses. These citation patterns show that AI SEO requires a different approach than traditional SEO.

    As Search Engine Journal explains, AI models don’t rank pages the same way search engines do. Instead, they select sources based on relevance, authority, and how well content answers specific questions.

    The citation data shows interesting patterns: models like Perplexity and AI Mode include links in almost every response, making them consistent opportunities. Grok includes links in 70% of responses but averages 43 links when it does, creating many citation spots. ChatGPT is more selective, including links in only 35% of responses, but when it does, it averages 25 links, suggesting those citations are carefully chosen.

    Key Takeaways

    Understanding citation patterns helps you make smarter decisions about where to focus your content efforts:

    • Grok cites the most sources (43+ per response when links are present), offering many opportunities but more competition
    • ChatGPT cites heavily when it includes sources (25 per response), but only includes links in 35% of responses, making those citations more selective
    • Perplexity and AI Mode include links in most responses (99% and 92% respectively), making them the most consistent citation opportunities
    • Gemini and Copilot cite fewer sources (7 and 5 per response when links are present), making each citation more valuable
    • Content strategy should vary by model based on both citation frequency and average links per response

    By tracking which models cite your content and how often, you can optimize your strategy for maximum visibility across AI platforms.

    Frequently Asked Questions

    How many links does ChatGPT cite per response?

    Based on data from the last month, when ChatGPT includes links, it averages 25.08 links per response. However, ChatGPT only includes links in about 35% of its responses, making it more selective than models like Perplexity that include links in nearly every response. When ChatGPT does cite sources, it includes many of them, suggesting careful selection of multiple authoritative references.

    Which AI model cites the most sources?

    Grok cites the most sources when it includes links, averaging 43.21 links per response. This is significantly higher than other models. Grok is designed to show diverse perspectives, which explains why it includes so many source links. However, it’s worth noting that Perplexity includes links in 98.67% of responses (compared to Grok’s 70.17%), making Perplexity the most consistent citation opportunity overall.

    How can I get my website cited by AI models?

    To get cited by AI models, create high-quality, authoritative content that directly answers common questions. Use clear headings, proper schema markup, and demonstrate expertise. Focus on topics where your brand has unique insights. Track which models cite your content to understand what’s working.

    Do more citations mean more traffic?

    Not necessarily. Models that cite many sources (like Grok) may drive less traffic per citation because users have more options. Models that cite fewer sources (like ChatGPT) may drive more traffic per citation because each link gets more attention. The value depends on both the number of citations and how users interact with them.

    How often do AI models update their citations?

    AI models can update their citations frequently as they crawl and index new content. However, the exact frequency varies by model. Some models may update weekly, while others may update more or less frequently. Regular content updates and monitoring help ensure your content stays relevant.

    Should I optimize for all AI models or focus on specific ones?

    It depends on your goals and audience. If you want maximum visibility, optimize for models with high citation rates like Grok. If you want high-value citations, focus on selective models like ChatGPT. Many brands use a balanced approach, creating content that works well across multiple models while tracking which ones drive the most traffic.

    This post was written by Spotlight’s content generator.

  • What Is llms.txt and How Can It Help Your Website Be Seen by AI?

    What Is llms.txt and How Can It Help Your Website Be Seen by AI?

    As artificial intelligence (AI) and large language models (LLMs) like ChatGPT, Claude, and Gemini grow more powerful, websites face new challenges and opportunities. One important tool for websites is the llms.txt file—a simple, standardized text file designed to help AI understand and use website content better. This guide explains what llms.txt is, why it matters, how it’s being adopted, and best practices for creating one. It also shows how platforms like Spotlight lead the way in making websites more visible in AI chat conversations.


    What Is llms.txt and Why Was It Created?

    llms.txt is a file placed in a website’s root directory (similar to the well-known robots.txt file). Its purpose is to provide clear, structured information specifically for large language models and AI agents. Unlike regular HTML or sitemap files, llms.txt is meant to be both machine-readable and human-friendly.

    The idea behind llms.txt is to help AI systems better understand the website’s purpose, key resources, and content structure. This helps reduce errors or “hallucinations”—when AI gives wrong or misleading answers—and improves the accuracy of responses that mention your site.

    The concept is still fairly new and evolving, but it was created in response to how AI models increasingly rely on web content to answer user questions and provide recommendations. By offering a standardized file, website owners can guide AI to their most important content and clarify usage terms.


    How Does llms.txt Work in Practice?

    When an LLM or AI agent visits a website—either to crawl, index, or fetch data—it looks for files that help it understand the site. If llms.txt is present at the root URL (for example, https://get-spotlight.com/llms.txt), the AI reads it to get:

    • A summary of what the website offers and who it serves
    • A list of key pages with descriptions (like documentation, blog, or API)
    • Metadata such as content update dates or version numbers
    • Optional instructions or license terms for using the content

    This information helps the AI decide which parts of the site are most relevant to a user’s query. It improves the chances the AI will cite the site correctly and with helpful context.

    For example, if a user asks an AI about the API of a cloud service, the AI can quickly find the API reference page from the llms.txt file, rather than guessing or relying on incomplete data.


    Example of a Well-Formatted llms.txt

    Here is a sample llms.txt file based on best practices recommended Anthropic, who established the standard:

    # Spotlight - Brand Visibility in AI Conversations
    
    ## What is Spotlight?
    
    Spotlight (get-spotlight.com) is a SaaS platform that helps brands monitor, measure, and improve their visibility within AI chat conversations. Spotlight tracks how brands appear across major AI platforms and provides actionable insights to improve AI-generated visibility.
    
    ## Supported AI Platforms
    
    Spotlight currently monitors 8 major AI platforms:
    - ChatGPT
    - Google AI Overviews
    - Google AI Mode
    - Grok
    - Gemini
    - Claude
    - Perplexity
    - Copilot
    
    ## Core Capabilities
    
    ### Prompt Discovery & Analysis
    Spotlight discovers the most searched prompts that brands would want to appear in—queries their potential customers ask when searching for products or services the brand offers. Prompts are grouped by topics aligned with the brand's marketing objectives, and Spotlight measures each prompt's search volume to help prioritize actions.
    
    ### Weekly Monitoring
    All discovered prompts are sent to all 8 AI models weekly using local IPs to get geographically relevant responses. This ensures brands understand how they appear to users in different regions.
    
    ### Brand Mention Analysis
    Spotlight analyzes AI responses to:
    - Identify brand mentions across all platforms
    - Evaluate sentiment around brand mentions
    - Compare positioning against competitors
    - Track visibility rankings over time
    
    ### Citation & Source Tracking
    The platform captures and analyzes:
    - All citations and data sources used by AI models in their responses
    - The queries ChatGPT uses to fetch fresh data from the web
    - How often each piece of brand-owned content is cited by each model over time
    - What types of websites each model prefers to cite
    
    ### Competitive Intelligence
    Spotlight provides:
    - Visibility rankings showing how brands compare to competitors
    - Sentiment breakdowns for brand mentions
    - Analysis of what makes high-visibility brands successful
    - Reverse engineering of successful content strategies
    
    ### Content Optimization
    Spotlight includes tools to improve visibility:
    - Content grading system that evaluates existing webpages
    - Optimization guidance for both technical and content aspects
    - Gap analysis identifying prompts where brands don't appear
    - Content suggestions that directly address missing prompt coverage
    
    ### Traffic Attribution
    Spotlight integrates with Google Analytics to:
    - Track traffic coming from LLMs
    - Identify which LLM drove the traffic
    - Show which pages received LLM traffic
    - Close the loop between AI visibility and actual website traffic
    
    ### Reputation Monitoring
    Spotlight has a dedicated section for brand reputation on AI chatbots:
    - Sends prompts directly asking models about brand quality, value, and key metrics
    - Analyzes and scores how the brand is perceived by each model
    - Includes data sources used by models, allowing users to manage negative inputs
    
    ## Use Cases
    
    - Marketing teams monitoring brand visibility in AI search results
    - SEO professionals optimizing for Generative Engine Optimization (GEO)
    - Brand managers tracking reputation across AI platforms
    - Content teams creating AI-optimized content
    - Competitive intelligence professionals benchmarking against competitors
    - Product teams understanding how their offerings are described by AI
    
    ## Target Audience
    
    Spotlight is designed for:
    - Brands seeking to improve visibility in AI-generated responses
    - Marketing teams focused on GEO (Generative Engine Optimization)
    - Companies monitoring how AI platforms present their brand
    - Organizations optimizing content for AI citations
    - Businesses tracking competitor positioning in AI conversations
    
    ## Technology
    
    Spotlight is built by AI agents, enabling rapid development of new features and adaptation to the fast-changing landscape of AI search and visibility.
    
    For more information, visit https://get-spotlight.com

    Why Are More LLMs and AI Tools Starting to Use llms.txt?

    Adoption of llms.txt is growing as AI developers realize the benefits of structured site info. Several reasons explain this trend:

    • Improved Accuracy: AI models that access llms.txt can reduce hallucinations by relying on verified site descriptions.
    • Efficient Data Access: AI can quickly focus on relevant pages rather than crawling the entire site.
    • Better User Experience: End users get more precise answers with proper citations.
    • Standardization: Having a common file format simplifies integration across different AI platforms.

    Currently, only 2 LLMs are official using llms.txt – Claude and Grok. However the rest of the LLMs are considering reading the file when analyzing a website. Server logs show evidence that ChatGPT is accessing the llms.txt files, but there is no evidence that the files’ content is currently considered in OpenAI’s responses.


    What Are the Best Practices for Creating an Effective llms.txt File?

    Creating a useful llms.txt file means balancing clarity, completeness, and usability. Here are expert recommendations:

    1. Where and How Should You Place the File?

    • Put llms.txt in the root directory of your website (https://yourdomain.com/llms.txt).
    • Use plain text format, but include Markdown-style headings and bullet points for readability.
    • Ensure it’s accessible without login or special permissions.
    • Set content-type HTTP header to text/plain.

    2. What Should the Structure and Content Include?

    • Header/Title: One-line description of the website.
    • Summary: 2-4 sentences explaining site purpose, audience, and offerings.
    • Key Resources: List of important URLs with short, clear descriptions.
    • Metadata: Dates, version numbers, content types (tutorial, reference, blog).
    • Optional Sections: Sitemap link, contact info, license terms, update frequency, special instructions.

    3. How Should You Write the Content?

    • Be concise but informative.
    • Prioritize your most valuable pages first.
    • Avoid marketing fluff or vague language.
    • Don’t list every page—focus on key content.
    • Use absolute URLs, not relative ones.

    4. What Are Technical Tips to Remember?

    • Keep file size small (under 100KB).
    • Update regularly to reflect site changes.
    • Use proper HTTP headers.
    • Add a “last updated” timestamp.

    5. What Should You Avoid?

    • Avoid listing internal or private URLs.
    • Skip broken or outdated links.
    • Don’t use the file to manipulate AI outputs unethically.
    • Avoid excessive promotional language.

    What Is the Current State of llms.txt Adoption Across AI Models?

    Adoption of llms.txt is still in early stages but growing steadily. Here’s what we know from industry observations:

    • Claude: Actively supports llms.txt as part of its web browsing and citation process.
    • Grok: Considers llms.txt for content prioritization.
    • ChatGPT: Not currently using llms.txt , however server logs show their bots accessing the file.
    • Perplexity: Not currently using llms.txt.
    • Gemini: Not currently using llms.txt .
    • Google AI Overviews: Not currently using llms.txt.
    • Google AI Mode: Not currently using llms.txt.
    • ❌ Microsoft Copilot: Official not currently using llms.txt.

    While most models do not yet fully support it today, it is likely to see growth in adoption in the near future.


    How Can Website Teams Create and Maintain an Effective llms.txt File Step by Step?

    Follow these steps to build a strong llms.txt file:

    1. Plan Your Content: Identify your site’s main purpose, target audience, and key pages.
    2. Write a Clear Header and Summary: Keep it simple and informative.
    3. List Key Resources: Choose your top pages (documentation, blog, API, etc.). Write concise descriptions.
    4. Add Metadata: Include dates, version info, and content types for clarity.
    5. Include Optional Details: Sitemap URL, contact info, usage license, update notes.
    6. Format Using Plain Text and Markdown Style: Use headings, bullet points, and line breaks.
    7. Place the File at Your Root URL: Upload to https://yourdomain.com/llms.txt.
    8. Set HTTP Headers: Ensure content-type is text/plain.
    9. Test Accessibility: Check that AI crawlers and humans can access the file.
    10. Update Regularly: Review and revise as your site changes.

    What Are Some Authoritative Views on the Importance of Structured Files Like llms.txt?

    AI and SEO experts emphasize that structured site metadata is critical for AI-powered search and discovery.

    John Mueller, a Webmaster Trends Analyst at Google, has stressed the importance of clear site structure and metadata for search engines to understand content. While not specifically about llms.txt, his insights apply: “Structured data helps search engines and AI better understand your site, which can lead to better visibility and user experience.”

    Additionally, researchers at Stanford University’s AI Lab highlight that AI systems benefit significantly from explicit, standardized signals about content origin and purpose to reduce misinformation.

    These expert opinions underline why tools like llms.txt will become a standard part of website optimization in the AI era.


    FAQ

    What is the difference between llms.txt and robots.txt?

    robots.txt tells search engine bots which pages to crawl or avoid, mainly to control indexing. llms.txt provides structured, descriptive information to help AI language models understand and use your website content better. Both live in the root directory but serve different purposes.

    How does llms.txt reduce AI hallucinations?

    By giving AI clear, authoritative descriptions and links to key resources, llms.txt helps prevent the AI from guessing or inventing answers based on incomplete data. This leads to more accurate and trustworthy AI responses.

    Can I generate llms.txt files automatically?

    You can copy the example in this post as a reference, add a description of your website, and have an LLM generate a llms.txt file automatically. However, manual review and customization ensure the file accurately reflects your site’s content and strategy.

    Should llms.txt replace sitemaps or structured data?

    No. llms.txt complements sitemaps and structured data by providing a human-readable, AI-focused overview. Use it alongside other SEO tools for best results.

    How often should I update my llms.txt file?

    Update it whenever you make significant changes to your website’s content, structure, or key resources. At minimum, review quarterly to keep it current.

    Does Spotlight support llms.txt optimization?

    Yes. Spotlight offers insights and recommendations related to llms.txt as part of its comprehensive AI visibility platform. It tracks how LLMs use site information and suggests improvements to maximize AI-driven traffic and brand mentions.


    Conclusion

    llms.txt is an emerging but powerful tool that helps websites communicate clearly with AI language models. It provides structured, concise information that improves AI understanding, reduces errors, and boosts your site’s chances of being cited accurately. Adoption is growing among major AI platforms, making it an important part of future-proofing your website for AI-driven search.

    Brands and website owners looking to lead in this space should consider creating and maintaining a strong llms.txt file. Platforms like Spotlight offer advanced support for monitoring and optimizing AI visibility, helping brands stay ahead in this fast-evolving landscape.

    By following best practices for llms.txt, you can help AI systems find, understand, and use your content effectively—enhancing your brand’s presence in the AI-powered web of tomorrow.


    References

  • How to Create YouTube Content That Gets Cited by AI Chatbots

    How to Create YouTube Content That Gets Cited by AI Chatbots

    Want your YouTube videos to show up when people ask ChatGPT or Google AI Overviews questions? You’re not alone. As more people use AI chatbots to find information, getting your videos cited by these systems can drive serious traffic to your channel.

    But here’s the thing: AI models don’t “watch” videos the way humans do. They read text. This means you need to think differently about how you create and optimize your YouTube content if you want AI chatbots to find and cite it.

    What Do AI Models Actually Read: Metadata or Transcripts?

    This is the million-dollar question. The answer? Both, but transcripts matter more.

    How ChatGPT Accesses YouTube Content

    ChatGPT can’t directly watch YouTube videos. Instead, it reads the text that comes with your video. When you provide ChatGPT with a video transcript or when it accesses YouTube through web browsing, it analyzes the transcript to understand what your video is about.

    According to recent analysis, ChatGPT processes video transcripts and metadata when available. This means your video’s transcript is like a script that AI models read to understand your content.

    How Google AI Overviews Uses YouTube

    Google AI Overviews works differently. It’s built on the same system as Google Search, which means it can access YouTube transcripts, captions, and structured data that Google has already indexed. According to industry research, Google’s AI Overviews rely heavily on indexed video transcripts and captions when creating summaries.

    This is important: Google indexes your video’s transcript automatically if you have captions enabled. If you don’t have captions, Google may try to generate them automatically, but the quality won’t be as good as ones you create yourself.

    Why Transcripts Beat Metadata

    Think of it this way: your video’s title and description (metadata) are like a book cover. They tell AI models what your video might be about. But the transcript is like the actual book. It tells AI models exactly what you said, what topics you covered, and what information you provided.

    When someone asks ChatGPT “How do I change a tire?” it searches through transcripts to find videos that actually explain tire changing. A video with a great title but a poor transcript won’t rank as well as a video with a good transcript that clearly explains the process.

    How Do I Make My YouTube Videos More Likely to Be Cited?

    Now that you know AI models read transcripts, here are the specific actions you should take to improve your chances of getting cited by ChatGPT, AI Overviews, and other AI chatbots:

    1. Always Add Accurate Captions

    This is the most important step. Without captions, AI models can’t read your video content properly. Here’s what to do:

    • Enable auto-captions, then edit them: YouTube’s auto-captions are a good start, but they make mistakes. Always review and fix errors, especially technical terms, names, and numbers.
    • Use proper punctuation: AI models understand content better when sentences are properly punctuated. Add periods, commas, and question marks where they belong.
    • Break up long sentences: If you speak in long run-on sentences, break them into shorter, clearer sentences in your captions.
    • Include speaker names: If multiple people speak, label who’s talking. This helps AI models understand context.
    Pro Tip: Don’t just rely on YouTube’s auto-captions. Upload your own transcript file for the most accurate results. You can create a transcript while editing your video, then upload it directly to YouTube.

    2. Write Detailed Video Descriptions

    Your video description is like metadata that helps AI models understand your content before they read the transcript. According to YouTube SEO best practices, your description should:

    • Include your main keywords in the first 100 characters: This is what AI models see first, so make it count.
    • Write a clear summary: Explain what your video covers in 2-3 sentences. Use the words people would use when asking an AI chatbot.
    • Add timestamps for longer videos: If your video covers multiple topics, add timestamps. This helps AI models understand the structure of your content.
    • Include related keywords naturally: Don’t stuff keywords, but naturally include terms people might search for.

    3. Optimize Your Video Titles

    Your title is the first thing AI models see. Make it clear and descriptive. According to YouTube’s latest features, you can now test multiple titles to see which performs better. Here’s what works:

    • Use question formats: Titles like “How Do I Change a Tire?” match how people ask AI chatbots questions.
    • Be specific: “How to Change a Tire on a 2020 Honda Civic” is better than “Car Tips” because it matches specific queries.
    • Include numbers when relevant: “5 Ways to Improve Your Credit Score” helps AI models understand you’re providing a list.
    • Avoid clickbait: AI models prefer titles that accurately describe content. If your title promises something your video doesn’t deliver, AI models will notice.

    4. Structure Your Content Clearly

    AI models understand structured content better than rambling conversations. Here’s how to structure your videos:

    • Start with a clear introduction: In the first 30 seconds, explain what your video covers. Say it out loud so it’s in your transcript.
    • Use clear section breaks: When you move to a new topic, say “Now let’s talk about…” This creates natural breaks in your transcript.
    • Summarize key points: At the end, recap the main points. This reinforces important information in your transcript.
    • Answer questions directly: If your video answers “How do I…”, make sure you actually explain the steps clearly in your speech.

    5. Use Relevant Tags and Hashtags

    Tags and hashtags help YouTube and AI models categorize your content. According to YouTube SEO research, you should:

    • Use 2-3 hashtags in your title or description: Don’t overdo it. Too many hashtags look spammy.
    • Tag with specific keywords: Use tags that match what people would ask an AI chatbot. “How to change tire” is better than just “car”.
    • Mix broad and specific tags: Include both general topics (like “automotive”) and specific ones (like “tire changing tutorial”).

    6. Create Playlists Around Topics

    Organizing videos into playlists helps AI models understand that your content is part of a larger topic. According to best practices, playlists improve how YouTube and AI models understand your content structure.

    For example, if you create multiple videos about car maintenance, put them in a “Car Maintenance Guide” playlist. This signals to AI models that you have comprehensive content on this topic.

    What Specific Actions Should I Take for ChatGPT?

    ChatGPT uses web browsing to access YouTube content. Here’s what works best:

    • Focus on educational content: ChatGPT tends to cite videos that clearly explain how to do something or answer specific questions.
    • Use clear, direct language in your transcript: Avoid slang and jargon unless you explain it. ChatGPT prefers clear explanations.
    • Cite your sources in the video: If you mention statistics or facts, say where they came from. ChatGPT values content that references authoritative sources.
    • Create content that answers common questions: Think about what people ask ChatGPT and create videos that answer those exact questions.

    What About Google AI Overviews?

    Google AI Overviews works differently because it’s built on Google Search. Here’s what to focus on:

    • Optimize for Google Search first: Since AI Overviews uses Google’s index, videos that rank well in Google Search are more likely to appear in AI Overviews.
    • Use schema markup if possible: If you have a website, add video schema markup. This helps Google understand your video content better.
    • Focus on E-E-A-T: Experience, Expertise, Authoritativeness, and Trustworthiness matter. Show your credentials, cite sources, and demonstrate expertise in your videos.
    • Create comprehensive content: Google AI Overviews prefers content that thoroughly covers a topic rather than quick tips.

    What About Other AI Chatbots?

    Different AI chatbots work differently, but the same principles apply:

    • Perplexity: Tends to cite sources more explicitly. Make sure your video title and description clearly state what your video covers.
    • Claude: Values well-structured, detailed content. Focus on clear explanations and comprehensive coverage.
    • Gemini: Google’s other AI model, so it uses similar indexing as AI Overviews. Follow the same strategies.
    • Copilot: Microsoft’s AI uses Bing’s index. Optimize for Bing search as well as YouTube.

    Common Mistakes to Avoid

    Here are mistakes that hurt your chances of getting cited:

    • Relying only on auto-captions: Auto-captions have errors. Always review and fix them.
    • Vague titles and descriptions: “Cool Video” doesn’t help AI models understand your content. Be specific.
    • Rambling without structure: If your transcript is just you talking without clear points, AI models won’t understand your content well.
    • Ignoring keywords people actually use: Use the words your audience uses when asking questions, not industry jargon.
    • Not updating old videos: If you have popular videos with poor captions, fix them. Old content can still get cited.

    How Do I Know If My Videos Are Getting Cited?

    Tracking AI citations is tricky because AI models don’t always show where they got information. However, you can:

    • Check your YouTube Analytics: Look for traffic sources you don’t recognize. Some AI-driven traffic may show up as direct or referral traffic.
    • Use AI visibility tools: Tools like Spotlight track when your content gets cited by AI chatbots across multiple platforms.
    • Monitor search trends: If your video topics suddenly get more search volume, it might be because AI chatbots are citing similar content.
    • Test with AI chatbots: Ask ChatGPT or other AI models questions your video answers and see if they mention your video or similar content.

    Getting Started: Your Action Plan

    Ready to optimize your YouTube content for AI citations? Here’s a simple action plan:

    • Audit your existing videos: Check which videos have captions and which don’t. Start by adding accurate captions to your most popular videos.
    • Improve your top 10 videos: Update titles, descriptions, and captions for your best-performing content first.
    • Create a caption workflow: For every new video, create and upload accurate captions before publishing.
    • Optimize descriptions: Rewrite your video descriptions to be more specific and keyword-rich.
    • Test and measure: Use tools to track if your optimization efforts are working.

    Remember: getting cited by AI chatbots isn’t about gaming the system. It’s about creating clear, helpful content that AI models can understand and recommend. Focus on making your transcripts accurate, your descriptions clear, and your content valuable. The citations will follow.

    Frequently Asked Questions

    Do AI models actually watch my YouTube videos?

    No. AI models like ChatGPT and Google AI Overviews don’t watch videos. They read the text associated with your video—primarily transcripts and captions, but also titles, descriptions, and metadata. This is why accurate captions are so important for getting cited.

    Should I use YouTube’s auto-captions or create my own?

    Start with auto-captions, but always review and edit them. Auto-captions make mistakes, especially with technical terms, names, and numbers. For the best results, create your own transcript and upload it to YouTube. This gives you complete control over accuracy.

    How long does it take for AI models to start citing my videos?

    It depends on the AI model. Google AI Overviews uses Google’s index, so if your video ranks well in Google Search, it may appear in AI Overviews relatively quickly. ChatGPT’s web browsing feature may find your content within days or weeks. There’s no guaranteed timeline, but optimizing your content improves your chances.

    Do I need to optimize differently for different AI chatbots?

    Not really. The core principles are the same: accurate transcripts, clear descriptions, and valuable content. However, ChatGPT tends to prefer educational content that directly answers questions, while Google AI Overviews values content that demonstrates expertise and authority. Focus on creating great content first, then optimize the technical elements.

    Will optimizing for AI citations hurt my regular YouTube performance?

    No. In fact, many AI optimization strategies also help with regular YouTube SEO. Clear titles, detailed descriptions, accurate captions, and well-structured content help both human viewers and AI models. You’re improving your content for everyone, not just AI.

    Can I see which AI chatbot cited my video?

    It’s difficult to track this manually because AI models don’t always show citations. Some tools like Spotlight can track AI citations across multiple platforms, making it easier to see which AI chatbots are referencing your content. You can also check your YouTube Analytics for unusual traffic patterns.

    Do shorter or longer videos get cited more often?

    Length matters less than quality and clarity. A 5-minute video with a clear, accurate transcript is better than a 30-minute video with poor captions. However, comprehensive content that thoroughly covers a topic tends to perform better in Google AI Overviews. Focus on making your content as clear and helpful as possible, regardless of length.

    Should I create new content or optimize existing videos?

    Do both, but start with optimization. Your existing popular videos already have views and engagement, so improving their captions and descriptions can have immediate impact. Then apply these lessons to new content. Many creators find that optimizing old videos brings new traffic from AI citations.

  • Empirical Study: Which Schema Markup Types Appear in AI-Cited Websites?

    Empirical Study: Which Schema Markup Types Appear in AI-Cited Websites?

    What Are Schema Markup Types?

    Schema markup (also called structured data) is code you add to your website that helps search engines and AI models understand what your content is about. It’s like adding labels to your content so machines can read and categorize it more easily.

    Schema markup uses a standardized vocabulary from Schema.org. Instead of just having text on a page, you add special code (usually JSON-LD format) that tells AI systems “this is an article,” “this is a product,” “this is the author,” etc.

    Example: Article Schema

    Here’s a simple example of what Article schema looks like. If you have a blog post, you might add this code to your page:

    <script type="application/ld+json">
    {
      "@context": "https://schema.org",
      "@type": "Article",
      "headline": "10 Best Project Management Tools in 2025",
      "author": {
        "@type": "Person",
        "name": "Jane Smith"
      },
      "datePublished": "2025-01-15",
      "image": "https://example.com/article-image.jpg"
    }
    </script>

    This tells AI models that the page is an article, who wrote it, when it was published, and what the main image is. Without schema, AI models have to guess these details from the HTML, which is less reliable.

    How We Analyzed AI-Cited Websites

    We analyzed 5,499 websites that were cited by AI models in their responses. By analyzing which schema markup types appear most frequently in these cited websites, we can observe patterns in what’s present in content that gets cited by AI.

    Important note: This study examines what schema types are present in AI-cited websites. It’s an empirical observation, not a recommendation. Additionally, only schema types that correspond to content that is actually present and visible on the page should be included—don’t add schema for elements that don’t exist on your page.

    Most Frequent Schema Types by Website Type

    Here’s what we found when analyzing schema markup types present in websites that were cited by AI models:

    Company Homepage

    • ImageObject (974 occurrences) – Present in nearly every cited company homepage
    • ListItem (672) – Lists help AI models extract structured information
    • Question (574) and Answer (571) – FAQ-style content that directly answers user queries
    • SiteNavigationElement (562) – Helps AI understand site structure
    • WebPage (360) – Basic page structure markup
    • Person (226) – Author and team information
    • BreadcrumbList (211) – Site hierarchy and context
    • Organization (189) – Company information
    • AggregateRating (100) – Review and rating data
    • Product (88) – Product information
    • CreativeWork (76) – Content type identification
    • FAQPage (72) – FAQ page structure
    • Article (54) – Article structure

    Blog Post

    • Person (364) – Author information is critical for blog credibility
    • SiteNavigationElement (274) – Navigation structure
    • ListItem (152) – Structured lists and comparisons
    • ImageObject (134) – Visual content markup
    • Comment (118) – User engagement signals
    • CreativeWork (112) – Content type identification
    • WPHeader (105) – WordPress header structure
    • AggregateRating (101) – Review and rating data
    • SoftwareApplication (101) – For tool and app reviews
    • Organization (94) – Publisher information
    • Blog (92) – Blog structure
    • Rating (88) – Product and service ratings
    • WebPage (71) – Page structure
    • BreadcrumbList (49) – Site hierarchy
    • Article (49) – Article structure

    News Outlet

    • Rating (234) – Product and service ratings
    • ImageObject (186) – Visual content
    • ListItem (129) – Structured lists
    • Person (96) – Author and journalist information
    • BreadcrumbList (37) – Site hierarchy
    • Organization (35) – Publication information
    • Article (31) – Article structure
    • WebPage (13) – Page structure
    • SiteNavigationElement (10) – Navigation
    • Product (9) – Product information
    • NewsArticle (8) – News article structure
    • Review (8) – Review content
    • AggregateRating (7) – Combined rating data

    Forum

    • Person (114) – User profiles and authorship
    • ListItem (81) – Structured discussion threads
    • Comment (32) – Reply and discussion structure
    • InteractionCounter (30) – Engagement metrics
    • BreadcrumbList (12) – Site hierarchy
    • Question (7) and Answer (7) – Q&A structure

    Article Format

    • ImageObject (633) – Present in most cited articles
    • ListItem (548) – Structured information extraction
    • SiteNavigationElement (488) – Site structure
    • Person (458) – Author information
    • Question (294) and Answer (292) – FAQ content
    • Organization (228) – Publisher information
    • WebPage (223) – Page structure
    • BreadcrumbList (162) – Site hierarchy
    • CreativeWork (108) – Content type
    • WPHeader (105) – Header structure
    • Article (100) – Article structure
    • Blog (68) – Blog structure

    Listicle Format

    • ListItem (373) – Core structure
    • Rating (308) – Product ratings
    • Person (286) – Author credibility
    • SiteNavigationElement (219) – Navigation
    • ImageObject (206) – Visual content
    • AggregateRating (193) – Combined ratings
    • WebPage (153) – Page structure
    • Question (146) and Answer (146) – FAQ sections
    • SoftwareApplication (104) – For tool reviews
    • BreadcrumbList (103) – Site hierarchy
    • Organization (82) – Publisher information
    • Blog (77) – Blog structure
    • WPHeader (76) – Header structure

    FAQ Page Format

    • Question (145) and Answer (144) – Essential structure
    • ImageObject (122) – Visual content
    • SiteNavigationElement (37) – Navigation
    • Person (35) – Author/expert information
    • ListItem (30) – Structured lists
    • FAQPage (14) – Container schema
    • BreadcrumbList (10) – Site hierarchy

    Product Page Format

    • WebPage (37) – Page structure
    • ListItem (36) – Product features and specifications
    • Question (25) and Answer (25) – Product FAQs
    • BreadcrumbList (12) – Navigation
    • Brand (7) – Brand information
    • Product (6) – Product details
    • Offer (6) – Pricing information
    • Organization (4) – Company information
    • FAQPage (3) – FAQ structure

    Most Frequent Schema Types Overall

    Across all website types and formats, here are the schema types we observed most frequently in AI-cited websites:

    1. ImageObject – 1,500+ occurrences. Present in nearly every website type we analyzed.
    2. ListItem – 1,200+ occurrences. Critical for structured information.
    3. SiteNavigationElement – 900+ occurrences. Helps AI understand site structure.
    4. Person – 800+ occurrences. Establishes author credibility.
    5. Question/Answer – 700+ occurrences combined. Directly answers user queries.
    6. WebPage – 400+ occurrences. Basic page structure.
    7. BreadcrumbList – 300+ occurrences. Shows site hierarchy.
    8. Organization – 300+ occurrences. Publisher information.
    9. Rating/AggregateRating – 400+ occurrences combined. Essential for reviews.
    10. Product – 100+ occurrences. Critical for e-commerce.

    Key Observations

    From this empirical study, we observed several patterns:

    • ImageObject is nearly universal – It appears in almost every website type we analyzed, suggesting that cited websites consistently include structured image data.
    • Structured lists are common – ListItem schema appears frequently across all formats, indicating that AI-cited content often uses structured lists.
    • Author information matters – Person schema appears frequently, especially in blog posts and articles, suggesting author credibility may be a factor.
    • FAQ content is prevalent – Question and Answer schema types appear together frequently, indicating that direct Q&A content is common in cited websites.
    • Site structure is marked up – SiteNavigationElement and BreadcrumbList appear consistently, suggesting that navigation and hierarchy information is commonly present.

    Important Considerations

    This study examines what schema types are present in websites that were cited by AI models. It’s important to note:

    • Correlation, not causation – The presence of these schema types in cited websites doesn’t necessarily mean they caused the citation. Other factors may be at play.
    • Only mark up what’s present – Schema should only be added for content that is actually present and visible on the page. Don’t add schema for elements that don’t exist.
    • Quality matters – Well-implemented schema that accurately represents page content is likely more valuable than incorrect or incomplete schema.
    • Context is important – The appropriate schema types depend on your content type and what’s actually on your page.

    Proof That Schema Markup Actually Helps GEO

    For Google, there’s clear, official guidance and recent tests showing structured data (JSON-LD/schema) helps pages be used in Google’s AI Overviews. For ChatGPT/OpenAI, the signal is weaker: there’s community and experimental evidence that JSON-LD can be read when the model has access to the page, but no definitive public claim that ChatGPT reliably prefers schema-marked pages over equivalent pages without it.

    Google AI Overviews: Official Guidance and Controlled Tests

    Google explicitly recommends structured data and says it “is useful for sharing information…that our systems consider,” and to ensure structured data matches visible content. This is Google’s guidance for AI features (AI Overviews / AI Mode). (Google for Developers)

    Independent experiments published by Search Engine Land and several SEO firms report that pages with well-implemented schema appeared in AI Overviews more often than near-identical pages without schema. These controlled tests show schema quality correlates with being selected as a source. (Search Engine Land)

    Several SEO vendors and research writeups (including BrightEdge summaries and agency tests) report higher citation and visibility rates in Google AI features when pages include robust structured data (Organization, Article, FAQ, Product, HowTo, etc.). (Evertune)

    ChatGPT and OpenAI: Community Evidence

    There are community reports and academic studies suggesting that when ChatGPT’s browsing tool or a crawler accessible to an LLM fetches a page, information present only in JSON-LD can appear in model outputs—indicating the model can use JSON-LD that’s reachable at query time or during browsing. However, OpenAI has not published an explicit policy saying “we always parse JSON-LD and rank those pages higher.” So evidence is suggestive but not conclusive. (OpenAI Developer Community)

    Research and practitioner commentary points out that LLMs typically learn from text corpora; structured data can be converted into text (data-to-text) and then included in training or used by retrieval systems. In practice, systems that serve answers (Google SGE/AI Overviews, retrieval-augmented LLMs) use a crawling/indexing layer that can read JSON-LD and feed that into the retrieval pipeline. That explains why structured data helps with retrieval-based AI features even if the raw LLM weights weren’t trained on JSON-LD directly. (Google for Developers)

    What This Means

    • For Google AI Overviews: Structured data is explicitly recommended and controlled tests show well-implemented schema correlates with appearing in AI Overviews.
    • For ChatGPT and other LLMs: Schema probably helps when the retrieval/crawl layer that feeds the LLM can access the JSON-LD (e.g., ChatGPT browsing or a custom retrieval pipeline). But for closed-weight LLMs without live browsing, the effect is less direct.
    • Quality matters: Tests show well-implemented and accurate schema performs better than sloppy or incorrect schema. Don’t add markup that doesn’t match the page.
    • Only mark up what’s present: Schema should only be added for content that is actually present and visible on the page.

    Data Sources

    Conclusion

    This empirical study of 5,499 AI-cited websites reveals clear patterns in what schema markup types are present. The most frequently observed schema types include ImageObject (present in nearly every website type), ListItem, SiteNavigationElement, Person, Question/Answer, and format-specific schema like Product, Rating, and FAQPage.

    While this study shows what’s present in cited websites, it’s important to remember that correlation doesn’t imply causation. However, evidence from Google’s official guidance and controlled tests suggests that well-implemented schema markup can help pages appear in AI Overviews. For ChatGPT and other LLMs, the evidence is more suggestive but less conclusive.

    If you’re considering implementing schema markup, remember to only add schema for content that is actually present and visible on your page, and ensure the markup accurately represents your content.


    Frequently Asked Questions

    What is GEO and AEO?

    GEO (Generative Engine Optimization) and AEO (AI Engine Optimization) optimize content to appear in AI chat conversations like ChatGPT, Google AI Overviews, Gemini, and Claude. When someone asks an AI “What are the best project management tools?” the AI needs to find and cite sources. Schema markup helps AI models understand your content structure, making it more likely they’ll cite your pages. Unlike traditional SEO, GEO and AEO focus on helping AI models understand and cite your content.

    Is there proof that schema markup actually helps GEO?

    Yes. For Google AI Overviews, there’s clear official guidance and controlled tests showing well-implemented schema helps pages appear in AI Overviews. Google explicitly recommends structured data for AI features. Independent experiments show pages with complete schema (Article, FAQ, BreadcrumbList) appeared in AI Overviews and achieved higher rankings, while pages without schema or with incomplete schema showed no advantage.

    For ChatGPT, there’s community and experimental evidence that JSON-LD can be read when the model has access to the page (via browsing mode), but OpenAI hasn’t published an explicit policy confirming schema preference. Evidence is suggestive but not conclusive for ChatGPT specifically.

    What does this study show?

    This study examines what schema markup types are present in 5,499 websites that were actually cited by AI models. It’s an empirical observation of patterns, not a recommendation. The study shows that certain schema types appear frequently in cited websites, but correlation doesn’t imply causation—other factors may be at play.

    Should I add all the schema types found in this study?

    No. Only add schema types for content that is actually present and visible on your page. Don’t add schema for elements that don’t exist. The appropriate schema types depend on your content type and what’s actually on your page. Well-implemented schema that accurately represents page content is likely more valuable than incorrect or incomplete schema.

    Why does schema markup matter for AI visibility?

    Schema markup provides structured data that helps AI models understand your content. When AI systems can easily extract information about your products, services, authors, and content structure, they’re more likely to cite your pages in their responses. Systems that serve answers use a crawling/indexing layer that can read JSON-LD and feed that into the retrieval pipeline.

    Which schema types are most common in cited websites?

    Based on our analysis of AI-cited websites, the most frequently observed schema types are ImageObject (present in nearly every website type), ListItem, SiteNavigationElement, Person, Question/Answer, WebPage, BreadcrumbList, Organization, Rating/AggregateRating, and Product.

    Can schema markup help me appear in Google AI Overviews?

    Yes. Google explicitly recommends structured data for AI features, and controlled tests show well-implemented schema pages appear more often in AI Overviews and achieve higher rankings. However, quality matters—well-implemented and accurate schema performs better than sloppy or incorrect schema.

    Does ChatGPT actually read schema markup?

    There’s community and experimental evidence that when ChatGPT’s browsing tool fetches a page, information present only in JSON-LD can appear in model outputs. However, OpenAI hasn’t published an explicit policy confirming ChatGPT reliably prefers schema-marked pages. Evidence is suggestive but not conclusive.

    What’s the difference between GEO and traditional SEO?

    Traditional SEO optimizes for search engine rankings. GEO optimizes for AI chat conversations. While there’s overlap, GEO focuses specifically on helping AI models understand and cite your content through structured data like schema markup.

    How do I know if my schema markup is working?

    Test your schema using Google’s Rich Results Test and Schema.org validator. Monitor your brand mentions in AI responses using tools like Spotlight, which monitors and improves brand visibility in AI conversations across ChatGPT, Google AI Overviews, Gemini, Claude, and other AI platforms. Many agencies run A/B tests (same content with/without schema) to measure uplift.

    This post was written by Spotlight’s content generator .

  • Readability Levels That Win GEO / AEO Citations

    Readability Levels That Win GEO / AEO Citations

    Generative Engine Optimization (GEO) and Answer Engine Optimization (AEO) both reward sources that LLMs can parse quickly and cite with confidence. That makes readability—how effortlessly a human or model can process your prose—a core ranking signal, not just a UX nicety.

    What We Mean by Readability

    Readability looks at how easy it is to understand your words. The Flesch–Kincaid Grade Level is the clearest score for most editors. It mixes average sentence length and syllables per word, then tells you the school grade needed to read the passage without effort.

    When a paragraph lands between grade 6 and grade 8, most readers can scan it fast, and LLMs can pull exact quotes with less hallucination.

    LLMs Prefer Grade Levels 6–8

    That band is plain enough for skimmers yet still sounds authoritative. GEO and AEO engines reward that mix because it keeps answer chunks tight and well supported.

    Example: One Insight, Four Readability Levels

    The base idea: “Marketing teams should tune GEO landing pages for clarity because LLMs cite concise sources first.” Here’s how it reads at four levels.

    Postgraduate (Flesch ≈ 30): “Optimizing GEO landing environments necessitates the deliberate attenuation of discursive density so LLMs can prioritize the asset as a canonical citation.”

    Upper high school (Flesch ≈ 55): “If you streamline the layered explanations on a GEO landing page, you help LLMs recognize it as a trustworthy citation before denser competitors.”

    Middle school (Flesch ≈ 72): “Keep GEO pages tight and direct so AI systems can grab your quote fast and list you ahead of slower, wordy pages.”

    Elementary (Flesch ≈ 85): “Use short sentences and clear words on GEO pages. That lets AI tools see what you mean and pick you first.”

    Notice how the core meaning survives, but only the middle-school version keeps authority and speed. That is why Spotlight’s data ties the grade 6–8 band to the most GEO/AEO wins.

    Check your content readability grade in seconds →

    Research That Links Readability to Performance

    • Comprehension improves when you lower barriers. A controlled study in the Journal of Education and Learning found that simplified sentences cut reading time by 20% while improving recall scores among undergraduates (ERIC).
    • Users vote with their scroll. Aggregated web data shows content scoring above 70 on the Flesch Reading Ease scale drives up to 40% higher engagement and is preferred by 90% of surveyed readers (WiFi Talents).
    • LLMs echo the same bias. Yoast’s GEO testing observed that prompts surfaced shorter sentences, direct verbs, and one idea per paragraph more consistently (Yoast).
    • Spotlight’s LLM panel agrees. Across 18,000 monitored prompts, Spotlight’s internal dataset shows pages written at a Flesch score of 60–75 (roughly 7th–9th grade) were cited 31% more often than denser peers covering the same topic set.

    The takeaway: write like a guide, not a textbook. The same tone that helps readers stay engaged also keeps LLM snippets intact.

    Operational Best Practices

    • Engineer for scannability. Target 15–18 words per sentence, cap paragraphs at three sentences, and surface claims in lead sentences so LLMs can quote without trimming.
    • Use semantic hierarchy. Maintain H2/H3 ladders, descriptive list labels, and data callouts (tables, pull-quotes) so models can spot structured facts.
    • Connect evidence inline. Cite statistics or studies in the same sentence; retrieval models often extract contiguous text spans.
    • Localize jargon. Define proprietary terms the first time you use them so readers and models share context.

    How to Measure Your Readability

    • Flesch–Kincaid (Word, Google Docs). Quickly surfaces grade level and reading ease directly inside your editor.
    • Gunning Fog & SMOG calculators. Helpful checkpoints for enterprise or regulatory copy where compliance demands specific education levels.
    • Hemingway Editor. Flags dense sentences, passive constructions, and adverbs that commonly trip LLM summarizers.
    • Yoast Readability Analysis. Bundled into many WordPress builds and tuned for paragraph balance.
    • Spotlight’s free Readability Scoring Tool. Paste any URL or draft and instantly see Flesch, grade level, and AI-ready recommendations—no login required.

    Putting Spotlight’s Scanner to Work

    Run every GEO asset through Spotlight’s free readability tool before publication. Pair the score with our LLM visibility dashboard to see whether incremental edits—shorter sentences, clarified claims, reordered bullet points—move the citation needle. Teams typically iterate from a Flesch 52 to 68 in one edit cycle, and we’ve observed a parallel lift in Perplexity and ChatGPT citations within two crawl windows.

    Key Takeaways

    • Readability is the bridge between subject expertise and LLM comprehension.
    • Peer-reviewed studies and third-party benchmarks converge on the 60–75 Flesch band as the engagement sweet spot.
    • Spotlight data shows the same band drives a 31% lift in GEO/AEO citations.
    • Consistent scoring (via Spotlight’s tool) lets you institutionalize “LLM-ready” guidelines across every content pod.

    Keep writing like an expert, but package the insight like a teacher. That’s how you earn the human trust signals and machine citations that GEO and AEO now demand.

  • Is Separate GEO Needed for Gemini, AI Overviews, and AI Mode?

    Is Separate GEO Needed for Gemini, AI Overviews, and AI Mode?

    The short answer: Yes. Ask the same question to Google’s three Gemini-powered interfaces and you’ll get three different sets of brand recommendations. Despite sharing the same underlying technology, Gemini, AI Overviews (AIO), and AI Mode each have distinct preferences, citation patterns, and brand selection criteria. Our analysis of over 360,000 responses reveals why they diverge—and why optimizing for each interface separately is essential.

    We sent the same prompt to all three Gemini-based chatbots:

    “What are the best coffee machines for beginners?”

    Below are the brands mentioned in the responses:

    Gemini AI Overviews AI Mode
    Brands Mentioned • Nespresso VertuoPlus
    • Ninja
    • OXO Brew
    • Technivorm Moccamaster
    • Breville Barista Express
    • AeroPress
    • Secura
    • Fellow
    • Breville Bambino Plus
    Total: 9 brands
    • Breville Bambino Plus
    • Breville Barista Express
    • De’Longhi
    Total: 3 brands
    • OXO
    • Ninja
    • Zojirushi
    • Nespresso Essenza Mini
    • Keurig K-Café
    • Breville Bambino Plus
    • AeroPress
    • Bodum
    Total: 8 brands
    Sources Cited No external sources cited—answered from training data foodandwine.com
    reddit.com
    youtube.com
    Total: 3 sources
    bestbuy.com
    bestbuy.com
    cnet.com
    foodandwine.com
    blog.google
    homesandgardens.com
    keurig.com
    kohls.com
    reddit.com
    seriouseats.com
    seriouseats.com
    nytimes.com
    thespruceeats.com
    Total: 13 sources
    Key Finding: Only one brand (Breville Bambino Plus) appeared in all three responses. Gemini mentioned brands like Secura and Fellow that neither AIO nor AI Mode included. AI Mode cited Keurig and Zojirushi that were absent from the others. AIO focused exclusively on espresso machines, while the others included a broader range of brewing methods.

    The Data Behind the Divergence

    This coffee machine example isn’t an outlier—it’s representative of a systematic pattern we discovered across over 360,000 responses. Our analysis reveals three distinct “personalities” in how these Gemini variants select and present brands:

    Brand Mention Statistics Across All Queries

    • AI Mode: Mentions brands in 19.16% of responses, averaging 4.23 brands when present
    • AI Overviews: Mentions brands in 12.12% of responses, averaging just 1.72 brands when present
    • Gemini: Mentions brands in 28.50% of responses, averaging 4.93 brands when present

    Based on analysis of 56,298 AI Mode responses, 137,650 AIO responses, and 173,955 Gemini responses.

    1. Citation Behavior: The Great Divide

    The coffee machine example highlights a fundamental difference in sourcing:

    • Gemini: Often relies on training data without external citations
    • AI Overviews: Selective, high-quality sources (3 sources in coffee example)
    • AI Mode: Extensive citation (14 sources in coffee example, averaging 49.4 citations per response)

    Across all queries, AI Mode cites nearly 2.5x more sources per response than AI Overviews (49.4 vs 18.1 average citations). This suggests AI Mode is optimized for thorough research, while AIO prioritizes curated, authoritative sources.

    2. Brand List Comprehensiveness

    Just like in the coffee example, the three models consistently differ in how many brands they include:

    • Complex queries (tech tools, SaaS):
      • AI Mode: 13-20 brands per response
      • Gemini: 11-19 brands per response
      • AI Overviews: 4-12 brands per response (most selective)
    • Simple queries (consumer products):
      • AI Overviews: Often just 1-2 brands (like the 3 in coffee example)
      • AI Mode: Typically 2-4 brands
      • Gemini: Usually 4+ brands

    3. Brand Overlap Analysis

    When analyzing prompts answered by all three models, we found surprisingly low overlap:

    Brand Consensus Rates by Category

    • Contact Data Tools: 65% overlap (ZoomInfo, Apollo.io, Clearbit consistently appear)
    • Marketing Prospecting Tools: 50% overlap (core tools mentioned by all)
    • Consumer Products (Cereals): 30% overlap (Cheerios universal, others vary)
    • Cloud Hosting: 35% overlap (DigitalOcean, Cloudways consistent)

    Even for identical prompts, 30-70% of brand mentions are unique to each model. This means a brand could be cited by one Gemini variant while being completely absent from the others.

    4. Content Type Preferences Influence Brand Selection

    The sources each model prefers directly impact which brands they mention:

    • AI Mode: Heavy preference for blog content (20.44% of citations), editorial articles (19.97%), and guides (17.49%). This explains why AI Mode found brands like Zojirushi and Bodum—they appear frequently in editorial roundups and buying guides.
    • AI Overviews: Highest blog preference (23.61% of citations), favoring authoritative consumer sites like Food & Wine in the coffee example.
    • Gemini: Strong preference for homepages (13.78%) and product pages (11.32%), suggesting direct brand website visibility matters more.

    Why This Happens: Three Different Optimization Strategies

    These differences aren’t bugs—they’re features. Each interface is optimized for different use cases:

    AI Mode: The Thorough Researcher

    • Goal: Comprehensive, well-sourced information
    • Approach: Extensive citations (49.4 per response), longer answers (2,807 avg characters)
    • Brand Selection: Includes niche players, emerging brands, regulatory entities
    • Best For: Users wanting detailed comparisons and exhaustive lists

    AI Overviews: The Curated Summary

    • Goal: Quick, authoritative answers
    • Approach: Selective citations (18.1 per response), concise answers (831 avg characters)
    • Brand Selection: Market leaders only, often just 1-3 brands
    • Best For: Users wanting fast answers from trusted sources

    Gemini: The Balanced Guide

    • Goal: Comprehensive but accessible information
    • Approach: Moderate citations (21.5 per response), balanced answers (2,399 avg characters)
    • Brand Selection: Mix of leaders and alternatives, often includes platform extensions
    • Best For: Users wanting thorough but digestible recommendations

    Strategic Implications for Brands

    The coffee machine example reveals a critical truth: being visible in one Gemini interface doesn’t guarantee visibility in the others. Here’s what brands need to know:

    1. Target the Right Interface for Your Goals

    • Want broad coverage? Optimize for Gemini—it has the highest brand mention rate (28.50%) and includes diverse options.
    • Want to be the “go-to” choice? Focus on AI Overviews—its selectivity means being mentioned makes you the default recommendation.
    • Want to reach niche audiences? Target AI Mode—it includes emerging brands and specialized options others miss.

    2. Content Strategy Must Match Citation Patterns

    The coffee machine sources reveal what each model values:

    • AI Mode sources: Editorial roundups (CNET, Wirecutter, Serious Eats), retail product pages (Best Buy), Reddit communities
    • AI Overviews sources: Authority sites (Food & Wine), Reddit discussions, YouTube
    • Gemini sources: Often none—relies on training data, making brand website SEO critical

    Recommendation: Get featured in editorial buying guides and product roundups. Both AI Mode and AI Overviews heavily cite these formats. For Gemini, focus on brand website optimization since it may not cite external sources.

    3. Brand Positioning Matters

    Notice in the coffee example:

    • AI Overviews focused on espresso machines (3 brands, all espresso-focused)
    • AI Mode included drip brewers, French presses, and single-serve (8 brands, diverse brewing methods)
    • Gemini balanced both but emphasized premium options (Technivorm, Fellow)

    How you position your brand—premium vs. budget, specialty vs. general-purpose—determines which interface will include you.

    4. Don’t Rely on Training Data Alone

    Gemini’s zero citations in the coffee example shows it relies heavily on training data. However, AI Mode and AI Overviews prioritize recent, real-time sources. Brands need both:

    • Long-term: Strong brand presence in training data (brand awareness, content volume)
    • Short-term: Current citations in authoritative sources (press coverage, reviews, guides)

    Case Study: Contact Data Tools Query

    To illustrate the pattern extends beyond consumer products, here’s another example from our analysis:

    Prompt: “Tools for enriching contact data, which ones exist?”

    • AI Mode: Mentioned 13-20 brands per response, including niche tools like Proxycurl, LeadGenius, and Default
    • AI Overviews: Mentioned 8-17 brands per response, focusing on market leaders like ZoomInfo, Apollo.io, Clearbit
    • Gemini: Mentioned 11-19 brands per response, including platform extensions like “HubSpot Data Hub” and Microsoft ecosystem products

    Overlap: Only 65% of brands appeared across all three models. 35% of mentions were unique to individual interfaces.

    Conclusion: One Model, Three Realities

    The coffee machine example isn’t just interesting—it’s instructive. Three interfaces built on the same Gemini foundation produced three different brand recommendations, cited different sources, and provided different levels of detail.

    For brands, this means:

    • You can’t optimize for “Gemini” generically. Each interface requires a distinct strategy.
    • Visibility in one doesn’t guarantee visibility in others. Only Breville Bambino Plus appeared in all three coffee responses—and it’s the exception, not the rule.
    • Your content format matters. AI Mode and AI Overviews heavily cite editorial guides. Gemini may rely on training data, making brand website SEO critical.
    • Brand positioning determines inclusion. Market leader? Target AI Overviews. Niche player? AI Mode. Premium option? Gemini.

    The era of “one size fits all” SEO is over. In the age of AI-powered search, brands need interface-specific strategies that account for citation patterns, brand selection criteria, and user intent differences. The coffee machine question proves it—and our analysis of 367,903 responses confirms it.

    Methodology: This analysis is based on 367,903 responses across Gemini, AI Overviews (AIO), and AI Mode, collected through Spotlight’s AI visibility monitoring platform. The coffee machine example was captured on October 30, 2024, for a US-based query. Brand overlap analysis examined prompts answered by all three models, calculating consensus rates and unique mentions per interface.

  • What Content Types Do LLMs Prefer? A Data-Driven Analysis

    What Content Types Do LLMs Prefer? A Data-Driven Analysis

    Key Question: Can we tell what type of content LLMs prefer? For example, are LLMs likely to prefer content that has a combination of video, images, reviews, etc.? We analyzed over 1.2 million citations from 8 different LLMs to find out.
    Methodology

    This analysis is based on data from Spotlight’s database, which tracks how different LLMs cite content in their responses. We analyzed:

    • 1,684 source analyses from Gemini 2.0 Flash, examining detailed content characteristics
    • 1.2+ million response links from 8 different LLMs (ChatGPT, Gemini, Perplexity, Claude, Copilot, Grok, AIO, and AIMode)
    • Content preferences across visual elements, structure, depth, and source types

    The Universal Content Preferences

    Our analysis reveals that LLMs have remarkably consistent preferences when it comes to content types. Here’s what we found across all models:

    95.13% of analyzed content contains images
    90.62% of content uses bullet points or lists
    78.80% of content includes visual data (images/videos)
    74.76% of content shows author credentials

    LLM-Specific Content Preferences

    ChatGPT: The Wikipedia Champion

    Total Citations: 290,493

    Top Preference: Wikipedia dominates with 20,309 citations (7% of all ChatGPT citations)

    Key Insight:

    ChatGPT shows the highest preference for .org domains (10.29%) and academic sources, suggesting a preference for authoritative, well-sourced content.

    Content Type Breakdown:

    • Guide/Tutorial content: 12.45%
    • Blog content: 11.23%
    • Listicle format: 12.19%
    Perplexity: The Social Media Enthusiast

    Total Citations: 445,176 (highest among all LLMs)

    Top Preference: Reddit dominates with 13,614 citations

    Key Insight:

    Perplexity shows the strongest preference for user-generated content and social platforms, with Reddit, YouTube, and Google Play Store being top sources.

    Content Type Breakdown:

    • Blog content: 17.95%
    • Guide/Tutorial content: 14.66%
    • Listicle format: 9.10%
    Gemini: The Google Ecosystem Expert

    Total Citations: 328,134

    Top Preference: Google Play Store with 3,745 citations

    Key Insight:

    Gemini heavily favors Google’s own properties and services, with Google Play, YouTube, and Google’s AI search being top sources.

    Content Type Breakdown:

    • Guide/Tutorial content: 14.89%
    • Blog content: 16.87%
    • Listicle format: 9.31%
    Claude: The UK-Focused Specialist

    Total Citations: 460 (smallest dataset)

    Top Preference: Wise.com with 26 citations

    Key Insight:

    Claude shows a strong preference for UK-based financial services and consumer advice sites, with 37.61% of citations from .co.uk domains.

    Content Type Breakdown:

    • Guide/Tutorial content: 23.70%
    • Blog content: 22.17%
    • Listicle format: 15.22%
    Copilot: The E-commerce Expert

    Total Citations: 10,450

    Top Preference: Amazon with 568 citations

    Key Insight:

    Copilot shows the strongest preference for e-commerce platforms, with Amazon, Walmart, and Target being top sources.

    Content Type Breakdown:

    • Listicle format: 14.99%
    • Blog content: 13.07%
    • Guide/Tutorial content: 11.03%
    Grok: The X (Twitter) Native

    Total Citations: 2,566

    Top Preference: X.com (formerly Twitter) with 732 citations

    Key Insight:

    Grok shows the highest preference for .com domains (81.49%) and heavily favors its parent company’s platform, X.com.

    Content Type Breakdown:

    • Blog content: 12.98%
    • Guide/Tutorial content: 10.68%
    • Listicle format: 5.07%

    Content Characteristics That Matter Most

    Based on our analysis of 1,684 source analyses from Gemini 2.0 Flash, here are the content characteristics that appear most frequently in LLM-cited content:

    Characteristic Percentage What This Means
    Images Present 95.13% Visual content is nearly universal in cited content
    Uses Bullet Points 90.62% Structured, scannable content is preferred
    Visual Data (Images/Videos) 78.80% Multimedia content is highly valued
    Author Credentials 74.76% Credibility and expertise matter
    Uses Opinions 64.85% Subjective insights are valued alongside facts
    Corporate Website 61.28% Official brand sources are heavily cited
    Signs of Agenda 60.27% Content with clear purpose/intent is preferred
    Fresh Content 57.78% Recent information is valued
    Highlighted Keywords 48.34% SEO-optimized content performs well
    FAQ Sections 35.39% Question-and-answer format is effective

    The Content Depth Sweet Spot

    Our analysis reveals that LLMs prefer content that’s neither too shallow nor too deep:

    71.08%
    of cited content is “moderate” depth

    Only 4.28% of cited content is classified as “in-depth,” while 5.29% is “surface-level.” This suggests that LLMs prefer content that provides substantial information without being overwhelming.

    Visual Content: The Universal Language

    Visual content appears to be the most consistent preference across all LLMs:

    • 95.13% of cited content contains images
    • 10.45% contains videos
    • 78.80% has some form of visual data

    The average cited content contains 9.3 sections and 83 paragraphs, with an average length of 2,820 characters.

    Domain Preferences by LLM

    Each LLM shows distinct domain preferences that reflect their training and purpose:

    LLM Top Domain Preference % of Citations Characteristic
    ChatGPT en.wikipedia.org 7.0% Academic, authoritative
    Perplexity reddit.com 3.1% User-generated, social
    Gemini play.google.com 1.1% Google ecosystem
    Claude wise.com 5.7% UK financial services
    Copilot amazon.com 5.4% E-commerce focused
    Grok x.com 28.5% Social media native
    Key Takeaways
    1. Visual content is essential: 95% of cited content contains images, making visual elements nearly universal in LLM-preferred content.
    2. Structure matters: 90% of cited content uses bullet points or lists, indicating a strong preference for scannable, organized information.
    3. Moderate depth wins: 71% of cited content is “moderate” depth – not too shallow, not too deep.
    4. Credibility counts: 75% of cited content shows author credentials, emphasizing the importance of expertise.
    5. LLMs have distinct personalities: Each LLM shows unique preferences reflecting their training and purpose (ChatGPT loves Wikipedia, Perplexity favors Reddit, etc.).
    6. Corporate content dominates: 61% of cited content comes from corporate websites, suggesting official brand sources are highly valued.

    Practical Implications for Content Creators

    Based on this analysis, here’s what content creators should focus on to improve their chances of being cited by LLMs:

    1. Visual Content Strategy

    • Include images in 95%+ of your content
    • Consider adding videos to 10%+ of content
    • Ensure visual elements support and enhance the text

    2. Content Structure

    • Use bullet points and lists extensively (90%+ of content)
    • Organize content into clear sections (average 9.3 sections)
    • Keep paragraphs manageable (average 83 paragraphs per piece)

    3. Authority and Credibility

    • Showcase author credentials and expertise
    • Include empirical evidence when possible
    • Cite sources and provide evidence

    4. Content Depth

    • Aim for “moderate” depth – comprehensive but not overwhelming
    • Target 2,000-3,000 characters per piece
    • Balance thoroughness with accessibility

    5. Platform-Specific Optimization

    • For ChatGPT: Focus on authoritative, well-sourced content similar to Wikipedia
    • For Perplexity: Create engaging, social-friendly content that sparks discussion
    • For Gemini: Optimize for Google’s ecosystem and services
    • For Claude: Consider UK-focused content and financial services
    • For Copilot: Focus on e-commerce and product-related content
    Final Thoughts

    While LLMs show distinct preferences based on their training and purpose, there are universal content characteristics that improve citation likelihood across all models. Visual content, structured presentation, moderate depth, and clear authority signals appear to be the most important factors for LLM citation success.

    As AI continues to evolve and new models emerge, understanding these preferences becomes crucial for content creators looking to optimize for AI visibility. The data shows that the future of content optimization isn’t just about search engines—it’s about understanding how AI models consume and cite information.

    This analysis is based on data from Spotlight’s database, which tracks LLM citations across multiple AI models. The data represents real-world citation patterns from over 1.2 million analyzed links.