Author: Michael Hermon

  • What Content Types Do LLMs Prefer? A Data-Driven Analysis

    What Content Types Do LLMs Prefer? A Data-Driven Analysis

    Key Question: Can we tell what type of content LLMs prefer? For example, are LLMs likely to prefer content that has a combination of video, images, reviews, etc.? We analyzed over 1.2 million citations from 8 different LLMs to find out.
    Methodology

    This analysis is based on data from Spotlight’s database, which tracks how different LLMs cite content in their responses. We analyzed:

    • 1,684 source analyses from Gemini 2.0 Flash, examining detailed content characteristics
    • 1.2+ million response links from 8 different LLMs (ChatGPT, Gemini, Perplexity, Claude, Copilot, Grok, AIO, and AIMode)
    • Content preferences across visual elements, structure, depth, and source types

    The Universal Content Preferences

    Our analysis reveals that LLMs have remarkably consistent preferences when it comes to content types. Here’s what we found across all models:

    95.13% of analyzed content contains images
    90.62% of content uses bullet points or lists
    78.80% of content includes visual data (images/videos)
    74.76% of content shows author credentials

    LLM-Specific Content Preferences

    ChatGPT: The Wikipedia Champion

    Total Citations: 290,493

    Top Preference: Wikipedia dominates with 20,309 citations (7% of all ChatGPT citations)

    Key Insight:

    ChatGPT shows the highest preference for .org domains (10.29%) and academic sources, suggesting a preference for authoritative, well-sourced content.

    Content Type Breakdown:

    • Guide/Tutorial content: 12.45%
    • Blog content: 11.23%
    • Listicle format: 12.19%
    Perplexity: The Social Media Enthusiast

    Total Citations: 445,176 (highest among all LLMs)

    Top Preference: Reddit dominates with 13,614 citations

    Key Insight:

    Perplexity shows the strongest preference for user-generated content and social platforms, with Reddit, YouTube, and Google Play Store being top sources.

    Content Type Breakdown:

    • Blog content: 17.95%
    • Guide/Tutorial content: 14.66%
    • Listicle format: 9.10%
    Gemini: The Google Ecosystem Expert

    Total Citations: 328,134

    Top Preference: Google Play Store with 3,745 citations

    Key Insight:

    Gemini heavily favors Google’s own properties and services, with Google Play, YouTube, and Google’s AI search being top sources.

    Content Type Breakdown:

    • Guide/Tutorial content: 14.89%
    • Blog content: 16.87%
    • Listicle format: 9.31%
    Claude: The UK-Focused Specialist

    Total Citations: 460 (smallest dataset)

    Top Preference: Wise.com with 26 citations

    Key Insight:

    Claude shows a strong preference for UK-based financial services and consumer advice sites, with 37.61% of citations from .co.uk domains.

    Content Type Breakdown:

    • Guide/Tutorial content: 23.70%
    • Blog content: 22.17%
    • Listicle format: 15.22%
    Copilot: The E-commerce Expert

    Total Citations: 10,450

    Top Preference: Amazon with 568 citations

    Key Insight:

    Copilot shows the strongest preference for e-commerce platforms, with Amazon, Walmart, and Target being top sources.

    Content Type Breakdown:

    • Listicle format: 14.99%
    • Blog content: 13.07%
    • Guide/Tutorial content: 11.03%
    Grok: The X (Twitter) Native

    Total Citations: 2,566

    Top Preference: X.com (formerly Twitter) with 732 citations

    Key Insight:

    Grok shows the highest preference for .com domains (81.49%) and heavily favors its parent company’s platform, X.com.

    Content Type Breakdown:

    • Blog content: 12.98%
    • Guide/Tutorial content: 10.68%
    • Listicle format: 5.07%

    Content Characteristics That Matter Most

    Based on our analysis of 1,684 source analyses from Gemini 2.0 Flash, here are the content characteristics that appear most frequently in LLM-cited content:

    Characteristic Percentage What This Means
    Images Present 95.13% Visual content is nearly universal in cited content
    Uses Bullet Points 90.62% Structured, scannable content is preferred
    Visual Data (Images/Videos) 78.80% Multimedia content is highly valued
    Author Credentials 74.76% Credibility and expertise matter
    Uses Opinions 64.85% Subjective insights are valued alongside facts
    Corporate Website 61.28% Official brand sources are heavily cited
    Signs of Agenda 60.27% Content with clear purpose/intent is preferred
    Fresh Content 57.78% Recent information is valued
    Highlighted Keywords 48.34% SEO-optimized content performs well
    FAQ Sections 35.39% Question-and-answer format is effective

    The Content Depth Sweet Spot

    Our analysis reveals that LLMs prefer content that’s neither too shallow nor too deep:

    71.08%
    of cited content is “moderate” depth

    Only 4.28% of cited content is classified as “in-depth,” while 5.29% is “surface-level.” This suggests that LLMs prefer content that provides substantial information without being overwhelming.

    Visual Content: The Universal Language

    Visual content appears to be the most consistent preference across all LLMs:

    • 95.13% of cited content contains images
    • 10.45% contains videos
    • 78.80% has some form of visual data

    The average cited content contains 9.3 sections and 83 paragraphs, with an average length of 2,820 characters.

    Domain Preferences by LLM

    Each LLM shows distinct domain preferences that reflect their training and purpose:

    LLM Top Domain Preference % of Citations Characteristic
    ChatGPT en.wikipedia.org 7.0% Academic, authoritative
    Perplexity reddit.com 3.1% User-generated, social
    Gemini play.google.com 1.1% Google ecosystem
    Claude wise.com 5.7% UK financial services
    Copilot amazon.com 5.4% E-commerce focused
    Grok x.com 28.5% Social media native
    Key Takeaways
    1. Visual content is essential: 95% of cited content contains images, making visual elements nearly universal in LLM-preferred content.
    2. Structure matters: 90% of cited content uses bullet points or lists, indicating a strong preference for scannable, organized information.
    3. Moderate depth wins: 71% of cited content is “moderate” depth – not too shallow, not too deep.
    4. Credibility counts: 75% of cited content shows author credentials, emphasizing the importance of expertise.
    5. LLMs have distinct personalities: Each LLM shows unique preferences reflecting their training and purpose (ChatGPT loves Wikipedia, Perplexity favors Reddit, etc.).
    6. Corporate content dominates: 61% of cited content comes from corporate websites, suggesting official brand sources are highly valued.

    Practical Implications for Content Creators

    Based on this analysis, here’s what content creators should focus on to improve their chances of being cited by LLMs:

    1. Visual Content Strategy

    • Include images in 95%+ of your content
    • Consider adding videos to 10%+ of content
    • Ensure visual elements support and enhance the text

    2. Content Structure

    • Use bullet points and lists extensively (90%+ of content)
    • Organize content into clear sections (average 9.3 sections)
    • Keep paragraphs manageable (average 83 paragraphs per piece)

    3. Authority and Credibility

    • Showcase author credentials and expertise
    • Include empirical evidence when possible
    • Cite sources and provide evidence

    4. Content Depth

    • Aim for “moderate” depth – comprehensive but not overwhelming
    • Target 2,000-3,000 characters per piece
    • Balance thoroughness with accessibility

    5. Platform-Specific Optimization

    • For ChatGPT: Focus on authoritative, well-sourced content similar to Wikipedia
    • For Perplexity: Create engaging, social-friendly content that sparks discussion
    • For Gemini: Optimize for Google’s ecosystem and services
    • For Claude: Consider UK-focused content and financial services
    • For Copilot: Focus on e-commerce and product-related content
    Final Thoughts

    While LLMs show distinct preferences based on their training and purpose, there are universal content characteristics that improve citation likelihood across all models. Visual content, structured presentation, moderate depth, and clear authority signals appear to be the most important factors for LLM citation success.

    As AI continues to evolve and new models emerge, understanding these preferences becomes crucial for content creators looking to optimize for AI visibility. The data shows that the future of content optimization isn’t just about search engines—it’s about understanding how AI models consume and cite information.

    This analysis is based on data from Spotlight’s database, which tracks LLM citations across multiple AI models. The data represents real-world citation patterns from over 1.2 million analyzed links.

  • Which Domains Do AI Models Trust Most? A 60-Day Analysis of Citation Patterns

    In the rapidly evolving world of AI-powered search and content generation, understanding which sources AI models trust most is crucial for brands looking to optimize their visibility. Our latest analysis of over 850,000 citations across major AI models reveals fascinating patterns in domain preferences that could reshape your content strategy.

    Key Finding

    Each AI model has distinct domain preferences, with Wikipedia dominating ChatGPT citations (20,122), Reddit leading Perplexity (12,774), and YouTube topping Gemini trusted sources (1,821).

    The Methodology

    We analyzed citation data from our Spotlight platform, examining over 850,000 URL citations across seven major AI models over the past 60 days. The data reveals not just which domains get cited most frequently, but also the unique preferences of each AI model.

    ChatGPT: The Wikipedia Champion

    ChatGPT shows a clear preference for authoritative, encyclopedia-style content. Wikipedia dominates its citations with an astonishing 20,122 references in just 60 days.

    DomainCitationsDomain Type
    en.wikipedia.org20,122Encyclopedia
    reddit.com11,251Community
    techradar.com3,424Tech News
    investopedia.com1,530Financial Education
    tomsguide.com1,330Tech Reviews

    Insight: ChatGPT heavily favors established, authoritative sources. Wikipedia dominance suggests that comprehensive, well-sourced content performs exceptionally well with this model.

    Perplexity: The Community-Driven Model

    Perplexity shows a different pattern, with Reddit leading its citations at 12,774 references. This suggests Perplexity values real-world user experiences and community discussions.

    DomainCitationsDomain Type
    reddit.com12,774Community
    youtube.com6,345Video Content
    translate.google.com2,970Translation Tool
    play.google.com1,871App Store
    bestbrokers.com1,800Financial Services

    Insight: Perplexity preference for Reddit and YouTube suggests it values authentic user experiences and visual content. Brands should consider creating community-focused content and video materials.

    Gemini: The Google Ecosystem Player

    Google Gemini shows interesting patterns, with YouTube leading at 1,821 citations, followed by Google’s own Vertex AI Search at 1,631 citations.

    DomainCitationsDomain Type
    youtube.com1,821Video Content
    play.google.com1,261App Store
    investopedia.com1,072Financial Education
    pcmag.com1,059Tech Reviews

    Insight: Gemini heavy reliance on Google’s own tools and YouTube suggests strong integration within the Google ecosystem. Video content and Google-optimized materials may perform better with this model.

    Cross-Model Patterns: Universal Winners

    • Reddit: Top performer in Perplexity (12,774), strong in ChatGPT (11,251)
    • YouTube: Leading in Gemini (1,821), strong in Perplexity (6,345)
    • Investopedia: Consistently cited across ChatGPT (1,530), Gemini (1,072)
    • TechRadar: Strong performance across ChatGPT (3,424), Perplexity (1,208), Gemini (770)

    What This Means for Your Brand

    1. Model-Specific Strategies

    • For ChatGPT: Focus on comprehensive, encyclopedia-style content that could be referenced in Wikipedia
    • For Perplexity: Engage with community platforms like Reddit and create video content for YouTube
    • For Gemini: Optimize for Google ecosystem and create video content

    2. Universal Strategies

    • Create comprehensive, authoritative content
    • Engage with community platforms
    • Develop video content
    • Focus on expert reviews and technical analysis

    Key Takeaways

    1. Model Preferences Vary Significantly: Each AI model has distinct domain preferences that require tailored strategies.
    2. Authority Matters: Established, authoritative sources consistently perform well across models.
    3. Community Engagement Works: Platforms like Reddit show strong citation patterns, indicating value in community-focused content.
    4. Video Content is Powerful: YouTube strong performance across models suggests video content is highly valued.
    5. Industry-Specific Patterns: Financial services and technology sectors show particularly strong citation patterns.

    This analysis is based on data from Spotlight AI visibility monitoring platform, analyzing over 850,000 citations across seven major AI models over the past 60 days.