Validation
Confirm your AI-readable files are correctly formatted, publicly accessible, and being discovered by search engines and AI crawlers.
Step 1 — Confirm Files Are Live
Visit each file directly in your browser. They should display as plain text or JSON — not a 404 or 403 error.
https://yourdomain.com/robots.txt
https://yourdomain.com/llms.txt
https://yourdomain.com/llms-full.txt
https://yourdomain.com/semantic/index.json
https://yourdomain.com/markdown/index.mdIf any of these fail, see Troubleshooting before continuing.
Step 2 — Validate Structured Data
Use Google's Rich Results Test to check your Schema.org markup for errors. Paste your homepage URL and review any flagged issues — common problems include missing required fields (such as a BreadcrumbList missing its item field) or incorrect nesting.
Step 3 — Check Google Search Console
Google Search Console is the most reliable way to see whether Google has actually discovered and crawled your new files.
URL Inspection:
- Go to URL Inspection in Search Console
- Paste the full URL of a file (e.g.,
https://yourdomain.com/llms.txt) - Review the result:
- "URL is not on Google" — expected for brand-new files; this is not an error, it just means Google hasn't crawled it yet
- Check the Discovery section for "Sitemaps" and "Referring page" — if both say "none detected," Google doesn't yet know the file exists
- Click Request Indexing to manually prompt a crawl
A known limitation: Google's indexer is built around HTML and JSON content. .txt and .md files are often crawled but not reliably indexed in standard search results — this is normal and doesn't mean your setup is broken. AI crawlers fetch these files directly regardless of Google's indexing status. See Markdown Knowledge Base for more detail.
Crawl Stats:
- Go to Settings → Crawl Stats
- Review the "Other agent type" category — this often includes non-Googlebot crawlers, including AI crawlers
- Look for your
/semantic/and/markdown/URLs appearing here with200 OKresponses — this confirms crawlers are actively reading your AI-readable files
Step 4 — Check for Content Signals and Bot Access
Run a free automated scan at isitagentready.com:
POST https://isitagentready.com/api/scan
Content-Type: application/json
{"url": "https://yourdomain.com"}This checks your robots.txt configuration, Content Signals implementation, and overall bot access control in one pass.
Step 5 — Monitor Server Logs
Your hosting provider's raw access logs are the most direct evidence of AI crawler activity. Look for requests from these user-agents:
GPTBot
ClaudeBot
OAI-SearchBot
Claude-SearchBot
PerplexityBot
Google-Extended
AmazonbotSee AI Crawlers for what each one does. Most hosting control panels (cPanel, Plesk) provide raw access log access, or ask your host directly.
Step 6 — Test AI Answers Directly
The most direct test of whether your AI readability work is paying off: periodically ask the same questions across multiple AI platforms and track whether your business appears.
Example questions to test:
- "Who are the best [your service] providers in [your city]?"
- "Tell me about [your company name]"
- Direct questions about your specific offerings or expertise
Test across ChatGPT, Perplexity, Claude, and Gemini. Screenshot and date your results so you can track changes over time.
Realistic Timeline
| Timeframe | What to Expect |
|---|---|
| 24–72 hours | Manually requested URLs crawled by Google |
| 1 week | Most new files indexed (where applicable) after sitemap resubmission |
| 2–4 weeks | AI crawlers begin regularly hitting your new files |
| 1–3 months | Measurable change in AI answer visibility, if any |
AI readability is a foundation, not an instant-results channel — the goal is to ensure that when AI systems do look, they find accurate, well-structured information.
Related
- Troubleshooting — fixing specific errors found during validation
- AI Crawlers — reference for identifying crawler activity
- Content Signals — declaring AI usage preferences