547 Requests in One Day: What Happens When GPTBot Discovers Your Mirror Site On January 7, 2026, GPTBot made 547 requests to rozz.genymotion.com. This accounted for 47% of all training bot activity recorded in 30 days. The mirror site is an AI publishing layer ROZZ builds automatically for clients. It had been live for weeks with minimal crawler attention. Then GPTBot found it. Within three weeks, ChatGPT users were receiving Genymotion content in their conversations. This is the first documented case study of the GEO pipeline: from mirror site deployment to training crawl to live citation.
Key Findings
GPTBot made 547 requests on January 7, 2026.
This accounted for 47% of 30-day training activity in one day.
Total training bot requests (GPTBot + ClaudeBot) were 1,172 over 30 days.
OAI-SearchBot made 66 requests building retrieval indexes.
GPTBot prioritized GEO pages (493 requests) over Q&A pages (322 requests).
Citation events (ChatGPT-User) began appearing approximately three weeks after the major crawl.
42 citation events were recorded in 30 days, concentrated on four high-intent pages.
The Data
Daily GPTBot Activity (Jan 3 – Feb 2, 2026)
| Date | GPTBot Requests | Notable Activity |
|------|-----------------|------------------|
| Jan 3–6 | 0–8/day | Baseline; ClaudeBot discovers site |
| Jan 7 | 547 | Major crawl spike |
| Jan 8–17 | 1–2/day | Low activity period |
| Jan 18–19 | 124 total | Secondary wave |
| Jan 25–26 | 409 total | Tertiary wave |
| Jan 27 | 40 | Q&A deep dive (40+ Q&As in rapid succession) |
| Jan 28+ | 2–4/day | Maintenance crawling; citations begin |
Bot Category Breakdown (30 Days)
| Category | Bot(s) | Requests | Purpose |
|----------|--------|----------|---------|
| Training | GPTBot, ClaudeBot | 1,172 | Content collection for model training |
| Index | OAI-SearchBot | 66 | Building retrieval indexes |
| Citation | ChatGPT-User | 42 | Real users receiving content in responses |
| Total LLM Bot Requests | — | 1,280 | — |
Content Type Distribution (GPTBot Only)
| Content Type | Requests | Percentage |
|--------------|----------|------------|
| GEO Pages | 493 | 57% |
| Q&A Pages | 322 | 37% |
| Sitemap | 27 | 3% |
| Other (APIs, llms.txt, homepage) | 16 | 2% |
What GPTBot Prioritized
1) Discovery via sitemap. GPTBot hit the sitemap first, then systematically worked through content pages.
2) GEO pages over Q&As. Despite the mirror site having 177 Q&A pages and 450 GEO pages, GPTBot crawled GEO pages at a higher rate (493 vs 322). GEO pages are AI-optimized versions of Genymotion's help center and documentation—rich in structured content.
3) Burst patterns for Q&As. On January 27, GPTBot returned specifically for Q&A pages, crawling 40+ in rapid succession (roughly one per second). This suggests different indexing strategies for different content types.
4) Schema.org matters. Every page on the mirror site includes full Schema.org JSON-LD markup (QAPage for Q&As, WebPage for content pages, CollectionPage for topics). This structured data makes content trivially extractable.
The Three-Phase Pipeline
Our data shows a clear progression from crawl to citation.
Phase 1: Training (Jan 7 + follow-up waves)
GPTBot mass-crawls the mirror site.
547 requests on January 7 alone.
Follow-up waves on Jan 18–19 (124 requests) and Jan 25–26 (409 requests).
A targeted Q&A crawl on Jan 27.
Content enters OpenAI's training pipeline.
Phase 2: Indexing (ongoing)
OAI-SearchBot operates separately from GPTBot.
It's building the retrieval index that powers ChatGPT's web search feature.
We recorded 66 SearchBot requests—mostly robots.txt checks (38 of 66), verifying it has permission to index.
This bot works quietly in the background.
Phase 3: Citations Begin (Jan 28+)
ChatGPT-User requests appear.
Real users asking ChatGPT questions are now receiving Genymotion content from the mirror site.
Timeline: approximately three weeks from major crawl to first citations.
Citation Events: What Users Are Asking
The 42 ChatGPT-User requests weren't distributed evenly. They concentrated on specific pages:
| Page | Citations | What Users Are Asking | |------|-----------|----------------------| | /pages/what-are-genymotion-desktop-requirements.html | 7 | System requirements for Genymotion | | /pages/which-android-versions-are-available.html | 5 | Android version support | | Homepage | 5 | General discovery | | /pages/how-to-enable-the-virtual-keyboard.html | 2 | Specific troubleshooting | | /pages/genymotion-desktop-release-notes.html | 1 | Version information |
These are high-intent queries. Users asking ChatGPT about system requirements or Android version support are evaluating whether to use Genymotion. The mirror site is now part of that conversation.
What ROZZ Built
The mirror site at rozz.genymotion.com is infrastructure that ROZZ builds automatically for every client. It includes:
- 450 GEO pages: AI-optimized versions of help center articles, documentation, and blog posts.
- 177 Q&A pages: Generated from questions users ask the ROZZ chatbot on genymotion.com.
- 15 topic categories: Semantic organization for both humans and machines.
- Schema.org markup on every page: QAPage, WebPage, CollectionPage with full JSON-LD.
- llms.txt discovery files: Two formats—index with links and complete content inline.
- JSON APIs: Programmatic access for AI systems.
This isn't on-page optimization. It is a dedicated publishing layer designed specifically for how LLMs retrieve and cite content.
Genymotion is one client. ROZZ builds this infrastructure automatically for every domain.
Implications for GEO Strategy
1) Dedicated infrastructure beats on-page tweaks. You can't effectively optimize a marketing page for both human conversion and machine extraction. The mirror site provides a separate, purpose-built layer for AI discovery.
2) Structured data accelerates discovery. Every page on the mirror site includes Schema.org JSON-LD. GPTBot's systematic crawl pattern suggests it prioritizes structured, extractable content.
3) The timeline is weeks, not months. From major crawl (Jan 7) to first citations (late Jan): about three weeks. GEO results appear faster than traditional SEO if the infrastructure is in place.
4) Citation events reveal user intent. The cited pages are high-intent queries about requirements, compatibility, and features. This is where purchase decisions happen.
Get This for Your Site
ROZZ builds this infrastructure automatically. Mirror site. Q&A pages from your chatbot. Schema.org markup on every page. llms.txt discovery files. JSON APIs. The complete AI publishing layer.
Pricing
$997/month. Results like Genymotion's.
Get This for Your Site (continued)
ROZZ builds this infrastructure automatically for every domain. The mirror site includes: GEO pages, Q&A pages, schema markup, discovery files, and JSON APIs. This is the complete publishing layer for AI retrieval and citation.
Data Source and Author
Data source: CloudFront access logs for rozz.genymotion.com, January 3 – February 2, 2026. Bot classification based on User-Agent strings.
Author: Adrien Schmidt, CEO, ROZZ. He is a serial tech entrepreneur with 10+ years of experience building AI systems, including Aristotle, and products for eBay and Cartier.
Date: February 2, 2026. Data period: Jan 3 – Feb 2, 2026.