How to Scrape Grok Answers with the Grok Scraper API
Specialist in Anti-Bot Strategies
TL;DR
- A Grok scraper API returns xAI's answer with both of its source panels as data. One POST to the
scraper.grokactor captures the full answer plusweb_search_resultsandx_search_results— the open-web pages and the X (Twitter) posts Grok cited, as separate arrays. - Three inputs, one of them unusual.
promptcarries the question,countrypins residential egress, and a required reasoningmode—MODEL_MODE_FAST,MODEL_MODE_EXPERT, orMODEL_MODE_AUTO— controls how hard Grok reasons before answering. - X citations are the differentiator. Grok blends live web search with X's real-time feed; capturing only the answer text throws away the half of the data that says who it credited.
- The envelope matches the other LLM actors.
{ status, task_id, task_result }, onex-api-token, the same endpoint — a ChatGPT capture client extends to Grok by changing the actor name and adding themode. - Run metadata comes along free. Follow-up suggestions, footnotes, token counts, and the run's conversation identifiers arrive in the same payload, ready for audit trails.
- Free to start. New Scrapeless accounts include free trial credits — sign up at app.scrapeless.com.
Introduction: the answer engine with a social feed inside
Grok answers questions by blending two source types no other major assistant combines: live web search and posts pulled straight from X. Ask it which tool to buy, which API holds up, which brand to trust, and the response folds web pages and X posts into one cited answer. For anyone tracking how a brand shows up in AI answers, that makes Grok a distinct surface — the citations include the social conversation, not just the indexed web.
Capturing those answers by hand is the usual story: a login-gated interface, streaming output, geo-sensitive responses, and a DOM that was never meant to be parsed. And Grok adds a twist of its own — the reasoning mode changes the answer, so a capture pipeline has to control it explicitly.
The scraper.grok actor turns all of that into one HTTP request: prompt, country, and mode in; structured answer and both citation panels out. This guide covers the request shape, the response schema, a runnable Python client, and the companion actors that cover the rest of the AI-answer landscape. For the ranked view of the category, see the best LLM scrapers guide.
What You Can Do With It
- Share-of-citation tracking across two panels. Count which domains appear in
web_search_resultsand which accounts appear inx_search_resultsfor a fixed prompt set over time. - Brand monitoring where X drives the narrative. For categories where sentiment forms on X first, Grok's citations show which posts are shaping the model's answers.
- Reasoning-mode comparison. Capture the same prompt under
FAST,EXPERT, andAUTOand measure how depth changes the answer and the sources. - Multi-market capture. Pin runs per country and compare what Grok tells different markets about the same question.
- Competitive answer analysis. Track when Grok starts or stops recommending a product, and trace the change to the citations behind it.
- Dataset building. Store prompt–answer–panel triples as clean JSON for longitudinal analysis.
Why the Scrapeless Grok Scraper
The scraper.grok actor is part of the Scrapeless LLM Chat Scraper family inside the Universal Scraping API line:
- Both citation panels as discrete arrays. Open-web sources and X posts arrive separately — a share-of-citation report reads each panel directly, no re-parsing.
- The reasoning mode is a first-class input. You decide how hard Grok thinks per run, which keeps a scheduled series methodologically consistent.
- Country-pinned residential egress. Runs route through residential proxies across 195+ countries, so locale-specific answers are reproducible.
- One contract across platforms. The same endpoint, header, and
{ status, task_id, task_result }envelope cover the ChatGPT, Gemini, Perplexity, and Copilot actors.
The parameter reference lives in the LLM Chat Scraper docs.
Prerequisites
- A Scrapeless account and API key — sign up at app.scrapeless.com.
curlfor the quick test, or Python 3.10+ for the client below.- Basic familiarity with HTTP and JSON.
Store your key in the environment so it never lands in code:
bash
export SCRAPELESS_API_KEY=your_api_token_here
How the Grok Scraper works
- Endpoint:
POST https://api.scrapeless.com/api/v2/scraper/execute - Actor:
scraper.grok - Auth header:
x-api-token: $SCRAPELESS_API_KEY
Request parameters
| input field | required | description |
|---|---|---|
prompt |
yes | the question to send to Grok |
country |
yes | two-letter country code for the run's residential egress (e.g. US; JP and TW are unavailable) |
mode |
yes | reasoning depth: MODEL_MODE_FAST, MODEL_MODE_EXPERT, or MODEL_MODE_AUTO |
Quick capture with curl
bash
curl -sS -X POST https://api.scrapeless.com/api/v2/scraper/execute \
-H "Content-Type: application/json" \
-H "x-api-token: ${SCRAPELESS_API_KEY}" \
-d '{
"actor": "scraper.grok",
"input": {
"prompt": "Which web scraping API handles JavaScript-heavy sites?",
"country": "US",
"mode": "MODEL_MODE_EXPERT"
}
}'
Response envelope
json
// illustrative sample — schema from a live scraper.grok run; values abridged
{
"status": "success",
"task_id": "52fc9c96-…",
"task_result": {
"user_query": "Which web scraping API handles JavaScript-heavy sites?",
"full_response": "For JavaScript-heavy sites, the options that hold up are…",
"web_search_results": [
{ "title": "…", "url": "https://…", "preview": "…", "description": "…", "favicon": "…", "image": "…" }
],
"x_search_results": [],
"follow_up_suggestions": [ "…" ],
"footnotes": [],
"tool_usages": [ "…" ],
"token_count": 1024,
"user_model": "…",
"response_id": "…",
"conversation": { "conversation_id": "…", "title": "…", "create_time": "…" }
}
}
Field by field:
| field | type | what it holds |
|---|---|---|
task_result.user_query |
string | the prompt as Grok received it |
task_result.full_response |
string | Grok's complete answer text |
task_result.web_search_results[] |
array | open-web citations — title, url, preview, plus description, favicon, and image when present |
task_result.x_search_results[] |
array | the X posts Grok cited; empty when the prompt pulled no social sources |
task_result.follow_up_suggestions[] |
array | the follow-up questions Grok offers after the answer |
task_result.footnotes[] |
array | footnote entries, when the answer carries them |
task_result.tool_usages[] |
array | the tools the run invoked (search, browse) |
task_result.token_count |
number | the run's token usage |
task_result.conversation |
object | run identifiers — conversation_id, title, timestamps — useful as audit keys |
Get your API key on the free plan: app.scrapeless.com
Integrating the API in Python
A complete client: send the prompt, check the envelope, and print both citation panels.
python
import os
import requests
ENDPOINT = "https://api.scrapeless.com/api/v2/scraper/execute"
def ask_grok(prompt: str, country: str = "US", mode: str = "MODEL_MODE_EXPERT") -> dict:
resp = requests.post(
ENDPOINT,
headers={
"Content-Type": "application/json",
"x-api-token": os.environ["SCRAPELESS_API_KEY"],
},
json={
"actor": "scraper.grok",
"input": {"prompt": prompt, "country": country, "mode": mode},
},
timeout=300,
)
resp.raise_for_status()
return resp.json()
if __name__ == "__main__":
data = ask_grok("Which web scraping API handles JavaScript-heavy sites?")
result = data.get("task_result", {})
web = result.get("web_search_results") or []
x = result.get("x_search_results") or []
print(f"status={data.get('status')} web_sources={len(web)} x_sources={len(x)}")
for i, src in enumerate(web[:5], 1):
print(f" [web {i}] {src.get('title', '')[:60]} → {src.get('url', '')[:60]}")
for i, post in enumerate(x[:5], 1):
print(f" [x {i}] {str(post)[:80]}")
For share-of-citation work, group web_search_results URLs by domain and x_search_results by account, and count per prompt — the two panels are independent signals and worth charting separately.
Picking the reasoning mode
The required mode is the input that has no ChatGPT equivalent, and it changes both latency and output:
MODEL_MODE_FAST— quickest answers; suits high-volume sweeps where breadth beats depth.MODEL_MODE_EXPERT— deeper reasoning and typically richer sourcing; suits the prompts you chart over time. Allow for longer runs.MODEL_MODE_AUTO— Grok chooses per prompt; convenient interactively, but a scheduled series is easier to interpret when the mode is held constant.
Whichever you pick, store it with each capture — comparing an EXPERT run against a FAST run is comparing two different processes.
Companion actors for the rest of the AI-answer landscape
The endpoint, header, and envelope stay the same across the family — only the actor name and platform-specific inputs change:
scraper.chatgpt—prompt+ optionalcountry; returnsresult_textwithcontent_referencescitations.scraper.gemini— same two-field input; returnsresult_textplus acitationsarray.scraper.perplexity— requiredcountryand aweb_searchflag; returnsweb_results,media_items, and related prompts.scraper.copilot— the Copilot answer surface under the same contract.scraper.overview/scraper.aimode— Google's AI Overview block and AI Mode tab; covered end to end in the AI Overview guide.
Pricing for the line is usage-based with free trial credits on signup — current tiers are on the pricing page.
How to avoid common problems
- An empty
x_search_resultsis normal for many prompts. Technical and product questions often resolve entirely from the open web. Prompts about people, events, and sentiment are the ones that pull X posts — phrase accordingly when the X panel is the point. - Panel sizes swing run to run. The same prompt can cite 35 web sources one run and 20 the next. Store every capture with its
conversation_idand read the series, not a single run. - Hold the mode constant in a series. Mode changes the reasoning process; mixing modes inside one tracked prompt set makes trend lines uninterpretable.
- Treat fields as nullable.
footnotesis often empty, web-source entries carrydescription/imageonly sometimes, andx_search_resultsmay be[]— read what is present. - Mind the country list.
countryis required and JP/TW are unavailable; pick the markets you report on and keep them fixed per series.
Conclusion: both panels, one request
Capturing Grok reduces to one call: POST { actor: "scraper.grok", input: { prompt, country, mode } } with your x-api-token, read full_response for the answer, and chart web_search_results and x_search_results as separate citation signals. Hold the mode constant, pin the country, store the conversation_id, and the same client scales from one prompt to a scheduled multi-market monitoring program.
FAQ
Q: Is scraping Grok answers legal?
The actor captures publicly rendered answer content. Rules vary by jurisdiction and by the platform's terms of service — review the relevant ToS and consult counsel for your use case, especially before redistributing captures. Never collect personal data protected under GDPR or CCPA.
Q: How do I authenticate?
Every request carries x-api-token: <your key>. One account key covers scraper.grok and every other Scrapeless actor. Create a key on the free plan at app.scrapeless.com.
Q: Do I need a proxy?
No. Residential egress and geo-routing are built into the actor; the required country input is the whole configuration.
Q: Why is mode required?
Grok's reasoning depth materially changes the answer, so the actor makes it explicit instead of defaulting silently. In code the values are the API enums — MODEL_MODE_FAST, MODEL_MODE_EXPERT, MODEL_MODE_AUTO.
Q: How do I separate web citations from X citations?
They already arrive separated: web_search_results holds the open-web pages, x_search_results holds the X posts. Read each array directly.
Q: Can I run this without an SDK or AI agent?
Yes. It is plain HTTP — curl, Python requests, Node fetch, or any HTTP client works directly against POST /api/v2/scraper/execute.
Q: Does my ChatGPT capture code work for Grok?
The auth, endpoint, and envelope are identical. Change the actor name, add the required mode and country, and map the task_result keys (full_response instead of result_text, the two panels instead of content_references).
Ready to Build Your AI-Answer Data Pipeline?
Join our community to claim a free plan and connect with developers building AI-answer pipelines: Discord · Telegram.
Sign up at app.scrapeless.com for free trial credits, and point the scraper.grok actor at the prompts, modes, and markets your monitoring program needs.
At Scrapeless, we only access publicly available data while strictly complying with applicable laws, regulations, and website privacy policies. The content in this blog is for demonstration purposes only and does not involve any illegal or infringing activities. We make no guarantees and disclaim all liability for the use of information from this blog or third-party links. Before engaging in any scraping activities, consult your legal advisor and review the target website's terms of service or obtain the necessary permissions.



