Scrapegraphai API
scrapegraph_start_smartscraper
Extract content from a webpage using AI by providing a natural language prompt and a URL.
Pricing
Per call
$0.08
Model
flat
Pay only for what you use. No subscriptions.
Inputs
headers
objectwebsite_markdown
stringsteps
stringcookies
objectuser_prompt *
stringtotal_pages
numberrender_heavy_js
booleannumber_of_scrolls
numberwebsite_html
stringwebsite_url *
stringstealth
booleanoutput_schema
objectmock
booleanTry It
API
MCP Config
Input Parameters
Optional custom HTTP headers to send with the request. Useful for setting User-Agent, cookies, authentication tokens, and other request metadata. Example: {"User-Agent": "Mozilla/5.0...", "Cookie": "session=abc123"}
Raw Markdown content to process directly (max 2MB). Mutually exclusive with website_url and website_html. Perfect for extracting structured data from Markdown documentation, README files, or any content already in Markdown format.
Optional array of interaction steps to perform on the webpage before extraction. Each step is a string describing the action to take (e.g., “click on filter button”, “wait for results to load”). Example: ["click on search button", "type query in search box", "wait for results"]
Optional cookies object for authentication and session management. Useful for accessing authenticated pages or maintaining session state. Example: {"session_id": "abc123", "auth_token": "xyz789"}
Natural language description of what information you want to extract from the webpage.
Optional parameter to enable pagination and scrape multiple pages. Specify the number of pages to extract data from. Default: 1 Range: 1-100
Optional parameter for infinite scroll pages. Specify how many times to scroll down to load more content before extraction. Default: 0 Range: 0-50
Raw HTML content to process directly (max 2MB). Mutually exclusive with website_url and website_markdown. Useful when you already have HTML content cached or want to process modified HTML.
The URL of the webpage you want to extract information from. You must provide exactly one of: website_url, website_html, or website_markdown.
Optional schema to structure the output. If provided, the AI will attempt to format the results according to this schema.

