Skip to main content
POST
/
v1
/
extract
Extract
curl --request POST \
  --url https://sdk.nimbleway.com/v1/extract \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: application/json' \
  --data '
{
  "url": "https://www.example.com",
  "render": true,
  "country": "US",
  "locale": "en-US"
}
'
{
  "url": "https://www.example.com/",
  "task_id": "e8ed8ef6-2657-43ba-98d5-a5c79ea7b551",
  "status": "success",
  "status_code": 200,
  "data": {
    "html": "...",
    "markdown": "MARKDOWN",
    "parsing": {},
    "cookies": {},
    "screenshot": "iVBORw0KGgoAAAANSUhEUgAAA...",
    "fetch_request": [],
    "network_capture": [],
    "browser_actions": [],
    "headers": {}
  },
  "metadata": {
    "query_time": "2026-02-09T10:26:05.817Z",
    "query_duration": 1877,
    "response_parameters": {
      "input_url": "https://www.example.com/"
    },
    "driver": "vx8"
  }
}

Authorizations

Authorization
string
header
required

Bearer authentication header of the form Bearer <token>, where <token> is your auth token.

Body

application/json

Request body model for the /extract endpoint

url
string<uri>
required

Target URL to scrape

Example:

"https://example.com/page"

country
string

Country used to access the target URL, use ISO Alpha-2 Codes

Example:

"US"

state
string

State used to access the target URL (US and CA only), use ISO Alpha-2 Codes

Example:

"NY"

city
string

City used to access the target URL

Example:

"new_york"

locale
string

LCID standard locale used for the URL request. Alternatively, user can use 'auto' for automatic locale based on geo-location

Example:

"en-US"

render
boolean

Whether to render JavaScript content using a browser

Example:

true

parse
boolean

Whether to parse the response content

parser
object

Custom extraction recipe defining what data to extract and how to structure it. Each property represents a field in the output.

formats
enum<string>[]

Response format

Available options:
html,
markdown
Example:
["html", "markdown"]
driver
enum<string>

Browserless drivers available for web extraction

Available options:
vx6,
vx8,
vx8-pro,
vx10,
vx10-pro
Example:

"vx8"

network_capture
object[]

Intercept and capture network requests made by the page

browser_actions
object[]

Array of actions to perform sequentially during browser rendering

Examples:
{ "wait": "2s" }
{
"click": { "selector": "#load-more", "timeout": "5s" }
}
browser

Browser type to emulate

Available options:
chrome,
firefox
Example:

"chrome"

os
enum<string>

Operating system to emulate

Available options:
windows,
mac os,
linux,
android,
ios
Example:

"windows"

no_userbrowser
boolean

Whether to disable browser-based rendering

Example:

false

device
enum<string>

Device type for browser emulation

Available options:
desktop,
mobile,
tablet
Example:

"desktop"

tag
string

User-defined tag for request identification

Example:

"campaign-2024-q1"

is_xhr
boolean

Whether to emulate XMLHttpRequest behavior

Example:

true

http2
boolean

Whether to use HTTP/2 protocol

Example:

true

expected_status_codes
integer[]

Expected HTTP status codes for successful requests

Required range: -9007199254740991 <= x <= 9007199254740991
Examples:

200

201

referrer_type

Referrer policy for the request

Available options:
random,
no-referer,
same-origin
Example:

"no-referrer"

method
enum<string>

HTTP method for the request

Available options:
GET,
POST,
PUT,
PATCH,
DELETE
Example:

"GET"

render_options
object

Response

Successful Response

url
string<uri>
required

The URL that was extracted

task_id
string<uuid>
required

Unique identifier for the extraction task

status
enum<string>
required

Status of the extraction

Available options:
success,
failed
status_code
integer
required

HTTP status code from the target website

data
object
required

Data from the extraction

metadata
object
required

Metadata from the extraction

warnings
string[]

List of warnings