Extract

Authorizations

Authorization

string

header

required

Bearer authentication header of the form Bearer <token>, where <token> is your auth token.

Body

application/json

Request body model for the /extract endpoint

url

string<uri>

required

Target URL to scrape

Example:

"https://example.com/page"

country

string

Country used to access the target URL, use ISO Alpha-2 Codes

Example:

"US"

state

string

State used to access the target URL (US and CA only), use ISO Alpha-2 Codes

Example:

"NY"

city

string

City used to access the target URL

Example:

"new_york"

locale

string

LCID standard locale used for the URL request. Alternatively, user can use 'auto' for automatic locale based on geo-location

Example:

"en-US"

render

boolean

Whether to render JavaScript content using a browser

Example:

true

parse

boolean

Whether to parse the response content

parser

object

Custom extraction recipe defining what data to extract and how to structure it. Each property represents a field in the output.

Show child attributes

formats

enum<string>[]

Response format

Available options:

html,

markdown,

screenshot,

headers,

links

Example:

["html", "markdown"]

driver

enum<string>

Browserless drivers available for web extraction

Available options:

vx6,

vx8,

vx8-pro,

vx10,

vx10-pro

Example:

"vx8"

network_capture

object[]

Intercept and capture network requests made by the page

Show child attributes

browser_actions

object[]

Array of actions to perform sequentially during browser rendering

Show child attributes

Examples:

{ "wait": "2s" }

{
  "click": { "selector": "#load-more", "timeout": "5s" }
}

browser

Browser type to emulate

Available options:

chrome,

firefox

Example:

"chrome"

enum<string>

Operating system to emulate

Available options:

windows,

mac os,

linux,

android,

ios

Example:

"windows"

no_userbrowser

boolean

Whether to disable browser-based rendering

Example:

false

device

enum<string>

Device type for browser emulation

Available options:

desktop,

mobile,

tablet

Example:

"desktop"

tag

string

User-defined tag for request identification

Example:

"campaign-2024-q1"

is_xhr

boolean

Whether to emulate XMLHttpRequest behavior

Example:

true

http2

boolean

Whether to use HTTP/2 protocol

Example:

true

expected_status_codes

integer[]

Expected HTTP status codes for successful requests

Required range: -9007199254740991 <= x <= 9007199254740991

Examples:

200

201

referrer_type

Referrer policy for the request

Available options:

random,

no-referer,

same-origin

Example:

"no-referrer"

method

enum<string>

HTTP method for the request

Available options:

GET,

POST,

PUT,

PATCH,

DELETE

Example:

"GET"

render_options

object

Show child attributes

Response

Successful Response

url

string<uri>

required

The URL that was extracted

task_id

string<uuid>

required

Unique identifier for the extraction task

status

enum<string>

required

Status of the extraction

Available options:

success,

failed

status_code

integer

required

HTTP status code from the target website

data

object

required

Data from the extraction

Show child attributes

metadata

object

required

Metadata from the extraction

Show child attributes

warnings

string[]

List of warnings

Agents API

Extract API

Search API

Map API

Crawl API

Tasks & Batches API

Authorizations

Body

Response