Async Requests

Use async requests when you need to extract data from multiple pages, run long-running operations, or integrate extraction into background job systems. Async requests return immediately with a task ID, letting you check status and retrieve results later.

When to use async

Use async extract when you:

Batch processing: Extract data from hundreds or thousands of URLs
Background jobs: Integrate extraction into scheduled or queued workflows
Long-running operations: Handle complex browser actions or slow-loading sites
Webhook integration: Get notified when extraction completes
Cloud storage: Save results directly to S3 or Google Cloud Storage

Async requests are ideal for high-volume extraction where you don’t need immediate results. For single-page extractions where you need results right away, use the synchronous Extract API.

How it works

Submit extraction request

Send a POST request to /v1/extract/async with your extraction parameters. The API returns immediately with a task ID.Optionally include:

callback_url to receive a webhook notification when complete
storage_url and storage_type (s3/gs) to save results directly to cloud storage

Track task status

Use the task ID to check progress at /v1/tasks/{task_id}. The task transitions through states: pending → completed or failed.

Retrieve results

Once complete, fetch results from /v1/tasks/{task_id}/results or from your configured cloud storage.

API endpoint

POST https://sdk.nimbleway.com/v1/extract/async

Parameters

Async extract accepts all the same parameters as the synchronous extract endpoint, plus optional async-specific parameters:

Core extraction parameters

All parameters from the Extract API are supported:

url (required) - The webpage to extract
formats - Output formats (html, markdown, text, screenshot)
render - Enable JavaScript rendering
driver - Choose extraction engine (vx6, vx8, vx10, etc.)
country, state, city - Geo-targeting options
parse - Enable parsing with schemas
parser - Define extraction schema
browser_actions - Automate interactions
network_capture - Capture network requests
And all other extract parameters…

Async-specific parameters

storage_type

string

Storage provider for results. Use s3 for Amazon S3 or gs for Google Cloud Storage.When specified, results are automatically saved to your cloud storage instead of Nimble’s servers.Options: s3, gs

storage_url

string

Repository URL where results will be saved. Format: s3://Your.Bucket.Name/path/prefix/Results are saved as {TASK_ID}.json in the specified location.Example: s3://my-bucket/nimble-results/

storage_compress

boolean

default:"false"

Compress results using GZIP before saving to storage. Reduces storage costs and transfer time.When true, results are saved as {TASK_ID}.json.gz

storage_object_name

string

Custom name for the stored object instead of the default task ID.Example: "my-custom-name" saves as my-custom-name.json

callback_url

string

Webhook URL to receive a POST request when the task completes.The API sends task metadata (without result data) to this URL when extraction finishes.Example: https://your-api.com/webhooks/extract-complete

Response format

The async endpoint returns immediately with task information:

{
  "task": {
    "id": "8e8cfde8-345b-42b8-b3e2-0c61eb11e00f",
    "state": "completed",
    "status_code": 200,
    "created_at": "2026-01-24T12:36:24.685Z",
    "modified_at": "2026-01-24T12:36:24.685Z",
    "input": {},
    "api_type": "extract"
  }
}

Task states

State	Description
`pending`	Task queued, waiting to start
`running`	Extraction in progress
`success`	Extraction finished successfully, results available
`failed`	Extraction failed, check error details

Example usage

Basic async extraction

from nimble_python import Nimble

nimble = Nimble(api_key="YOUR_API_KEY")

# Submit async extraction
response = nimble.extract_async(
    url= "https://www.example.com",
    render= True,
    formats= ["html", "markdown"]
)

task_id = response.task_id
print(f"Task created: {task_id}")

# Check status
import time
while True:
    my_task = nimble.tasks.get(task_id)
    print(f"Status: {my_task.task.state}")

    if my_task.task.state == 'success':
        # Get results
        results = nimble.tasks.results(task_id)
        print(results.data.html[:200])
        break
    elif my_task.task.state == 'failed':
        print(f"Task failed: {my_task.message}")
        break

    time.sleep(2)

Save to cloud storage

Store results directly in Amazon S3:

from nimble_python import Nimble

nimble = Nimble(api_key="YOUR_API_KEY")

response = nimble.extract_async(
    url= "https://www.example.com",
    render= True,
    formats= ["html", "markdown"],
    storage_type= "s3",
    storage_url= "s3://my-bucket/nimble-extracts/",
    storage_compress= True,
    storage_object_name= "example-com-extraction"
)

task_id = response.task_id
print(f"Task created: {task_id}")
print(f"Results will be saved to: s3://my-bucket/nimble-extracts/example-com-extraction.json.gz")

# Results are automatically saved to S3 when complete
# You can still check status and retrieve from Nimble's servers if needed

Webhook notifications

Get notified when extraction completes:

from nimble_python import Nimble

nimble = Nimble(api_key="YOUR_API_KEY")

response = nimble.extract_async(
    url= "https://www.example.com",
    render= True,
    formats= ["html"],
    callback_url= "https://your-api.com/webhooks/extract-complete"
)

task_id = response.task_id
print(f"Task created: {task_id}")
print("Webhook will be called when extraction completes")

Checking task status

Use the Tasks API to check status:

GET https://sdk.nimbleway.com/v1/tasks/{task_id}/results

Status Response:

{
  "task": {
    "id": "8e8cfde8-345b-42b8-b3e2-0c61eb11e00f",
    "state": "completed",
    "status_code": 200,
    "created_at": "2026-01-24T12:36:24.685Z",
    "modified_at": "2026-01-24T12:36:24.685Z",
    "input": {},
    "api_type": "extract"
  }
}

Retrieving results

Once the task is complete, retrieve results:

GET https://sdk.nimbleway.com/v1/tasks/{task_id}/results

Results Response

{
    "url": "https://www.nimbleway.com/blog/post",
    "task_id": "ec89b1f7-1cf2-40eb-91b4-78716093f9ed",
    "status": "success",
    "task": {
        "id": "ec89b1f7-1cf2-40eb-91b4-78716093f9ed",
        "state": "success",
        "created_at": "2026-02-09T23:15:43.549Z",
        "modified_at": "2026-02-09T23:16:39.094Z",
        "account_name": "your-account"
    },
    "data": {
        "html": "<!DOCTYPE html>...",
        "markdown": "# Page Title\n\nContent...",
        "headers": { ... }
    },
    "metadata": {
        "query_time": "2026-02-09T23:15:43.549Z",
        "query_duration": 1877,
        "response_parameters": {
            "input_url": "https://www.nimbleway.com/blog/post"
        },
		"driver": "vx6"
    },
    "status_code": 200
}

Data Retention & Expiration

Result retention

Results are typically retained for 7 days. If you need longer retention:

Use cloud storage (storage_url) to persist results indefinitely
Download results promptly after completion
Implement your own archival system

Task expiration

Pending tasks: Expire after 24 hours if not started
Completed results: Available for 24-48 hours (unless using cloud storage)
Failed tasks: Retry data available for 24 hours

Introduction

Web Tools

Agentic

SDKs

Guides

Admin

When to use async

How it works

API endpoint

Parameters

Core extraction parameters

Async-specific parameters

Response format

Task states

Example usage

Basic async extraction

Save to cloud storage

Webhook notifications

Checking task status

Status Response:

Retrieving results

Results Response

Data Retention & Expiration

Result retention

Task expiration

Introduction

Web Tools

Agentic

SDKs

Guides

Admin

​When to use async

​How it works

​API endpoint

​Parameters

​Core extraction parameters

​Async-specific parameters

​Response format

​Task states

​Example usage

​Basic async extraction

​Save to cloud storage

​Webhook notifications

​Checking task status

​Status Response:

​Retrieving results

​Results Response

​Data Retention & Expiration

​Result retention

​Task expiration

When to use async

How it works

API endpoint

Parameters

Core extraction parameters

Async-specific parameters

Response format

Task states

Example usage

Basic async extraction

Save to cloud storage

Webhook notifications

Checking task status

Status Response:

Retrieving results

Results Response

Data Retention & Expiration

Result retention

Task expiration