Quick Start
- Async
- Batch
- Run extractions without blocking your application
- Process large volumes of URLs efficiently
- Deliver results to cloud storage (S3 / GCS) automatically
- Receive webhook notifications when tasks complete
- Integrate extraction into scheduled or queued workflows
How it works
Submit a request
Send a POST request to the async or batch endpoint. The API returns
immediately with a
task_id (async) or batch_id (batch) — no waiting for
extraction to finish.Nimble processes in the background
Extraction runs asynchronously. For batch, each URL becomes an independent
task processed in parallel.
Async
Submit a single URL and receive atask_id immediately. Nimble processes the extraction in the background — retrieve results via polling, webhook, or cloud storage.
Parameters
Accepts all parameters from the Extract API, plus async-specific delivery options:url(required) — The webpage to extractformats— Output formats:html,markdown,text,screenshotrender— Enable JavaScript renderingdriver— Extraction engine:vx6,vx8,vx10, etc.country,state,city— Geo-targetingparse,parser— Structured data extractionbrowser_actions,network_capture— Advanced interactions
storage_type
storage_type
Storage provider for results. When specified, results are saved to your cloud storage instead of Nimble’s servers.Options:
s3 (Amazon S3), gs (Google Cloud Storage)storage_url
storage_url
Bucket path where results will be saved. Results are stored as
{task_id}.json at the specified location.Format: s3://your-bucket/path/prefix/storage_compress
storage_compress
Compress results with GZIP before saving. When
true, results are saved as {task_id}.json.gz.storage_object_name
storage_object_name
Custom filename for the stored object instead of the default task ID.Example:
"my-custom-name" saves as my-custom-name.jsoncallback_url
callback_url
Webhook URL to receive a POST request when the task completes. Nimble sends task metadata (without result data) to this URL when extraction finishes.Example:
https://your-api.com/webhook/completeExamples
Submit a URL and receive atask_id immediately. All three delivery methods below return the same initial response — the difference is how you retrieve results once the task completes.
Example 1: Basic async extraction
Example 1: Basic async extraction
Poll the task endpoint to check status and retrieve results when complete.
Example 2: Cloud storage delivery
Example 2: Cloud storage delivery
Results are saved automatically to your bucket as the task completes. No need to poll — the file appears at
storage_url/storage_object_name.json.gz when done.Example 3: Webhook notification
Example 3: Webhook notification
Nimble sends a POST to your Nimble POSTs task metadata to your URL when complete:
callback_url when the task completes. No polling required — your server receives the notification automatically.Status & Results
When polling, the typical flow is:- Poll
GET /v1/tasks/{task_id}untilstate: "success" - Call
GET /v1/tasks/{task_id}/resultsto retrieve the extracted data
| State | Description |
|---|---|
pending | Task queued, waiting to start |
success | Extraction complete, results available |
error | Extraction failed |
Retrieve results
Check a task
List all tasks (paginated)
Batch
Submit up to 1,000 URLs in a single request. Each URL runs as an independent async task. Useshared_inputs to apply common settings across all URLs — individual items in inputs can override any shared value.
Parameters
inputs — Required
inputs — Required
Array of per-URL extraction requests. Supports up to 1,000 items per batch. Each item accepts all Core extraction parameters —
url is the only required field per item.Per-item values override anything set in shared_inputs:shared_inputs
shared_inputs
Examples
Parameters set inshared_inputs are applied as defaults to all items in inputs. Any value set inside an individual item overrides the shared default.
Example 1: Collect data from multiple URLs
Example 1: Collect data from multiple URLs
Extract several unique URLs with results delivered to S3 and a webhook callback on completion:
Example 2: Multiple URLs from multiple countries
Example 2: Multiple URLs from multiple countries
Set a different country per URL. Items without a country fall back to the shared default (
CA):Example 3: Same URL from multiple countries
Example 3: Same URL from multiple countries
Set the URL once in
shared_inputs and vary only the country per item — useful for geo-comparison:batch_id and the initial task list:
Status & Results
When polling, the typical flow is:- Poll
/v1/batches/{batch_id}/progressuntilcompleted: true - Fetch
/v1/batches/{batch_id}to get all task IDs and states - For each
successtask, callGET /v1/tasks/{task_id}/results
| State | Description |
|---|---|
pending | Task queued, waiting to start |
in_progress | Task is currently being processed |
success | Extraction complete, results available |
error | Extraction failed |
Poll for batch completion
Call/v1/batches/{batch_id}/progress repeatedly until completed: true. This is a lightweight endpoint — use it for polling.Fetch the full batch details
Oncecompleted: true, fetch the batch details to get all task IDs, states, and download URLs.Retrieve results per task
Iterate over the task list and callGET /v1/tasks/{task_id}/results for each success task.List all batches
Data Retention
Results are retained for 7 days. For longer retention, use cloud storage (storage_url) to persist results indefinitely.
| Item | Expiration |
|---|---|
| Pending tasks | 24 hours if not started |
| Completed results | 24–48 hours (indefinite with cloud storage) |
| Failed tasks | 24 hours |
API Reference
Tasks APIs
Check the status of a single async task
Batch APIs
Retrieve the full task list and states for a batch