Extract data in the background with async requests, or submit multiple URLs at once with batch. Both modes return immediately without blocking your application while Nimble processes the work.Documentation Index
Fetch the complete documentation index at: https://docs.nimbleway.com/llms.txt
Use this file to discover all available pages before exploring further.
Quick Start
- Async
- Batch
- Run extractions without blocking your application
- Process large volumes of URLs efficiently
- Deliver results to cloud storage (S3 / GCS) automatically
- Receive webhook notifications when tasks complete
- Integrate extraction into scheduled or queued workflows
How it works
Submit a request
Send a POST request to the async or batch endpoint. The API returns
immediately with a
task_id (async) or batch_id (batch) — no waiting for
extraction to finish.Nimble processes in the background
Extraction runs asynchronously. For batch, each URL becomes an independent
task processed in parallel.
Async
Submit a single URL and receive atask_id immediately. Nimble processes the extraction in the background — retrieve results via polling, webhook, or cloud storage.
Parameters
Accepts all parameters from the Extract API, plus async-specific delivery options:url(required) — The webpage to extractformats— Output formats:html,markdown,text,screenshotrender— Enable JavaScript renderingdriver— Extraction engine:vx6,vx8,vx10, etc.country,state,city— Geo-targetingparse,parser— Structured data extractionbrowser_actions,network_capture— Advanced interactions
storage_type
storage_type
Storage provider for results. When specified, results are saved to your cloud storage instead of Nimble’s servers.Options:
s3 (Amazon S3), gs (Google Cloud Storage)storage_url
storage_url
Bucket path where results will be saved. Results are stored as
{task_id}.json at the specified location.Format: s3://your-bucket/path/prefix/storage_compress
storage_compress
Compress results with GZIP before saving. When
true, results are saved as {task_id}.json.gz.storage_object_name
storage_object_name
Custom filename for the stored object instead of the default task ID.Example:
"my-custom-name" saves as my-custom-name.jsoncallback_url
callback_url
Webhook URL to receive a POST request when the task completes. Nimble sends task metadata (without result data) to this URL when extraction finishes.Example:
https://your-api.com/webhook/completeExamples
Submit a URL and receive atask_id immediately. All three delivery methods below return the same initial response — the difference is how you retrieve results once the task completes.
Example 1: Basic async extraction
Example 1: Basic async extraction
Poll the task endpoint to check status and retrieve results when complete.
Example 2: Cloud storage delivery
Example 2: Cloud storage delivery
Results are saved automatically to your bucket as the task completes. No need to poll — the file appears at
storage_url/storage_object_name.json.gz when done.Example 3: Webhook notification
Example 3: Webhook notification
Nimble sends a POST to your Nimble POSTs task metadata to your URL when complete:
callback_url when the task completes. No polling required — your server receives the notification automatically.Status & Results
When polling, the typical flow is:- Poll
GET /v1/tasks/{task_id}untilstate: "success" - Call
GET /v1/tasks/{task_id}/resultsto retrieve the extracted data
| State | Description |
|---|---|
pending | Task queued, waiting to start |
success | Extraction complete, results available |
error | Extraction failed |
Retrieve results
Check a task
List all tasks (paginated)
Batch
Submit up to 1,000 URLs in a single request. Each URL runs as an independent async task. Useshared_inputs to apply common settings across all URLs — individual items in inputs can override any shared value.
Parameters
inputs — Required
inputs — Required
Array of per-URL extraction requests. Supports up to 1,000 items per batch. Each item accepts all Core extraction parameters —
url is the only required field per item.Per-item values override anything set in shared_inputs:shared_inputs
shared_inputs
Examples
Parameters set inshared_inputs are applied as defaults to all items in inputs. Any value set inside an individual item overrides the shared default.
Example 1: Collect data from multiple URLs
Example 1: Collect data from multiple URLs
Extract several unique URLs with results delivered to S3 and a webhook callback on completion:
Example 2: Multiple URLs from multiple countries
Example 2: Multiple URLs from multiple countries
Set a different country per URL. Items without a country fall back to the shared default (
CA):Example 3: Same URL from multiple countries
Example 3: Same URL from multiple countries
Set the URL once in
shared_inputs and vary only the country per item — useful for geo-comparison:batch_id and the initial task list:
Status & Results
When polling, the typical flow is:- Poll
/v1/batches/{batch_id}/progressuntilcompleted: true - Fetch
/v1/batches/{batch_id}to get all task IDs and states - For each
successtask, callGET /v1/tasks/{task_id}/results
| State | Description |
|---|---|
pending | Task queued, waiting to start |
in_progress | Task is currently being processed |
success | Extraction complete, results available |
error | Extraction failed |
Poll for batch completion
Call/v1/batches/{batch_id}/progress repeatedly until completed: true. This is a lightweight endpoint — use it for polling.Fetch the full batch details
Oncecompleted: true, fetch the batch details to get all task IDs, states, and download URLs.Retrieve results per task
Iterate over the task list and callGET /v1/tasks/{task_id}/results for each success task.List all batches
Data Retention
Results are retained for 7 days. For longer retention, use cloud storage (storage_url) to persist results indefinitely.
| Item | Expiration |
|---|---|
| Pending tasks | 24 hours if not started |
| Completed results | 24–48 hours (indefinite with cloud storage) |
| Failed tasks | 24 hours |
API Reference
Tasks APIs
Check the status of a single async task
Batch APIs
Retrieve the full task list and states for a batch