/extract/async, /extract/batch, /agent/async, or /crawl, you have three flexible options for receiving your results. Choose the method that best fits your infrastructure and workflow.
Polling
Pull results on-demand using task IDs
Callbacks
Receive push notifications when tasks complete
Cloud Delivery
Automatic delivery to your S3 or GCS bucket
Option 1: Polling (Pull)
The simplest approach - submit your async request, receive a task ID, and poll for results when ready.Submit async request
Send a request to the async endpoint. You’ll receive a task or crawl ID to track your request.
- Extract
- Agent
- Crawl
- Batch
Polling endpoints reference
| API | Submit | Check Status | Get Results |
|---|---|---|---|
| Extract | POST /v1/extract/async | GET /v1/tasks/{task_id} | GET /v1/tasks/{task_id}/results |
| Agent | POST /v1/agent/async | GET /v1/tasks/{task_id} | GET /v1/tasks/{task_id}/results |
| Batch | POST /v1/extract/batch | GET /v1/batches/{batch_id}/progress | GET /v1/tasks/{task_id}/results (per task) |
| Crawl | POST /v1/crawl | GET /v1/crawl/{crawl_id} | GET /v1/tasks/{task_id}/results (per page) |
GET /v1/tasks (supports cursor and limit for pagination). To list all batches, use GET /v1/batches.
Option 2: Webhooks (Push)
Get notified automatically when your tasks complete. Perfect for event-driven architectures.Submit request with callback URL
Include
callback_url (or callback object for crawl) in your async request.- Extract
- Agent
- Crawl
Webhook configuration options
| API | Parameter | Type | Description |
|---|---|---|---|
| Extract | callback_url | string | Your callback URL |
| Agent | params.callback_url | string | Your callback URL |
| Crawl | callback.url | string | Your callback URL |
callback.headers | object | Custom headers for authentication | |
callback.metadata | object | Custom data included in payload | |
callback.events | array | Filter events: started, page, completed, failed |
Option 3: Cloud Delivery (Async API)
Applies to async API requests -
/extract/async, /extract/batch, /agent/async, /crawl. For Nimble Jobs, see Option 4: Jobs cloud delivery below.Amazon S3
Deliver to any S3 bucket in your AWS account
Google Cloud Storage
Deliver to any GCS bucket in your GCP project
Configure bucket permissions (one-time)
Grant Nimble’s service account write access to your bucket.
- Amazon S3
- Google Cloud Storage
Nimble Service User ARN:Add this bucket policy:
KMS-Encrypted Buckets
KMS-Encrypted Buckets
For KMS-encrypted buckets, add this to your KMS key policy:
Submit request with storage config
Include
storage_type and storage_url in your request.Cloud delivery parameters
| Parameter | Type | Description |
|---|---|---|
storage_type | s3 | gs | Cloud provider |
storage_url | string | Bucket path with prefix (e.g., s3://bucket/prefix/) |
storage_compress | boolean | Enable GZIP compression |
storage_object_name | string | Custom filename (default: task ID) |
- Extract
- Agent
- S3
- GCS
Option 4: Jobs cloud delivery
Applies to Nimble Jobs only - used as the input source and/or destination for a Job. For async API requests, see Option 3 above.
Nimble’s IAM principal
Every Job runs under a single IAM user:Principal in the bucket policy.
Required permissions
| Prefix | Permission |
|---|---|
s3://YOUR_BUCKET (whole bucket) | s3:ListBucket |
s3://YOUR_BUCKET/input/* | s3:GetObject |
s3://YOUR_BUCKET/output/* | s3:GetObject, s3:PutObject, s3:AbortMultipartUpload, s3:ListMultipartUploadParts |
s3:AbortMultipartUpload and s3:ListMultipartUploadParts are required because large output files are written in parts. If an upload fails mid-way, Nimble aborts the incomplete multipart upload so partial bytes do not accumulate in the bucket.
s3:DeleteObject is not requested. Nimble never deletes files in the bucket - including the connection-test probe described below.Bucket policy template
ReplaceYOUR_BUCKET with the bucket name. Adjust the input/ and output/ prefixes to match the paths used in the Job form.
Applying the policy
Test Connection - what it does
| Mode | Operation | Verifies |
|---|---|---|
| Read (input path) | head_bucket + list one object under the prefix | s3:ListBucket + s3:GetObject |
| Write (output path) | head_bucket + put_object of an empty .nimble-connection-test file at the prefix | s3:ListBucket + s3:PutObject |
The write test leaves an empty
.nimble-connection-test file under the output prefix. The file is not deleted - the policy does not grant s3:DeleteObject, so cleanup is not possible. The key is deterministic, so repeated tests overwrite the same object. At most one residue file per output prefix. Remove it manually at any time.Interpreting the result
| Chip | Meaning | Fix |
|---|---|---|
| ✓ Connection OK | Permissions correct. The prefix has files. | Ready to use. |
| ✓ Connection OK - prefix is empty | Permissions correct. The prefix has no files yet. | Upload the first input file. |
| ✗ Access denied | The bucket policy is missing a required permission. | Re-apply the template above. |
| ✗ Bucket does not exist | The bucket name in the path is wrong. | Verify the bucket name. |
| ✗ Connection failed | Network, throttling, or unexpected S3 error. | Retry. Contact support if it persists. |
Comparison
| Feature | Polling | Webhooks | Cloud Delivery |
|---|---|---|---|
| Setup complexity | None | Requires endpoint | Requires bucket setup |
| Real-time notifications | No (you poll) | Yes | No |
| Automatic storage | No | No | Yes |
| Best for | Simple integrations, testing | Event-driven apps | Data pipelines ETLs |
| Infrastructure needed | None | Web server | Cloud storage bucket |
Combining methods
You can combine delivery methods for redundancy:Best Practices
Polling
Polling
- Check status first - Use
/tasks/{id}before fetching full results - Use reasonable intervals - Poll every 2-5 seconds, not continuously - Handle rate limits - Implement retry logic for 429 responses - Set timeouts - Most tasks complete within seconds to minutes
Webhooks
Webhooks
- Use HTTPS - Always use secure endpoints - Verify authenticity - Use custom headers for authentication - Respond quickly - Return 200 OK immediately, process async - Handle retries - Nimble retries failed deliveries
Cloud Delivery
Cloud Delivery
- Use prefixes - Organize by date, project, or type - Enable
compression - Use
storage_compress: truefor large files - Set lifecycle policies - Auto-delete old files to manage costs - Use custom names -storage_object_namefor meaningful filenames
Next Steps
Async Extract
Learn about async extraction options
Crawl API
Deep website crawling with async delivery
Agent Gallery
Browse available search agents
Rate Limits
Understand API rate limit