Skip to main content
When using async operations like /extract/async, /extract/batch, /agent/async, or /crawl, you have three flexible options for receiving your results. Choose the method that best fits your infrastructure and workflow.

Polling

Pull results on-demand using task IDs

Callbacks

Receive push notifications when tasks complete

Cloud Delivery

Automatic delivery to your S3 or GCS bucket

Option 1: Polling (Pull)

The simplest approach - submit your async request, receive a task ID, and poll for results when ready.
1

Submit async request

Send a request to the async endpoint. You’ll receive a task or crawl ID to track your request.
from nimble_python import Nimble

nimble = Nimble(api_key="YOUR-API-KEY")

response = nimble.extract_async(
    url="https://www.nimbleway.com",
    render=True,
    formats=["html", "markdown"]
)

task_id = response.task_id
print(f"Task submitted: {task_id}")
2

Check status

Poll the status endpoint to monitor progress.
import time

while True:
    my_task = nimble.tasks.get(task_id)
    print(f"Status: {my_task.task.state}")

    if my_task.task.state == "success":
        break
    elif my_task.task.state == "error":
        print(f"Task failed: {my_task.task.error}")
        break

    time.sleep(15)
3

Retrieve results

Once complete, fetch the full results.
results = nimble.tasks.results(task_id)

print(f"HTML length: {len(results.data.html)}")
print(f"Markdown length: {len(results.data.markdown)}")
{
    "url": "https://www.nimbleway.com/blog/post",
    "task_id": "ec89b1f7-1cf2-40eb-91b4-78716093f9ed",
    "status": "success",
    "task": {
        "id": "ec89b1f7-1cf2-40eb-91b4-78716093f9ed",
        "state": "success",
        "created_at": "2026-02-09T23:15:43.549Z",
        "modified_at": "2026-02-09T23:16:39.094Z",
        "account_name": "your-account"
    },
    "data": {
        "html": "<!DOCTYPE html>...",
        "markdown": "# Page Title\n\nContent...",
        "headers": { ... }
    },
    "metadata": {
        "query_time": "2026-02-09T23:15:43.549Z",
        "query_duration": 1877,
        "response_parameters": {
            "input_url": "https://www.nimbleway.com/blog/post"
        },
		"driver": "vx6"
    },
    "status_code": 200
}

Polling endpoints reference

APISubmitCheck StatusGet Results
ExtractPOST /v1/extract/asyncGET /v1/tasks/{task_id}GET /v1/tasks/{task_id}/results
AgentPOST /v1/agent/asyncGET /v1/tasks/{task_id}GET /v1/tasks/{task_id}/results
BatchPOST /v1/extract/batchGET /v1/batches/{batch_id}/progressGET /v1/tasks/{task_id}/results (per task)
CrawlPOST /v1/crawlGET /v1/crawl/{crawl_id}GET /v1/tasks/{task_id}/results (per page)
To list all tasks across your account, use GET /v1/tasks (supports cursor and limit for pagination). To list all batches, use GET /v1/batches.

Option 2: Webhooks (Push)

Get notified automatically when your tasks complete. Perfect for event-driven architectures.
1

Submit request with callback URL

Include callback_url (or callback object for crawl) in your async request.
from nimble_python import Nimble

nimble = Nimble(api_key="YOUR-API-KEY")

response = nimble.extract_async(
    url="https://www.nimbleway.com",
    render=True,
    formats=["html", "markdown"],
    callback_url="https://your-server.com/webhooks/nimble"
)

task_id = response.task_id
print(f"Task submitted: {task_id}")
print("Results will be POSTed to your callback URL when ready")
2

Receive webhook notification

Nimble sends a POST to your callback URL when complete:
{
  "task": {
    "id": "8e8cfde8-345b-42b8-b3e2-0c61eb11e00f",
    "state": "success",
    "status_code": 200,
    "created_at": "2026-01-24T12:36:24.685Z",
    "modified_at": "2026-01-24T12:36:24.685Z",
    "input": {},
    "api_type": "extract"
  }
}

Webhook configuration options

APIParameterTypeDescription
Extractcallback_urlstringYour callback URL
Agentparams.callback_urlstringYour callback URL
Crawlcallback.urlstringYour callback URL
callback.headersobjectCustom headers for authentication
callback.metadataobjectCustom data included in payload
callback.eventsarrayFilter events: started, page, completed, failed

Option 3: Cloud Delivery (Async API)

Applies to async API requests - /extract/async, /extract/batch, /agent/async, /crawl. For Nimble Jobs, see Option 4: Jobs cloud delivery below.
Automatically deliver results directly to your cloud storage bucket.

Amazon S3

Deliver to any S3 bucket in your AWS account

Google Cloud Storage

Deliver to any GCS bucket in your GCP project
1

Configure bucket permissions (one-time)

Grant Nimble’s service account write access to your bucket.
Nimble Service User ARN:
arn:aws:iam::744254827463:user/webit-uploader
Add this bucket policy:
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "NimbleCloudDelivery",
      "Effect": "Allow",
      "Principal": {
        "AWS": "arn:aws:iam::744254827463:user/webit-uploader"
      },
      "Action": [
        "s3:PutObject",
        "s3:PutObjectACL",
        "s3:GetBucketLocation"
      ],
      "Resource": [
        "arn:aws:s3:::YOUR_BUCKET_NAME",
        "arn:aws:s3:::YOUR_BUCKET_NAME/*"
      ]
    }
  ]
}
Replace YOUR_BUCKET_NAME with your actual bucket name.
For KMS-encrypted buckets, add this to your KMS key policy:
{
  "Sid": "NimbleKMSAccess",
  "Effect": "Allow",
  "Principal": {
    "AWS": "arn:aws:iam::744254827463:user/webit-uploader"
  },
  "Action": [
    "kms:Encrypt",
    "kms:Decrypt",
    "kms:ReEncrypt*",
    "kms:GenerateDataKey*",
    "kms:DescribeKey"
  ],
  "Resource": "*"
}
2

Submit request with storage config

Include storage_type and storage_url in your request.

Cloud delivery parameters

ParameterTypeDescription
storage_types3 | gsCloud provider
storage_urlstringBucket path with prefix (e.g., s3://bucket/prefix/)
storage_compressbooleanEnable GZIP compression
storage_object_namestringCustom filename (default: task ID)
from nimble_python import Nimble

nimble = Nimble(api_key="YOUR-API-KEY")

response = nimble.extract_async(
    url="https://www.nimbleway.com",
    render=True,
    formats=["html", "markdown"],
    storage_type="s3",
    storage_url="s3://your-bucket/nimble-results/",
    storage_compress=True,
    storage_object_name="my-result"
)

task_id = response.task_id
print(f"Results will be saved to: s3://your-bucket/nimble-results/my-result.json.gz")
3

Results delivered automatically

When complete, results are written to your bucket as {task_id}.json (or .json.gz if compressed).

Option 4: Jobs cloud delivery

Applies to Nimble Jobs only - used as the input source and/or destination for a Job. For async API requests, see Option 3 above.
A Job can read its input set from an S3 bucket and write its assembled results back to one. Both directions are wired through a single bucket policy that grants Nimble’s IAM principal access to specific prefixes. Nimble does not assume a role in the customer account.

Nimble’s IAM principal

Every Job runs under a single IAM user:
arn:aws:iam::744254827463:user/crawlit-scrapy
Use this ARN as the Principal in the bucket policy.

Required permissions

PrefixPermission
s3://YOUR_BUCKET (whole bucket)s3:ListBucket
s3://YOUR_BUCKET/input/*s3:GetObject
s3://YOUR_BUCKET/output/*s3:GetObject, s3:PutObject, s3:AbortMultipartUpload, s3:ListMultipartUploadParts
s3:AbortMultipartUpload and s3:ListMultipartUploadParts are required because large output files are written in parts. If an upload fails mid-way, Nimble aborts the incomplete multipart upload so partial bytes do not accumulate in the bucket.
s3:DeleteObject is not requested. Nimble never deletes files in the bucket - including the connection-test probe described below.

Bucket policy template

Replace YOUR_BUCKET with the bucket name. Adjust the input/ and output/ prefixes to match the paths used in the Job form.
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "NimbleListBucket",
      "Effect": "Allow",
      "Principal": { "AWS": "arn:aws:iam::744254827463:user/crawlit-scrapy" },
      "Action": "s3:ListBucket",
      "Resource": "arn:aws:s3:::YOUR_BUCKET"
    },
    {
      "Sid": "NimbleReadInputPrefix",
      "Effect": "Allow",
      "Principal": { "AWS": "arn:aws:iam::744254827463:user/crawlit-scrapy" },
      "Action": "s3:GetObject",
      "Resource": "arn:aws:s3:::YOUR_BUCKET/input/*"
    },
    {
      "Sid": "NimbleReadWriteOutputPrefix",
      "Effect": "Allow",
      "Principal": { "AWS": "arn:aws:iam::744254827463:user/crawlit-scrapy" },
      "Action": [
        "s3:GetObject",
        "s3:PutObject",
        "s3:AbortMultipartUpload",
        "s3:ListMultipartUploadParts"
      ],
      "Resource": "arn:aws:s3:::YOUR_BUCKET/output/*"
    }
  ]
}

Applying the policy

1

Open the bucket

In the AWS Console, open S3 and click the bucket Nimble should access.
2

Edit the bucket policy

Open the Permissions tab. Click Edit under Bucket policy.
3

Paste the template

Paste the JSON above. Replace YOUR_BUCKET and the prefixes. Save changes.
4

Verify in Nimble

Open the Job form. Paste the s3://... path and click Test Connection. A green chip confirms the policy is correct.
Keep Block Public Access enabled. The policy grants access only to Nimble’s IAM user, not to the public.

Test Connection - what it does

ModeOperationVerifies
Read (input path)head_bucket + list one object under the prefixs3:ListBucket + s3:GetObject
Write (output path)head_bucket + put_object of an empty .nimble-connection-test file at the prefixs3:ListBucket + s3:PutObject
The write test leaves an empty .nimble-connection-test file under the output prefix. The file is not deleted - the policy does not grant s3:DeleteObject, so cleanup is not possible. The key is deterministic, so repeated tests overwrite the same object. At most one residue file per output prefix. Remove it manually at any time.

Interpreting the result

ChipMeaningFix
✓ Connection OKPermissions correct. The prefix has files.Ready to use.
✓ Connection OK - prefix is emptyPermissions correct. The prefix has no files yet.Upload the first input file.
✗ Access deniedThe bucket policy is missing a required permission.Re-apply the template above.
✗ Bucket does not existThe bucket name in the path is wrong.Verify the bucket name.
✗ Connection failedNetwork, throttling, or unexpected S3 error.Retry. Contact support if it persists.

Comparison

FeaturePollingWebhooksCloud Delivery
Setup complexityNoneRequires endpointRequires bucket setup
Real-time notificationsNo (you poll)YesNo
Automatic storageNoNoYes
Best forSimple integrations, testingEvent-driven appsData pipelines ETLs
Infrastructure neededNoneWeb serverCloud storage bucket

Combining methods

You can combine delivery methods for redundancy:
from nimble_python import Nimble

nimble = Nimble(api_key="YOUR-API-KEY")

# Receive webhook AND store in S3
response = nimble.extract_async(
    url="https://www.nimbleway.com",
    formats=["html", "markdown"],
    callback_url="https://your-server.com/webhooks/nimble",
    storage_type="s3",
    storage_url="s3://your-bucket/results/"
)

Best Practices

  • Check status first - Use /tasks/{id} before fetching full results - Use reasonable intervals - Poll every 2-5 seconds, not continuously - Handle rate limits - Implement retry logic for 429 responses - Set timeouts - Most tasks complete within seconds to minutes
  • Use HTTPS - Always use secure endpoints - Verify authenticity - Use custom headers for authentication - Respond quickly - Return 200 OK immediately, process async - Handle retries - Nimble retries failed deliveries
  • Use prefixes - Organize by date, project, or type - Enable compression - Use storage_compress: true for large files - Set lifecycle policies - Auto-delete old files to manage costs - Use custom names - storage_object_name for meaningful filenames

Next Steps

Async Extract

Learn about async extraction options

Crawl API

Deep website crawling with async delivery

Agent Gallery

Browse available search agents

Rate Limits

Understand API rate limit