Web dataBulk content

POST /v1/content/bulk

Extract content from many URLs (or a sitemap) in one request. Results are delivered asynchronously by email or webhook.

curl -X POST https://api.chuger.com/v1/content/bulk \
  -H "Authorization: Bearer YOUR_API_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "urls": [
      "https://example.com/article-1",
      "https://example.com/article-2"
    ],
    "email": true,
    "callback_url": "https://hooks.your-app.com/chuger"
  }'
{
  "message": "Content processing job has been queued successfully",
  "job_id": "content-bulk-9f4e2a...",
  "urls_count": 2,
  "url_source": "urls",
  "delivery_method": "email, webhook",
  "estimated_completion_time": "1 minute",
  "status": "queued",
  "queued_at": "2026-05-13T14:00:00.000000Z"
}

Extract content from many URLs in a single request. Results are delivered asynchronously by email or webhook.

POST https://api.chuger.com/v1/content/bulk

Use this when you have a list of URLs (or a sitemap) and want to extract them all without managing rate limits, pacing, or retries yourself.

Authentication

Bearer token in the Authorization header. See Authentication.

Cost

Bulk extraction is billed per URL successfully extracted. Each successful page costs the same as a single /v1/content request:

PlanCredits per successful URL
Basic2
Pro2
Business2

URLs that fail to extract are not charged. Chuger pre-checks that you have enough credits to cover the largest possible case before accepting the job.

Always call /v1/credits/preview-cost before launching a large batch.

Request body

Content-Type: application/json

body
urlsstring[]

1–100 HTTP/HTTPS URLs, max 180 characters each. Required unless sitemap is provided.

body
sitemapstring

URL to an XML sitemap. Required unless urls is provided. Max 500 characters.

body
emailboolean

Set true to receive results by email. One of email or callback_url must be set.

body
callback_urlstring

Webhook URL that results are POSTed to when the batch completes. Max 500 characters. One of email or callback_url must be set.

body
syncboolean

If true and the resolved URL count is ≤ 10, the request is processed inline and returns 200 with the final state. Larger jobs automatically fall back to async.

You must specify at least one delivery method (email, callback_url, or both). When both are set, both fire. If you pass both urls and sitemap, urls wins.

Examples

Response fields

messagestring
Required

Human-readable status. For sync mode confirms completion; for async confirms queuing.

job_idstring
Required

Unique identifier for this batch. Save it for support escalations.

urls_countinteger
Required

Number of URLs queued.

url_sourcestring
Required

Either "urls" or "sitemap" — which input was used.

delivery_methodstring
Required

Comma-separated list of email, webhook.

estimated_completion_timestring
Required

Human-readable estimate (e.g. "5 minutes"). For sync mode this is "Completed".

statusstring
Required

Always "queued".

queued_atstring
Required

ISO-8601 timestamp.

If sync: true is requested but more than 10 URLs are submitted, Chuger automatically falls back to async and tells you so in the message.

Email delivery

When email: true, you'll receive a message with a ZIP download link. The archive contains:

  • html/ — readable HTML for every successful URL, organized by the URL path
  • markdown/ — Markdown version of the same content

Files preserve the directory structure of the source URLs so it's easy to spot which result came from where. The ZIP link is hosted on a CDN — save the file you need; links may expire.

Webhook delivery

When callback_url is set, Chuger sends a POST request to that URL once the batch finishes:

POST /your/webhook HTTP/1.1
Content-Type: application/json
User-Agent: Chuger-API/1.0
X-Chuger-Webhook: content-all-results
{
  "processed_count": 50,
  "success_count": 48,
  "failed_count": 2,
  "results": [
    {
      "url": "https://example.com/article-1",
      "success": true,
      "content": {
        "url": "https://example.com/article-1",
        "success": true,
        "statusCode": 200,
        "html": "...",
        "markdown": "...",
        "metadata": { "title": "..." }
      },
      "error": null
    },
    {
      "url": "https://example.com/broken",
      "success": false,
      "content": null,
      "error": "Timeout while fetching"
    }
  ],
  "user": { "id": 42, "email": "[email protected]" },
  "processing_completed_at": "2026-05-13T14:08:13.000Z"
}

If the entire job fails for systemic reasons, you'll receive a failure payload with X-Chuger-Webhook: content-bulk-failure instead.

Webhook reliability

  • Up to 3 retries on non-2xx responses
  • 30-second timeout per attempt
  • Respond as soon as you can — queue the payload and process it asynchronously on your side

Errors

StatusWhen
401Missing / invalid token
402No plan, or insufficient credits for the whole batch
422Validation failed (no URLs, too many URLs, missing delivery method, bad sitemap, etc.)
429Rate limit or monthly quota exceeded

See Errors for the full reference.

Tips

  • Estimate cost up front with /v1/credits/preview-cost, passing service_type: "content" and quantity: <url count>
  • For sitemaps, only the URL entries discovered in the XML are queued — large sitemaps may exceed your monthly quota, so preview first
  • Use the job_id if you need to follow up with support about a specific batch
Was this page helpful?