Batch PDF Generation at Scale with Blink PDF REST API

When you need to produce hundreds or thousands of PDFs in a single run — month-end invoices, bulk certificate generation, mass document digitization — the key is concurrency. Blink PDF can render roughly 30 PDFs per second per node, and your throughput scales with the number of parallel requests you issue within your plan’s rate limit. This guide shows you how to structure a concurrent batch job in Python and Node.js, handle rate limit errors gracefully, and size your plan for the throughput you need.

Concurrency Limits by Plan

Your plan’s rate limit determines how many requests you can issue per minute. For batch workloads, translate that into a practical concurrency ceiling:

Plan	Rate Limit	Recommended Max Concurrency	500 renders in ~…
Free	10/min	1	~50 min
Pro ($9/mo)	120/min	10	~4 min
Business ($79/mo)	600/min	40	~50 sec
Enterprise	Custom	per capacity	Custom

Rate limits count requests, while your monthly allowance is metered in render units (a text render costs 1 unit; image-heavy renders cost more, up to 16). Plan for both: requests/min bound your throughput, render units bound your monthly volume.

These concurrency figures are conservative estimates that keep you safely below the rate limit ceiling while accounting for variable render times. You can tune the concurrency value up or down based on observed throughput in your environment.

Python: Async Batch with `asyncio` and `aiohttp`

The most efficient Python approach uses asyncio with aiohttp to issue multiple render requests in parallel, capped by a semaphore set to your concurrency limit.

import asyncio
import aiohttp
import time
from pathlib import Path

BLINK_API_KEY = "bp_xxxxxxxxxxxxxxxxxxxx"
BLINK_URL = "https://api.blinkpdf.io/v1/render"

# Set this to the recommended concurrency for your plan:
# Free=1, Pro=10, Business=40
MAX_CONCURRENCY = 10  # Pro plan example


async def render_pdf(
    session: aiohttp.ClientSession,
    semaphore: asyncio.Semaphore,
    document: dict,
    retries: int = 4,
) -> tuple[str, bytes | None]:
    """
    Render a single PDF with exponential backoff on 429 / 5xx errors.
    Returns (document_id, pdf_bytes) or (document_id, None) on failure.
    """
    async with semaphore:
        delay = 1.0  # initial backoff in seconds
        for attempt in range(retries):
            try:
                async with session.post(
                    BLINK_URL,
                    headers={
                        "x-api-key": BLINK_API_KEY,
                        "Accept": "application/pdf",
                        "Content-Type": "application/json",
                    },
                    json={
                        "markdown": document["markdown"],
                        "metadata": {"title": document["title"]},
                    },
                ) as response:
                    if response.status == 200:
                        pdf_bytes = await response.read()
                        request_id = response.headers.get("X-Request-Id", "?")
                        print(
                            f"  ✓ {document['id']} rendered (request {request_id})"
                        )
                        return document["id"], pdf_bytes

                    elif response.status == 429:
                        # Rate limited — back off and retry
                        print(
                            f"  ⚠ {document['id']} rate limited, "
                            f"retrying in {delay:.1f}s (attempt {attempt + 1}/{retries})"
                        )
                        await asyncio.sleep(delay)
                        delay *= 2  # exponential backoff

                    else:
                        body = await response.text()
                        print(
                            f"  ✗ {document['id']} failed with "
                            f"{response.status}: {body[:120]}"
                        )
                        return document["id"], None

            except aiohttp.ClientError as exc:
                print(f"  ✗ {document['id']} network error: {exc}")
                await asyncio.sleep(delay)
                delay *= 2

        print(f"  ✗ {document['id']} exhausted retries")
        return document["id"], None


async def batch_render(documents: list[dict], output_dir: str = "output") -> dict:
    """
    Render all documents concurrently and save PDFs to output_dir.
    Returns a summary dict with success/failure counts.
    """
    Path(output_dir).mkdir(parents=True, exist_ok=True)
    semaphore = asyncio.Semaphore(MAX_CONCURRENCY)

    connector = aiohttp.TCPConnector(limit=MAX_CONCURRENCY)
    async with aiohttp.ClientSession(connector=connector) as session:
        tasks = [
            render_pdf(session, semaphore, doc) for doc in documents
        ]

        start = time.monotonic()
        results = await asyncio.gather(*tasks)
        elapsed = time.monotonic() - start

    success, failure = 0, 0
    for doc_id, pdf_bytes in results:
        if pdf_bytes is not None:
            out_path = Path(output_dir) / f"{doc_id}.pdf"
            out_path.write_bytes(pdf_bytes)
            success += 1
        else:
            failure += 1

    print(
        f"\nBatch complete: {success} succeeded, {failure} failed "
        f"in {elapsed:.1f}s ({len(documents) / elapsed:.1f} PDFs/sec)"
    )
    return {"success": success, "failure": failure, "elapsed_s": elapsed}


# Example: build a list of 100 documents
documents = [
    {
        "id": f"report_{i:04d}",
        "title": f"Monthly Report — Account {i:04d}",
        "markdown": f"# Monthly Report\n\n**Account:** {i:04d}\n\nContent for account {i}.",
    }
    for i in range(1, 101)
]

asyncio.run(batch_render(documents, output_dir="reports"))

Set MAX_CONCURRENCY to the value in the plan table above and tune it upward only after confirming you are not hitting 429 errors in practice. Starting conservative saves you from failed renders at the beginning of a large run.

Node.js: Concurrent Batch with `Promise.all` and a Concurrency Pool

Node.js’s Promise.all runs tasks concurrently, but without a concurrency cap it will fire all requests simultaneously and overwhelm the rate limit. The renderBatch function below implements a simple pool that keeps exactly MAX_CONCURRENCY requests in flight at any time.

import { writeFileSync, mkdirSync } from "fs";
import { join } from "path";

const BLINK_API_KEY = "bp_xxxxxxxxxxxxxxxxxxxx";
const BLINK_URL = "https://api.blinkpdf.io/v1/render";
const MAX_CONCURRENCY = 10; // Adjust for your plan
const OUTPUT_DIR = "reports";

mkdirSync(OUTPUT_DIR, { recursive: true });

async function renderWithRetry(document, retries = 4) {
  let delay = 1000; // ms

  for (let attempt = 0; attempt < retries; attempt++) {
    const response = await fetch(BLINK_URL, {
      method: "POST",
      headers: {
        "x-api-key": BLINK_API_KEY,
        Accept: "application/pdf",
        "Content-Type": "application/json",
      },
      body: JSON.stringify({
        markdown: document.markdown,
        metadata: { title: document.title },
      }),
    });

    if (response.ok) {
      const buffer = Buffer.from(await response.arrayBuffer());
      console.log(
        `  ✓ ${document.id} rendered (request ${response.headers.get("X-Request-Id")})`
      );
      return { id: document.id, buffer };
    }

    if (response.status === 429) {
      console.log(
        `  ⚠ ${document.id} rate limited, retrying in ${delay}ms (attempt ${attempt + 1}/${retries})`
      );
      await new Promise((r) => setTimeout(r, delay));
      delay *= 2; // exponential backoff
      continue;
    }

    const text = await response.text();
    console.error(`  ✗ ${document.id} failed: ${response.status} ${text.slice(0, 120)}`);
    return { id: document.id, buffer: null };
  }

  console.error(`  ✗ ${document.id} exhausted retries`);
  return { id: document.id, buffer: null };
}

async function runPool(documents, concurrency) {
  const results = [];
  const queue = [...documents];
  const inFlight = new Set();

  await new Promise((resolve) => {
    function startNext() {
      while (inFlight.size < concurrency && queue.length > 0) {
        const doc = queue.shift();
        const promise = renderWithRetry(doc).then((result) => {
          inFlight.delete(promise);
          results.push(result);
          if (queue.length === 0 && inFlight.size === 0) resolve();
          else startNext();
        });
        inFlight.add(promise);
      }
    }
    startNext();
  });

  return results;
}

// Build example documents
const documents = Array.from({ length: 100 }, (_, i) => ({
  id: `report_${String(i + 1).padStart(4, "0")}`,
  title: `Monthly Report — Account ${String(i + 1).padStart(4, "0")}`,
  markdown: `# Monthly Report\n\n**Account:** ${String(i + 1).padStart(4, "0")}\n\nContent for account ${i + 1}.`,
}));

console.log(`Rendering ${documents.length} PDFs with concurrency=${MAX_CONCURRENCY}...`);
const start = Date.now();

const results = await runPool(documents, MAX_CONCURRENCY);

let success = 0;
let failure = 0;
for (const { id, buffer } of results) {
  if (buffer) {
    writeFileSync(join(OUTPUT_DIR, `${id}.pdf`), buffer);
    success++;
  } else {
    failure++;
  }
}

const elapsed = ((Date.now() - start) / 1000).toFixed(1);
console.log(
  `\nBatch complete: ${success} succeeded, ${failure} failed in ${elapsed}s ` +
    `(${(documents.length / elapsed).toFixed(1)} PDFs/sec)`
);

Handling Rate Limit Errors

When you exceed your plan’s rate limit, the API returns HTTP 429 Too Many Requests. Both examples above implement exponential backoff — they wait 1 second after the first 429, 2 seconds after the second, 4 seconds after the third, and so on. Key principles for robust batch jobs:

Never retry immediately on 429. Back off before each retry attempt.
Cap your retries (4–5 attempts is usually sufficient). After that, log the failure and move on so the rest of the batch can complete.
Log failures with document IDs so you can re-run just the failed subset without re-processing the whole batch.
Monitor X-Render-Status and X-Render-Diagnostics across your batch run — a rising share of degraded renders or error-severity diagnostic codes can flag documents with missing glyphs, image fetch failures, or other issues. When you need a strict match to the source Markdown, also check X-Render-Rendered-As-Requested. Log X-Request-Id for any failures so support can trace them.

If you consistently hit 429 errors even at the recommended concurrency levels, your workload has outgrown your current plan. Upgrading to the next plan tier immediately increases your rate limit and allows higher concurrency.

Pre-validate templates without spending render units or throughput: POST /v1/render/validate runs the full schema and semantic checks and has its own rate-limit bucket, separate from render. And apply a consistent theme across the batch so every document shares one look.

Shared Images: Stage Once, Reference Everywhere

Batch jobs often reuse the same logo or letterhead across every document. Instead of inlining that image as a large data: URL in every request — which counts toward your input-size limit and bloats each payload — stage it once and reference it by handle:

Stage it through the MCP upload_image tool — one call for a small image (or ordered parts for a larger one) that takes the image bytes (base64) and returns the blink://asset/<sha256> handle directly (identical bytes dedupe and return a dup flag). The server hashes the bytes for you, so no client-side SHA-256 step and no presigned PUT are involved.
Reference it in every render as ![Logo](blink://asset/<sha256>).

upload_image handles one image at a time, so if a batch mixes several logos, stage each distinct image before referencing it. Staged assets are fetched at render time and do not count toward the request input-size limit, they are content-addressed (identical bytes are stored once), and they are scoped per credential — so a handle staged with your batch key resolves in every POST /v1/render call that key makes. Assets live up to 24 hours, so one staging step covers a full batch run. For image-only outputs — digitizing scans or assembling photo batches — use the imageSequence field to lay staged assets out as pages (one per page or a grid) without hand-writing Markdown. See Document Options.

Sizing Your Plan for Batch Workloads

Use this formula to estimate the minimum plan you need:

Required rate (PDFs/min) = batch_size / acceptable_duration_minutes

Example: You need to generate 10,000 invoices within 30 minutes.

Required rate = 10,000 / 30 ≈ 334 PDFs/min

The Business plan (600/min) covers this comfortably. The Pro plan (120/min) would require ~83 minutes.

Pro Plan — $9/mo

120 req/min · 30k render units/mo — Good for nightly batch jobs of a few hundred documents or on-demand runs of up to ~1,000 text documents.

Business Plan — $79/mo

600 req/min · 1M render units/mo — Suitable for large month-end batch runs, bulk document digitization, and high-frequency automated pipelines.

The Enterprise plan offers dedicated capacity with custom rate limits and no per-render metering — designed for workloads that require sustained high-volume generation. Contact sales to discuss capacity planning.

Get Started

Core Concepts

Guides

MCP Server

Configuration

Plans & Limits

Help

Batch PDF Generation at Scale with Blink PDF REST API

Concurrency Limits by Plan

Python: Async Batch with `asyncio` and `aiohttp`

Node.js: Concurrent Batch with `Promise.all` and a Concurrency Pool

Handling Rate Limit Errors

Shared Images: Stage Once, Reference Everywhere

Sizing Your Plan for Batch Workloads

Pro Plan — $9/mo

Business Plan — $79/mo

​Concurrency Limits by Plan

​Python: Async Batch with asyncio and aiohttp

​Node.js: Concurrent Batch with Promise.all and a Concurrency Pool

​Handling Rate Limit Errors

​Shared Images: Stage Once, Reference Everywhere

​Sizing Your Plan for Batch Workloads

Pro Plan — $9/mo

Business Plan — $79/mo

Concurrency Limits by Plan

Python: Async Batch with `asyncio` and `aiohttp`

Node.js: Concurrent Batch with `Promise.all` and a Concurrency Pool

Handling Rate Limit Errors

Shared Images: Stage Once, Reference Everywhere

Sizing Your Plan for Batch Workloads