API Documentation
Integrate CargoParse into your TMS, ERP, or custom workflow.
API access is available on Haul, Fleet, and Terminal plans. Generate keys in Account → Developer.
Authentication
All requests require an Authorization header with your API key:
Authorization: Bearer cp_live_your_key_hereKeys can be created with an expiration (30 days, 90 days, 1 year, or no expiry). Revoke and regenerate them from the Developer tab. Never include keys in client-side code or git repositories.
API keys and teams
Personal API keys are tied to an individual user. When that user joins a team, their personal keys are paused — requests return 403with a migration hint, to prevent a personal-scope credential from silently authenticating against the team's data. Keys stay on the user record so they automatically resume if they leave the team, and they can still be revoked from the Developer tab while paused.
Team-scoped API keys are available for Developer+ org members. Create them alongside personal keys from the Developer tab — team keys read and write team-partition data. For non-programmatic integrations, teams can also use team webhooks or the shared team email-to-upload address.
Webhook Events
For real-time notifications instead of polling, configure a webhook endpoint in Account → Automation. CargoParse sends HTTP POST requests to your endpoint when document events occur. Each paid plan supports up to 10 endpoints with independent signing secrets and failure tracking.
Team webhooks fire for every team upload regardless of who submitted it. Personal webhook endpoints are paused while a user is on a team — a Developer or above can click Copy to teamon any paused endpoint to add a team copy with a freshly rotated signing secret. The personal record stays in place so it's ready to resume if the user leaves.
Event Types
Example Payload
A document.completed event delivers the full extraction result:
{ "event": "document.completed", "deliveryId": "evt_a1b2c3d4-5678-...", "timestamp": "2026-03-17T14:30:00.000Z", "documentId": "doc-456", "filename": "bol-march-17.pdf", "documentType": "BILL_OF_LADING", "qualityScore": 92, "qualityTier": "clean", "stats": { "total": 24, "ok": 22, "review": 1, "missing": 1 }, "data": { "bol_number": { "value": "BOL-12345", "confidence": 96, "flag": "ok" }, "shipper_name": { "value": "ACME Freight", "confidence": 92, "flag": "ok" }, "consignee_name": { "value": "Globex Co.", "confidence": 88, "flag": "ok" }, "pickup_date": { "value": "2026-03-17", "confidence": 71, "flag": "review" } // … one entry per field on this document type }, "lineItems": [ { "groupId": "commodity_items", "items": [ { "commodity_description": { "value": "Pallets of widgets", "confidence": 90, "flag": "ok" }, "commodity_weight": { "value": "1200 lbs", "confidence": 85, "flag": "ok" } } // … one object per row ] } // … one element per group on the document type (most types have one) ] }
The data map is the same flat shape the document detail page renders. Each value is a FlaggedField: { value, confidence (0-100), flag ("ok"|"review"|"missing") }. lineItems is an array of { groupId, items } objects — groupId is the group identifier (e.g. commodity_items on a BOL, delivery_items on a POD), and items is the row array, where each row is a flat FlaggedField map. Both are present on document.completed, document.needs_review, and document.approved; absent on document.failed / document.rejected / automation_failure (no extraction happened).
Same shape across surfaces, different envelope. Webhooks put data and lineItems at the top level of the body. The REST APIs nest them: GET /api/v1/jobs/:jobId returns { result: { data, lineItems, stats, meta } }, and GET /api/v1/documents/:id returns { document: { latestResult: { data, lineItems, ... } } }. The fields inside are identical — only the wrapper changes — so a small adapter at the entry point of your handler keeps the rest of your pipeline shape-agnostic.
Two optional boolean flags may appear on document.completedwhen it isn't a fresh-extraction event:templateRecompute: true— the document was re-evaluated against an updated export template and just flipped from needs_review to clean. The original extraction didn't change. Useful if you want to filter these out of an intake pipeline.manualRefire: true— the user manually re-pushed this doc's current state via POST /api/v1/documents/:id/refire, typically after editing fields. Use this to ignore replays in idempotent intake systems, or to surface them differently in your UI.
Signature Verification
Every request includes an X-CargoParse-Signature header. Verify it to ensure the payload is authentic and untampered:
# Header format: # X-CargoParse-Signature: t=<unix_timestamp>,v1=<hex_hmac> # Verification (Node.js): const crypto = require("crypto"); function verifyWebhook(body, signatureHeader, secret, toleranceSeconds = 300) { const parts = Object.fromEntries( signatureHeader.split(",").map(p => p.split("=", 2)) ); // Reject deliveries with a timestamp outside your tolerance window. // Defends against replay attacks if a delivery is captured + replayed later. const now = Math.floor(Date.now() / 1000); if (Math.abs(now - Number(parts.t)) > toleranceSeconds) return false; // Signed payload format: "t=<timestamp>.<raw_body>" — note the // literal "t=" prefix is included in the HMAC input. const expected = crypto .createHmac("sha256", secret) .update("t=" + parts.t + "." + body) .digest("hex"); return crypto.timingSafeEqual( Buffer.from(expected, "hex"), Buffer.from(parts.v1, "hex") ); }
The t= timestamp is included in the HMAC input, so any tampering with it invalidates the signature. We recommend a tolerance window of 5 minutes (300 seconds) — long enough to absorb our 35-second retry window and clock skew, short enough to make replay impractical.
Delivery Behavior
CargoParse expects your endpoint to respond within 10 seconds. If delivery fails, it retries up to 3 times with exponential backoff (5s, then 30s delay). A 4xx response is treated as a permanent failure and is not retried. After all attempts fail, an automation_failure email notification is sent to the account owner.
Circuit breaker: After 10 consecutive delivery failures, the affected endpoint is automatically paused while others keep running. You can re-enable it from Account → Automation.
Multiple endpoints: You can configure up to 10 webhook endpoints per account or organization. Each endpoint has its own URL, secret, event filters, and independent failure tracking. Manage them at Account → Automation (solo) or Account → Team (org). Endpoints are configured via the UI, not the API.
Replay: Failed deliveries can be replayed from the delivery log in Account → Automation. Delivery logs (with per-attempt timestamps) are available for 7 days.
Upload Documents
Document upload is a 3-step flow: get a presigned URL, upload to S3, then enqueue for processing. File bytes go directly to S3 — never through the CargoParse API server.
Limits: up to 30 files per request, 15 MB per file. Accepted types: PDF, JPEG, PNG, TIFF. Multi-page captures (e.g. phone photos of a BOL) can be bundled into a single logical document via the optional groups parameter.
# Step 1 — get presigned upload URL(s) POST /api/v1/jobs Authorization: Bearer cp_live_your_key Content-Type: application/json { "files": [ { "name": "bol-123.pdf", "size": 245000, "mimeType": "application/pdf" } ] } # Optional — bundle several images into one logical document (multi-page capture). # Max 5 groups per request, 2–10 files per group. fileIndices reference the files array above. # "groups": [{ "groupName": "POD March 17", "fileIndices": [0, 1, 2] }] → { "ok": true, "jobs": [{ "jobId": "abc-123", "documentId": "doc-456", "filename": "bol-123.pdf", "uploadUrl": "https://s3.amazonaws.com/...", "uploadFields": { "key": "...", "policy": "...", "x-amz-signature": "...", ... } }] } # Step 2 — upload file bytes to S3 via presigned POST (5-min expiry) POST <uploadUrl> Content-Type: multipart/form-data # Include all uploadFields as form fields, then append file as the last field # Step 3 — enqueue for processing POST /api/v1/jobs/enqueue Authorization: Bearer cp_live_your_key Content-Type: application/json { "jobIds": ["abc-123"] } → { "ok": true, "jobs": [{ "jobId": "abc-123", "documentId": "doc-456", "filename": "bol-123.pdf" }] }
Poll for Results
Poll GET /api/v1/jobs/:jobId until status is terminal. No auth required — jobIds are UUIDs known only to you.
# Poll until status is SUCCEEDED, FAILED, or REJECTED (no auth required) GET /api/v1/jobs/abc-123 → { "status": "SUCCEEDED", "documentId": "doc-456" } # or PROCESSING, QUEUED…
Typical processing time is 5–30 seconds. Recommended poll interval: 2s initial, backing off to 5s. Status values: QUEUED PROCESSING SUCCEEDED FAILED REJECTED.
Retrieve Extracted Data
GET /api/v1/documents/doc-456/export?format=json Authorization: Bearer cp_live_your_key → { "meta": { "documentType": "BILL_OF_LADING", "textSource": "embedded" }, "data": { "bol_number": { "value": "BOL-12345", "confidence": 95, "flag": "ok" }, "shipper_name": { "value": "ACME Corp", "confidence": 88, "flag": "ok" }, "weight_lbs": { "value": "5000 LBS", "confidence": 55, "flag": "review" } }, "lineItems": [ { "groupId": "commodity_items", "label": "Commodity Items", "items": [ { "commodity_description": { "value": "Electronics — Desktop Computers", "confidence": 88, "flag": "ok" }, "commodity_weight": { "value": "2,500 LBS", "confidence": 82, "flag": "ok" }, "commodity_pieces": { "value": "50", "confidence": 90, "flag": "ok" } } ] } ], "stats": { "total": 24, "ok": 18, "review": 4, "missing": 2 } }
confidence is an integer in the range 0–100 (or null when the field is absent). flag values: ok (high confidence), review (below threshold), missing (not found). Supported formats: ?format=json ?format=csv ?format=xlsx ?format=pdf. Add &templateId=tmpl_... to apply a column mapping template.
Endpoint Reference
Documents
/api/v1/documentsList documents. Supports ?limit=, ?cursor=, ?search=, ?status=, ?documentType=.
/api/v1/documents/:idGet a single document. Response includes the doc metadata + latestResult (data, lineItems, stats, meta), qualityScore (0-100 overall confidence), qualityTier (clean | needs_review | approved), and — when a template applies — templateStats and resolvedTemplate.
/api/v1/documents/:idPermanently delete a document and its S3 file.
/api/v1/documents/:id/exportDownload extracted data. ?format=json|csv|xlsx|pdf. Optional &templateId=.
/api/v1/documents/:id/reprocessRe-run extraction on an existing document. Optionally pass { "documentType": "RATE_CONFIRMATION" } to override the auto-classified type. Free, does not use a credit.
/api/v1/documents/:id/fieldsUpdate extracted field values. Body: { fields: { field_key: <value> }, lineItems?, expectedUpdatedAt? }. Each <value> can be (a) a bare string ("ACME Inc."), (b) null to clear the field, or (c) a full FlaggedField object { value, confidence, flag, edited }. expectedUpdatedAt is an optimistic-concurrency token (returns 409 if the server-side updatedAt has moved on since you read it).
/api/v1/documents/:id/approveMark a needs-review document as approved. Clears review flags on populated fields, sets qualityTier=approved, and fires the document.approved webhook/email.
/api/v1/documents/:id/refireManually re-fire document.completed to your webhooks and emails (e.g. after editing fields). Payload carries manualRefire: true so subscribers can distinguish a manual replay from a fresh extraction. Returns { ok, fired: { email, webhook } }.
/api/v1/documents/searchAdvanced search. Body: { documentType?, fields?: { field_key: "substring" }, dateRange?: { from, to }, q?, cursor?, limit? }. Field filters use AND; q uses OR across high-value fields.
/api/v1/documents/:id/viewerGet a presigned S3 URL (15-min expiry) for the original file. Returns { presignedUrl, pageUrls, fileType, isImage, sourceFileDeleted }. Useful for embedding the source document in your own UI.
Jobs
/api/v1/jobsInitiate upload — returns presigned S3 POST URLs. Body: { files: [{ name, size, mimeType }] }.
/api/v1/jobs/enqueueQueue jobs for processing after S3 upload. Body: { jobIds: [...] }.
/api/v1/jobs/:jobIdPoll job status. No auth required.
Export Templates
/api/v1/export-templatesList templates. Optional ?documentType=BILL_OF_LADING.
/api/v1/export-templatesCreate a template. Body: { name, documentType, columns: [{ sourceField, outputName }], lineItemGroups?: [{ groupId, sheetName, columns: [...] }] }.
/api/v1/export-templates/:idGet a single template.
/api/v1/export-templates/:idUpdate a template.
/api/v1/export-templates/:idDelete a template.
Batch Export
/api/v1/documents/export-batchExport multiple documents. Body: { documentIds: [...], templateId, format?: "xlsx"|"csv"|"json" }. Max 100 docs.
Errors
Error Handling
All error responses return JSON with an errorfield. Here's how to handle common cases:
# Rate limit exceeded — retry after the indicated wait HTTP 429 { "ok": false, "error": "Rate limit exceeded" } # → Check Retry-After header and retry after that many seconds # Plan limit reached — user needs to upgrade or wait for reset HTTP 402 { "ok": false, "error": "Monthly document limit reached" } # → Show a message to the user; limits reset on their billing anniversary # File validation failure HTTP 400 { "ok": false, "error": "Unsupported file type" } # → Accepted types: PDF, JPEG, PNG, TIFF (detected from file content, not extension) # Extraction failed — the document could not be processed GET /api/v1/jobs/:id → { "status": "FAILED", "error": "..." } # → Call POST /api/v1/documents/:id/reprocess to retry (free, no credit charged)
Rate Limits & Retry Strategy
When you exceed the rate limit, the API returns HTTP 429 with headers indicating when you can retry:
HTTP 429 Too Many Requests X-RateLimit-Limit: 60 X-RateLimit-Remaining: 0 X-RateLimit-Reset: 1710680460 Retry-After: 42 { "ok": false, "error": "Rate limit exceeded" }
Recommended Retry Strategy
Use exponential backoff with the Retry-After header:
- If
Retry-Afteris present, wait that many seconds before retrying. - Otherwise, use exponential backoff: 1s, 2s, 4s, 8s, up to 60s max.
- Add random jitter (0-500ms) to prevent thundering herd on shared rate windows.
- After 5 consecutive failures, stop retrying and alert your monitoring system.