/document-to-json
Extract structured JSON from any document.
Define your schema, send your file — get clean JSON back.
# Extract structured JSON from any document
curl -X POST https://api.documentai.dev/document-to-json/v1 \
-H "x-api-key: YOUR_API_KEY" \
-F "url=https://example.com/invoice.pdf" \
-F 'schema={
"company": "string",
"paid": "boolean",
"items": [{"name": "string", "price": "number"}]
}'Try it
Try with the example below, or paste your own text.
JSON Schema
Defines the structure of the extracted JSON output.
Documentation
Endpoint
https://api.documentai.dev/document-to-json/v1Authentication
| Header | Required | Description |
|---|---|---|
| x-api-key | Required | Your API key. Get one by signing up. |
Request Body
Upload a file, provide a public URL, or send plain text.
| Field | Type | Required |
|---|---|---|
| url | string | One of three |
| file | file | One of three |
| text | string | One of three |
| schema | object | Required |
Accepted File Types
JSON Schema Examples
Define the structure you want — from flat key-value pairs to deeply nested schemas.
{
"type": "object",
"properties": {
"company": { "type": "string" },
"total": { "type": "number" },
"paid": { "type": "boolean" }
}
}Tip: use the description keyword to guide the extraction. Add it to any field to tell the AI what it should contain, or at the root of your schema to give overall extraction instructions. This improves accuracy on ambiguous documents.
Supported keywords
Example Response
{
"status": "success",
"data": {
"company": "Acme Corp",
"total": 1249.5,
"paid": true
}
}Limits
Max Input
| Limit | Value |
|---|---|
| Max file size | 30 MB |
| Max pages per document | 1,000 |
| Max input tokens | 1,000,000 |
Max Output
The API allocates output capacity based on your input size, so larger documents can produce richer extractions.
If input is a URL or a file, max output is 2,000 tokens per input page.
If input is text, max output is twice the input tokens count.
Minimum is always 2,000 tokens, absolute maximum is 60,000 tokens per request.
Cost
- · File inputs: 1 credit per page.
- · Text inputs: 1 credit per 1,000 tokens.
- · Only input is billed — output size does not affect cost.
- · Failed requests are not charged.