/document-to-markdown
Convert any document to clean Markdown.
Define formatting options, send your file — get Markdown back.
/document-to-markdown
# Convert any document to clean Markdown
curl -X POST https://api.documentai.dev/document-to-markdown/v1 \
-H "x-api-key: YOUR_API_KEY" \
-F "url=https://example.com/invoice.pdf" \
-F 'options={"headings": true, "tables": true}'Playground
Try it
Markdown Options
Customize how the Markdown is generated.
Copy request as
Reference
Documentation
Endpoint
POST
https://api.documentai.dev/document-to-markdown/v1Authentication
| Header | Required | Description |
|---|---|---|
| x-api-key | Required | Your API key. Get one by signing up. |
Request Body
Upload a file, provide a public URL, or send plain text.
| Field | Type | Required |
|---|---|---|
| url | string | One of three |
| file | file | One of three |
| text | string | One of three |
| options | object | Optional |
Accepted File Types
PDFWord (.doc, .docx)Images (PNG, JPG, WebP, GIF, BMP)PowerPoint (.ppt, .pptx)Excel (.xls, .xlsx, .csv)HTMLRTF / TXTOpenDocument (ODT, ODS, ODP)Code filesPlain text
Options Reference
Customize how the Markdown is generated. Pass this as a JSON string in the options field.
| Option | Type | Default | Description |
|---|---|---|---|
| headings | boolean | true | Render text serving as titles or section labels with heading prefixes (#, ##). |
| lists | boolean | true | Render list items with list markers (-, 1.). |
| nested_lists | boolean | true | Maintain hierarchical indentation for nested list items. |
| tables | boolean | true | Extract tabular data using Markdown table syntax. |
| bold | boolean | true | Apply bold formatting (**text**) for emphasized text. |
| italic | boolean | true | Apply italic formatting (*text*) where appropriate. |
| code_blocks | boolean | true | Wrap code snippets in triple backticks. |
| links | boolean | false | Extract URLs as Markdown links [text](url). |
| math | boolean | true | Render mathematical formulas using LaTeX math syntax. |
| page_delimiters | boolean | false | Insert <!-- page N --> markers instead of unified flat output. |
Example Response
200 OK
{
"status": "success",
"data": {
"markdown": "# Invoice #12345\n\n**From:** Acme Corp \n**Date:** April 28, 2026 \n**Due:** May 15, 2026\n\n### Items: \n- Widget Pro x5 — $199.99 each \n- Gadget Lite x10 — $24.50 each\n\n| Description | Amount |\n| --- | --- |\n| Subtotal | $1,244.95 |\n| Tax (8%) | $99.60 |\n| **Total** | **$1,344.55** |\n\n**Status:** *Unpaid*",
"text": "Invoice #12345\n\nFrom: Acme Corp\nDate: April 28, 2026\nDue: May 15, 2026\n\nItems:\n- Widget Pro x5 — $199.99 each\n- Gadget Lite x10 — $24.50 each\n\nDescription Amount\nSubtotal $1,244.95\nTax (8%) $99.60\nTotal $1,344.55\n\nStatus: Unpaid"
}
}Limits
Max Input
| Limit | Value |
|---|---|
| Max file size | 30 MB |
| Max pages per document | 1,000 |
| Max input tokens | 1,000,000 |
Max Output
If input is a URL or a file, max output is 2,000 tokens per input page.
If input is text, max output is twice the input tokens count.
Minimum is always 2,000 tokens, absolute maximum is 60,000 tokens per request.
Cost
- · File inputs: 1 credit per page.
- · Text inputs: 1 credit per 1,000 tokens.
- · Only input is billed — output size does not affect cost.
- · Failed requests are not charged.