/store-and-search-documents
Store and semantically search your documents.
Upload your files with metadata – Query them by meaning and filters.
# 1. Store a document with metadata filters in a dataspace
curl -X POST https://api.documentai.dev/store-and-search-documents/v1/dataspaces/:dataspaceId/documents \
-H "x-api-key: YOUR_API_KEY" \
-F "file=@company-wiki.pdf" \
-F 'filters={"string_1": "hr-policies", "number_1": 2025}'
# 2. Search your documents by meaning + filters
curl -X POST https://api.documentai.dev/store-and-search-documents/v1/dataspaces/:dataspaceId/documents/search \
-H "x-api-key: YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"query": "What is the remote work policy?",
"filters": [{"field": "string_1", "operator": "==", "value": "hr-policies"}],
"limit": 5
}'We handle for you document conversion, markdown extraction, chunking, embedding and vector databases.
Use our search back-end to build your knowledge bases, RAG pipelines, internal search tools, support bots...
Why This API
Cross-Language Search
Search in any language and find results in any other. Store a document in German, query in English... — the API supports more than 100 languages.
Semantic Understanding
Powered by embeddings, the search engine matches by meaning, not literal text. Synonyms, paraphrases and related concepts are all captured.
Metadata Filters
Attach up to 7 typed filter fields per document. Combine semantic search with precise metadata conditions to narrow results before ranking.
No Infrastructure to Manage
A single POST stores your document. Conversion, chunking, embedding and indexing all happen automatically behind one endpoint.
Dataspaces
Organize documents into isolated dataspaces. Keep different clients, projects or environments fully separated — search never leaks across boundaries.
All File Types Supported
PDF, Word, Excel, PowerPoint, images, HTML, code files and more. Send any document, we extract the content for you.
Getting Started
Get your API key
Sign up for free to receive your API key.
The free plan includes 100 credits per month and 200 document storage slots to get you started.
Create a dataspace
A dataspace is an isolated container for your documents.
Create one with a simple POST /dataspaces.
Create multiple dataspaces if you want to isolate data (ex: for different environments, projects, or clients).
Store & search
Upload a file or plain text to your dataspace via our API.
Then search by meaning with a single query.
Results are ranked by semantic relevance and returned with a similarity score.
Documentation
Endpoint Base URL
https://api.documentai.dev/store-and-search-documents/v1Authentication
| Header | Required | Description |
|---|---|---|
| x-api-key | Required | Your API key. Get one by signing up. |
Endpoints
| Method | Endpoint | Description |
|---|---|---|
| Documents | ||
| POST | /dataspaces/:id/documents | Store a new document |
| POST | /dataspaces/:id/documents/search | Semantic search over documents in a dataspace |
| GET | /dataspaces/:id/documents | List documents (paginated) |
| GET | /dataspaces/:id/documents/:docId | Retrieve a specific document |
| PATCH | /dataspaces/:id/documents/:docId | Update document metadata filters only |
| DELETE | /dataspaces/:id/documents/:docId | Delete a document |
| Dataspaces | ||
| POST | /dataspaces | Create a new dataspace |
| GET | /dataspaces | List dataspaces (paginated) |
| GET | /dataspaces/:id | Retrieve a dataspace |
| DELETE | /dataspaces/:id | Delete dataspace (and all containing documents) |
Store Document
/dataspaces/:id/documentsUpload a document to have it stored and searchable.
You can provide a file, a public URL, or raw text.
You can also attach optional metadata filters.
Request Body
Send data as multipart/form-data.
| Field | Type | Required |
|---|---|---|
| url | string | One of three |
| file | file | One of three |
| text | string | One of three |
| filters | object | Optional |
Accepted File Types
Filter Fields for Metadata
Key names are fixedYou can attach up to 7 distinct filter fields per document.
These must be formatted as a flat JSON object in the filters field.
You must use these exact keys:
number_1,number_2,number_3string_1,string_2,string_3,string_4
Example:
{
"string_1": "hr-policies",
"string_2": "internal",
"string_3": "onboarding",
"string_4": "v2",
"number_1": 2025,
"number_2": -3.14159,
"number_3": 42
}Document Fields
Every stored document contains the following fields.
| Field | Type |
|---|---|
| id | string |
| text | string |
| markdown | string |
| filters | object |
| created_at | string |
Example Responses
File or URL input
{
"status": "success",
"data": {
"id": "8f3kLmNpQ2xR4vW1",
"markdown": "# Company Wiki\n\n## Remote Work\nEmployees are allowed...",
"text": "Company Wiki\nRemote Work\nEmployees are allowed...",
"filters": {
"string_1": "hr-policies",
"number_1": 2025
},
"created_at": "2026-04-30T12:00:00Z"
}
}Text input
{
"status": "success",
"data": {
"id": "Kp9nWxYz5TmR3qL7",
"markdown": "",
"text": "Employees are allowed to work remotely 3 days a week.\nA monthly stipend of $200 is provided for home office equipment.",
"filters": {
"string_1": "hr-policies",
"number_1": 2025
},
"created_at": "2026-04-30T12:05:00Z"
}
}Search Documents
/dataspaces/:id/documents/searchPerform a semantic vector search across all documents in a dataspace.
You can also apply complex metadata filters to narrow down the results before ranking.
Request Body
Send data as application/json.
| Field | Type | Required |
|---|---|---|
| query | string | Required |
| limit | number | Optional |
| filters | array | Optional |
Filter Rules
Each rule in the filters array must be an object with three properties: field, operator, and value.
Operators can be combined freely.
| Field Type | Supported Operators |
|---|---|
| number_1, number_2, number_3 | ==, !=, >, >=, <, <=, in, not-in |
| string_1, string_2, string_3, string_4 | ==, !=, in, not-in |
Limits:
- · You can use maximum 7 filter rules in a request.
- · You can only use one
inornot-inoperator per search query. - · You can only use one
!=ornot-infilter per search query.
Request Examples
Basic query
{
"query": "What is the remote work policy?"
}With limit
{
"query": "What is the remote work policy?",
"limit": 5
}Single filter
{
"query": "How do I cancel my subscription?",
"limit": 5,
"filters": [
{ "field": "string_1", "operator": "==", "value": "support" }
]
}Multiple filters
{
"query": "GDPR compliance requirements for user data",
"limit": 10,
"filters": [
{ "field": "string_1", "operator": "==", "value": "legal" },
{ "field": "number_1", "operator": ">=", "value": 2024 },
{ "field": "string_2", "operator": "!=", "value": "draft" }
]
}Range + in operator
{
"query": "pricing model",
"limit": 20,
"filters": [
{ "field": "number_2", "operator": ">=", "value": 3.14 },
{ "field": "number_2", "operator": "<", "value": 3.15 },
{ "field": "string_4", "operator": "in", "value": ["finance", "sales", "marketing"] }
]
}Example Response
{
"status": "success",
"data": [
{
"score": 0.892,
"document": {
"id": "8f3kLmNpQ2xR4vW1",
"text": "Employees are allowed to work remotely 3 days a week...",
"markdown": "## Remote Work\nEmployees are allowed to work **remotely** 3 days a week...",
"filters": {
"string_1": "hr-policies",
"number_1": 2025
}
}
},
{
"score": 0.814,
"document": {
"id": "Yt7nBcDe9FgH2jK5",
"text": "A monthly stipend of $200 is provided for home office equipment...",
"markdown": "",
"filters": {
"string_1": "hr-policies",
"number_1": 2025
}
}
},
{
"score": 0.743,
"document": {
"id": "Zw6mXsAp3RqV8uT4",
"text": "Remote employees must be available during core hours 10am-4pm...",
"markdown": "## Availability\nRemote employees must be available during core hours **10am-4pm**...",
"filters": {
"string_1": "hr-policies",
"number_1": 2024
}
}
}
]
}The markdown field is populated for documents stored from files or URLs. For documents stored from text input, it is an empty string.
Other Endpoints
Detailed specifications for all remaining CRUD and management endpoints.
Pagination
Both GET /dataspaces and GET /dataspaces/:id/documents support cursor-based pagination.
| Query Parameter | Default |
|---|---|
| limit | 20 |
| cursor | — |
Every paginated response includes a pagination object:
{
"status": "success",
"data": [ ... ],
"pagination": {
"has_more": true,
"next_cursor": "8f3kLmNpQ2xR4vW1"
}
}List Documents
/dataspaces/:id/documentsReturns a paginated list of documents in a dataspace, ordered by creation date (newest first).
{
"status": "success",
"data": [
{
"id": "8f3kLmNpQ2xR4vW1",
"markdown": "# Company Wiki\n\n## Remote Work\nEmployees are allowed...",
"text": "Company Wiki\nRemote Work\nEmployees are allowed...",
"filters": {
"string_1": "hr-policies",
"number_1": 2025
},
"created_at": "2026-04-30T12:00:00Z"
},
{
"id": "Kp9nWxYz5TmR3qL7",
"markdown": "",
"text": "A monthly stipend of $200 is provided for home office equipment...",
"filters": {
"string_1": "hr-policies",
"number_1": 2025
},
"created_at": "2026-04-30T11:45:00Z"
}
],
"pagination": {
"has_more": true,
"next_cursor": "Kp9nWxYz5TmR3qL7"
}
}Get Document
/dataspaces/:id/documents/:docIdRetrieve a single document by its ID.
{
"status": "success",
"data": {
"id": "8f3kLmNpQ2xR4vW1",
"markdown": "# Company Wiki\n\n## Remote Work\nEmployees are allowed...",
"text": "Company Wiki\nRemote Work\nEmployees are allowed...",
"filters": {
"string_1": "hr-policies",
"number_1": 2025
},
"created_at": "2026-04-30T12:00:00Z"
}
}Update Document
/dataspaces/:id/documents/:docIdUpdate metadata filters only, on an existing document.
Send data as application/json with a filters object.
- · Partial merge: Only the keys you include are updated. Existing filters you don't mention are left unchanged.
- · Delete a filter: Send
nullfor a key to remove it entirely. - · Content is immutable: You cannot update the text, file, or markdown of a document. Delete and re-store it instead.
Request body
{
"filters": {
"string_1": "updated-category",
"number_1": 2026,
"string_2": null
}
}Response
{
"status": "success",
"data": {
"id": "8f3kLmNpQ2xR4vW1",
"markdown": "# Company Wiki\n\n## Remote Work\nEmployees are allowed...",
"text": "Company Wiki\nRemote Work\nEmployees are allowed...",
"filters": {
"string_1": "updated-category",
"number_1": 2026
},
"created_at": "2026-04-30T12:00:00Z"
}
}Note that string_2 was removed from the response because it was set to null in the request. Filter updates take effect on the next search immediately.
Delete Document
/dataspaces/:id/documents/:docIdPermanently delete a document and all its associated search data.
{
"status": "success",
"data": {
"id": "8f3kLmNpQ2xR4vW1",
"deleted": true
}
}Create Dataspace
/dataspacesCreate a new, empty dataspace. No request body is required.
{
"status": "success",
"data": {
"id": "Xr9pLmWq4TnK2vY8",
"count": 0,
"created_at": "2026-04-30T14:00:00Z"
}
}List Dataspaces
/dataspacesReturns a paginated list of your dataspaces, ordered by creation date (newest first).
Supports the same ?limit and ?cursor query parameters.
{
"status": "success",
"data": [
{
"id": "Xr9pLmWq4TnK2vY8",
"count": 47,
"created_at": "2026-04-30T14:00:00Z"
},
{
"id": "Bc3nYhTw7KpR5jM1",
"count": 12,
"created_at": "2026-04-28T09:30:00Z"
}
],
"pagination": {
"has_more": false,
"next_cursor": "Bc3nYhTw7KpR5jM1"
}
}Get Dataspace
/dataspaces/:idRetrieve a single dataspace. The count field reflects the number of documents currently stored.
{
"status": "success",
"data": {
"id": "Xr9pLmWq4TnK2vY8",
"count": 47,
"created_at": "2026-04-30T14:00:00Z"
}
}Delete Dataspace
/dataspaces/:idPermanently delete a dataspace. All documents inside it are permanently destroyed.
{
"status": "success",
"data": {
"id": "Xr9pLmWq4TnK2vY8",
"deleted": true,
"documents_destroyed": 47
}
}Error Responses
All error responses follow a consistent envelope:
{
"status": "error",
"error": {
"code": "error_code",
"message": "Human-readable description."
}
}Limits
| Limit | Value |
|---|---|
| Max input file size | 30 MB |
| Max pages per document | 1,000 |
| Filter fields per document | string_1, string_2, string_3, string_4, number_1, number_2, number_3 |
| Max filter rules per search | 7 |
Document storage limit: The total number of documents you can store across all dataspaces depends on your plan — 200 on the Free plan, 50,000 on Basic, 100,000 on Plus, and 200,000 on Max.
Cost
- · Storing files: 1 credit per page.
- · Storing text: 1 credit per 1,000 tokens.
- · Searching: 0.2 credits per search query.
- · All other endpoints (list, get, update, delete, and all dataspace management) are free.
- · Failed requests are not charged.
- · Storage is bounded by your plan's
max_documentslimit.