Tabby’s Doc Ingestion API Is Here: Power Up with Your Own Docs

Why Doc Ingestion Matters
Tabby is only as smart as the documentation you give it.
The Doc Ingestion API offers a programmatic way to feed Tabby your own materials — like project docs, technical articles, or internal knowledge base entries.
Once submitted, these docs are processed asynchronously and indexed into Tabby’s knowledge system. That makes them available for context retrieval during code assistance tasks.
You’re no longer limited to built-in context. Tabby can now respond with your team’s actual docs — grounded, specific, and under your control.
What You Can Do With It
The Doc Ingestion API accepts documents from external systems and scripts, making it easy to integrate with your existing workflows.
Ingested docs are:
- Indexed in the background
- Usable as context in completions and answers
- Associated with a
source
andid
for easy updates or removal - Automatically expirable with a configurable TTL (time to live)
You can remove docs either by source
or by specific id
.
All operations are processed asynchronously, giving you control over what Tabby knows and how long it remembers it.
👀 See Tabby in Action
Watch a doc go from ingestion → selection → answer reference in just seconds.
Getting Started
To use the Ingestion API, you'll need an authentication token.
🔐 Authentication
Tabby uses Bearer Token authentication.

To get your token:
- Go to the
Information
section - Click onto the
System
tab - Copy the value labelled Registration token
Use this token in your API requests:
Authorization: Bearer <YOUR_TOKEN_HERE>
If the token is missing or invalid, the API will return a 401 Unauthorized
response.
Walkthrough: From Document to Answer
Let's walk through the full flow of using Ingestion API, from crafting your request to seeing the document show up in Tabby's Answer Engine.
We'll use this Tabby llama.cpp model configuration page as our live example.
1. Prepare the request
Create a JSON payload with your document's content and metadata:
{
"source": "tabby-documents",
"id": "tabby-llama.cpp-models-configuration",
"title": "Tabby llama.cpp Model Configuration",
"body": "import Collapse from '@site/src/components/Collapse';\n\n# llama.cpp\n\n[llama.cpp](https://github.com/ggml-org/llama.cpp/blob/master/examples/server/README.md#api-endpoints) is a popular C++ library for serving gguf-based models. It provides a server implementation that supports completion, chat, and embedding functionalities through HTTP APIs.\n\n## Chat model\n\nllama.cpp provides an OpenAI-compatible chat API interface.\n\n```toml title=\"~/.tabby/config.toml\"\n[model.chat.http]\nkind = \"openai/chat\"\napi_endpoint = \"http://localhost:8888\"\n```\n\n## Completion model\n\nllama.cpp offers a specialized completion API interface for code completion tasks.\n\n```toml title=\"~/.tabby/config.toml\"\n[model.completion.http]\nkind = \"llama.cpp/completion\"\napi_endpoint = \"http://localhost:8888\"\nprompt_template = \"<PRE> {prefix} <SUF>{suffix} <MID>\" # Example prompt template for the CodeLlama model series.\n```\n\n## Embeddings model\n\nllama.cpp provides embedding functionality through its HTTP API.\n\nThe llama.cpp embedding API interface and response format underwent some changes in version `b4356`.\nTherefore, we have provided two different kinds to accommodate the various versions of the llama.cpp embedding interface.\n\nYou can refer to the configuration as follows:\n\n```toml title=\"~/.tabby/config.toml\"\n[model.embedding.http]\nkind = \"llama.cpp/embedding\"\napi_endpoint = \"http://localhost:8888\"\n```\n\n<Collapse title=\"For the versions prior to b4356\">\n\n```toml title=\"~/.tabby/config.toml\"\n[model.embedding.http]\nkind = \"llama.cpp/before_b4356_embedding\"\napi_endpoint = \"http://localhost:8888\"\n```\n\n</Collapse>",
"link": "https://tabby.tabbyml.com/docs/references/models-http-api/llama.cpp/",
"ttl": "180d"
}
Key fields:
-
source
— group multiple docs together
id
— unique identifier for this doc
ttl
— optional expiry window (e.g. "180d")
2. Send it to Tabby
Send the request to Tabby's ingestion endpoint using any HTTP client (e.g. curl
, Postman, or a script):
Example command:
curl -X POST 'http://localhost:8080/v1beta/ingestion' \
-H 'Authorization: Bearer <YOUR_TOKEN_HERE>' \
-H 'Content-Type: application/json' \
-d '{
"source": "my-project-docs",
"id": "setup-guide-v1",
"title": "Project Setup Guide",
"body": "This document explains how to set up the project environment...",
"link": "https://internal.example.com/docs/setup",
"ttl": "180d"
}'
For full request details and advanced options, explore the OpenAPI docs.
3. Confirm ingestion
Go to the System
tab under Information
.
There, under Documents Ingestion Status, you'll see your source listed: tabby-documents — Total: 1
That means your doc was successfully ingested and is now queued for indexing.

4. Query with the document
In the Answer Engine UI:
- Open the Documents dropdown
- Select your ingested source(e.g.
tabby-documents
) - Ask a relevant question — Tabby will now include your doc's context when generating answers.

5. View the answer with references
If Tabby pulls from your document, you’ll see it cited in the response card, with a link back to the original doc.

Tabby just answered with your documentation. ✨
You Just Ingested Your First Doc — What’s Next?
Boom — Tabby’s now supercharged with your team’s own docs, making answers sharper, faster, and tailored to you.
From here, you can
- Keep adding more internal docs to grow Tabby's brain
- Check out OpenAPI docs if you want to peek under the hood.
🚀 The more you feed it, the more unstoppable Tabby becomes.
Discover Tabby Unlock Your Coding Potential
Get Started with our Community Plan Today
Simple self-onboarding
Free community plan
Local-first deployment
Explore Full Features with Team or Enterprise Plans
Enterprise-first experience
Flexible deployment options
Enhanced security support