Orthanc API
Persistent Memory for Voice AI Agents. Give your voice agents memory that persists across every call. Works with Vapi, Retell, Bland, Twilio & any voice AI platform. Two API calls, sub-200ms retrieval.
Base URL
https://api.orthanc.aiFor local development: http://localhost:3001
Voice-Grade Speed
Sub-200ms retrieval. Built for real-time voice.
Automatic Extraction
Facts extracted from call transcripts automatically.
Per-Caller Isolation
Each caller's memory is fully isolated. HIPAA-ready.
Quickstart
Add persistent memory to your voice AI in under 5 minutes. Works with Vapi, Retell, Bland, or any voice platform. Test the API in the Playground.
Get your API Key
Create an API key from your dashboard. Keys look like sk-mem-xxxxx.
Sync call transcripts after each call
Send call transcripts to /api/sync after each call. Facts are extracted automatically. Use the caller's phone number or ID as the userId.
const response = await fetch('https://api.orthanc.ai/api/sync', {
method: 'POST',
headers: {
'Authorization': 'Bearer sk-mem-your-api-key',
'Content-Type': 'application/json'
},
body: JSON.stringify({
userId: '+14155551234', // caller's phone number
messages: [
{ role: 'user', content: 'Hi, I need to reschedule my appointment' },
{ role: 'assistant', content: 'Of course! I see you have an appointment Thursday at 2pm' },
{ role: 'user', content: 'Can we move it to Friday at 10am? Also my insurance changed to Aetna' }
]
})
});
// Response: { "status": "queued", "requestId": "a1b2c3d4", "result": { "factsExtracted": 3 } }Fetch context before each call
Query memories with /api/context before the call starts. Inject the results into your voice AI's system prompt.
const response = await fetch('https://api.orthanc.ai/api/context', {
method: 'POST',
headers: {
'Authorization': 'Bearer sk-mem-your-api-key',
'Content-Type': 'application/json'
},
body: JSON.stringify({
userId: '+14155551234', // caller's phone number
messages: [
{ role: 'user', content: 'incoming call' }
]
})
});
const data = await response.json();
// data.memories: ["Caller rescheduled appointment to Friday 10am", "Insurance changed to Aetna"]Inject into your AI
Inject retrieved memories into your voice AI's system prompt so it knows the caller.
// Example: Vapi function call / Retell pre-call hook
async function getCallerContext(callerPhone) {
const memoryResponse = await fetch('https://api.orthanc.ai/api/context', {
method: 'POST',
headers: {
'Authorization': 'Bearer sk-mem-your-api-key',
'Content-Type': 'application/json'
},
body: JSON.stringify({
userId: callerPhone,
messages: [{ role: 'user', content: 'incoming call' }]
})
});
const { memories } = await memoryResponse.json();
// Build the system prompt for your voice AI
return `You are a helpful voice assistant.
What you know about this caller:
${memories.map(m => '- ' + m).join('\n')}
Use this context naturally in conversation. Don't repeat it back unless asked.`;
}Authentication
All API requests require authentication via Bearer token in the Authorization header.
Authorization: Bearer sk-mem-your-api-keySecurity note: Never expose your API key in client-side code. Make API calls from your backend server only.
API Key Format
API keys follow the format sk-mem-[random-string]. Create and manage keys in your dashboard.
/api/sync
Ingest conversation messages and extract facts about your user. Processing happens asynchronously in the background.
Request Body
userIdstringUnique identifier for your end user. All memories are isolated by this ID. Use a stable identifier like a database ID or auth provider ID.
messagesarrayArray of conversation messages to process. Each message needs role and content.
role — "user" or "assistant"
content — The message text (max 32,000 chars)
Example request
curl -X POST https://api.orthanc.ai/api/sync \
-H "Authorization: Bearer sk-mem-your-api-key" \
-H "Content-Type: application/json" \
-d '{
"userId": "user_123",
"messages": [
{ "role": "user", "content": "I love spicy Thai food" },
{ "role": "assistant", "content": "Thai food is delicious!" },
{ "role": "user", "content": "My favorite restaurant is Kin Khao" }
]
}'Response
{
"status": "queued",
"message": "Memory sync queued for processing",
"requestId": "a1b2c3d4",
"inputFormat": "messages",
"result": {
"factsExtracted": 4,
"memoriesInserted": 3,
"memoriesUpdated": 0,
"memoriesSkipped": 1,
"latency_ms": 1250
}
}This endpoint returns immediately after queuing. Facts are extracted, embeddings are generated, and memories are stored in the background (typically 1-3 seconds).
/api/context
Retrieve relevant memories for a user query. Uses hybrid search (semantic + keyword) with intelligent query understanding.
Request Body
userIdstringThe user whose memories to search.
messagesarrayConversation context. The last user message is used as the search query. Previous messages help resolve pronouns and context.
optionsobjectOptional configuration for the search.
useHyDE boolean
Enable HyDE for better accuracy on vague queries. Default: true
matchThreshold number
Minimum similarity score (0-1). Default: 0.5
matchCount number
Maximum memories to return (1-100). Default: 5
timeFilter string
Filter by time: "hour", "day", "week", "month", "year", "all"
Example request
curl -X POST https://api.orthanc.ai/api/context \
-H "Authorization: Bearer sk-mem-your-api-key" \
-H "Content-Type: application/json" \
-d '{
"userId": "user_123",
"messages": [
{ "role": "user", "content": "What restaurants do I like?" }
],
"options": {
"useHyDE": true,
"matchCount": 5
}
}'Response
{
"memories": [
"User loves spicy Thai food",
"User's favorite restaurant is Kin Khao",
"User prefers Asian cuisine"
],
"scores": [0.92, 0.87, 0.81],
"count": 3,
"queryType": "question_match",
"latency_ms": 145,
"requestId": "a1b2c3d4"
}Response Fields
memoriesstring[]Array of relevant memory strings, ordered by relevance.
scoresnumber[]Relevance scores for each memory (0-1).
countnumberNumber of memories returned.
queryTypestringQuery type used: question_match (fast path, ~100ms), vector_search, graph_relation, temporal, etc.
latency_msnumberTotal processing time in milliseconds.
requestIdstringUnique request ID for debugging and support.
/api/upload
Upload audio, PDF, or text files for memory extraction. Audio is transcribed via Whisper, PDFs are parsed automatically.
Supported file types
Audio (25MB max)
mp3, wav, m4a, webm, ogg, flac
Documents (10MB max)
Text (5MB max)
txt, md, csv
Form Data
fileFileThe file to upload.
userIdstringUnique identifier for your end user.
optionsJSON stringOptional settings (sync, source, tags, importance, category).
Example: Upload audio file
const formData = new FormData();
formData.append('file', audioFile); // .mp3, .wav, etc.
formData.append('userId', 'user_123');
const response = await fetch('https://api.orthanc.ai/api/upload', {
method: 'POST',
headers: { 'Authorization': 'Bearer sk-mem-your-api-key' },
body: formData
});
// Response: { "status": "queued", "inputFormat": "audio", "fileName": "voice.mp3" }Example: Upload PDF (sync mode)
const formData = new FormData();
formData.append('file', pdfFile);
formData.append('userId', 'user_123');
formData.append('options', JSON.stringify({ sync: true })); // Wait for processing
const response = await fetch('https://api.orthanc.ai/api/upload', {
method: 'POST',
headers: { 'Authorization': 'Bearer sk-mem-your-api-key' },
body: formData
});
const data = await response.json();
// data.result: { factsExtracted: 12, memoriesInserted: 8 }curl example
curl -X POST https://api.orthanc.ai/api/upload \
-H "Authorization: Bearer sk-mem-your-api-key" \
-F "file=@document.pdf" \
-F "userId=user_123" \
-F 'options={"sync": true}'By default, files are processed asynchronously (fast response). Set sync: true in options to wait for processing.
/api/webhooks
Get real-time notifications when memories change. Supports HMAC signatures for security.
Available events
memory.createdmemory.updatedmemory.deletedmemory.batch_createdmemory.batch_deletedCreate a webhook
curl -X POST https://api.orthanc.ai/api/webhooks \
-H "Authorization: Bearer sk-mem-your-api-key" \
-H "Content-Type: application/json" \
-d '{
"url": "https://your-server.com/webhook",
"events": ["memory.created", "memory.updated"],
"secret": "whsec_your_secret"
}'List webhooks
curl https://api.orthanc.ai/api/webhooks \
-H "Authorization: Bearer sk-mem-your-api-key"
# Response: { "webhooks": [...], "count": 2 }JavaScript example
// Create webhook
const response = await fetch('https://api.orthanc.ai/api/webhooks', {
method: 'POST',
headers: {
'Authorization': 'Bearer sk-mem-your-api-key',
'Content-Type': 'application/json'
},
body: JSON.stringify({
url: 'https://your-server.com/webhook',
events: ['memory.created', 'memory.updated']
})
});
// List webhooks
const list = await fetch('https://api.orthanc.ai/api/webhooks', {
headers: { 'Authorization': 'Bearer sk-mem-your-api-key' }
});
// Delete webhook
await fetch('https://api.orthanc.ai/api/webhooks/webhook-id', {
method: 'DELETE',
headers: { 'Authorization': 'Bearer sk-mem-your-api-key' }
});Webhook payload
{
"event": "memory.created",
"timestamp": "2026-02-04T12:00:00Z",
"data": {
"userId": "user_123",
"memory": "User works at Stripe"
}
}If you set a secret, verify the X-Webhook-Signature header using HMAC-SHA256.
/api/memories/batch
Create, update, or delete up to 100 memories in a single request. Perfect for migrations and bulk operations.
Request Body
userIdstringUser whose memories to modify.
operationsarrayArray of operations (max 100).
create
{ action: 'create', text: 'fact to store', options: {...} }
update
{ action: 'update', id: 'mem-uuid', updates: { category, importance } }
delete
{ action: 'delete', id: 'mem-uuid' }
Example request
curl -X POST https://api.orthanc.ai/api/memories/batch \
-H "Authorization: Bearer sk-mem-your-api-key" \
-H "Content-Type: application/json" \
-d '{
"userId": "user_123",
"operations": [
{ "action": "create", "text": "User prefers dark mode" },
{ "action": "create", "text": "User lives in NYC" },
{ "action": "delete", "id": "mem-old-123" }
]
}'Response
{
"processed": 3,
"results": {
"created": 2,
"updated": 0,
"deleted": 1,
"failed": 0
}
}JavaScript example
const response = await fetch('https://api.orthanc.ai/api/memories/batch', {
method: 'POST',
headers: {
'Authorization': 'Bearer sk-mem-your-api-key',
'Content-Type': 'application/json'
},
body: JSON.stringify({
userId: 'user_123',
operations: [
{ action: 'create', text: 'User prefers dark mode' },
{ action: 'create', text: 'User lives in NYC' },
{ action: 'update', id: 'mem-uuid', updates: { importance: 0.9 } },
{ action: 'delete', id: 'mem-old-123' }
]
})
});
const data = await response.json();
console.log(data.results); // { created: 2, updated: 1, deleted: 1, failed: 0 }/api/memories/export
Export all memories for a user as JSON or CSV. Full data portability, no lock-in.
Query Parameters
userIdstringFilter by user ID. Omit to export all project memories.
formatstringExport format: json (default) or csv.
Default: json
limitnumberMemories per page (1-1000).
Default: 100
offsetnumberPagination offset.
Default: 0
includeEmbeddingsbooleanInclude embedding vectors (large!).
Default: false
Example: Export as JSON
curl "https://api.orthanc.ai/api/memories/export?userId=user_123&limit=100" \
-H "Authorization: Bearer sk-mem-your-api-key"
# Response:
{
"userId": "user_123",
"total": 247,
"count": 100,
"hasMore": true,
"memories": [...]
}Example: Export as CSV
curl "https://api.orthanc.ai/api/memories/export?userId=user_123&format=csv" \
-H "Authorization: Bearer sk-mem-your-api-key" \
-o memories.csvJavaScript example
// Export memories for a user
const response = await fetch(
'https://api.orthanc.ai/api/memories/export?userId=user_123&limit=100',
{ headers: { 'Authorization': 'Bearer sk-mem-your-api-key' } }
);
const data = await response.json();
console.log(data.memories); // Array of memory objects
console.log(data.hasMore); // true if more pages exist
// Paginate through all memories
let allMemories = [];
let offset = 0;
while (true) {
const res = await fetch(
`https://api.orthanc.ai/api/memories/export?userId=user_123&limit=100&offset=${offset}`,
{ headers: { 'Authorization': 'Bearer sk-mem-your-api-key' } }
);
const page = await res.json();
allMemories.push(...page.memories);
if (!page.hasMore) break;
offset += 100;
}Use pagination (offset) to export large datasets. Check hasMore to know if more pages exist.
How It Works
Orthanc uses a "Split-Brain" architecture optimized for voice-grade speed, orchestrated into a seamless API.
Fact Extraction
When you call /api/sync, our extraction pipeline identifies discrete facts from the conversation. Greetings and small talk are filtered out. Each fact is atomic and self-contained.
Embedding Generation
Each fact is converted into a high-dimensional vector using optimized embedding models. This enables semantic search beyond simple keyword matching, with sub-50ms embedding latency.
Contradiction Detection
Before storing, we check for contradictions with existing memories. If a user says "I moved to New York" but previously said "I live in San Francisco", the old memory is updated intelligently.
Hybrid Retrieval
When you call /api/context, we combine semantic search (vector similarity) with keyword search (full-text) and re-ranking for 99% retrieval accuracy.
HyDE Enhancement
HyDE (Hypothetical Document Embeddings) improves search accuracy for vague or indirect queries.
Instead of embedding the raw query, HyDE first generates a hypothetical "ideal answer" that would match the query, then embeds that. This bridges the gap between how users ask questions and how facts are stored.
User Query
"What do I like to eat?"
HyDE Query
"The user enjoys eating Thai food, particularly spicy dishes..."
HyDE is enabled by default. Disable it with useHyDE: false for direct keyword-style queries.
Intent Detection
Orthanc automatically detects query intent and applies specialized handling.
| Intent | Example | Behavior |
|---|---|---|
simple | "What's my job?" | Standard retrieval |
list_all | "What restaurants do I like?" | Returns all matching items |
count | "How many projects?" | Returns count + categories |
comparison | "Do I prefer React or Vue?" | Retrieves both for comparison |
temporal | "What did I say yesterday?" | Applies time filter |
existence | "Have I mentioned my dog?" | Returns found + confidence |
inference | "Would I enjoy this?" | Multi-hop reasoning |
Error Codes
All errors return a JSON response with an error message and code.
| Status | Code | Description |
|---|---|---|
| 400 | VALIDATION_ERROR | Missing or invalid field |
| 401 | MISSING_AUTH | Missing Authorization header |
| 401 | INVALID_API_KEY | Invalid or revoked API key |
| 429 | RATE_LIMIT_EXCEEDED | Rate limit exceeded |
| 500 | INTERNAL_ERROR | Internal server error |
Example error response
{
"error": "Invalid or revoked API key",
"code": "INVALID_API_KEY"
}Best Practices
Sync after every call ends
Call /api/sync with the full transcript after each call ends. Use your voice platform's post-call webhook (Vapi, Retell, Bland all support this). The endpoint is async, so it won't block your flow.
Fetch context before each call
Always call /api/context before the voice AI answers. Sub-200ms retrieval fits within the call setup window. Inject the memories into your agent's system prompt.
Use phone number as user ID
Use the caller's phone number (E.164 format like +14155551234) as the userId. This ensures memory persists across all calls from the same number. Alternatively, use your CRM customer ID.
Include recent messages for context
When calling /api/context, include the last few transcript lines. This helps resolve references ("the issue we discussed") and improves retrieval relevance.
Keep API keys server-side
Never expose your API key in client-side code. Make all Orthanc API calls from your backend or serverless functions.
3 Lines of Code
Add persistent memory to your voice AI in under 5 minutes.
Install the SDK
npm install @orthanc-protocol/clientAdd 3 lines to your voice AI
Add these lines to your call handler (works with Vapi, Retell, Bland, or custom):
import { OrthancsClient } from '@orthanc-protocol/client';
const memory = new OrthancsClient({
endpoint: 'https://api.orthanc.ai',
apiKey: 'sk-mem-YOUR_API_KEY' // Get key from dashboard
});
// After call ends — sync the transcript:
await memory.syncMessages(callerPhone, transcript);
// Before next call — fetch caller context:
const { memories } = await memory.query(callerPhone, 'incoming call');callerPhone = caller's phone number or CRM ID
transcript = array of message objects from the call
Inject into voice AI prompt
Add the memories to your voice agent's system prompt:
const systemPrompt = `You are a helpful voice assistant.
What you know about this caller:
${memories.map(m => '- ' + m).join('\n')}
Use this context naturally. Don't read it back verbatim.`;
// Pass to Vapi/Retell/Bland as the system promptDone
Your voice AI now remembers every caller. When someone mentions their insurance changed on Monday, your AI knows it on Friday's follow-up call.
3 lines of Orthanc code. Persistent voice AI memory.
Get your API key from the dashboard.
All 6 SDK Methods
memory.syncMessages() — Sync call transcripts
await memory.syncMessages('+14155551234', [
{ role: 'user', content: 'I need to update my address to 123 Oak Street' },
{ role: 'assistant', content: 'I have updated your address on file.' }
]);
// Response: { status: 'queued', result: { factsExtracted: 2, memoriesInserted: 1 } }memory.query() — Retrieve caller context
const { memories, count, latency_ms } = await memory.query('+14155551234', 'incoming call');
// memories: ['Caller address updated to 123 Oak Street', 'Insurance: Aetna']
// latency_ms: 45
// With options
const result = await memory.query('+14155551234', 'appointment history', {
timeFilter: 'month',
matchCount: 10
});upload via fetch — Ingest files (PDF, audio, text)
// Upload a PDF via the API directly
const formData = new FormData();
formData.append('file', pdfFile);
formData.append('userId', 'user_123');
await fetch('https://api.orthanc.ai/api/upload', {
method: 'POST',
headers: { 'Authorization': 'Bearer sk-mem-your-key' },
body: formData
});memory.batch() — Bulk create/update/delete (max 100)
const result = await memory.batch({
userId: 'user_123',
operations: [
{ action: 'create', text: 'User loves hiking in mountains' },
{ action: 'create', text: 'User prefers morning workouts' },
{ action: 'update', id: 'mem-uuid', updates: { importance: 0.9 } },
{ action: 'delete', id: 'old-memory-id' }
]
});
console.log(result.results);
// { created: 2, updated: 1, deleted: 1, failed: 0 }memory.exportAll() — Download memories
// Export all memories for a user (auto-paginates)
const memories = await memory.exportAll('user_123');
for (const mem of memories) {
console.log(mem.content, mem.createdAt);
}memory.createWebhook() — Real-time notifications
// Create a webhook
await memory.createWebhook({
url: 'https://your-server.com/webhook',
events: ['memory.created', 'memory.updated'],
secret: 'whsec_your_secret'
});
// List all webhooks
const webhooks = await memory.listWebhooks();
// Delete a webhook
await memory.deleteWebhook('webhook-id');SDK methods — @orthanc-protocol/client
syncMessages() — Save conversations
query() — Retrieve memories
POST /api/upload — Ingest files
batch() — Bulk create/update/delete
exportAll() — Download memories
createWebhook() — Real-time events
Auto-retry with exponential backoff
Full TypeScript support with types