Orthanc API

Persistent Memory for Voice AI Agents. Give your voice agents memory that persists across every call. Works with Vapi, Retell, Bland, Twilio & any voice AI platform. Two API calls, sub-200ms retrieval.

Base URL

https://api.orthanc.ai

For local development: http://localhost:3001

Voice-Grade Speed

Sub-200ms retrieval. Built for real-time voice.

Automatic Extraction

Facts extracted from call transcripts automatically.

Per-Caller Isolation

Each caller's memory is fully isolated. HIPAA-ready.

Quickstart

Add persistent memory to your voice AI in under 5 minutes. Works with Vapi, Retell, Bland, or any voice platform. Test the API in the Playground.

01

Get your API Key

Create an API key from your dashboard. Keys look like sk-mem-xxxxx.

02

Sync call transcripts after each call

Send call transcripts to /api/sync after each call. Facts are extracted automatically. Use the caller's phone number or ID as the userId.

After call ends — sync transcript
const response = await fetch('https://api.orthanc.ai/api/sync', {
  method: 'POST',
  headers: {
    'Authorization': 'Bearer sk-mem-your-api-key',
    'Content-Type': 'application/json'
  },
  body: JSON.stringify({
    userId: '+14155551234', // caller's phone number
    messages: [
      { role: 'user', content: 'Hi, I need to reschedule my appointment' },
      { role: 'assistant', content: 'Of course! I see you have an appointment Thursday at 2pm' },
      { role: 'user', content: 'Can we move it to Friday at 10am? Also my insurance changed to Aetna' }
    ]
  })
});

// Response: { "status": "queued", "requestId": "a1b2c3d4", "result": { "factsExtracted": 3 } }
03

Fetch context before each call

Query memories with /api/context before the call starts. Inject the results into your voice AI's system prompt.

Before call starts — fetch context
const response = await fetch('https://api.orthanc.ai/api/context', {
  method: 'POST',
  headers: {
    'Authorization': 'Bearer sk-mem-your-api-key',
    'Content-Type': 'application/json'
  },
  body: JSON.stringify({
    userId: '+14155551234', // caller's phone number
    messages: [
      { role: 'user', content: 'incoming call' }
    ]
  })
});

const data = await response.json();
// data.memories: ["Caller rescheduled appointment to Friday 10am", "Insurance changed to Aetna"]
04

Inject into your AI

Inject retrieved memories into your voice AI's system prompt so it knows the caller.

Inject into voice AI system prompt
// Example: Vapi function call / Retell pre-call hook
async function getCallerContext(callerPhone) {
  const memoryResponse = await fetch('https://api.orthanc.ai/api/context', {
    method: 'POST',
    headers: {
      'Authorization': 'Bearer sk-mem-your-api-key',
      'Content-Type': 'application/json'
    },
    body: JSON.stringify({
      userId: callerPhone,
      messages: [{ role: 'user', content: 'incoming call' }]
    })
  });
  const { memories } = await memoryResponse.json();

  // Build the system prompt for your voice AI
  return `You are a helpful voice assistant.

What you know about this caller:
${memories.map(m => '- ' + m).join('\n')}

Use this context naturally in conversation. Don't repeat it back unless asked.`;
}

Authentication

All API requests require authentication via Bearer token in the Authorization header.

http
Authorization: Bearer sk-mem-your-api-key

Security note: Never expose your API key in client-side code. Make API calls from your backend server only.

API Key Format

API keys follow the format sk-mem-[random-string]. Create and manage keys in your dashboard.

POST

/api/sync

Ingest conversation messages and extract facts about your user. Processing happens asynchronously in the background.

Request Body

userIdstring
required

Unique identifier for your end user. All memories are isolated by this ID. Use a stable identifier like a database ID or auth provider ID.

messagesarray
required

Array of conversation messages to process. Each message needs role and content.

role"user" or "assistant"

content — The message text (max 32,000 chars)

Example request

bash
curl -X POST https://api.orthanc.ai/api/sync \
  -H "Authorization: Bearer sk-mem-your-api-key" \
  -H "Content-Type: application/json" \
  -d '{
    "userId": "user_123",
    "messages": [
      { "role": "user", "content": "I love spicy Thai food" },
      { "role": "assistant", "content": "Thai food is delicious!" },
      { "role": "user", "content": "My favorite restaurant is Kin Khao" }
    ]
  }'

Response

json
{
  "status": "queued",
  "message": "Memory sync queued for processing",
  "requestId": "a1b2c3d4",
  "inputFormat": "messages",
  "result": {
    "factsExtracted": 4,
    "memoriesInserted": 3,
    "memoriesUpdated": 0,
    "memoriesSkipped": 1,
    "latency_ms": 1250
  }
}

This endpoint returns immediately after queuing. Facts are extracted, embeddings are generated, and memories are stored in the background (typically 1-3 seconds).

POST

/api/context

Retrieve relevant memories for a user query. Uses hybrid search (semantic + keyword) with intelligent query understanding.

Request Body

userIdstring
required

The user whose memories to search.

messagesarray
required

Conversation context. The last user message is used as the search query. Previous messages help resolve pronouns and context.

optionsobject
optional

Optional configuration for the search.

useHyDE boolean

Enable HyDE for better accuracy on vague queries. Default: true

matchThreshold number

Minimum similarity score (0-1). Default: 0.5

matchCount number

Maximum memories to return (1-100). Default: 5

timeFilter string

Filter by time: "hour", "day", "week", "month", "year", "all"

Example request

bash
curl -X POST https://api.orthanc.ai/api/context \
  -H "Authorization: Bearer sk-mem-your-api-key" \
  -H "Content-Type: application/json" \
  -d '{
    "userId": "user_123",
    "messages": [
      { "role": "user", "content": "What restaurants do I like?" }
    ],
    "options": {
      "useHyDE": true,
      "matchCount": 5
    }
  }'

Response

json
{
  "memories": [
    "User loves spicy Thai food",
    "User's favorite restaurant is Kin Khao",
    "User prefers Asian cuisine"
  ],
  "scores": [0.92, 0.87, 0.81],
  "count": 3,
  "queryType": "question_match",
  "latency_ms": 145,
  "requestId": "a1b2c3d4"
}

Response Fields

memoriesstring[]
optional

Array of relevant memory strings, ordered by relevance.

scoresnumber[]
optional

Relevance scores for each memory (0-1).

countnumber
optional

Number of memories returned.

queryTypestring
optional

Query type used: question_match (fast path, ~100ms), vector_search, graph_relation, temporal, etc.

latency_msnumber
optional

Total processing time in milliseconds.

requestIdstring
optional

Unique request ID for debugging and support.

POST

/api/upload

Upload audio, PDF, or text files for memory extraction. Audio is transcribed via Whisper, PDFs are parsed automatically.

Supported file types

Audio (25MB max)

mp3, wav, m4a, webm, ogg, flac

Documents (10MB max)

pdf

Text (5MB max)

txt, md, csv

Form Data

fileFile
required

The file to upload.

userIdstring
required

Unique identifier for your end user.

optionsJSON string
optional

Optional settings (sync, source, tags, importance, category).

Example: Upload audio file

javascript
const formData = new FormData();
formData.append('file', audioFile); // .mp3, .wav, etc.
formData.append('userId', 'user_123');

const response = await fetch('https://api.orthanc.ai/api/upload', {
  method: 'POST',
  headers: { 'Authorization': 'Bearer sk-mem-your-api-key' },
  body: formData
});

// Response: { "status": "queued", "inputFormat": "audio", "fileName": "voice.mp3" }

Example: Upload PDF (sync mode)

javascript
const formData = new FormData();
formData.append('file', pdfFile);
formData.append('userId', 'user_123');
formData.append('options', JSON.stringify({ sync: true })); // Wait for processing

const response = await fetch('https://api.orthanc.ai/api/upload', {
  method: 'POST',
  headers: { 'Authorization': 'Bearer sk-mem-your-api-key' },
  body: formData
});

const data = await response.json();
// data.result: { factsExtracted: 12, memoriesInserted: 8 }

curl example

bash
curl -X POST https://api.orthanc.ai/api/upload \
  -H "Authorization: Bearer sk-mem-your-api-key" \
  -F "file=@document.pdf" \
  -F "userId=user_123" \
  -F 'options={"sync": true}'

By default, files are processed asynchronously (fast response). Set sync: true in options to wait for processing.

REST

/api/webhooks

Get real-time notifications when memories change. Supports HMAC signatures for security.

Available events

memory.createdmemory.updatedmemory.deletedmemory.batch_createdmemory.batch_deleted

Create a webhook

bash
curl -X POST https://api.orthanc.ai/api/webhooks \
  -H "Authorization: Bearer sk-mem-your-api-key" \
  -H "Content-Type: application/json" \
  -d '{
    "url": "https://your-server.com/webhook",
    "events": ["memory.created", "memory.updated"],
    "secret": "whsec_your_secret"
  }'

List webhooks

bash
curl https://api.orthanc.ai/api/webhooks \
  -H "Authorization: Bearer sk-mem-your-api-key"

# Response: { "webhooks": [...], "count": 2 }

JavaScript example

javascript
// Create webhook
const response = await fetch('https://api.orthanc.ai/api/webhooks', {
  method: 'POST',
  headers: {
    'Authorization': 'Bearer sk-mem-your-api-key',
    'Content-Type': 'application/json'
  },
  body: JSON.stringify({
    url: 'https://your-server.com/webhook',
    events: ['memory.created', 'memory.updated']
  })
});

// List webhooks
const list = await fetch('https://api.orthanc.ai/api/webhooks', {
  headers: { 'Authorization': 'Bearer sk-mem-your-api-key' }
});

// Delete webhook
await fetch('https://api.orthanc.ai/api/webhooks/webhook-id', {
  method: 'DELETE',
  headers: { 'Authorization': 'Bearer sk-mem-your-api-key' }
});

Webhook payload

json
{
  "event": "memory.created",
  "timestamp": "2026-02-04T12:00:00Z",
  "data": {
    "userId": "user_123",
    "memory": "User works at Stripe"
  }
}

If you set a secret, verify the X-Webhook-Signature header using HMAC-SHA256.

POST

/api/memories/batch

Create, update, or delete up to 100 memories in a single request. Perfect for migrations and bulk operations.

Request Body

userIdstring
required

User whose memories to modify.

operationsarray
required

Array of operations (max 100).

create

{ action: 'create', text: 'fact to store', options: {...} }

update

{ action: 'update', id: 'mem-uuid', updates: { category, importance } }

delete

{ action: 'delete', id: 'mem-uuid' }

Example request

bash
curl -X POST https://api.orthanc.ai/api/memories/batch \
  -H "Authorization: Bearer sk-mem-your-api-key" \
  -H "Content-Type: application/json" \
  -d '{
    "userId": "user_123",
    "operations": [
      { "action": "create", "text": "User prefers dark mode" },
      { "action": "create", "text": "User lives in NYC" },
      { "action": "delete", "id": "mem-old-123" }
    ]
  }'

Response

json
{
  "processed": 3,
  "results": {
    "created": 2,
    "updated": 0,
    "deleted": 1,
    "failed": 0
  }
}

JavaScript example

javascript
const response = await fetch('https://api.orthanc.ai/api/memories/batch', {
  method: 'POST',
  headers: {
    'Authorization': 'Bearer sk-mem-your-api-key',
    'Content-Type': 'application/json'
  },
  body: JSON.stringify({
    userId: 'user_123',
    operations: [
      { action: 'create', text: 'User prefers dark mode' },
      { action: 'create', text: 'User lives in NYC' },
      { action: 'update', id: 'mem-uuid', updates: { importance: 0.9 } },
      { action: 'delete', id: 'mem-old-123' }
    ]
  })
});

const data = await response.json();
console.log(data.results); // { created: 2, updated: 1, deleted: 1, failed: 0 }
GET

/api/memories/export

Export all memories for a user as JSON or CSV. Full data portability, no lock-in.

Query Parameters

userIdstring
optional

Filter by user ID. Omit to export all project memories.

formatstring
optional

Export format: json (default) or csv.

Default: json

limitnumber
optional

Memories per page (1-1000).

Default: 100

offsetnumber
optional

Pagination offset.

Default: 0

includeEmbeddingsboolean
optional

Include embedding vectors (large!).

Default: false

Example: Export as JSON

bash
curl "https://api.orthanc.ai/api/memories/export?userId=user_123&limit=100" \
  -H "Authorization: Bearer sk-mem-your-api-key"

# Response:
{
  "userId": "user_123",
  "total": 247,
  "count": 100,
  "hasMore": true,
  "memories": [...]
}

Example: Export as CSV

bash
curl "https://api.orthanc.ai/api/memories/export?userId=user_123&format=csv" \
  -H "Authorization: Bearer sk-mem-your-api-key" \
  -o memories.csv

JavaScript example

javascript
// Export memories for a user
const response = await fetch(
  'https://api.orthanc.ai/api/memories/export?userId=user_123&limit=100',
  { headers: { 'Authorization': 'Bearer sk-mem-your-api-key' } }
);

const data = await response.json();
console.log(data.memories);  // Array of memory objects
console.log(data.hasMore);   // true if more pages exist

// Paginate through all memories
let allMemories = [];
let offset = 0;
while (true) {
  const res = await fetch(
    `https://api.orthanc.ai/api/memories/export?userId=user_123&limit=100&offset=${offset}`,
    { headers: { 'Authorization': 'Bearer sk-mem-your-api-key' } }
  );
  const page = await res.json();
  allMemories.push(...page.memories);
  if (!page.hasMore) break;
  offset += 100;
}

Use pagination (offset) to export large datasets. Check hasMore to know if more pages exist.

How It Works

Orthanc uses a "Split-Brain" architecture optimized for voice-grade speed, orchestrated into a seamless API.

1

Fact Extraction

When you call /api/sync, our extraction pipeline identifies discrete facts from the conversation. Greetings and small talk are filtered out. Each fact is atomic and self-contained.

2

Embedding Generation

Each fact is converted into a high-dimensional vector using optimized embedding models. This enables semantic search beyond simple keyword matching, with sub-50ms embedding latency.

3

Contradiction Detection

Before storing, we check for contradictions with existing memories. If a user says "I moved to New York" but previously said "I live in San Francisco", the old memory is updated intelligently.

4

Hybrid Retrieval

When you call /api/context, we combine semantic search (vector similarity) with keyword search (full-text) and re-ranking for 99% retrieval accuracy.

HyDE Enhancement

HyDE (Hypothetical Document Embeddings) improves search accuracy for vague or indirect queries.

Instead of embedding the raw query, HyDE first generates a hypothetical "ideal answer" that would match the query, then embeds that. This bridges the gap between how users ask questions and how facts are stored.

User Query

"What do I like to eat?"

HyDE Query

"The user enjoys eating Thai food, particularly spicy dishes..."

HyDE is enabled by default. Disable it with useHyDE: false for direct keyword-style queries.

Intent Detection

Orthanc automatically detects query intent and applies specialized handling.

IntentExampleBehavior
simple"What's my job?"Standard retrieval
list_all"What restaurants do I like?"Returns all matching items
count"How many projects?"Returns count + categories
comparison"Do I prefer React or Vue?"Retrieves both for comparison
temporal"What did I say yesterday?"Applies time filter
existence"Have I mentioned my dog?"Returns found + confidence
inference"Would I enjoy this?"Multi-hop reasoning

Error Codes

All errors return a JSON response with an error message and code.

StatusCodeDescription
400VALIDATION_ERRORMissing or invalid field
401MISSING_AUTHMissing Authorization header
401INVALID_API_KEYInvalid or revoked API key
429RATE_LIMIT_EXCEEDEDRate limit exceeded
500INTERNAL_ERRORInternal server error

Example error response

json
{
  "error": "Invalid or revoked API key",
  "code": "INVALID_API_KEY"
}

Best Practices

Sync after every call ends

Call /api/sync with the full transcript after each call ends. Use your voice platform's post-call webhook (Vapi, Retell, Bland all support this). The endpoint is async, so it won't block your flow.

Fetch context before each call

Always call /api/context before the voice AI answers. Sub-200ms retrieval fits within the call setup window. Inject the memories into your agent's system prompt.

Use phone number as user ID

Use the caller's phone number (E.164 format like +14155551234) as the userId. This ensures memory persists across all calls from the same number. Alternatively, use your CRM customer ID.

Include recent messages for context

When calling /api/context, include the last few transcript lines. This helps resolve references ("the issue we discussed") and improves retrieval relevance.

Keep API keys server-side

Never expose your API key in client-side code. Make all Orthanc API calls from your backend or serverless functions.

3 Lines of Code

Add persistent memory to your voice AI in under 5 minutes.

1

Install the SDK

bash
npm install @orthanc-protocol/client
2

Add 3 lines to your voice AI

Add these lines to your call handler (works with Vapi, Retell, Bland, or custom):

javascript
import { OrthancsClient } from '@orthanc-protocol/client';

const memory = new OrthancsClient({
  endpoint: 'https://api.orthanc.ai',
  apiKey: 'sk-mem-YOUR_API_KEY'  // Get key from dashboard
});

// After call ends — sync the transcript:
await memory.syncMessages(callerPhone, transcript);

// Before next call — fetch caller context:
const { memories } = await memory.query(callerPhone, 'incoming call');

callerPhone = caller's phone number or CRM ID

transcript = array of message objects from the call

3

Inject into voice AI prompt

Add the memories to your voice agent's system prompt:

javascript
const systemPrompt = `You are a helpful voice assistant.

What you know about this caller:
${memories.map(m => '- ' + m).join('\n')}

Use this context naturally. Don't read it back verbatim.`;

// Pass to Vapi/Retell/Bland as the system prompt

Done

Your voice AI now remembers every caller. When someone mentions their insurance changed on Monday, your AI knows it on Friday's follow-up call.

3 lines of Orthanc code. Persistent voice AI memory.

Get your API key from the dashboard.

All 6 SDK Methods

memory.syncMessages() — Sync call transcripts

javascript
await memory.syncMessages('+14155551234', [
  { role: 'user', content: 'I need to update my address to 123 Oak Street' },
  { role: 'assistant', content: 'I have updated your address on file.' }
]);
// Response: { status: 'queued', result: { factsExtracted: 2, memoriesInserted: 1 } }

memory.query() — Retrieve caller context

javascript
const { memories, count, latency_ms } = await memory.query('+14155551234', 'incoming call');
// memories: ['Caller address updated to 123 Oak Street', 'Insurance: Aetna']
// latency_ms: 45

// With options
const result = await memory.query('+14155551234', 'appointment history', {
  timeFilter: 'month',
  matchCount: 10
});

upload via fetch — Ingest files (PDF, audio, text)

javascript
// Upload a PDF via the API directly
const formData = new FormData();
formData.append('file', pdfFile);
formData.append('userId', 'user_123');

await fetch('https://api.orthanc.ai/api/upload', {
  method: 'POST',
  headers: { 'Authorization': 'Bearer sk-mem-your-key' },
  body: formData
});

memory.batch() — Bulk create/update/delete (max 100)

javascript
const result = await memory.batch({
  userId: 'user_123',
  operations: [
    { action: 'create', text: 'User loves hiking in mountains' },
    { action: 'create', text: 'User prefers morning workouts' },
    { action: 'update', id: 'mem-uuid', updates: { importance: 0.9 } },
    { action: 'delete', id: 'old-memory-id' }
  ]
});

console.log(result.results);
// { created: 2, updated: 1, deleted: 1, failed: 0 }

memory.exportAll() — Download memories

javascript
// Export all memories for a user (auto-paginates)
const memories = await memory.exportAll('user_123');

for (const mem of memories) {
  console.log(mem.content, mem.createdAt);
}

memory.createWebhook() — Real-time notifications

javascript
// Create a webhook
await memory.createWebhook({
  url: 'https://your-server.com/webhook',
  events: ['memory.created', 'memory.updated'],
  secret: 'whsec_your_secret'
});

// List all webhooks
const webhooks = await memory.listWebhooks();

// Delete a webhook
await memory.deleteWebhook('webhook-id');

SDK methods — @orthanc-protocol/client

syncMessages() — Save conversations

query() — Retrieve memories

POST /api/upload — Ingest files

batch() — Bulk create/update/delete

exportAll() — Download memories

createWebhook() — Real-time events

Auto-retry with exponential backoff

Full TypeScript support with types