Quick links
- Live (UI served from
docs/if deployed to GitHub Pages): static upload + ask demo - Source code: GitHub repository
- Agent files:
src/mastra/agents/* - Tools:
src/mastra/tools/* - Workflow:
src/mastra/workflows/pdf-workflow.ts - Server:
src/server.ts
What you’ll build
- Dual Mastra agents:
pdfAgent(multi‑PDF retrieval across all uploaded docs)singlePdfAgent(restricted to latest uploaded PDF)
- Deterministic retrieval + answer workflow (
pdfWorkflow) - Express API with upload, ask (JSON), and streaming (SSE) endpoints
- Persisted hybrid vector + BM25 store (
.data/namespacepdf) - Integration with CometChat AI Agents (endpoint wiring)
How it works
- Users upload PDF(s) → server parses pages → chunks + embeddings generated → persisted in vector + manifest store.
- Hybrid retrieval ranks candidate chunks:
score = α * cosine + (1-α) * sigmoid(BM25)whereα = hybridAlpha. - Multi‑query expansion (optional) generates paraphrased variants to boost recall.
- Tools (
retrieve-pdf-context,retrieve-single-pdf-context) return stitched context + source metadata. - Workflow (
pdfWorkflow) orchestrates retrieval + answer; streaming endpoints emitmetathen incrementaltokenevents. - Mastra agent(s) are exposed via REST endpoints you wire into CometChat AI Agents.
Repo layout (key files)
src/mastra/agents/pdf-agent.ts– multi‑PDF agentsrc/mastra/agents/single-pdf-agent.ts– single latest PDF agentsrc/mastra/tools/retrieve-pdf-context.ts/retrieve-single-pdf-context.ts– hybrid retrieval toolssrc/mastra/workflows/pdf-workflow.ts– deterministic orchestrationsrc/lib/*– vector store, embeddings, manifest, PDF parsingsrc/server.ts– Express API (upload, ask, streaming, manifest ops)docs/index.html– optional static UI.data/– persisted vectors + manifest JSON
Prerequisites
- Node.js 20+
OPENAI_API_KEY(embeddings + chat model)- A CometChat app (to register the agent)
- (Optional)
CORS_ORIGINif restricting browser origins
Step 1 — Clone & install
Clone the example and install dependencies:Create a.envwith at least:
Step 2 — Define retrieval tools
File:src/mastra/tools/retrieve-pdf-context.ts (multi) & retrieve-single-pdf-context.ts (single)
Step 3 — Create agents
Files:src/mastra/agents/pdf-agent.ts, src/mastra/agents/single-pdf-agent.ts
Step 4 — Wire Mastra & workflow
File:src/mastra/index.ts registers agents + pdfWorkflow with storage (LibSQL or file‑backed).
Step 5 — Run locally
Step 6 — Upload and ask (API)
| Agent | File | Purpose | Tool |
|---|---|---|---|
pdfAgent | src/mastra/agents/pdf-agent.ts | Multi‑PDF retrieval QA | retrieve-pdf-context |
singlePdfAgent | src/mastra/agents/single-pdf-agent.ts | Latest single PDF QA | retrieve-single-pdf-context |
topK, more qVariants) if initial context is sparse.
JSON ask (Mastra dev agent route)
Mastra automatically exposes the API:Step 7 — Deploy & connect to CometChat
- Deploy the project (e.g., Vercel, Railway, or AWS).
- Copy the deployed endpoint URL.
- In CometChat Dashboard → AI Agents, add a new agent:
- Agent ID:
knowledge - Endpoint:
https://your-deployed-url/api/agents/knowledge/generate
- Agent ID:
Step 8 — Optimize & extend
- Add more documents to the
docs/folder. - Use embeddings + vector DB (Pinecone, Weaviate) for large datasets.
- Extend the agent with memory or multi-tool workflows.
Repository Links
Environment variables
| Name | Description |
|---|---|
OPENAI_API_KEY | Required for embeddings + model |
CORS_ORIGIN | Optional allowed browser origin |
PORT | Server port (default 3000) |
.env example:
Endpoint summary
| Method | Path | Description |
|---|---|---|
| POST | /api/upload | Upload a PDF (multipart) returns { docId, pages, chunks } |
| GET | /api/documents | List ingested documents |
| DELETE | /api/documents/:id | Delete a document + vectors |
| GET | /api/documents/:id/file | Stream stored original PDF |
| POST | /api/ask | Multi‑PDF retrieval + answer (JSON) |
| POST | /api/ask/full | Same as /api/ask (deterministic path) |
| POST | /api/ask/stream | Multi‑PDF streaming (SSE) |
| POST | /api/ask/single | Single latest PDF answer (JSON) |
| POST | /api/ask/single/stream | Single latest PDF streaming (SSE) |
Curl Examples
Upload:SSE Events
| Event | Payload | Notes |
|---|---|---|
meta | { sources, docId? } | First packet with retrieval metadata |
token | { token } | Incremental answer token chunk |
done | {} | Completion marker |
error | { error } | Error occurred |
Tuning & retrieval knobs
| Parameter | Effect | Trade‑off |
|---|---|---|
hybridAlpha | Higher = more semantic weight | Too high reduces keyword recall |
topK | More chunks = broader context | Larger responses, slower |
multiQuery | Recall across paraphrases | Extra model + embedding cost |
qVariants | Alternative queries for expansion | Diminishing returns >5 |
maxContextChars | Caps stitched context size | Too small omits evidence |
topK=8, qVariants=5.
Troubleshooting & debugging
- Enable internal logging (if available) to inspect scoring.
- Inspect vectors: open
.data/pdf-vectors.json. - Manifest corrupted? Delete
.data/manifest.jsonand re‑upload. - Low lexical relevance? Lower
hybridAlpha(e.g. 0.55). - Noise / irrelevant chunks? Reduce
topKor lowerqVariants.
Hardening & roadmap
- SSE/WebSocket answer token streaming to clients (UI consumption)
- Source highlighting + per‑chunk confidence
- Semantic / layout‑aware advanced chunking
- Vector deduplication & compression
- Auth layer (API keys / JWT) & per‑user isolation
- Background ingestion queue for large docs
- Retrieval quality regression tests
Repository Links
- Source: GitHub Repository
- Multi agent:
pdf-agent.ts - Single agent:
single-pdf-agent.ts - Tools:
retrieve-pdf-context.ts,retrieve-single-pdf-context.ts - Workflow:
pdf-workflow.ts - Server:
server.ts