Private document chat,
built in.
Drop a file. Ask anything. Nothing leaves your machine.
Most "chat with your documents" tools work by uploading your file to a server, embedding it with a cloud API, and storing the chunks in someone else's vector database. That's three places your data has touched before the model answers a single question. tailor. does the entire pipeline locally , chunking, embedding, retrieval, and inference all run on your device.
What you can drop in
PDFs (text and scanned, with OCR via a multimodal model). Word documents. Spreadsheets , tailor. understands cell context and can run analysis. Source code with language-aware semantic chunking, so the model sees functions and classes as units instead of arbitrary line ranges. Images and charts, processed directly by vision models. Audio files, transcribed by local Whisper before retrieval. Markdown, plain text, ePub, JSON, CSV.
How retrieval actually works
tailor. runs a local embedding model (you can pick one in settings , defaults to a strong general-purpose model under 500MB). When you add a file, it's chunked, embedded, and stored in a local SQLite-backed vector index. Queries are embedded the same way, retrieved by cosine similarity with optional BM25 hybrid scoring, and the top-K chunks are passed to the chat model with citations. Citations link back to the exact page or line in the original file.
Why this is better than uploading to ChatGPT
Privacy first , nothing leaves the device. But there's a performance angle too: ChatGPT-style document chat tends to truncate or summarize large files because of context-window economics on the cloud side. tailor. uses local context however the model supports it, which means you can routinely work with 200-page reports or whole codebases without the model losing track.
Scanned PDFs and image-heavy documents
When a PDF has scanned pages or embedded images, tailor. uses a local vision-capable model to extract content directly. No separate OCR step, no cloud OCR API. The same loop handles whiteboard photos, screenshots, and chart images.
Codebases
Point tailor. at a directory and it walks every file with the right strategy: language-aware chunking that splits source code on function and class boundaries, Markdown handling for docs, structure-preserving handling for config. Then ask questions like "where is the authentication middleware defined" or "what's the difference between handleRequest in v2 and v3" and tailor. retrieves the right files before answering.