Local OpenAI API.

Drop-in /v1/chat/completions. No proxy. No API key in someone's database.

If you've ever wired your editor, a script, or an internal tool to OpenAI's API and then realized you don't want every call to leave your machine, tailor.'s local endpoint is the answer. It speaks the same wire format as OpenAI's /v1/chat/completions, runs on localhost:11435, and serves whatever local model you've selected in the app.

How to use it

Point any OpenAI-compatible client at http://localhost:11435/v1 with any API key string (it's not validated , auth lives elsewhere since the server is loopback-only by default). Example with the Python SDK:

Example: Python

from openai import OpenAI

client = OpenAI(
    base_url="http://localhost:11435/v1",
    api_key="not-used",
)

resp = client.chat.completions.create(
    model="qwen2.5-coder:14b",
    messages=[{"role": "user", "content": "Write a haiku about TCP."}],
)
print(resp.choices[0].message.content)

Example: Cursor

In Cursor settings, under "Models" → "Add custom model," set the base URL to http://localhost:11435/v1, pick any model name (tailor. routes to the active model regardless), and Cursor will use your local model for completions and chat.

Streaming

Server-sent events work the same way as OpenAI's. stream=true returns token-by-token chunks in the OpenAI delta format. Most clients work without changes.

Why this matters

Cloud LLM costs add up fast , a heavy Cursor user can spend $200+/month on Claude through Anthropic. A heavy script user can spend more. tailor. is $11/month flat for unlimited local inference. The privacy story is the same as the rest of the app: nothing leaves your device, no key is sitting in someone else's database, no rate limits.

Questions

Is the endpoint exposed to other machines on my network?

By default it binds to 127.0.0.1 (loopback only). You can opt in to LAN exposure in settings if you want other devices on your network to use it , over the same WiFi for example.

Does it support function calling / tools?

Yes, the tools parameter works for models that support it. The OpenAI tool-calling schema is supported.

Which OpenAI endpoints are supported?

/v1/chat/completions (streaming and non-streaming), /v1/models, /v1/health, /v1/search. Image generation has its own local endpoint. (/v1/embeddings is on the roadmap.)

Can I run my own model name routing?

Yes , tailor. accepts any model parameter and routes it to the active local model. You can also pin specific names to specific local models in settings.

Try tailor. free for 7 days.

Full access. No credit card required. Mac, Windows, and Linux.

Start free trial → See pricing