Connect To Your Own LLM Server

Connect directly from your browser to Ollama, LM Studio, llama.cpp, Jan, vLLM, TGI, and other OpenAI-compatible model servers on localhost, your LAN, or private networks such as Tailscale.

Browser-Based

Truly Free

Remote mode is browser-direct - if your setup is blocked by mixed content, CORS, or local-network exposure, this page will surface the browser/server issue instead of proxying around it.

UnwriteConnect To Your Own LLM Server

Run In Browser Connect To Server

Connect directly to local, LAN, or Tailscale-accessible model servers from your browser.

This mode is browser-direct, so mixed-content, CORS, and server auth still apply.

Saved Connections

Activate an existing profile or build a new one.

No saved profiles yet.

Profile nameProvider preset

Base URLManagement URL

API keyRaw `extra_body` JSON

Create or select a profile to connect.

Messages go directly from your browser to the selected server. Unwrite does not proxy the traffic.

Model types

Decoder: The most common architecture for chat. Generates text left-to-right, one token at a time. Used by GPT, Llama, Qwen, Gemma, Phi, and SmolLM. Good for conversation, creative writing, and general instruction-following.
Seq2Seq: Encoder-decoder models that read the full input before generating output. Better for structured tasks like translation, summarisation, and Q&A. FLAN-T5 uses this architecture.
Hybrid: Combines convolution and attention layers for efficient on-device inference. LFM2.5 from Liquid AI uses this novel architecture, achieving strong performance at very small sizes.
Vision: Processes images as input and produces text descriptions. Florence 2 can caption images, read text via OCR, and detect objects.
Multimodal: Handles multiple input or output types. Janus Pro generates and understands images. Gemma 4 E2B accepts text, images, audio, and video.
TTS (Text-to-Speech): Converts written text into natural-sounding audio. Kokoro produces speech across 54 voices and 8 languages.
ASR (Speech Recognition): Converts spoken audio into text. Granite 4.0 is the top-ranked open model on the OpenASR leaderboard.

How does in-browser AI work?

Everything on this page runs locally in your browser. These panels explain the technology behind it.