Quickstart

By the end of this page you will have a streaming, OpenAI-compatible chat completion running entirely on your laptop.

1. Install

npm install -g @infrarix/locopilot
locopilot init

See Installation if init reports a missing dependency.

2. Pull a model

locopilot models pull llama3:8b

You can use any Ollama-compatible model: llama3.1, mistral, qwen2.5, phi3, gemma2, codellama, etc.

3. Start the server

locopilot start
# 🐌 LocoPilot API running on port 8080

Leave this running. In a second terminal:

curl http://localhost:8080/v1/locopilot/health
# { "status": "ok", "ollama": "ok", "sqlite": "ok" }

4. Send a chat completion

curl

curl -N http://localhost:8080/v1/chat/completions \
  -H 'Content-Type: application/json' \
  -d '{
    "model": "llama3:8b",
    "messages": [{ "role": "user", "content": "Hello!" }],
    "stream": true
  }'

Node.js (OpenAI SDK)

import OpenAI from 'openai';

const client = new OpenAI({
  baseURL: 'http://localhost:8080/v1',
  apiKey: 'not-needed', // free tier accepts any value
});

const stream = await client.chat.completions.create({
  model: 'llama3:8b',
  messages: [{ role: 'user', content: 'Hello!' }],
  stream: true,
});

for await (const chunk of stream) {
  process.stdout.write(chunk.choices[0]?.delta?.content ?? '');
}

Python (OpenAI SDK)

from openai import OpenAI

client = OpenAI(
    base_url="http://localhost:8080/v1",
    api_key="not-needed",
)

stream = client.chat.completions.create(
    model="llama3:8b",
    messages=[{"role": "user", "content": "Hello!"}],
    stream=True,
)

for chunk in stream:
    print(chunk.choices[0].delta.content or "", end="")

5. (Optional) Enable Pro

If you want remote GPU fallback for models you haven't pulled, cloud training with live log streaming, or Cloudflare tunnels, link your machine to LocoPilot Cloud:

locopilot login --key qs_xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx

Tier detection is local — Pro features unlock as soon as a qs_… token is present in ~/.locopilot/config.json. See locopilot login.

1. Install​

2. Pull a model​

3. Start the server​

4. Send a chat completion​

curl​

Node.js (OpenAI SDK)​

Python (OpenAI SDK)​

5. (Optional) Enable Pro​

What's next​