How to set up a local & offline GitHub Copilot alternative
This guide sets up a local coding assistant inside your editor. If you instead want to run a local model and call it from your own app, see Build a Local LLM App, and Cloud vs Local Models for when local makes sense.
Local models are viable now
As of mid-2026, local models have closed much of the gap with frontier APIs for day-to-day coding tasks. Vicki Boykis documents this well in Running local models is good now (2026-06-15): on an M2 Mac with 64 GB RAM she runs agentic coding workflows — refactoring notebooks, generating unit tests, bootstrapping repos — at roughly ~75% the accuracy and speed of frontier models, without a cloud API call. The models she found most capable: Gemma 4 26B A4B, Gemma 4 12B QAT, Qwen 3 MOE, and Qwen 2.5 Coder.
The remaining limitations are real: inference is slower than a remote API, context windows are capped by your RAM, and results still warrant a second opinion on tricky problems. But for privacy-sensitive work or offline development, the tooling is now good enough to use daily.
Ollama + Continue Dev
- Install ollama
- Install the continue.dev Plugin for JetBrains or VSCode
- Optionally, use OpenWebUI via Docker as an Interface for Chatting
Model Setup
- Quick Tab Completion
ollama pull qwen2.5-coder:1.5b
- Indexing and Codebase Search
ollama pull nomic-embed-text
- General Purpose Reasoning Model
ollama pull phi4
- https://ollama.com/library/phi4
- MIT License
- Alternatively use Gemma 4 (strong mid-2026 recommendation, especially on Apple Silicon with 32 GB+ RAM):
- https://ollama.com/library/gemma4
gemma4:26bfor the 26B variant,gemma4:12bfor the lighter 12B variant
ollama pull gemma4:26b# or the lighter variantollama pull gemma4:12b - Or Qwen 3 MOE — mixture-of-experts, punches above its weight on coding:
ollama pull qwen3:30b-a3b
- Reranking Model
ollama pull linux6200/bge-reranker-v2-m3
- Update continue.dev
config.json-> see here - Run ollama api locally
ollama serve
Open Web UI

Nvidia GPU
docker run -d -p 3000:8080 --gpus all --add-host=host.docker.internal:host-gateway -v open-webui:/app/backend/data --name open-webui --restart always ghcr.io/open-webui/open-webui:cuda
Other
docker run -d -p 3456:8080 --add-host=host.docker.internal:host-gateway -v open-webui:/app/backend/data --name open-webui --restart always ghcr.io/open-webui/open-webui:main
Docker Compose
Below docker connects to ollama running natively on windows and not via docker.
services:
open-webui:
image: ghcr.io/open-webui/open-webui:cuda
container_name: open-webui
volumes:
- ./data:/app/backend/data
ports:
- 3456:8080
environment:
- 'OLLAMA_BASE_URL=http://0.0.0.0:11434'
extra_hosts:
- host.docker.internal:host-gateway
restart: always
deploy:
resources:
reservations:
devices:
- driver: nvidia
count: all
capabilities: [ gpu ]
volumes:
open-webui: { }
Usage
Use directly in your editor

or via the chat-sidebar tab

Suggested continue.dev config
- Unix:
~/.continue/config.json - Windows:
%USERPROFILE%\.continue\config.json
{
"models": [
{
"title": "Gemma 4 26B",
"provider": "ollama",
"model": "gemma4:26b",
"systemMessage": "You are a helpful assistant supporting a software developer. Your tasks may involve explaining technical concepts, assisting with code, offering best practices, and solving programming-related issues across various languages and frameworks. Always provide clear, concise, and accurate answers. Always respond in English."
},
{
"title": "Qwen 3 MOE",
"provider": "ollama",
"model": "qwen3:30b-a3b",
"systemMessage": "You are a helpful assistant supporting a software developer. Your tasks may involve explaining technical concepts, assisting with code, offering best practices, and solving programming-related issues across various languages and frameworks. Always provide clear, concise, and accurate answers. Always respond in English."
},
{
"title": "PHi-4",
"provider": "ollama",
"model": "phi4",
"systemMessage": "You are a helpful assistant supporting a software developer. Your tasks may involve explaining technical concepts, assisting with code, offering best practices, and solving programming-related issues across various languages and frameworks. Always provide clear, concise, and accurate answers. Always respond in English."
},
],
"tabAutocompleteModel": {
"title": "Qwen2.5-Coder",
"provider": "ollama",
"model": "qwen2.5-coder:1.5b"
},
"embeddingsProvider": {
"title": "Nomic Embed Text",
"provider": "ollama",
"model": "nomic-embed-text"
},
"customCommands": [
{
"name": "test",
"prompt": "{{{ input }}}\n\nWrite a comprehensive set of unit tests for the provided code. Ensure to include setup, execution of correctness checks with important edge cases, and teardown. Present the tests as plain text output.",
"description": "Generate unit tests for the highlighted code."
},
{
"name": "refactor",
"prompt": "{{{ input }}}\n\nRefactor the provided code to improve its structure and readability without altering its functionality. Include a detailed explanation of your changes and reasoning.",
"description": "Improve the code's structure for better readability."
},
{
"name": "optimize",
"prompt": "{{{ input }}}\n\nOptimize the provided code for performance while maintaining its current behavior. Describe any trade-offs involved in your optimization process.",
"description": "Enhance code performance with a detailed explanation of changes and trade-offs."
},
{
"name": "explain",
"prompt": "{{{ input }}}\n\nExplain the logic and functionality of the provided code. Discuss any potential inefficiencies or unnecessary computations that could be improved for better performance.",
"description": "Analyze and explain the code's functionality and potential improvements."
},
{
"name": "document",
"prompt": "{{{ input }}}\n\nWrite language-specific documentation for the provided function. Use appropriate formats like Javadoc for Java or JSDoc for JavaScript. Ensure clarity and conciseness in your explanation.",
"description": "Create clear and concise function documentation using the correct language format."
}
],
"contextProviders": [
{
"name": "diff",
"params": {}
},
{
"name": "folder",
"params": {}
},
{
"name": "codebase",
"params": {}
},
{
"name": "file",
"params": {}
},
{
"name": "code",
"params": {}
},
{
"name": "currentFile",
"params": {}
},
{
"name": "terminal",
"params": {}
},
{
"name": "open",
"params": {}
},
{
"name": "web",
"params": {}
},
{
"name": "url",
"params": {}
},
{
"name": "repo-map",
"params": {}
},
{
"name": "os",
"params": {}
},
{
"name": "docs",
"params": {}
}
],
"slashCommands": [
{
"name": "share",
"description": "Export the current chat session to markdown"
},
{
"name": "commit",
"description": "Generate a git commit message"
}
],
"docs": [
{
"startUrl": "https://www.aem.live/docs",
"title": "aem.live",
"faviconUrl": "https://www.aem.live/favicon.ico"
},
{
"startUrl": "https://experienceleague.adobe.com/de/docs/experience-manager-cloud-service",
"title": "AEMaaCS",
"faviconUrl": "https://experienceleague.adobe.com/favicon.ico"
},
{
"startUrl": "https://lucanerlich.com",
"title": "lucanerlich",
"faviconUrl": ""
},
{
"startUrl": "https://react.dev/",
"title": "react",
"faviconUrl": ""
},
{
"startUrl": "https://www.typescriptlang.org/",
"title": "typescript",
"faviconUrl": ""
},
{
"startUrl": "https://react-spectrum.adobe.com/index.html",
"title": "react spectrum",
"faviconUrl": ""
}
],
"experimental": {
"useChromiumForDocsCrawling": true
}
}