How to set up a local & offline GitHub Copilot alternative

This guide sets up a local coding assistant inside your editor. If you instead want to run a local model and call it from your own app, see Build a Local LLM App, and Cloud vs Local Models for when local makes sense.

Local models are viable now

As of mid-2026, local models have closed much of the gap with frontier APIs for day-to-day coding tasks. Vicki Boykis documents this well in Running local models is good now (2026-06-15): on an M2 Mac with 64 GB RAM she runs agentic coding workflows — refactoring notebooks, generating unit tests, bootstrapping repos — at roughly ~75% the accuracy and speed of frontier models, without a cloud API call. The models she found most capable: Gemma 4 26B A4B, Gemma 4 12B QAT, Qwen 3 MOE, and Qwen 2.5 Coder.

The remaining limitations are real: inference is slower than a remote API, context windows are capped by your RAM, and results still warrant a second opinion on tricky problems. But for privacy-sensitive work or offline development, the tooling is now good enough to use daily.

Ollama + Continue Dev

Install ollama
- https://ollama.com/download
Install the continue.dev Plugin for JetBrains or VSCode
- https://plugins.jetbrains.com/plugin/22707-continue
- https://marketplace.visualstudio.com/items?itemName=Continue.continue
Optionally, use OpenWebUI via Docker as an Interface for Chatting

Model Setup

Quick Tab Completion
```
ollama pull qwen2.5-coder:1.5b
```
- https://ollama.com/library/qwen2.5-coder
- Open Source
Indexing and Codebase Search
```
ollama pull nomic-embed-text
```
General Purpose Reasoning Model
```
ollama pull phi4
```
- https://ollama.com/library/phi4
- MIT License
- Alternatively use Gemma 4 (strong mid-2026 recommendation, especially on Apple Silicon with 32 GB+ RAM):
  - https://ollama.com/library/gemma4
  - gemma4:26b for the 26B variant, gemma4:12b for the lighter 12B variant
```
ollama pull gemma4:26b
# or the lighter variant
ollama pull gemma4:12b
```
- Or Qwen 3 MOE — mixture-of-experts, punches above its weight on coding:
  - https://ollama.com/library/qwen3
```
ollama pull qwen3:30b-a3b
```
Reranking Model
```
ollama pull linux6200/bge-reranker-v2-m3
```
- https://docs.continue.dev/customize/model-roles/reranking
- https://ollama.com/linux6200/bge-reranker-v2-m3
Update continue.dev config.json -> see here
Run ollama api locally
```
ollama serve
```
- https://ollama.com/library/nomic-embed-text
- Apache License

Open Web UI

Nvidia GPU

docker run -d -p 3000:8080 --gpus all --add-host=host.docker.internal:host-gateway -v open-webui:/app/backend/data --name open-webui --restart always ghcr.io/open-webui/open-webui:cuda

Other

docker run -d -p 3456:8080 --add-host=host.docker.internal:host-gateway -v open-webui:/app/backend/data --name open-webui --restart always ghcr.io/open-webui/open-webui:main

Docker Compose

Below docker connects to ollama running natively on windows and not via docker.

services:
    open-webui:
        image: ghcr.io/open-webui/open-webui:cuda
        container_name: open-webui
        volumes:
            - ./data:/app/backend/data
        ports:
            - 3456:8080
        environment:
            - 'OLLAMA_BASE_URL=http://0.0.0.0:11434'
        extra_hosts:
            - host.docker.internal:host-gateway
        restart: always
        deploy:
            resources:
                reservations:
                    devices:
                        -   driver: nvidia
                            count: all
                            capabilities: [ gpu ]

volumes:
    open-webui: { }

Usage

Use directly in your editor

or via the chat-sidebar tab

Suggested continue.dev config

Unix: ~/.continue/config.json
Windows: %USERPROFILE%\.continue\config.json

~/.continue/config.json
{
    "models": [
        {
            "title": "Gemma 4 26B",
            "provider": "ollama",
            "model": "gemma4:26b",
            "systemMessage": "You are a helpful assistant supporting a software developer. Your tasks may involve explaining technical concepts, assisting with code, offering best practices, and solving programming-related issues across various languages and frameworks. Always provide clear, concise, and accurate answers. Always respond in English."
        },
        {
            "title": "Qwen 3 MOE",
            "provider": "ollama",
            "model": "qwen3:30b-a3b",
            "systemMessage": "You are a helpful assistant supporting a software developer. Your tasks may involve explaining technical concepts, assisting with code, offering best practices, and solving programming-related issues across various languages and frameworks. Always provide clear, concise, and accurate answers. Always respond in English."
        },
        {
            "title": "PHi-4",
            "provider": "ollama",
            "model": "phi4",
            "systemMessage": "You are a helpful assistant supporting a software developer. Your tasks may involve explaining technical concepts, assisting with code, offering best practices, and solving programming-related issues across various languages and frameworks. Always provide clear, concise, and accurate answers. Always respond in English."
        },
    ],
    "tabAutocompleteModel": {
        "title": "Qwen2.5-Coder",
        "provider": "ollama",
        "model": "qwen2.5-coder:1.5b"
    },
    "embeddingsProvider": {
        "title": "Nomic Embed Text",
        "provider": "ollama",
        "model": "nomic-embed-text"
    },
    "customCommands": [
        {
            "name": "test",
            "prompt": "{{{ input }}}\n\nWrite a comprehensive set of unit tests for the provided code. Ensure to include setup, execution of correctness checks with important edge cases, and teardown. Present the tests as plain text output.",
            "description": "Generate unit tests for the highlighted code."
        },
        {
            "name": "refactor",
            "prompt": "{{{ input }}}\n\nRefactor the provided code to improve its structure and readability without altering its functionality. Include a detailed explanation of your changes and reasoning.",
            "description": "Improve the code's structure for better readability."
        },
        {
            "name": "optimize",
            "prompt": "{{{ input }}}\n\nOptimize the provided code for performance while maintaining its current behavior. Describe any trade-offs involved in your optimization process.",
            "description": "Enhance code performance with a detailed explanation of changes and trade-offs."
        },
        {
            "name": "explain",
            "prompt": "{{{ input }}}\n\nExplain the logic and functionality of the provided code. Discuss any potential inefficiencies or unnecessary computations that could be improved for better performance.",
            "description": "Analyze and explain the code's functionality and potential improvements."
        },
        {
            "name": "document",
            "prompt": "{{{ input }}}\n\nWrite language-specific documentation for the provided function. Use appropriate formats like Javadoc for Java or JSDoc for JavaScript. Ensure clarity and conciseness in your explanation.",
            "description": "Create clear and concise function documentation using the correct language format."
        }
    ],
    "contextProviders": [
        {
            "name": "diff",
            "params": {}
        },
        {
            "name": "folder",
            "params": {}
        },
        {
            "name": "codebase",
            "params": {}
        },
        {
            "name": "file",
            "params": {}
        },
        {
            "name": "code",
            "params": {}
        },
        {
            "name": "currentFile",
            "params": {}
        },
        {
            "name": "terminal",
            "params": {}
        },
        {
            "name": "open",
            "params": {}
        },
        {
            "name": "web",
            "params": {}
        },
        {
            "name": "url",
            "params": {}
        },
        {
            "name": "repo-map",
            "params": {}
        },
        {
            "name": "os",
            "params": {}
        },
        {
            "name": "docs",
            "params": {}
        }
    ],
    "slashCommands": [
        {
            "name": "share",
            "description": "Export the current chat session to markdown"
        },
        {
            "name": "commit",
            "description": "Generate a git commit message"
        }
    ],
    "docs": [
        {
            "startUrl": "https://www.aem.live/docs",
            "title": "aem.live",
            "faviconUrl": "https://www.aem.live/favicon.ico"
        },
        {
            "startUrl": "https://experienceleague.adobe.com/de/docs/experience-manager-cloud-service",
            "title": "AEMaaCS",
            "faviconUrl": "https://experienceleague.adobe.com/favicon.ico"
        },
        {
            "startUrl": "https://lucanerlich.com",
            "title": "lucanerlich",
            "faviconUrl": ""
        },
        {
            "startUrl": "https://react.dev/",
            "title": "react",
            "faviconUrl": ""
        },
        {
            "startUrl": "https://www.typescriptlang.org/",
            "title": "typescript",
            "faviconUrl": ""
        },
        {
            "startUrl": "https://react-spectrum.adobe.com/index.html",
            "title": "react spectrum",
            "faviconUrl": ""
        }
    ],
    "experimental": {
        "useChromiumForDocsCrawling": true
    }
}

Local models are viable now​

Ollama + Continue Dev​

Model Setup​

Open Web UI​

Docker Compose​

Usage​

Suggested continue.dev config​