Local LLM Workflows with Docker AI and N8N

Docker AI makes on-device LLM workflows insanely simple with near perfect developer experience

I walked into the WeAreDevelopers Congress last week expecting the usual mix of keynotes and hallway chats. Instead, the Docker team dropped a bombshell: Docker AI, a bundle of new features that bring large language model (LLM) runtimes and a MCP gateway solution directly into Docker Desktop. The feature overview page is live at docker.com/solutions/docker-ai, and many capabilities are already shipping in the current Desktop release.

What excites me most is that Docker is treating LLM orchestration as a first-class developer experience issue rather than a bolt-on. Developer experience has always been one of my top concerns. Poorly designed microservice systems routinely ignore it. Something I’ll save for another post.

Local models can now be declared in docker-compose.yml, started with a single command, and wired to workflow tools such as n8n without resorting to third-party hacks. MCP servers can be freely added from a catalog. Because everything sits in the same Docker network, a container can talk to the gateway and the models as easily as it talks to Redis. That polish raises the bar for every other AI automation solution I touch. It also solidifies the rise of the MCP and agentic AI workflows.

For anyone who is interested in some technical details. Below is a working docker compose snippet that spins up: a basic n8n instance and a MCP gateway using the DuckDuckGo server that allows agents to search the web or fetch content from specified websites. This example uses the qwen3:0.6B-Q4_K_M model with 0.6B parameters and 4-bit quantization which should run fine on any local machine at the cost of model response quality.

name: Local-Agents
services:
  n8n:
    image: docker.n8n.io/n8nio/n8n
    restart: always
    ports:
      - "127.0.0.1:5678:5678"
    environment:
      - N8N_RUNNERS_ENABLED=true
      - N8N_PORT=5678
      - NODE_ENV=production
    volumes:
      - n8n_data:/home/node/.n8n
      - ./local-files:/files
    models:
      qwen3:
        endpoint_var: MODEL_RUNNER_URL
        model_var: MODEL_RUNNER_MODEL

  mcp-gateway:
    image: docker/mcp-gateway:latest
    use_api_socket: true
    command:
      - --transport=sse
      - --servers=duckduckgo
      - --tools=search,fetch_content

models:
  qwen3:
    model: ai/qwen3:0.6B-Q4_K_M
    context_size: 8192
    runtime_flags:
      - --no-prefill-assistant

volumes:
  n8n_data:

The MCP server tools are exposed at http://mcp-gateway:8811/sse. Every model runner answers under the OpenAI API schema at http://model-runner.docker.internal/engines/v1 inside the docker network.

When you need local testing from the host machine, enable a TCP bridge using docker desktop enable model-runner --tcp=12434. After that, you can hook up any OpenAI-compatible client or framework at http://localhost:12434/engines/v1 or just run your naked POST requests against it. Some MCP services need extra secrets, such as API tokens, to function. You can inject them like this docker mcp secret set 'linkedin-mcp-server.linkedin_cookie=AQEDAS....'

Find more use cases here: https://github.com/docker/compose-for-agents/tree/main/crew-ai

For anyone building agentic AI systems or just curious about running powerful models on-device. This is the cleanest on-ramp I have seen so far.