2.9 KiB
2.9 KiB
Model Guide
This reference covers common Ollama models and selection guidance.
Popular Models
Chat/General Models
| Model | Params | Best For | Notes |
|---|---|---|---|
qwen3:4b |
4B | Fast tasks, quick answers | Thinking-enabled, very fast |
llama3.1:8b |
8B | General chat, reasoning | Good all-rounder |
gemma3:12b |
12.2B | Creative, design tasks | Google model, good quality |
phi4-reasoning:latest |
14.7B | Complex reasoning | Thinking-enabled |
mistral-small3.1:latest |
24B | Technical tasks | May need CPU offload |
deepseek-r1:8b |
8.2B | Deep reasoning | Thinking-enabled, chain-of-thought |
Coding Models
| Model | Params | Best For | Notes |
|---|---|---|---|
qwen2.5-coder:7b |
7.6B | Code generation, review | Best local coding model |
codellama:7b |
7B | Code completion | Meta's code model |
deepseek-coder:6.7b |
6.7B | Code tasks | Good alternative |
Embedding Models
| Model | Params | Dimensions | Notes |
|---|---|---|---|
bge-m3:latest |
567M | 1024 | Multilingual, good quality |
nomic-embed-text |
137M | 768 | Fast, English-focused |
mxbai-embed-large |
335M | 1024 | High quality embeddings |
Model Selection Guide
By Task Type
- Quick questions:
qwen3:4b(fastest) - General chat:
llama3.1:8b - Coding:
qwen2.5-coder:7b - Complex reasoning:
phi4-reasoningordeepseek-r1:8b - Creative/design:
gemma3:12b - Embeddings:
bge-m3:latest
By Speed vs Quality
Fastest ←──────────────────────────────→ Best Quality
qwen3:4b → llama3.1:8b → gemma3:12b → mistral-small3.1
Tool Use Support
Models with good tool/function calling support:
- ✅
qwen2.5-coder:7b- Excellent - ✅
qwen3:4b- Good - ✅
llama3.1:8b- Basic - ✅
mistralmodels - Good - ⚠️ Others - May not support tools natively
OpenClaw Integration
To use Ollama models in OpenClaw sub-agents, use these model paths:
ollama/qwen3:4b
ollama/llama3.1:8b
ollama/qwen2.5-coder:7b
ollama/gemma3:12b
ollama/mistral-small3.1:latest
ollama/phi4-reasoning:latest
ollama/deepseek-r1:8b
Auth Profile Required
OpenClaw requires an auth profile even for Ollama (no actual auth needed). Add to auth-profiles.json:
"ollama:default": {
"type": "api_key",
"provider": "ollama",
"key": "ollama"
}
Hardware Considerations
- 8GB VRAM: Can run models up to ~13B comfortably
- 16GB VRAM: Can run most models including 24B+
- CPU offload: Ollama automatically offloads to CPU/RAM for larger models
- Larger models may be slower due to partial CPU inference
Installing Models
# Pull a model
ollama pull llama3.1:8b
# Or via the skill script
python3 scripts/ollama.py pull llama3.1:8b
# List installed models
python3 scripts/ollama.py list