Files

106 lines
2.9 KiB
Markdown

# Model Guide
This reference covers common Ollama models and selection guidance.
## Popular Models
### Chat/General Models
| Model | Params | Best For | Notes |
|-------|--------|----------|-------|
| `qwen3:4b` | 4B | Fast tasks, quick answers | Thinking-enabled, very fast |
| `llama3.1:8b` | 8B | General chat, reasoning | Good all-rounder |
| `gemma3:12b` | 12.2B | Creative, design tasks | Google model, good quality |
| `phi4-reasoning:latest` | 14.7B | Complex reasoning | Thinking-enabled |
| `mistral-small3.1:latest` | 24B | Technical tasks | May need CPU offload |
| `deepseek-r1:8b` | 8.2B | Deep reasoning | Thinking-enabled, chain-of-thought |
### Coding Models
| Model | Params | Best For | Notes |
|-------|--------|----------|-------|
| `qwen2.5-coder:7b` | 7.6B | Code generation, review | Best local coding model |
| `codellama:7b` | 7B | Code completion | Meta's code model |
| `deepseek-coder:6.7b` | 6.7B | Code tasks | Good alternative |
### Embedding Models
| Model | Params | Dimensions | Notes |
|-------|--------|------------|-------|
| `bge-m3:latest` | 567M | 1024 | Multilingual, good quality |
| `nomic-embed-text` | 137M | 768 | Fast, English-focused |
| `mxbai-embed-large` | 335M | 1024 | High quality embeddings |
## Model Selection Guide
### By Task Type
- **Quick questions**: `qwen3:4b` (fastest)
- **General chat**: `llama3.1:8b`
- **Coding**: `qwen2.5-coder:7b`
- **Complex reasoning**: `phi4-reasoning` or `deepseek-r1:8b`
- **Creative/design**: `gemma3:12b`
- **Embeddings**: `bge-m3:latest`
### By Speed vs Quality
```
Fastest ←──────────────────────────────→ Best Quality
qwen3:4b → llama3.1:8b → gemma3:12b → mistral-small3.1
```
### Tool Use Support
Models with good tool/function calling support:
-`qwen2.5-coder:7b` - Excellent
-`qwen3:4b` - Good
-`llama3.1:8b` - Basic
-`mistral` models - Good
- ⚠️ Others - May not support tools natively
## OpenClaw Integration
To use Ollama models in OpenClaw sub-agents, use these model paths:
```
ollama/qwen3:4b
ollama/llama3.1:8b
ollama/qwen2.5-coder:7b
ollama/gemma3:12b
ollama/mistral-small3.1:latest
ollama/phi4-reasoning:latest
ollama/deepseek-r1:8b
```
### Auth Profile Required
OpenClaw requires an auth profile even for Ollama (no actual auth needed). Add to `auth-profiles.json`:
```json
"ollama:default": {
"type": "api_key",
"provider": "ollama",
"key": "ollama"
}
```
## Hardware Considerations
- **8GB VRAM**: Can run models up to ~13B comfortably
- **16GB VRAM**: Can run most models including 24B+
- **CPU offload**: Ollama automatically offloads to CPU/RAM for larger models
- **Larger models** may be slower due to partial CPU inference
## Installing Models
```bash
# Pull a model
ollama pull llama3.1:8b
# Or via the skill script
python3 scripts/ollama.py pull llama3.1:8b
# List installed models
python3 scripts/ollama.py list
```