← Back to Home

Ollama Cheat Sheet

Essential Ollama commands for local AI model management

ollama run
Run a model interactively. Downloads the model if not already present. Start chatting with the AI model directly.

ollama run llama2

Keywords: run, chat, interactive, model, download

ollama pull
Download a model without running it. Useful for pre-downloading models or updating to latest version.

ollama pull codellama:13b

Keywords: pull, download, model, update

ollama list
List all locally installed models. Shows model names, sizes, and modification dates.

ollama list

Keywords: list, models, installed, local

ollama show
Display detailed information about a model. Shows model parameters, template, and system message.

ollama show llama2

Keywords: show, info, details, parameters

ollama rm
Remove a model from local storage. Frees up disk space by deleting the model files.

ollama rm llama2:7b

Keywords: remove, delete, model, cleanup

ollama cp
Copy a model to create a new model with different name. Useful for creating model variants.

ollama cp llama2 my-llama2

Keywords: copy, duplicate, model, variant

ollama create
Create a new model from a Modelfile. Define custom models with specific parameters and system prompts.

ollama create mymodel -f ./Modelfile

Keywords: create, custom, modelfile, build

ollama push
Upload a model to Ollama registry. Share your custom models with others.

ollama push myusername/mymodel

Keywords: push, upload, share, registry

ollama serve
Start Ollama server daemon. Required for API access and running models as a service.

ollama serve

Keywords: serve, server, daemon, api, service

ollama ps
List currently running models and their resource usage. Shows memory consumption and processing status.

ollama ps

Keywords: ps, running, status, memory, resources

ollama stop
Stop a running model to free up memory. Model will need to be loaded again for next use.

ollama stop llama2

Keywords: stop, unload, memory, free

curl API
Make API requests to Ollama server. Send prompts and receive responses programmatically via REST API.

curl -X POST http://localhost:11434/api/generate -d '{"model":"llama2","prompt":"Hello"}'

Keywords: api, curl, rest, generate, programmatic

ollama --version
Display Ollama version information. Useful for troubleshooting and compatibility checks.

ollama --version

Keywords: version, info, troubleshoot

ollama help
Show help information for Ollama commands. Get detailed usage instructions for any command.

ollama help run

Keywords: help, usage, instructions, documentation

ollama run --verbose
Run model with verbose output. See detailed information about model loading and token generation.

ollama run --verbose llama2

Keywords: verbose, debug, detailed, tokens

ollama run --format json
Get structured JSON responses from model. Useful for programmatic integration and parsing.

ollama run llama2 --format json 'Explain JSON in one sentence'

Keywords: json, structured, format, programmatic

ollama run --system
Set system message for model behavior. Define role and context for consistent responses.

ollama run llama2 --system 'You are a helpful coding assistant'

Keywords: system, role, context, behavior

ollama run --template
Use custom prompt template. Override default model template for specific use cases.

ollama run llama2 --template '{{.System}} {{.Prompt}}'

Keywords: template, prompt, custom, format

ollama run --keepalive
Control how long model stays loaded in memory. Optimize memory usage vs response time.

ollama run --keepalive 5m llama2

Keywords: keepalive, memory, optimization, timeout

ollama run --nowordwrap
Disable word wrapping in output. Useful for code generation or structured output.

ollama run --nowordwrap codellama 'Generate a Python function'

Keywords: nowordwrap, code, structured, formatting

OLLAMA_HOST
Set custom Ollama server host. Connect to remote Ollama instance or change default port.

OLLAMA_HOST=192.168.1.100:11434 ollama list

Keywords: host, remote, server, port, environment

OLLAMA_MODELS
Set custom models directory. Change where Ollama stores downloaded models.

OLLAMA_MODELS=/custom/path ollama pull llama2

Keywords: models, directory, storage, path, custom

ollama run --multiline
Enable multiline input mode. Send multiple lines of text before model processes input.

ollama run --multiline llama2

Keywords: multiline, input, multiple, lines