Run AI Models Locally. Master the Ollama CLI

In our previous post we saw how we can install Ollama in our Ubuntu server. Now it’s time to start playing with it. With Ollama wee skip the cloud. We run large language models directly on our machine.

This cheat sheet gives us every essential Ollama command — organized, explained, and ready to use.

What is Ollama? Ollama is an open-source tool. It lets us download, run, and manage AI models on our own hardware. We use one command to pull a model. We use another to run it. No API key. No internet required after download. Ollama works on macOS, Linux, and Windows. Install it at ollama.com.

01 — Model Management

These commands control the models we store on our machine.

Command	What It Does	Notes
`ollama pull <model>`	Downloads a model from the Ollama registry to our machine.	Run this first. Required before use.
`ollama list`	Shows all models we have downloaded locally.	Displays name, ID, size, and modified date.
`ollama show <model>`	Prints details about a model — parameters, license, template, and system prompt.	Use before running an unfamiliar model.
`ollama rm <model>`	Deletes a model from our local disk.	Frees up storage. Cannot be undone.
`ollama cp <src> <dest>`	Copies a model under a new name. Useful before customizing a model.	Original model stays intact.

Examples:

# Pull the Llama 3 8B model
ollama pull llama3

# Pull a specific version tag
ollama pull llama3:70b

# See what we downloaded
ollama list

# Remove a model we no longer need
ollama rm mistral

02 — Running Models

These commands start a model and send it prompts.

Command	What It Does	Notes
`ollama run <model>`	Starts an interactive chat session with a model in our terminal.	Type `/bye` to exit the session.
`ollama run <model> "<prompt>"`	Sends a single prompt, prints the response, and exits. No interactive session.	Great for scripts and automation.
`echo "<text>" \| ollama run <model>`	Pipes text directly into a model as a prompt via standard input.	Combine with shell scripts for batch tasks.

Session Commands — Inside an active ollama run session, we have special commands available:

Command	What It Does
`/help`	Lists all session commands.
`/show info`	Displays model info mid-session.
`/set parameter <key> <value>`	Adjusts parameters like temperature on the fly.
`/bye`	Exits the session cleanly.

03 — Server & API

Ollama runs a local REST server. We use it to connect any application to our models.

Command	What It Does	Notes
`ollama serve`	Starts the Ollama REST API server. Listens on `localhost:11434`by default.	Runs automatically on install. Use only if we stopped it manually.
`OLLAMA_HOST=0.0.0.0 ollama serve`	Starts the server and exposes it on all network interfaces. Accessible from other machines on our network.	Use with caution on public networks.

# Call the running Ollama server directly with curl
curl http://localhost:11434/api/generate \
  -d '{
    "model": "llama3",
    "prompt": "Explain Mulesoft in one sentence.",
    "stream": false
  }'

# OpenAI-compatible endpoint — works with any OpenAI SDK client
curl http://localhost:11434/v1/chat/completions \
  -d '{"model":"llama3","messages":[{"role":"user","content":"Hello"}]}'

04 — Custom Models with Modelfile

A Modelfile lets us define a custom model. We set a base model, a system prompt, and parameters.

Command	What It Does	Notes
`ollama create <name> -f <Modelfile>`	Builds a new custom model from a Modelfile.	Key feature for personalized AI behavior.
`ollama push <model>`	Uploads a model we created to the Ollama registry. Requires a registered account.	Share with our team or the community.

# File: Modelfile

FROM llama3

# Set a persistent system prompt for all conversations
SYSTEM """
You are a senior enterprise architect.
You give concise, accurate technical answers only.
"""

# Adjust model behavior — lower = more focused responses
PARAMETER temperature 0.3
PARAMETER num_ctx     4096
# Build the custom model
ollama create my-architect -f ./Modelfile

# Run it immediately
ollama run my-architect

05 — Quick Reference

Every essential Ollama CLI command at a glance.

Command	Purpose
`ollama pull <model>`	Download a model
`ollama list`	List local models
`ollama show <model>`	Inspect model details
`ollama rm <model>`	Delete a model
`ollama cp <src> <dst>`	Copy a model
`ollama run <model>`	Interactive chat
`ollama run <model> "…"`	Single-shot prompt
`ollama serve`	Start REST API server
`ollama create <name> -f <file>`	Build custom model
`ollama push <model>`	Upload to registry
`ollama ps`	Show running models
`ollama stop <model>`	Unload a running model