Run AI Models Locally. Master the Ollama CLI


In our previous post we saw how we can install Ollama in our Ubuntu server. Now it’s time to start playing with it. With Ollama wee skip the cloud. We run large language models directly on our machine. 

This cheat sheet gives us every essential Ollama command — organized, explained, and ready to use.


What is Ollama? Ollama is an open-source tool. It lets us download, run, and manage AI models on our own hardware. We use one command to pull a model. We use another to run it. No API key. No internet required after download. Ollama works on macOS, Linux, and Windows. Install it at ollama.com.


01 — Model Management

These commands control the models we store on our machine.
CommandWhat It DoesNotes
ollama pull <model>Downloads a model from the Ollama registry to our machine.Run this first. Required before use.
ollama listShows all models we have downloaded locally.Displays name, ID, size, and modified date.
ollama show <model>Prints details about a model — parameters, license, template, and system prompt.Use before running an unfamiliar model.
ollama rm <model>Deletes a model from our local disk.Frees up storage. Cannot be undone.
ollama cp <src> <dest>Copies a model under a new name. Useful before customizing a model.Original model stays intact.

Examples:
# Pull the Llama 3 8B model
ollama pull llama3

# Pull a specific version tag
ollama pull llama3:70b

# See what we downloaded
ollama list

# Remove a model we no longer need
ollama rm mistral

02 — Running Models

These commands start a model and send it prompts.

CommandWhat It DoesNotes
ollama run <model>Starts an interactive chat session with a model in our terminal.Type /bye to exit the session.
ollama run <model> "<prompt>"Sends a single prompt, prints the response, and exits. No interactive session.Great for scripts and automation.
echo "<text>" | ollama run <model>Pipes text directly into a model as a prompt via standard input.Combine with shell scripts for batch tasks.

Session Commands — Inside an active ollama run session, we have special commands available:

CommandWhat It Does
/helpLists all session commands.
/show infoDisplays model info mid-session.
/set parameter <key> <value>Adjusts parameters like temperature on the fly.
/byeExits the session cleanly.


03 — Server & API

Ollama runs a local REST server. We use it to connect any application to our models.

CommandWhat It DoesNotes
ollama serveStarts the Ollama REST API server. Listens on localhost:11434by default.Runs automatically on install. Use only if we stopped it manually.
OLLAMA_HOST=0.0.0.0 ollama serveStarts the server and exposes it on all network interfaces. Accessible from other machines on our network.Use with caution on public networks.
# Call the running Ollama server directly with curl
curl http://localhost:11434/api/generate \
-d '{
"model": "llama3",
"prompt": "Explain Mulesoft in one sentence.",
"stream": false
}'

# OpenAI-compatible endpoint — works with any OpenAI SDK client
curl http://localhost:11434/v1/chat/completions \
-d '{"model":"llama3","messages":[{"role":"user","content":"Hello"}]}'

04 — Custom Models with Modelfile

A Modelfile lets us define a custom model. We set a base model, a system prompt, and parameters.

CommandWhat It DoesNotes
ollama create <name> -f <Modelfile>Builds a new custom model from a Modelfile.Key feature for personalized AI behavior.
ollama push <model>Uploads a model we created to the Ollama registry. Requires a registered account.Share with our team or the community.
# File: Modelfile

FROM llama3

# Set a persistent system prompt for all conversations
SYSTEM """
You are a senior enterprise architect.
You give concise, accurate technical answers only.
"""

# Adjust model behavior — lower = more focused responses
PARAMETER temperature 0.3
PARAMETER num_ctx 4096
# Build the custom model
ollama create my-architect -f ./Modelfile

# Run it immediately
ollama run my-architect

05 — Quick Reference

Every essential Ollama CLI command at a glance.

CommandPurpose
ollama pull <model>Download a model
ollama listList local models
ollama show <model>Inspect model details
ollama rm <model>Delete a model
ollama cp <src> <dst>Copy a model
ollama run <model>Interactive chat
ollama run <model> "…"Single-shot prompt
ollama serveStart REST API server
ollama create <name> -f <file>Build custom model
ollama push <model>Upload to registry
ollama psShow running models
ollama stop <model>Unload a running model

Previous Post Next Post