16 Using Ollama in the shell (bash)

This brief tutorial demonstrates how to use Ollama from the shell to generate responses from a local - and therefore private - LLM.

You will learn how to:

Generate a free text response directly from an LLM.
Use system prompts to guide the model’s behavior.
Control important model parameters like temperature.
Define an output schema and generate structured outputs from an LLM.
Use “reasoning” models and extract their reasoning output.

16.1 Requirements

Ollama
curl
jq

16.2 Generate a free text response

To generate a free text response from an LLM using Ollama, we can use the curl command to send a POST request to the Ollama API endpoint:

resp=$(
  curl http://localhost:11434/api/generate -d '{
    "model": "qwen3:8b",
    "system": "You are a meticulous research assistant.",
    "prompt": "What is your name and who made you?",
    "stream": false,
    "options": {
      "temperature": 0.2
      }
  }'
)

  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  2519    0  2307  100   212    207     19  0:00:11  0:00:11 --:--:--   594

16.3 Parse the response

We can use printf '%s' and a tool like jq to extract the “thinking” and “response” fields:

echo "--- Reasoning ---"
printf '%s' "$resp" | jq -r '.thinking'
echo "--- Response ---"
printf '%s' "$resp" | jq -r '.response'

--- Reasoning ---
Okay, the user is asking for my name and who made me. Let me start by confirming my name. I should mention that I'm Qwen, developed by Alibaba Cloud. Then, I need to explain the development team and the purpose of my creation. I should highlight the collaboration between Alibaba Cloud and the Tongyi Lab, and mention the goal of providing helpful and efficient services. Also, I should keep the response friendly and open-ended, inviting the user to ask more questions. Let me make sure the information is accurate and the tone is approachable.

--- Response ---
My name is Qwen, and I was developed by Alibaba Cloud. I am a large-scale language model created through the collaboration of Alibaba Cloud and the Tongyi Lab. My purpose is to provide users with helpful, efficient, and accurate services across various tasks, such as answering questions, creating content, and assisting with problem-solving. If you have any questions or need help, feel free to ask! 😊

16.4 Generate structured output

There are many scenarios, especially in research, where we want to generate a structured response instead of free text. Many users try to achieve this using instructions added to the user prompt and LLMs are increasingly good at following these instructions. However, there is native support to define an output schema including the required names and their descriptions, which is much cleaner, easier, and more likely to succeed, without requiring laborious and extensive prompting.

We begin by defining the output schema. To do this, we create a JSON object. For example, we can define a schema with three fields: name, manufacturer, and knowledge_cutoff.

LLMinfo_schema='{
  "type": "object",
  "properties": {
    "name": {
      "type": "string",
      "description": "Your name"
    },
    "manufacturer": {
      "type": "string",
      "description": "The name of the person, group, or company that built you."
    },
    "knowledge_cutoff": {
      "type": "string",
      "description": "Your knowledge cutoff date."
    }
  },
  "required": ["name", "manufacturer", "knowledge_cutoff"]
}'

Let’s rerun the previous query, but this time we will include the output schema in the request.

When generating structured output, you might choose to adjust the prompt to explicitly ask for responses that conform to the schema. This is not necessary, as models, especially those designed for structured outputs, will usually adhere to the schema without additional prompting. Note that in this case, we are purposely using the same prompt as before, while the schema is actually requesting for a third field, knowledge_cutoff, which is not mentioned in the prompt.

resp_structured=$(
  curl http://localhost:11434/api/generate -d "{
    \"model\": \"qwen3:8b\",
    \"system\": \"You are a meticulous research assistant.\",
    \"prompt\": \"What is your name and who made you?\",
    \"stream\": false,
    \"options\": {
      \"temperature\": 0.2
    },
    \"format\": $LLMinfo_schema
  }"
)

  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  1301  100   651  100   650    169    169  0:00:03  0:00:03 --:--:--   338

Now the “response” field will contain a JSON string that conforms to the defined schema. We can use printf and jq to pretty print it:

printf '%s' "$resp_structured" | jq -r '.response' | jq .

{

  "name": "Qwen",

  "manufacturer": "Alibaba Group",

  "knowledge_cutoff": "2024年04月"

}

or:

printf '%s' "$resp_structured" | jq '.response | fromjson'

{

  "name": "Qwen",

  "manufacturer": "Alibaba Group",

  "knowledge_cutoff": "2024年04月"

}

Ollama does not currently support output of both reasoning and structured response in a single call. This is likely to be fixed in future versions.

16.4.1 Validate the response

You can optionally validate the response:

if printf '%s' "$resp_structured" | jq -r '.response' | jq -e 'if .name and .manufacturer and .knowledge_cutoff then . else error("Invalid") end' > /dev/null 2>&1; then
  echo "✓ The response conforms to the schema."
else
  echo "✗ The response does not conform to the schema."
fi

✓ The response conforms to the schema.