15 Using Ollama in TypeScript

The goal of this brief tutorial is to demonstrate how to use the Ollama JavaScript library to generate responses from a local - and therefore private - LLM in a programmatic way. This notebook uses the Deno TypeScript kernel.

You will learn how to:

Generate a free text response directly from an LLM.
Use system prompts to guide the model’s behavior.
Control important parameters like temperature.
Define an output schema and generate structured outputs from an LLM.
Use “reasoning” models and extract their reasoning output.

15.1 Requirements

Ollama
Deno installed with the Deno Jupyter kernel

15.2 Import the ollama library

We’ll use Deno’s npm compatibility to import the ollama package:

import ollama from "npm:ollama";

15.3 Use the Ollama client

The default export from npm:ollama is a pre-configured Ollama instance (not a class), so we can use it directly:

// The default export is already an Ollama instance, so we can use it directly
// It's pre-configured to connect to http://localhost:11434
const client = ollama;

15.4 Generate a free text response

Let’s ask the model a simple question:

const response = await client.generate({
  model: 'qwen3:8b',
  system: 'You are a meticulous research assistant.',
  prompt: 'What is your name and who made you?',
  stream: false,
  options: {
    temperature: 0.2
  }
});

If the model used is a “reasoning” model, like qwen3:8b, the response will contain a “thinking” field. All models include a “response” field.

In this case, since we used a reasoning model, we can print the reasoning step followed by the final response:

let output = '';
if (response.thinking) {
  output += '--- Reasoning ---\n' + response.thinking + '\n\n';
}
output += '--- Response ---\n' + response.response;
console.log(output);

--- Reasoning ---
Okay, the user is asking for my name and who made me. Let me start by confirming my name. I should mention that my name is Qwen, which is the name given by Alibaba Cloud. Then, I need to explain that I was developed by Alibaba Cloud's Tongyi Lab. It's important to highlight the team's expertise and the collaborative effort involved. I should also mention the technologies and models used, like the large-scale language models and machine learning techniques. Additionally, I should note that I was trained on a vast amount of text data to understand and generate human-like text. Finally, I should invite the user to ask any questions they have. Let me make sure the response is clear and concise, avoiding any technical jargon that might confuse the user.


--- Response ---
My name is Qwen, and I was developed by Alibaba Cloud's Tongyi Lab. I am a large-scale language model designed to assist with a wide range of tasks, from answering questions to creating content. My development involved a team of researchers and engineers who specialized in natural language processing and machine learning. I was trained on a vast amount of text data to understand and generate human-like text. If you have any questions or need assistance, feel free to ask!

In the above example, we are concatenating strings to avoid multiple console.log calls, which seem to be buggy and can result in repeated outputs. This is likely to be fixed in the future.

15.5 Generate structured output

There are many scenarios, especially in research, where we want to generate a structured response instead of free text. Many users try to achieve this using instructions added to the user prompt and LLMs are increasingly good at following these instructions. However, there is native support to define an output schema including the required names and their descriptions, which is much cleaner, easier, and more likely to succeed, without requiring laborious and extensive prompting.

We begin by defining the output schema. To do this, we create a named list that follows the JSON schema format:

type = "object": Indicates that the output is a JSON object.
properties: A named list where we define the fields we want in the output. The name of each element is the field name, and its value is another list defining the field’s type (e.g. number, string, etc.) and its description.
required: A character vector listing the names of the required fields that must be present in the output.

Define an output schema:

const LLMInfoSchema = {
  type: "object",
  properties: {
    name: {
      type: "string",
      description: "Your name"
    },
    manufacturer: {
      type: "string",
      description: "The name of the person, group, or company that built you."
    },
    knowledge_cutoff: {
      type: "string",
      description: "Your knowledge cutoff date."
    }
  },
  required: ["name", "manufacturer", "knowledge_cutoff"]
};

Let’s rerun the previous query, but this time we will include the output schema in the request.

When generating structured output, you might choose to adjust the prompt to explicitly ask for responses that conform to the schema. This is not necessary, as models, especially those designed for structured outputs, will usually adhere to the schema without additional prompting. Note that in this case, we are purposely using the same prompt as before, while the schema is actually requesting for a third field, knowledge_cutoff, which is not mentioned in the prompt.


const structuredResponse = await client.generate({
  model: 'qwen3:8b',
  system: 'You are a meticulous research assistant.',
  prompt: 'What is your name and who made you?',
  stream: false,
  options: {
    temperature: 0.2
  },
  format: LLMInfoSchema
});

The response field is now a string in JSON format. We can parse it and pretty-print it using JSON.parse and JSON.stringify:

let output = '';
if (structuredResponse.thinking) {
  output += '--- Reasoning ---\n' + structuredResponse.thinking + '\n\n';
}
output += '--- Response ---\n' + JSON.stringify(JSON.parse(structuredResponse.response), null, 4);
console.log(output);

--- Response ---
{
    "name": "Qwen",
    "manufacturer": "Alibaba Cloud",
    "knowledge_cutoff": "2024年04月"
}

Ollama does not currently support output of both reasoning and structured response in a single call. This is likely to be fixed in future versions.