How to get JSON response from Ollama

Many developers are utilizing ChatGPT for their application, but may want to explore cheaper alternatives. Furthermore, for many applications, you simply want to get JSON response from the model, without the excessive courtesies. Ollama is a fantastic application for getting a Large Language Model (LLM) up and running on your local environment, quickly. This guide will explain how to set up Ollama to provide JSON-formatted responses to your prompts.

Get ollama installed

what a cute llama wow

First, you should get ollama installed. Simply download it from Github.

You'll also need to pull down a model. llama2 is what I've used for now, but many other options exist.

Make a request to the ollama server

When you start ollama, it will start a server process. You can interact with the server through an included chat module, but in our case we're going to interact with it strictly through the ollama API.

The default configuration for ollama will start the server listening at

http://localhost:11434

We can make requests to this server at /api/generate and receive responses.

Let's think of a prompt for what we'd like to have the model do. One example might be parsing email text to determine some information about it.

const prompt = `This body of text may or may not be an automated response from an job applicant tracking system. Please parse the text and identify the company applied for and the job title, if possible.

If there's a division included, such as "Software Engineer, Ad Creative Management", don't include it, e.g. only return "Software Engineer".

Please format your response in JSON and use "company", "jobTitle", and "wasAtsEmail" as attributes.

If the email does not appear to be from an job applicant tracking system, for example if it's from an interest list for purchasing homes or a product, set wasAtsEmail to false.

Email text is as follows: \n\n${emailBody}`;

For each request we include the email body in the prompt, similar to calling a function with different arguments.

In order to remove the "Certaintly! blah blah blah" obsequious comments the model replies with and force the model to reply with "only JSON", we pass the "format": "json" attribute in the request. We then make a normal POST request to the endpoint and we'll receive our response.

const requestBody = {
  "model": "llama2:latest",
  "prompt": prompt,
  "format": "json",
  "stream": false,
}
const response = await axios.post(ollamaEndpoint, requestBody);

The model will return the response in response.data.response as a JSON string. We simply call JSON.parse() on it and we've got our object.

Conclusion

Using ollama is a great way to gain experience writing AI-powered applications and get a prototype up and working, quickly. Using open-source models also provides some insurance against vendor lock-in and surprise cost increases. As open-source models continue to develop, the software ecosystem will continue to provide new possibilities for developers to create amazing applications.

Jon P

Jon P

I'm the author of these works. Technical professional and independent investor. Dabbler in everything. Good luck!