[Solved] OpenAI API: How do I enable JSON mode using the gpt-4-vision-preview model? – Gpt-4

by
Alexei Petrov
chat-gpt-4 openai-api python

Quick Fix: Currently, enabling JSON mode is only supported by gpt-4-1106-preview and gpt-3.5-turbo-1106 models. Use response_format parameter to set JSON mode and constrain the model to generate valid JSON strings.

The Problem:

A developer is using the OpenAI API’s gpt-4-vision-preview model to generate JSON-formatted responses for image-based prompts. However, when they set the response_format parameter to { type: "json_object" } as specified in the documentation, they encounter an error: 1 validation error for Request body -> response_format extra fields not permitted (type=value_error.extra). The developer needs to find a way to enable JSON mode for the gpt-4-vision-preview model.

The Solutions:

Solution 1: Use `gpt-4-1106-preview` or `gpt-3.5-turbo-1106` models

According to the updated OpenAI documentation, you can only get the JSON response when using the `gpt-4-1106-preview` or `gpt-3.5-turbo-1106` models. These models are specifically designed to generate JSON-formatted responses.

Here’s a Python code example that demonstrates how to use the `gpt-4-1106-preview` model to get a JSON response:

import os
from openai import OpenAI

client = OpenAI()
OpenAI.api_key = os.getenv("OPENAI_API_KEY")

completion = client.chat.completions.create(
    model="gpt-4-1106-preview",
    messages=[
        {
            "role": "system",
            "content": "You are a helpful assistant. Your response should be in JSON format.",
        },
        {"role": "user", "content": "Hello!"},
    ],
    response_format={"type": "json_object"},
)

print(completion.choices[0].message.content)

When you run this code, you should get a JSON response similar to this:

{
  "response": "Hello! How can I assist you today?"
}

Here’s a Node.js code example that demonstrates how to use the `gpt-4-1106-preview` model to get a JSON response:

const OpenAI = require("openai");
const openai = new OpenAI({
  apiKey: process.env.OPENAI_API_KEY,
});

async function main() {
  const completion = await openai.chat.completions.create({
    model: "gpt-4-1106-preview",
    messages: [
      {
        role: "system",
        content:
          "You are a helpful assistant. Your response should be in JSON format.",
      },
      { role: "user", content: "Hello!" },
    ],
    response_format: { type: "json_object" },
  });

  console.log(completion.choices[0].message.content);
}

main();

When you run this code, you should also get a JSON response similar to the one shown above.

Solution 2: Check if the model supports JSON output

The release notes indicate that the GPT modes mentioned in the code (`gpt-3.5-turbo` and `gpt-4-vision-preview`) don’t support the new JSON output format option. To enable JSON mode, you should use `gpt-3.5-turbo-1106` or `gpt-4-1106-preview` instead.

Here’s an updated version of the code that uses `gpt-3.5-turbo-1106` as the model:

	headers = {
	    "Content-Type": "application/json",
	    "Authorization": f"Bearer {api_key}"
	}
	
	payload = {
	    "model": "gpt-3.5-turbo-1106",
	    "response_format": {"type": "json_object"},
	    "messages": [
	      {
	        "role": "system",
	        "content": "You are a helpful assistant. Your response should be in JSON format."
	      },
	      {
	        "role": "user",
	        "content": [
	          {
	            "type": "text",
	            "text": prompt
	          },
	          {
	            "type": "image_url",
	            "image_url": {
	              "url": f"data:image/jpeg;base64,{base64_image}"
	            }
	          }
	        ]
	      }
	    ],
	    "max_tokens": 1000,
	}
	
	response = requests.post("https://api.openai.com/v1/chat/completions", headers=headers, json=payload)
	print(response.json())

With this change, you should be able to successfully enable JSON mode for the supported GPT modes.

Solution 3: Workaround with Body Response

To enable JSON mode for the gpt-4-vision-preview model using the OpenAI API, you can follow these steps:

  1. Make a POST request to the "https://api.openai.com/v1/chat/completions" endpoint.
  2. Set the "Content-Type" header to "application/json" and the "Authorization" header to "Bearer ".
  3. In the request body, specify the "model" as "gpt-4-vision-preview" and provide your prompt and image URL.
  4. Set "max_tokens" to 1000 or adjust as needed.
import requests

api_key = "{YOUR_API_KEY}"
prompt = "Generate a caption for the following image:"
base64_image = "{BASE64_ENCODED_IMAGE}"

headers = {
    "Content-Type": "application/json",
    "Authorization": f"Bearer {api_key}"
}

payload = {
    "model": "gpt-4-vision-preview",
    "messages": [
        {
            "role": "system",
            "content": "You are a helpful assistant. Your response should be in JSON format."
        },
        {
            "role": "user",
            "content": [
                {
                    "type": "text",
                    "text": prompt
                },
                {
                    "type": "image_url",
                    "image_url": {
                        "url": f"data:image/jpeg;base64,{base64_image}"
                    }
                }
            ]
        }
    ],
    "max_tokens": 1000
}

response = requests.post("https://api.openai.com/v1/chat/completions", headers=headers, json=payload)

# Extract JSON response from HTML body
json_response = response.json()

Note: Due to the API’s current behavior, you’ll need to extract the JSON response from the HTML body manually. The response contains HTML markup, so you can use string manipulation to extract the JSON object.

The code above demonstrates how to make the request, set the necessary headers and body, and extract the JSON response from the HTML body in Java. Remember to replace "", "", and "" with your actual values.

Solution 4: Opt for Alternative Models to Enable JSON Mode

Unfortunately, the gpt-4-vision-preview and gpt-3.5-turbo models don’t currently support the JSON output format.

Referencing the official documentation from OpenAI, JSON mode is limited to two models: gpt-4-1106-preview and gpt-3.5-turbo-1106.

Therefore, to enable JSON mode, your solution is to select one of these models:

Image of the "gpt-4-1106-preview" model in the OpenAI API dropdown menu

Image of the "gpt-3.5-turbo-1106" model in the OpenAI API dropdown menu

Q&A

Is there a way to get the JSON response from "gpt-4-vision-preview" model?

Currently "gpt-4-vision-preview" model doesn’t support JSON response format.

Which model can be used to get JSON response?

Only gpt-4-1106-preview and gpt-3.5-turbo-1106 can be used to enable JSON mode.

Video Explanation:

The following video, titled "Build a Chat App with NEW ChatGPT API | Full stack, React, Redux ...", provides additional insights and in-depth exploration related to the topics discussed in this post.

Play video

Build a Chat Application with ChatEngine and OpenAI and ChatGPT integration tutorial. The frontend will consist of ChatEngine for our chat ...