Simplifying Data Extraction with OpenAI JSON Mode and JSON Schemas
When I first experimented with ChatGPT-3.5 after its release, I was thrilled about its potential for various applications. However, my excitement quickly faded when I encountered a major roadblock: although the valuable information it returned was exceptionally readable, it was not in a form that could be reliably ingested by an application. Ironically, LLMs excel in extracting information from unstructured text but can only return it in unstructured form. Trying to programmatically extract results from LLMs felt like being at an incredible restaurant that serves the most delicious food, but without any utensils — you can see it and smell it, but you just can’t get to it.
I tried every trick in the book to coax it into giving me some semblance of structured data. “Please, just separate each item with a bar or a new line and skip the commentary,” I’d plead. Sometimes it worked, sometimes it didn’t. Sometimes it would “helpfully” number or reorder the items, like a well-meaning but slightly confused assistant. Other times it would still sneak in some commentary, reminiscent of a chatty co-worker. I even demanded it in no uncertain terms to return JSON and nothing else, but it sometimes left out a comma — almost as if it were taking a passive-aggressive jab. Eventually, I gave up and reluctantly returned to the less exciting but more predictable confines of traditional algorithms.
Fortunately, a few months later, OpenAI introduced JSON mode, a feature that forces the LLM to return valid JSON. I decided to try this feature and found it significantly more effective for processing results in my applications. Here’s an example of the output with JSON mode enabled:
PROMPT: Parse the following sentence into words and then return the results as a list of the original word and the translation in English and return the results in JSON. -- sentence -- 早安 RESULTS: { "results": [ { "original": "早安", "translation": "Good morning" } ] }
This output is certainly an improvement. However, while the output is valid JSON, its structure can vary depending on the contents of the prompt. A more predictable approach is to specify the desired return format. One way to achieve this is by providing a sample JSON structure for the LLM to follow. This method involves creating an example and writing code to parse it. If the structure changes, modifications must be done in both places.
An alternative approach is to define a Data Transfer Object (DTO) to hold the results and use it both to instruct the LLM and to parse the results, avoiding synchronization issues. First, define the DTO, for example:
record Entries(List<Entry> entries) { record Entry(String originalWord, String wordInEnglish, String pronunciation) {} }
Now the DTO can be used in the prompt instructions as well as by the parsing code:
// Construct the prompt with the output schema. var prompt = MessageFormat.format(""" Parse the following sentence into English and return the results in JSON according to the following JSON schema. 人工智慧將引領未來,以智慧之光照亮人類無限可能的前程。 --- output json schema --- {0} """, jsonSchemaOf(Entries.class)); var result = sendPrompt(prompt, Entries.class);
Here’s the code that uses the Jackson JSON Schema generator:
PROMPT: Parse the following sentence into words and then return the results as a list of the original word and the translation in English and return the results in JSON. -- sentence -- 早安 RESULTS: { "results": [ { "original": "早安", "translation": "Good morning" } ] }
Note: By default, the generated schema will include ID fields used for references, which can waste tokens. See the repository OpenAI JSON Mode Sample for code that removes these unused IDs.
And finally, here’s the code that sends the prompt to OpenAI using the Azure OpenAI Java SDK:
record Entries(List<Entry> entries) { record Entry(String originalWord, String wordInEnglish, String pronunciation) {} }
The solution works well most of the time. The LLM understands the JSON schema effectively but a word of caution: I’ve seen cases where it sometimes gets it wrong. For example, if a field is a String and its name is plural (e.g. “exampleValues”), the LLM sometimes insist on returning an array of Strings instead.
LLMs can generate remarkable outputs, sometimes exceeding the capabilities of the average person. However, it’s intriguing that, at least for now, they struggle with the more mundane task of reliably formatting their generated output.
The above is the detailed content of Simplifying Data Extraction with OpenAI JSON Mode and JSON Schemas. For more information, please follow other related articles on the PHP Chinese website!

Hot AI Tools

Undresser.AI Undress
AI-powered app for creating realistic nude photos

AI Clothes Remover
Online AI tool for removing clothes from photos.

Undress AI Tool
Undress images for free

Clothoff.io
AI clothes remover

Video Face Swap
Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Article

Hot Tools

Notepad++7.3.1
Easy-to-use and free code editor

SublimeText3 Chinese version
Chinese version, very easy to use

Zend Studio 13.0.1
Powerful PHP integrated development environment

Dreamweaver CS6
Visual web development tools

SublimeText3 Mac version
God-level code editing software (SublimeText3)

Hot Topics











Python is more suitable for beginners, with a smooth learning curve and concise syntax; JavaScript is suitable for front-end development, with a steep learning curve and flexible syntax. 1. Python syntax is intuitive and suitable for data science and back-end development. 2. JavaScript is flexible and widely used in front-end and server-side programming.

The shift from C/C to JavaScript requires adapting to dynamic typing, garbage collection and asynchronous programming. 1) C/C is a statically typed language that requires manual memory management, while JavaScript is dynamically typed and garbage collection is automatically processed. 2) C/C needs to be compiled into machine code, while JavaScript is an interpreted language. 3) JavaScript introduces concepts such as closures, prototype chains and Promise, which enhances flexibility and asynchronous programming capabilities.

The main uses of JavaScript in web development include client interaction, form verification and asynchronous communication. 1) Dynamic content update and user interaction through DOM operations; 2) Client verification is carried out before the user submits data to improve the user experience; 3) Refreshless communication with the server is achieved through AJAX technology.

JavaScript's application in the real world includes front-end and back-end development. 1) Display front-end applications by building a TODO list application, involving DOM operations and event processing. 2) Build RESTfulAPI through Node.js and Express to demonstrate back-end applications.

Understanding how JavaScript engine works internally is important to developers because it helps write more efficient code and understand performance bottlenecks and optimization strategies. 1) The engine's workflow includes three stages: parsing, compiling and execution; 2) During the execution process, the engine will perform dynamic optimization, such as inline cache and hidden classes; 3) Best practices include avoiding global variables, optimizing loops, using const and lets, and avoiding excessive use of closures.

Python and JavaScript have their own advantages and disadvantages in terms of community, libraries and resources. 1) The Python community is friendly and suitable for beginners, but the front-end development resources are not as rich as JavaScript. 2) Python is powerful in data science and machine learning libraries, while JavaScript is better in front-end development libraries and frameworks. 3) Both have rich learning resources, but Python is suitable for starting with official documents, while JavaScript is better with MDNWebDocs. The choice should be based on project needs and personal interests.

Both Python and JavaScript's choices in development environments are important. 1) Python's development environment includes PyCharm, JupyterNotebook and Anaconda, which are suitable for data science and rapid prototyping. 2) The development environment of JavaScript includes Node.js, VSCode and Webpack, which are suitable for front-end and back-end development. Choosing the right tools according to project needs can improve development efficiency and project success rate.

C and C play a vital role in the JavaScript engine, mainly used to implement interpreters and JIT compilers. 1) C is used to parse JavaScript source code and generate an abstract syntax tree. 2) C is responsible for generating and executing bytecode. 3) C implements the JIT compiler, optimizes and compiles hot-spot code at runtime, and significantly improves the execution efficiency of JavaScript.
