The future of AI data visualization
Since LLMs hit the scene, one of the very first use cases/demo was data analysis. At this stage, most of us have used ChatGPT, Claude or some other AI to generate a chart, but it feels like the jury is still out on the role AI will play in data visualization. Will we continue to default to point and click charting? Will AI generate 100% of charts? Or is the future hybrid, intermixing some AI generation and some point and click?
As a founder in the AI and data visualization space, I find this topic almost existential. Founded post-2022 (ie. after LLMs hit the scene in a real way), we have to make a decision about how we want to handle charting. Do we invest hours and hours of dev work (and funds) to develop charting functionality, or is that going away and a sunk cost for all tools built pre-LLMs? Or is the future hybrid? I recently came across Data Formulator, a research project, which explores some really interesting interactions between AI and traditional charting which revived this question for me.
In this post I’m going to take a look at where we are today for text-to-chart (or text-to-visualization) and where we might be headed in the future.
The current state of text-to-visualization
Like all things AI, this post likely won’t age very well. Some new piece of information or model will come out in the next 6 months and completely change how we think about this topic. Nonetheless, let’s take a look at the various states of data visualization and AI.
Pure point-and-click charting
I won’t linger on this one too much since most readers know this one well. Open up Excel, Google Sheets or any other data tool built pre-2023 and you’ll have some form of this. Sometimes you click to add data to an axis, sometimes you drag and drop a field, but the concept is the same: You structure the data appropriately, then you press a few buttons to generate a chart.
In this paradigm, the vast majority of data cleaning and transformation happens prior to the charting. You can generally apply aggregation metrics like average, median, count, min, max etc. but all transformations are fairly rudimentary.
100% AI generated charting
AI generated charts, or text-to-visualization, has only really existed since the advent of modern LLMs (if we dig around, there were experiments going on before then, but for all practical purposes we can focus on post-2022 LLMs).
OpenAI’s ChatGPT can generate non-interactive charts using Python, or a limited set of interactive charts using front end libraries (see OpenAI Canvas for some examples). As with all things OpenAI, Anthropic has its own analogous concepts and has Artifacts.
It’s worth noting here that AI-generated charts can be subdivided into two families: Purely Pythonic/back end generated charts or a mix of back end and front end.
ChatGPT and Claude alternate between the two. Training an AI to generate front end code, and integrating that front end code to create visualizations can be a lot more work than just relying on Python, using a library such as plotly, matplotlib, seaborn. On the other hand, front end libraries give the providers and users more control over the look and feel of the chart and interactivity. This is why LLM providers have their AI generate basic charts like bar charts, line charts or scatter plots, but anything more sophisticated like a Sankey diagram or waterfall chart falls back to Python.
A brief sidebar on Fabi.ai: Seeing as we’re a data analysis platform, we obviously offer charting, and despite some point-and-click charting, the vast majority of charts created by our users are AI-generated. So far, we’ve found that AI is remarkably good at generating charts, and by leveraging pure Python for charting, we’ve been able to train the AI to generate nearly any chart the user can dream up. So far, we’ve chosen that accuracy and flexibility over point-and-click functionality and custom UI designs.
Hybrid: AI generation in a point-and-click paradigm
This is where things start to get interesting in the debate of where AI text-to-visualization is headed. Fast forward 3 years from now, when someone is doing an analysis, if they use AI, will they let AI take 100% control, or will the AI be used in a mixed-environment where it can only edit the charts within the confines of certain point-and-click functionality.
To help make this picture more concrete, check out Data Formulator. This is a recent research project that attempts to offer a true mixed environment where AI can make certain edits, but the user can take over and use the point-and-click functionality as needed.
If we ask the question using a car analogy: Do you believe that in the future cars will not have a steering wheel, or do you believe that there will be a driver who will have to sit there and pay attention and occasionally take over, similar to how the Tesla self-driving functionality currently works?
First principles: What I believe to be true
The question of where things are headed is really important to us at Fabi.ai seeing as this could greatly influence certain decisions we make: Do we invest in integrating a charting library in the front end? Do we even bother with point-and-click functionality at all? As a growing, innovative company leading in the AI data analysis space, we need to be thinking about where the puck is going, not where it currently is.
So to answer this question, I’m going to use some first-principle thinking.
AI is only getting better, faster, cheaper
From the very first time I used AI and complaints arose around the speed and cost, I’ve believed that AI was going to continue getting better, faster and cheaper. Roughly speaking, the cost per token has fallen by 87% per year in the past few years. Not only has the cost gone down, but accuracy and speed have both gone up drastically as well.
In the next 10 years, we’re going to look back on 2024 LLMs the same way we look back on “supercomputers” from the 80s and 90s now that we all have supercomputers in our pockets everywhere we go.
All that to say, that any argument for or against any of the various charting approaches mentioned above cannot be that AI is too slow, expensive or inaccurate to generate charts. In other words, to believe that point-and-click charting will still exist in any way, shape or form, you have to believe that there is something about the user experience or the use case, that merits that functionality.
The hard part about data visualization is the data wrangling and cleaning
In my experience, when doing any form of data analysis that involves visualization, the hard part is not the charting. The hard part is getting the data cleaned and ready in the right format for the chart I’m trying to create.
Say I some user event data that has the following fields:
- Event ID
- Event start timestamp
- Event end timestamp
Now say I want to plot the average event duration by hour to measure latency. Before I can do any sort of charting in a spreadsheet or legacy charting tool, I have to:
- Calculate the end time minus the start time (I probably have to do some sort of formatting first)
- Pivot the data by hour, which is actually surprisingly hard to do
But asking AI to do this, it takes care of all of that and the charting in just a second or two:
# Calculate the event duration in hours df['Event duration (hours)'] = (df['Event end datetime'] - df['Event start datetime']).dt.total_seconds() / 3600 # Extract the start hour from the start datetime df['Start hour'] = df['Event start datetime'].dt.hour # Group by start hour and calculate the average duration average_duration_by_hour = df.groupby('Start hour')['Event duration (hours)'].mean().reset_index() # Plot using Plotly fig = px.bar( average_duration_by_hour, x='Start hour', y='Event duration (hours)', title='Average Event Duration by Hour', labels={'Event duration (hours)': 'Average Duration (hours)', 'Start hour': 'Hour of Day'}, text='Event duration (hours)' ) # Show the figure fig.show()
And this was one of the simplest possible examples. Most times real-world data is much more complicated.
The future of AI text-to-visualization: some point and click with 100% AI generated
At this point, you likely have a sense of where I’m leaning. As long as you can get your dataset roughly right with all the data needed for an analysis, AI already does a remarkably good job at manipulating it and charting it in the blink of an eye. Fast forward one, two or three years from now, it’s hard to imagine that this won’t be the standard.
That said, there are some interesting hybrid approaches that are cropping up like Data Formulator. The case for this type of approach is that perhaps our hands and brains are able to move faster to quickly make tweaks than it takes us to think about what we want and explain it sufficiently clearly for the AI to do its job. If I ask “Show me total sales by month over the last 12 months” with the assumption that this should be a stacked bar chart broken out by region, it’s possible that we may find it easier to just move our mouse around. If that’s the case, the hybrid approach may be the most interesting: Ask the AI to take a first stab at it, then a few clicks and you have what you want.
The key to success for either a full AI approach or a hybrid approach is going to be in the user experience. Especially for the hybrid approach, the AI and human interactions have to work perfectly hand in hand and be incredibly intuitive to the user.
I’m excited to watch the space develop and where we head with text-to-visualization in the next 12 months.
The above is the detailed content of The future of AI data visualization. For more information, please follow other related articles on the PHP Chinese website!

Hot AI Tools

Undresser.AI Undress
AI-powered app for creating realistic nude photos

AI Clothes Remover
Online AI tool for removing clothes from photos.

Undress AI Tool
Undress images for free

Clothoff.io
AI clothes remover

Video Face Swap
Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Article

Hot Tools

Notepad++7.3.1
Easy-to-use and free code editor

SublimeText3 Chinese version
Chinese version, very easy to use

Zend Studio 13.0.1
Powerful PHP integrated development environment

Dreamweaver CS6
Visual web development tools

SublimeText3 Mac version
God-level code editing software (SublimeText3)

Hot Topics











Python is easier to learn and use, while C is more powerful but complex. 1. Python syntax is concise and suitable for beginners. Dynamic typing and automatic memory management make it easy to use, but may cause runtime errors. 2.C provides low-level control and advanced features, suitable for high-performance applications, but has a high learning threshold and requires manual memory and type safety management.

To maximize the efficiency of learning Python in a limited time, you can use Python's datetime, time, and schedule modules. 1. The datetime module is used to record and plan learning time. 2. The time module helps to set study and rest time. 3. The schedule module automatically arranges weekly learning tasks.

Python is better than C in development efficiency, but C is higher in execution performance. 1. Python's concise syntax and rich libraries improve development efficiency. 2.C's compilation-type characteristics and hardware control improve execution performance. When making a choice, you need to weigh the development speed and execution efficiency based on project needs.

Is it enough to learn Python for two hours a day? It depends on your goals and learning methods. 1) Develop a clear learning plan, 2) Select appropriate learning resources and methods, 3) Practice and review and consolidate hands-on practice and review and consolidate, and you can gradually master the basic knowledge and advanced functions of Python during this period.

Pythonlistsarepartofthestandardlibrary,whilearraysarenot.Listsarebuilt-in,versatile,andusedforstoringcollections,whereasarraysareprovidedbythearraymoduleandlesscommonlyusedduetolimitedfunctionality.

Python and C each have their own advantages, and the choice should be based on project requirements. 1) Python is suitable for rapid development and data processing due to its concise syntax and dynamic typing. 2)C is suitable for high performance and system programming due to its static typing and manual memory management.

Python excels in automation, scripting, and task management. 1) Automation: File backup is realized through standard libraries such as os and shutil. 2) Script writing: Use the psutil library to monitor system resources. 3) Task management: Use the schedule library to schedule tasks. Python's ease of use and rich library support makes it the preferred tool in these areas.

Key applications of Python in web development include the use of Django and Flask frameworks, API development, data analysis and visualization, machine learning and AI, and performance optimization. 1. Django and Flask framework: Django is suitable for rapid development of complex applications, and Flask is suitable for small or highly customized projects. 2. API development: Use Flask or DjangoRESTFramework to build RESTfulAPI. 3. Data analysis and visualization: Use Python to process data and display it through the web interface. 4. Machine Learning and AI: Python is used to build intelligent web applications. 5. Performance optimization: optimized through asynchronous programming, caching and code
