Tips for Writing SQL Queries with GPT
Learn how to make GPT produce SQL queries that work with your data.
July 9, 2023
ChatGPT can provide immense help in writing SQL queries. But to get the best results, there’s a couple of things you need to look out for.
This post will show you tips on how to make GPT-3.5 / GPT-4 produce SQL queries that work on your data. With these tips in hand, you’ll craft complex SQL queries in no time and find new insights in your data - all with the help of GPT.
Inform GPT about your schema
If you want SQL queries returned by GPT to work with your data, you’ll have to tell GPT about your database schema.
You can start out simple by mentioning specific tables and columns in your prompt. For example:
I have a table named `users` with field `created_at`.
That’s good for simple cases but quickly gets tedious for queries using more columns.
As an alternative, you can paste CREATE TABLE statements into your prompt. Saves you typing effort, and provides more context to GPT by also mentioning data types! Example:
These are the relevant tables:
CREATE TABLE public.users ( id uuid NOT NULL, created_at timestamp with time zone DEFAULT now() NOT NULL, email text NOT NULL );
CREATE TABLE public.purchases ( id uuid NOT NULL, user_id uuid NOT NULL, product_id uuid NOT NULL, created_at timestamp with time zone DEFAULT now() NOT NULL, );
Want to get even more out of GPT for SQL? Give DB Pilot a try!
DB Pilot is a database GUI with GPT integration.
It gives you a personal AI assistant that's aware of your complete database schema.
Don't worry about how to tell GPT how your database looks like - focus on finding new insights in your data.
How to get CREATE TABLE statements
Use any of the commands below, depending on which database you use, to get the
CREATE TABLE statement for a single table.
For PostgreSQL you need to use an external tool to get the
CREATE TABLE statement, for example the command line tool
pg_dump -t '<table_name>' --schema-only <database_name>
Run this SQL statement:
SHOW CREATE TABLE <table_name>;
Run this SQL statement:
SELECT sql FROM sqlite_master WHERE name = '<table_name>';
Provide sample values
In addition to letting GPT know about the database schema, it can help to provide sample values of important columns or even entire rows.
Doing so is especially helpful if the format or organization of your data isn’t visible just by looking at the database schema.
One such case is extracting details from JSON columns. Looking at just the table schema, GPT can’t know the format of your JSON data.
So you’ll either have to describe the format of your JSON data or, and this is often easier, paste one or more example values of the column into the prompt. Looking at those example values, GPT should then be able to figure out how to query the JSON column.
Remember to think about sensitive information when pasting data into a prompt.
It is up to you to decide if you can expose sensitive data to GPT (and thus OpenAI) or not.
Because of this, DB Pilot never exposes values from your database to GPT without your explicit confirmation.
Tell GPT which tables/columns to use
For the most part, GPT is able to infer which tables and columns it needs to query to fulfill the request you give it. In some situations though, it can be unsure.
In those cases it can help to give GPT some assistance: Tell it in which tables and columns it can find what you are looking for. Add something along these lines to your prompt:
Be aware that GPT makes errors occasionally
Both GPT-3.5 and GPT-4 are good at producing valid SQL if you follow the previous tips. But keep in mind that GPT does occasionally produce queries that return errors, or don’t match your intent.
Ideally, check each SQL query returned by GPT to make sure it does what you want.
Let GPT correct itself
As discussed, GPT does make errors at times. But the good thing? It’s usually happy to correct itself.
Say you asked GPT to write a SQL query. You run the query, but it returns an error. To address this, simply send a follow-up chat message along the lines of:
Or say, the query does return something, but the results seem unexpected to you. In this case, send another message explaining what is unexpected about the results.
In both cases, GPT will often be able to correct the error it made initially.
Why use GPT for SQL if it isn’t perfect?
As mentioned, you sometimes need to tell GPT which columns to use, and sometimes GPT doesn’t produce correct results right away. You might wonder then, why use GPT to write SQL queries at all? Why not just write them by hand?
And you’re right, for simple queries like
SELECT * FROM users, it’s faster to construct the query by hand than asking GPT to do the same.
But once things get a little more complex, you’ll save time, and more importantly mental energy, if you let GPT help you with your SQL queries.
Of course, you could write that complex analytical query by hand. But you probably have to think hard to achieve what you need. So why not let GPT do the hard thinking? All you’ll have to do is verify that GPT’s query is correct - usually much easier than coming up with a complex query from scratch all by yourself.
Secondly, how often have you wanted to use a function but didn’t remember the exact order of its parameters? Or, how exactly was the syntax for querying JSON again? By using GPT for SQL, you won’t have to constantly keep looking at documentation for things like these.
Even if GPT doesn’t always get everything correct immediately, it can still help you get to your desired results faster overall.