The AI revolution is bringing some crazy advancements in technology, and one of the most exciting developments is ChatGPT. This bad boy has the potential to seriously shake up the way we work. And guess what? Writing SQL is one of the tasks that’s already feeling the impact.
Now, get ready for some mind-blowing stuff because we’re about to go deep into the world of natural language and SQL databases. We’ll be using Python’s open-source package Vanna, so grab your coding hats and let’s get started!
In this guide, we’re gonna cover some important ground. We’ll tackle the challenges of writing SQL in data-driven projects, explore how generative AI can make our lives easier, and delve into the implementation of LLMs (Large Language Models) to write SQL queries using natural language prompts. Plus, we’ll learn how to connect and interact with an SQL database using the badass Vanna.
But hold up! Before we get into the nitty-gritty, we gotta give credit to the Data Science Blogathon for publishing this article. It’s a goldmine of information, my friends.
Okay, let’s kick things off by talking about the common struggles of working with SQL in data-driven projects. SQL is a beast of a programming language, and not everyone in a company has the skills to tame it. This becomes a major bottleneck when it comes to answering business questions because everyone relies on the few SQL-savvy individuals who hold the keys to the database.
But what if I told you there’s a way for everyone, regardless of their SQL expertise, to access and utilize that valuable data? Enter generative AI to save the day! Developers and researchers are already experimenting with training LLMs specifically for SQL tasks. One popular tool, LangChain, can connect and interact with SQL databases using natural language prompts. It’s still in its early stages, though, so there’s plenty of room for improvement.
And that’s where Vanna comes in. This AI agent is here to democratize SQL usage. With Vanna, you can create custom models tailored to your specific database. Just ask it business questions in plain English, and it’ll work its magic to translate those questions into SQL queries. You can even run the queries against the database and get the results, along with some neat visualizations and follow-up questions.
To create a custom model with Vanna, you gotta train it with the right information. That means feeding it SQL examples, database documentation, and database schemas. The more accurate and relevant your training data, the better your model’s accuracy. The cool thing is, Vanna keeps learning from its mistakes and gets smarter with every query it generates.
Now, let’s get down to business. First things first, you gotta install Vanna. It’s a piece of cake, just a quick pip install away. Once it’s set up, import the packages and get your API key. Trust me, this is gonna be a wild ride.
With Vanna, you can create as many custom models as you want. So let’s say you’re part of the marketing department, and you work with a Snowflake data warehouse and a PostgreSQL database. You can create separate models for each database, fine-tuned to their unique characteristics. It’s all about customization, my friends.
Once your models are ready, you can access them using Vanna’s handy functions. But here’s a tip: Vanna comes with some pre-trained models you can play around with for testing purposes. We’ll work with the “chinook” model, which is trained on a fictional SQLite database. It’s gonna be a blast, I promise.
Now, let’s talk about the guts of Vanna – the models. You can check what data is in a model at any time using the get_training_data() function. It’ll give you a pandas DataFrame with all the training data. You can even add more training data manually using the train() function. Just pass in the question, SQL query, DDL statement, or database documentation, and watch your model get even smarter.
Alright, buckle up because now we’re getting to the fun part – asking questions. With Vanna, you simply call the ask() function and throw your question at it. Let’s start with an easy one: “What are the top 5 jazz artists by sales?” Vanna will do its thing and return the results. It’s like having a genius SQL assistant at your fingertips.
And that’s the magic of Vanna, my friends. It’s all about making SQL accessible and easy for everyone, no matter their technical skills. So go ahead, explore the world of generative AI and SQL, and let Vanna be your guide. It’s gonna be one hell of a journey!
[Note: This article was published as a part of the Data Science Blogathon]