Quick Setup of Flights Data to Learn SQL
I realize that if someone is completely new to SQL there may be some initial pain points trying to set up an environment and get going with actually learning SQL.
This article is a quick tutorial for how to setup and get going with the Kaggle 2015 Flight Delays and Cancellations Data.
For this setup you will need:
- Git installed on your machine
- Docker Compose Version 2 Installed (Or update the commands in the Makefile from docker compose to docker-compose)
- Have a Kaggle Account (Or know somebody who does that can download the data for you :-))
That’s it. Lets jump right in.
Go to the GitHub Repository https://github.com/MichaelShoemaker/sql_practice_flight_data
Click the Code button, choose HTTPS and copy the link.
On your local machine run
git clone https://github.com/MichaelShoemaker/sql_practice_flight_data.git
Now we need the data. Head over to the Kaggle Flight Delay Data https://www.kaggle.com/datasets/usdot/flight-delays If you don’t have an account just create one. It’s free. Then Download the data and copy it to the data directory in the cloned repository.
Now from the root of the cloned repo execute
make up
Navigate to localhost:8080 and login with username admin@admin.com and the password root.
Once logged in right click on Server, highlight register and choose Server
You can call the server whatever you want. In my case I just used flights.
Then in the Connection tab enter database for the host and use root for the Username and Password
Once connected you will notice that there is a flights database, but no data.
In the root of your directory run
make load-data
You may need to right click on tables and choose Refresh. You should now see the tables populated.
Go to tools at the top and choose Query Tool
You just need to hit the little play button to execute the queries you write.
You are now free to walk about the cabin.