Since I started web development several months ago, I have always wondered what happens behind the scene when a user creates an account on a website or when they fill in their email and password and click on the "login" button, how does the website know the person entered the right details and then grant them access to the site? well, I found out a few days ago when I started learning SQL (specifically MySQL).
It's things like this that make programming so fun and interesting, that "Ah-ha" moments when you suddenly discover how things work in a website. I have been taking an introduction to computer science course (CS50) with Harvard university online (Harvardx) and so far, I have been introduced to C, python, and now SQL.
After each lecture week, we are usually given problem sets to solve which pertain to the language and topic that was taught that week. I recently worked on a project that involves writing codes to query an SQL file filled with data about movies from IMDb and there are over a million rows of data in there. There were 13 questions I had to answer using SQL queries and I will talk about how I solved 5 of them.
Warning: if you're currently taking CS50 and you haven't worked on this problem, you might want to stop reading at this point because I will be providing solutions to some of them. I assume you must have read the course's policy on academic honesty and you know what it means to use someone's code in your work.
Understanding the database
The first thing I did was to use
.schema to show the contents of the database file. There are 5 tables there and each of them holds different data.
The movies table has 3 columns: the title of the movies, the year they were released, and a column containing id numbers that uniquely identify each movie. The same thing can be seen in the people's table: unique id numbers, names of people (directors and movie stars), and their year of birth.
The stars and directors tables are used to link to the people's table and also the movies table. They have something called foreign keys, which are just id numbers that reference the unique id numbers of the movies and people's tables. The foreign keys can be used to link to the tables that have the unique id numbers it's referencing, we can combine two or more tables with this. Now let's answer the first question.
The first question
SELECT title FROM movies WHERE year = 2008 LIMIT BY 20;
You can sort of understand what's going on in the above code without me having to explain. The
SELECT command is used to select data from a database and since the question said we should list the title of movies, that's what I did: select title from the movies table. But there's a constraint: we only need movies that were released in 2008 and that's where the next command comes in:
With that command, I'm saying: Give me the titles of the movies that were released in 2008. The last line of code isn't supposed to be there but I added it just so I can show you guys what the result looks like. The
LIMIT command does what it says: it limits the number of results according to the number provided and in this case, I provided 20. So, give me the top 20 movies that were released in the year 2008.
This is the output:
This is similar to the first one but the only difference is we are printing the year of birth of a person. It's the same logic as the first, but this time we are going to access the people's table because that's where we can find Emma stone and also her birth year
SELECT birth FROM people WHERE name = "Emma Stone";
As you can see, the logic is similar to the first but the
WHERE command is now matching a string instead of numbers. The above code simply means: select the birth year from the people's table but only give me the birth year that matches Emma Stone. If we didn't provide that last command, it will just output the birth year of everyone in the table, and if Emma stone isn't in the table, it won't output anything.
This is the output:
The solution to this is almost the same as the first one, the difference is that we are not only listing movies from a particular year but from that year and upwards. So, instead of writing a query to list out movies released in only 2018, we need to also include the ones released in 2019, 2020, and so on
SELECT title FROM movies WHERE year >= 2018 ORDER BY title LIMIT 20;
So, select all the movie titles from the movies table but only select the ones that were released in 2018 and above. The last thing we should do is to arrange all those movies in alphabetical order and that's where the
ORDER BY command comes in. We are ordering the result according to the title, so the movies will be listed starting with numbers, before A, then B, then C, and so on.
This is the output:
The solution to this is also very similar to some of the previous ones but this time, we are to write a query that will output the total number of movies that have a rating of 10, does that mean we have to count? Exactly!
SELECT COUNT(rating) FROM ratings WHERE rating = 10.0;
The count command provides the total number of whatever you pass into the bracket. So in this case, we are selecting all the ratings from the rating table but we only need those that are 10, and we don't need to print them out, we only need the total number and that's why we have to enclose the rating with the
This is the output:
That's a lot of specifications but don't let the word "chronological" confuse you, it simply means to list the movies according to the year they were released, in simpler terms: order the movies by year of release. One thing you should note is that unlike before where we have to list out just one thing (maybe a movie title or a year of birth or ratings of a movie), this time we are listing out two things: the title of the movies and also the year they were released.
SELECT title, year FROM movies WHERE title LIKE "Harry Potter%" ORDER BY year;
To list out two or more things from a table, we simply use the
SELECT command as usual and then separate each of the things we are selecting with a comma. So, select all titles and years from the movies table but... Now this is where it gets sought of confusing, haven't I been using the equality sign (=) with the
WHERE command? How come I am using something else now?
Well, if we are to use the equality sign for this query, the program will only look for movies that exactly match the title "Harry Potter" and as far as I can tell, there are none. All the Harry Potter movies have some other things attached to them, for example; Harry Potter and the cursed child or Harry Potter and the sorcerer's stone. The equality sign won't pick up any of these movie titles, it will only look for those that are exactly Harry Potter.
To fix that, we will use the
LIKE command which will look for titles that have the words "Harry Potter". But I believe you notice I added a percentage sign (%) at the end, but why? According to the specifications of this question, if a movie title begins with the words "Harry Potter", then it's considered an Harry Potter movie, which means we don't have to worry about what comes after that and that's why the % is there, it strips off every other words that comes after "Harry Potter".
So, we are listing out all titles and years from the movies table where the movie title starts with Harry Potter. Any movie title that starts with Harry Potter will be listed along with the year it is released, and they will be listed according to the year they were released.
This is the output:
That's all for now
I will talk about 2 more questions and their solutions in another article. I mentioned earlier that you can join two or more tables using a foreign key in one table that is referencing the unique id numbers of data from another table but I never showed how that works, so I will talk about that in the next article.
I will also show how you can nest queries (we call them subqueries). By doing this, you can pass the result of one query (the inner or subquery) to another query (the outer query). This can be very useful when you want to access the data in one table and use it in another table.
Thanks for reading
Edited with Canva