Intermediate SQL
Intermediate SQL was another course I believe I’ve been through. If I haven’t, I’ve definitely worked through chapters 1 & 2 of this course as all of that was review. Intermediate SQL has 4 chapters: Selecting Data, Filtering Records, Aggregate Functions, and Sorting and Grouping. This is the first SQL course where you get a real idea of what SQL can actually do.
Selecting Data focused on the COUNT( ), COUNT(DISTINCT ) functions and debugging errors. Debugging was all about catching misspellings, missing commas, and correct formatting. The COUNT( ) function will count the number of values in a certain data set. And COUNT(DISTINCT ) will count the unique values in a data set.
Example: You want to find how many movies were released in a specific year. That code looks like:
SELECT release_year, COUNT(title)
FROM films;
This will return a table of the years listed in one column and the number of movies released in each respective year in the second column.
Example 2: You want to find out how many different countries movies were filmed in any given year. That would look like this:
SELECT release_year, COUNT(DISTINCT(country))
FROM films;
This will return a table of the years in one column and the number of different countries movies were filmed in. For example: in 1999 movies were filmed in the USA, Canada, and Mexico. The first column would be 1999 and the second column would return the number 3 since there were 3 different countries filmed in even though 100 movies could have been filmed in these countries, we were only curious in how many countries. Not the best example but this gets the point across…I think.
For Filtering Records, we got into WHERE, AND, OR, BETWEEN, LIKE, NOT LIKE, IN clauses. These are all different ways to filter the data. You want to find out what movies were filmed between 1999 and 2005, you want to return a list of movie titles that start with a certain letter, etc.
Example:
SELECT release_year, title
FROM films
WHERE release_year = 2005;
This will return the movies released in 2005. This definitely gets more in detail than that simple example but this was the general theme.
Aggregate Functions was next. Quickly, this means AVG( ), SUM( ), MIN( ), MAX( ), COUNT( ) functions. A few nuances I noted:
MIN, MAX, SUM don’t have to be numeric values
If you are trying to pull the most recent year of a dataset for example, you would use the MAX( ) function
Simple things for this would be finding the average budget for movies:
SELECT release_year, AVG(budget)
FROM films;
This returns the release years and the average budget for those years. You can insert any aggregate function there and you would return a value.
Aggregate functions also went over the ROUND( ) function. This is where you can round your results to any number of decimals or values. Example:
SELECT release_year, ROUND(AVG(budget), 2)
FROM films;
That would round the budgets of the release years to 2 decimals. You can also round to negative values which would just be round on the other side of the decimal. ROUND(AVG(budget), -2) would change the result from $111.11 to $100 (I used very simple numbers here just for the sake of keeping it simple). Instead of the ‘,2’ round to the second decimal, the ‘,-2’ rounded on the left side of the decimal to the nearest 100, in this case $100.
Sorting and Grouping went over the ORDER BY and GROUP BY clauses. ‘ORDER BY budget ASC/DESC’ in a call would order the budgets from smallest to largest or largest to smallest. One thing to call out here - you don’t have to put your ORDER BY field in your SELECT field. So, you can order by budgets but the result may only return the titles of the movies event though they are in order of budgets. Unsure why someone would want to do that, but that is an option. Also, if you ORDER BY 2 fields, think of the second field you call as a tie-breaker.
The GROUP BY clause will return groups of values. For example, you call the number of movies in each rating category (G, PG, PG-13, R, etc.). One thing to note - you will get an error if you don’t use your SELECT statements in the GROUP BY statements unless a SELECT state is an aggregate like COUNT( ). Example:
SELECT rating, COUNT(title)
FROM films
GROUP BY rating;
This will return a table of each rating and the number of movies in each rating.
Filtering the GROUP BY statements is also a bit different than other filtering methods like WHERE statements. For GROUP BY statements, you will need to use a ‘HAVING’ statement. This would look something like this in a line of code: ‘HAVING AVG(budget) > 1000000’. This will filter the group to only return budgets of greater than $1,000,000.
Onto the next…