...
Code Block | ||
---|---|---|
| ||
SELECT * FROM articles ORDER BY first_author DESC; |
...
ASC
is the default.
We can also sort on several fields at once. To truly be alphabetical, we might want to order by genus then species.
...
Code Block | ||
---|---|---|
| ||
SELECT authors, title FROM articles WHERE issns = '2067-2764|2247-6202' ORDER BY date ASC, first_author ASC; |
...
We can do this because sorting occurs earlier in the computational pipeline than field selection.
...
Clauses are written in a fixed order: SELECT
, FROM
, WHERE
, then ORDER BY
. It is possible to write a query as a single line, but for readability, we recommend to put each clause on its own line.
Aggregation
...
Overview
Exercises: 30 min
...
to understand how to aggregate records in SQL
COUNT
and GROUP BY
Aggregation allows us to combine results by grouping records based on value and calculating combined values in groups.
Let’s go to the articles table and find out how many entries there are. Using the wildcard simply counts the number of records (rows)
Code Block | ||
---|---|---|
| ||
SELECT COUNT(*)
FROM articles; |
We can also find out how many authors have participated in these articles.
Code Block | ||
---|---|---|
| ||
SELECT COUNT(*), SUM(author_count)
FROM articles; |
There are many other aggregate functions included in SQL including MAX
, MIN
, and AVG
.
...
Now, let’s see how many articles were published in each journal. We do this using a GROUP BY
clause
Code Block | ||
---|---|---|
| ||
SELECT issns, COUNT( * )
FROM articles
GROUP BY issns; |
GROUP BY
tells SQL what field or fields we want to use to aggregate the data. If we want to group by multiple fields, we give GROUP BY
a comma separated list.
...
The HAVING
keyword
In the previous lesson, we have seen the keywords WHERE
, allowing to filter the results according to some criteria. SQL offers a mechanism to filter the results based on aggregate functions, through the HAVING
keyword.
For example, we can adapt the last request we wrote to only return information about articles with a 10 or more published articles:
Code Block | ||
---|---|---|
| ||
SELECT issns, COUNT( * )
FROM articles
GROUP BY issns
HAVING COUNT( * ) >= 10; |
The HAVING
keyword works exactly like the WHERE
keyword, but uses aggregate functions instead of database fields.
If you use AS
in your query to rename a column, HAVING
can use this information to make the query more readable. For example, in the above query, we can call the COUNT(*)
by another name, like occurrences
. This can be written this way:
Code Block | ||
---|---|---|
| ||
SELECT issns, COUNT( * ) AS occurrences
FROM articles
GROUP BY issns
HAVING occurrences >= 10; |
...
Note that in both queries, HAVING
comes after GROUP BY
. One way to think about this is: the data are retrieved (SELECT
), can be filtered (WHERE
), then joined in groups (GROUP BY
); finally, we only select some of these groups (HAVING
).
...
Ordering aggregated results
...
We can order the results of our aggregation by a specific column, including the aggregated column. Let’s count the number of articles published in each journal, ordered by the count
Code Block | ||
---|---|---|
| ||
SELECT issns, COUNT( * )
FROM articles
GROUP BY issns
ORDER BY COUNT( * ) DESC; |
Saving queries for future use
...
Creating a view from a query requires to add CREATE VIEW viewname AS
before the query itself. For example, if we want to save the query giving the number of journals in a view, we can write
Code Block | ||
---|---|---|
| ||
CREATE VIEW journal_counts AS
SELECT issns, COUNT(*)
FROM articles
GROUP BY issns; |
Now, we will be able to access these results with a much shorter notation:
SELECTCode Block | ||
---|---|---|
| ||
SELECT * FROM journal_counts; |
...
Assuming we do not need this view anymore, we can remove it from the database almost as we would a table:
Code Block | ||
---|---|---|
| ||
DROP VIEW journal_counts; |
...
You can also add a view using Create View in the View menu and see the results in the Views tab just like a table.
Challenge
Key Points
SQL is ideal for aggregating database records
...
Exercise
- Write a query that returns the number of articles published in each journal on each month, sorted from most popular journal to the ones with least publications each month starting from the most recent records. Save this query as a
VIEW
.