Elevate Your SQL Skills With These Commands

As a software engineer, I write code daily with Python or Ruby. In the past, it was Java, C# and C++. And this is how the majority of software developers spend their professional days.

With expanding the seniority of the engineers the area of responsibility also grows. It involves efficient communication with the stakeholders, cross-team collaboration, and covering adjacent domains.

The adjacent domains can be anything: machine learning models, marketing technologies, and various service integrations.

One of the most undervalued skills of every engineer is navigating through data. SQL queries become the best friend in this case.

Surprisingly, only a few developers feel comfortable with SQL queries, especially regarding something slightly more complex than SELECT and WHERE commands.

GROUP BY

This SQL statement helps make a different kind of summaries. Consider counting something by a specific value or applying a different mathematical function. The GROUP BY statement is usually applied with aggregation functions such as SUM(), AVG(), and COUNT().

The simplest example would be to count employees by department.

SELECT department, COUNT(*)
FROM employees
GROUP BY department;

Or we can count an average salary by the department.

SELECT department, AVG(salary)
FROM employees
GROUP BY department;

But let’s make a more complex example that is applied quite often.

But let’s make a more complex example that is applied quite often.

We want to get all employees grouped by year and month they joined the company. This involves some extra actions on the timestamp. We will apply the function EXTRACT. It fetches a specific part of the date, in our case, it is the month and the year.

SELECT 
EXTRACT(YEAR FROM start_date) AS start_year,
EXTRACT(MONTH FROM start_date) AS start_month,
COUNT(*) AS num_employees
FROM
employees
GROUP BY
EXTRACT(YEAR FROM start_date),
EXTRACT(MONTH FROM start_date)
ORDER BY
start_year, start_month;

This is what the result looks like.

The result of GROUP BY with EXTRACT

By combining aggregation functions with GROUP BY, we can build simple reports without external help.

JOIN

To append multiple tables together we use the statement JOIN. This statement allows us to expand the data overview. We can collect the missing data on the normalized tables.

The statement is simple once you try it.

Let’s get back to our table with employees. We have additionally the table Departments that contains the information about each department such as name, number of people, and location.

The table Employees and the table Departments have a relation 1-to-N. One employee can be a part of a single department and each department can have multiple employees.

Now, we want to have a single query to fetch the department’s name for each employee. We are going to use a JOIN statement to connect both tables.

SELECT e.*, d.name
FROM employees e
INNER JOIN departments d ON e.department_id = d.id;

We will receive extended results for every employee record with the department name.

The statement JOIN works similarly to the operations on sets such as union and intersection. You can decide which results of the join you want to have.

Source: https://learnsql.de/blog/wie-man-sql-joins-lernt/

Nested Query

What if we need to make multiple queries and pass the results of one query to another? SQL statements can become too complex and maintaining them could be hard.

Yet we can nest queries and chain the results. We can build a simple example where we want to extract all employees with the salary higher than average.

First, we calculate the average salary with the function AVG(). Later, we run a query to compare the salary with the average one.

SELECT *
FROM employees
WHERE salary > (
SELECT AVG(salary)
FROM employees
);

Nested queries allow us to make even more complex results. They are helpful for aggregations, conditional logic, and updating rows.

SELECT *
FROM employees
WHERE department_id IN (
SELECT id
FROM departments
GROUP BY name
HAVING COUNT(*) > 5
);

In the query above we are looking for all employees who work in the departments with more than five employees.

The nested query fetches all departments that satisfy the condition. Then the result is used in the outer query to select the employees from a different table.

A nested query is similar to chaining functions in programming. The result of the one query is the input for the other one.

However, nested queries can downgrade the performance because they require more lookup requests. In certain cases, they can be replaced with the JOIN statement. That can bring a better performance due to the optimized nature of the JOIN statement.


SQL as a language is especially powerful on big data. It enables us to generate insightful reports and make data-driven decisions. That’s why it goes beyond the responsibilities of the data scientists. Everybody with good practical knowledge of SQL will be a valuable addition to any team. That is a rising requirement in the modern world of data.


Looking for how to grow as a software developer?

Do you want to learn the essential principles of a successful engineer?
Are you curious about how to achieve the next level in your career?

My book Unlock the Code offers a comprehensive list of steps to boost
your professional life. Get your copy now!