SQL Syntactic Sugar: The COALESCE Function

0
(0)

If you work with SQL, you’ve likely encountered the COALESCE function. If you’re unfamiliar with it, now’s the time to learn it—it’s a very cool feature that will come in handy in solving a wide range of problems.

So, let’s start with the simplest thing – what is COALESCE?

Definition and description of the COALESCE function

COALESCE is a special function that evaluates each of its arguments in order and returns the value of the first argument that is not NULL.

Example:

SELECT COALESCE(NULL, NULL, 1, 2, NULL, 3)
# 1

This query will return 1 because the first two arguments are NULL, and the third argument takes a value different from NULL. Accordingly, the expression COALESCEwon’t even look further—the important thing is that the third argument isn’t NULL.

Other examples:

SELECT COALESCE(1, NULL, 2, NULL)
# 1

or

SELECT COALESCE(NULL, NULL)
# NULL

You probably get the gist of it. If it’s possible to return not NULL, the first non-empty value is returned. If it’s not possible, it’s returned NULL.

Here and below, we consider examples of using COALESCE in PostgreSQL, but it is worth noting that this function is used absolutely identically in all other popular DBMS: MS SQL Server, Oracle, and MySQL.

Let’s now dig a little deeper and see how this function works.

Analogy of COALESCE with CASE

We’ve figured out how the expression COALESCEworks, but can we write it from scratch? Let’s say we’re given the following task:

Without using COALESCE, write a SELECT query that is similar to SELECT COALESCE(expression1, expression2, expression3, expression4)

Can you solve this problem right away? Actually, the solution is quite simple: COALESCEIt’s just a convenient wrapper for the construct CASE. It’s convenient for processing values NULL. That’s why a function COALESCEin SQL can be considered a good example of syntactic sugar.

In programming, syntactic sugar is a term used to describe convenient language constructs that make code more readable and concise, but don’t add any new features. These are simplified ways of expressing what could otherwise be written more cumbersomely, without changing the semantics of the code.

Considering that handling gaps is a typical task when writing SQL queries, we decided to separate this construct into a separate function COALESCEso that poor developers wouldn’t have to suffer every time, and wouldn’t have to clutter up huge CASEexpressions.

Let’s solve this problem together. The answer will be:

SELECT
CASE
WHEN (expression1 IS NOT NULL) THEN expression1
WHEN (expression2 IS NOT NULL) THEN expression2
WHEN (expression3 IS NOT NULL) THEN expression3
ELSE expression4
END

That’s basically all the intricacies. You’ll agree—it’s not difficult. But it’s much more convenient to write COALESCEthan to write out such a lengthy construction with each time CASE.

Combat mission

Let’s use a real-life example to illustrate that when using COALESCE isn’t just a fun activity, but a real necessity.

You can say:

I’ve been writing SQL queries for so long and I’ve never used COALESCE!

Well, maybe you were just lucky… But there’s a high probability that some of your queries will start to return incorrect results if the tables start to fill with “ugly” data.

Let’s consider a simple problem. Let’s say we have an accounting spreadsheet containing information about an employee’s name and their monthly bonus. At the end of the year, we’re faced with a seemingly trivial task: calculating the total earnings of each employee.

+--------+----------+-----------+--------------+
| ID | name | bonus | date |
+--------+----------+-----------+--------------+
| 1 | Ivan | NULL | 2020-01-01 |
| 1 | Maya | 3500 | 2020-01-03 |
| 1 | Dora | 4500 | 2020-01-02 |
| 1 | Petr | 5750 | 2020-02-01 |
| 1 | Ivan | 3220 | 2020-03-05 |
| ... |
+--------+----------+-----------+--------------+

It so happened that an employee’s IvanJanuary bonus wasn’t loaded, and the table shows the value NULL. Accordingly, the query below will return the value 3220for this employee.

SELECT name, SUM(bonus)
FROM table_name
GROUP BY name

Not only is the result incorrect, but we also won’t even know it. And if there are many records, various calculations and aggregations will constantly lead to hidden errors (as proven in practice).

To avoid this situation, you can replace NULL values ​​with a very large number, for example. Then, we’ll end up with a multi-billion dollar bonus, and we’ll know something’s wrong.

Naturally, the techniques may vary depending on the task: this example illustrates the manual processing of a bonus table.

In our case, the solution can be written as follows:

SELECT name, SUM(COALESCE(bonus, 1000000))
FROM table_name
GROUP BY name

More SQL query examples using the COALESCE function

Replacing NULL with a label

Let’s consider this simple situation: a table usershas a field phone. If the phone number is missing, the value should be replaced with “Unknown.”

SELECT user_id,
COALESCE(phone, 'Unknown') AS phone_status
FROM users;

As a result, we get the following table:

+----------+--------------+
| user_id | phone_status |
+----------+--------------+
| 1 | +79112223344 |
| 2 | Unknown |
| 3 | +79211234567 |
| 4 | Unknown |
+----------+--------------+

Using COALESCE with multiple arguments

Let’s look at a table orderswith order data, which contains the order ID, the date it was created, the payment date, and the dispatch date.

+--------+----------------+----------------+----------------+
| id | order_date | payment_date | shipped_date |
+--------+----------------+----------------+----------------+
| 1 | 2024-01-01 | NULL | NULL |
| 2 | 2024-01-01 | 2024-01-02 | NULL |
| 3 | 2024-01-02 | 2024-01-02 | 2024-01-04 |
| 4 | 2024-01-02 | 2024-01-05 | NULL |
| 5 | 2024-01-03 | 2024-01-03 | 2024-03-05 |
| ... |
+--------+----------------+----------------+----------------+

We see that not all orders have been paid, and of those that have been paid, not all have been shipped. Let’s say we want to display the last activity date for each order. To do this quickly and elegantly, we can use the COALESCE function:

SELECT id,
COALESCE(shipped_date, payment_date, order_date) AS last_date
FROM orders;

As a result, we obtain the following table:

+--------+--------------+
| ID | last_date |
+--------+--------------+
| 1 | 2024-01-01 |
| 2 | 2024-01-02 |
| 3 | 2024-01-04 |
| 4 | 2024-01-05 |
| 5 | 2024-01-05 |
| ... |
+--------+--------------+

Using COALESCE when concatenating NULL and strings

Let’s consider a scenario: a table productscontains product names (product_name) and categories (category). Some products don’t have a name or category. We need to display the name and category in a single field, using the “-” separator:

SELECT
product_id,
COALESCE(product_name, 'Unnamed Product') || ' - ' || COALESCE(category, 'No Category') AS product_info
FROM products;

As a result, we get the following table:

+------------+-------------------------------+
| product_id | product_info |
+------------+-------------------------------+
| 1 | Laptop - Electronics |
| 2 | Table - Furniture |
| 3 | Sofa - No Category |
| 4 | Unnamed Product - Electronics |
+------------+-------------------------------+

Using COALESCE to replace NULL with a calculated value

Let’s imagine that the table orderscontains information about the cost and discount of orders (columns discount and price).

+----------+-------+----------+
| order_id | price | discount |
+----------+-------+----------+
| 1 | 1 000 | NULL |
| 2 | 2 000 | 100 |
| 3 | 1 500 | NULL |
| 4 | 4 000 | 500 |
+----------+-------+----------+

If the order discount is not specified, it should be calculated as 10% of the price. We use the following query:

SELECT
order_id,
price,
COALESCE(discount, price * 0.1) AS final_discount
FROM orders;

As a result, we get the following table:

+----------+-------+----------------+
| order_id | price | final_discount |
+----------+-------+----------------+
| 1 | 1 000 | 100 |
| 2 | 2 000 | 100 |
| 3 | 1 500 | 150 |
| 4 | 4 000 | 500 |
+----------+-------+----------------+

Conclusion

We’ve covered this feature  COALESCEand demonstrated its importance. Take a look at the queries you write every day—are there any errors? Could you, too, encounter situations where the calculation is incorrect, and you can’t even see it? To avoid such situations, use COALESCE!

How useful was this post?

Click on a star to rate it!

Average rating 0 / 5. Vote count: 0

No votes so far! Be the first to rate this post.

As you found this post useful...

Follow us on social media!

We are sorry that this post was not useful for you!

Let us improve this post!

Tell us how we can improve this post?


Explore More IT Terms


Share this term: Facebook X LinkedIn WhatsApp Email

Leave a Reply

Your email address will not be published. Required fields are marked *