GNIITSolution: SQL Subqueries

Sunday, 31 January 2016

SQL Subqueries

Introduction

Before we start discussing what a subquery is, we need to have a sample database schema.

Diagram was created with Vertabelo

What is a subquery?

A subquery is a SELECT statement with another SQL statement, like in the example below:

Hide Copy Code

select *
from product
where id in
           (select product_id
            from provider_offer
            where provider_id = 156);

Subqueries are further classified as either a correlated subquery or a nested subquery. They are usually constructed in such a way to return:

a table:

Hide Copy Code

select max(average.average_price)
from (select product_category, avg(price) as average_price
      from product
      group by product_category) average;

or a value :

Hide Copy Code

select id
from purchase
where  value > (select avg(value)
                from purchase )

Nested Subqueries

Nested subqueries are subqueries that don't rely on an outer query. In other words, both queries in a nested subquery may be run as separate queries.

This type of subquery could be used almost everywhere, but it usually takes one of these formats:

Hide Copy Code

SELECT
FROM
WHERE [NOT] IN (subquery)

Select *
from client
where city in (select city
               from provider);

The example subquery returns all clients that are from the same city as the product providers.
The IN operator checks if the value is within the table and retrieves the matching rows.

Correlated Subqueries

Subqueries are correlated when the inner and outer queries are interdependent, that is, when the outer query is a query that contains a subquery and the subquery itself is an inner query. Users that know programming concepts may compare it to a nested loop structure.

Let's start with a simple example. The inner query calculates the average value and returns it. In the outer query’s where clause, we filter only those purchases which have a value greater than the inner query’s returned value.

Subquery Correlated in Where Clause

Hide Copy Code

select id
from purchase p1
where date > '2013-07-15'
and value > (select avg(value)
             from purchase p2
             where p1.date = p2.date)

The query returns purchases after 15/07/2014 with a total price greater than the average value from the same day.

The equivalent example, but with joining tables.

Hide Copy Code

Select  p1.id
from purchase p1, purchase p2
where p1.date = p2.date
and p1.date > '2013-07-15'
group by p1.id
having p1.value > avg(p2.value);

This example can also be written as a select statement with a subquery correlated in a from clause.

The subquery returns the table that contains the average value for each purchase for each day. We join this result with the Purchase table on column 'date' to check the condition date > '15/07/2014'.

Hide Copy Code

select id
from purchase, (select date, avg(value) as average_value
                from purchase
                  where date > '2013-07-15'
                  group by date) average
where purchase.date = average.date
and purchase.date > '2013-07-15'
and purchase.value > average.average_value;

Usually, this kind of subquery should be avoided because indexes can't be used on a temporary table in memory.

When a subquery is used, the query optimizer performs additional steps before the results from the subquery are used. If a query that contains a subquery can be written using a join, it should be done this way. Joins usually allow the query optimizer to retrieve the data in a more efficient way.

You can find an extended article as well as more examples here.

GNIITSolution

Pages

Sunday, 31 January 2016

SQL Subqueries

Introduction

What is a subquery?

Nested Subqueries

Correlated Subqueries

Subquery Correlated in Where Clause

No comments:

Post a Comment

Total Pageviews