SQL Queries

📄 Table of Contents

▬▬▬▬▬▬▬▬▬▬▬▬▬▬ ✦ ✦ ✦ ▬▬▬▬▬▬▬▬▬▬▬▬▬▬

SQL: Understanding Joins

SQL helps us handle data with relational databases. When it comes to having different tables in a database, we need to find a way to connect the tables’ data, and this is where joins come in to save the day!

There a couple of different types of joins:

  • (Inner) join
  • Left (OUTER) join
  • Right (OUTER) join
  • Full (OUTER) join

Note: In some databases RIGHT JOIN is called RIGHT OUTER JOIN. and so on.

See the image below to give you a visualization of those different joins:

Different types of JOINS

Let’s explore the relationship between these two tables using different types of JOINS available:

◉ Inner Join

An inner join will allow us to return data records that have a matching value in both tables. In mathematical terms, an inner join is the intersection of the two tables.

We can create an INNER JOIN between the two tables via the shared ArtistId value, see the syntax of using inner join to connect the tables data here:

**using INNER JOIN keyword is the same as using the JOIN keyword

◉ Left (OUTER) Join

A left join keyword will return all data records from the left table and the data records that meet a condition from the right table.

◉ Right (OUTER) Join

The right join is actually quite similar to the left join! With the right join, we get the flip side of results.

The RIGHT JOIN will return all rows from the right table and all matched rows from the left table.

◉ Full (OUTER) Join

A full join will go ahead and return all the rows from the joined tables whether they are matched or not.

NOTE:- SQLite does NOT support the use of a full join sometimes referred to as a full outer join

Bonus: Cross Join

Not mentioned earlier, but something worth knowing in your backpocket is the cross join!

A cross join creates a product of rows from the two tables, without a join condition. Syntax for a cross join:

*selection of first 6 output results from that query about

You can see there are some duplicate data results, a use case for a cross join would be to create an initial data set that would allow for further queries or analysis.

Time/Date functions in SQL

◉ SQL Server ROW_NUMBER() function

source: https://www.sqlservertutorial.net/sql-server-window-functions/sql-server-row_number-function/

The ROW_NUMBER() is a window function that assigns a sequential integer to each row within the partition of a result set. The row number starts with 1 for the first row in each partition.

The following shows the syntax of the ROW_NUMBER() function:

row_number() OVER (partition by dts.aDate , nomligne ORDER BY entrydate desc) as rownumber

Using SQL Server ROW_NUMBER() over partitions example

The following example uses the ROW_NUMBER() function to assign a sequential integer to each customer. It resets the number when the city changes:

SELECT 
first_name,
last_name,
city,
ROW_NUMBER() OVER (
PARTITION BY city
ORDER BY first_name
) row_num
FROM
sales.customers
ORDER BY
city;
Code language: SQL (Structured Query Language) (sql)

The following picture shows the partial output:

In this example, we used the PARTITION BY clause to divide the customers into partitions by city. The row number was reinitialized when the city changed.

use Case of using ROW_NUMBER()

One of the excellent use cases of using ROW_NUMBER() is to delete all duplicated rows except unique row

◉ CAST Function

A CAST function converts the selected data type into another. Pretty straight forward. It changes one type into your preferred type. The syntax is below.

CAST(expression AS datatype)

Below is an example of how it can be applied to date and time.

SELECT
NOW(),
CAST(NOW() AS TIMESTAMP),
CAST(NOW() AS DATE),
CAST(NOW() AS TIME),
CURRENT_DATE,
CURRENT_TIME

In this query, we expect to have 6 outputs. If you are unfamiliar with NOW(), CURRENT_DATE, CURRENT_TIME they are SQL functions that retrieve the current time or date. Below are all the outputs from the query in order. (note — you will get different numbers since the functions call the exact time or date)

  • 2020–03–28 23:18:20.261879+00:00 # returns timestamp plus timezone
  • 2020–03–28 23:18:20.261879 # returns only timestamp (sans timezone)
  • 2020–03–28
  • 23:18:20.261879
  • 2020–03–28
  • 23:18:20.261879+00:00

We can see how the CAST function works with time, and the last two outputs using CURRENT_DATE and CURRENT_TIME are just there for you to see the comparison of results.

More Examples — Not with timestamps

SQL also allows CAST() functions to be used with types that are not timestamped.

SELECT 
CAST(1.34 AS INT),
CAST(1 AS BOOLEAN),
CAST(2.65 AS DEC(3,0))

The results from this query are,

  • 1 → Since an integer can’t have decimals, it will round to the nearest integer value
  • true → 1 as a boolean is true, and 0 is false
  • 3 → using the DEC() we can also do the reverse of our first integer CAST.

for more read @ https://towardsdatascience.com/date-time-functions-in-sql-1885e2cbdc1a

◉ EXTRACT()

Next, we can look at pulling a specific format out of a timestamp. The goal is to extract a portion out of a timestamp. For example, if we want just the month from the date 12/10/2018, we would get December (12).

Let’s take a look at EXTRACT syntax

EXTRACT(part FROM date)

We state the type of extraction we want as part and then the source to be extracted date. EXTRACTis an import tool in time series data analysis. It helps you isolate groups in your timestamps to aggregate data based on precise timings.

Let’s assume we run NOW() and our timestamp is 2020–03–29 00:27:51.677318+00:00, we can use EXTRACTto get the following.

SELECT
EXTRACT(MINUTE FROM NOW()) AS MINUTE,
EXTRACT(HOUR FROM NOW()) AS HOUR,
EXTRACT(DAY FROM NOW()) AS DAY,
EXTRACT(WEEK FROM NOW()) AS WEEK,
EXTRACT(MONTH FROM NOW()) AS MONTH,
EXTRACT(YEAR FROM NOW()) AS YEAR,
EXTRACT(DOW FROM NOW()) AS DAY_OF_WEEK,
EXTRACT(DOY FROM NOW()) AS DAY_OF_YEAR,
EXTRACT(QUARTER FROM NOW()) AS QUARTER,
EXTRACT(TIMEZONE FROM NOW()) AS TIMEZONE
Output

We see that we can go into great detail in how we want to extract information out of our timestamps.

Note — DOW — The day of the week is Sunday(0) to Saturday(6).

◉ DATE_TRUNC

Truncate — to shorten by or as if by cutting off

The purpose of truncating a date in SQL is to return an interval at an absolute precision. The precision values are a subset of the field identifiers that can be used with the EXTRACT. DATE_TRUNC will return an interval or timestamp rather than a number.

The syntax for DATE_TRUNC, time_column is the database column that contains the timestamp you'd like to round, and ‘[interval]’ dictates your desired precision level.

DATE_TRUNC(‘[interval]’, time_column)

Let’s assume our NOW() returns the same 2020–03–29 00:27:51.677318+00:00, we can use date_part to get the following.

SELECT
CAST(DATE_TRUNC('DAY', NOW()) AS DATE) AS DAY,
CAST(DATE_TRUNC('WEEK', NOW()) AS DATE) AS WEEK,
CAST(DATE_TRUNC('MONTH', NOW()) AS DATE) AS MONTH,
CAST(DATE_TRUNC('YEAR', NOW()) AS DATE) AS YEAR
Output

Think of using DATE_TRUNC as getting the interval of where it is currently, and each level of an interval is how the date gets trimmed. Note — we did not need to use CAST in this example. I used it to ensure the format was clean for analysis.

We can use DATE_TRUNC on the rental car scenario from earlier, and try to find which day in the year regardless of time is more popular for rentals.

SELECT
CAST(DATE_TRUNC('DAY', RENTAL_DATE) AS DATE) AS RENTAL_DAY,
COUNT(*) AS RENTALS
FROM RENTAL
GROUP BY
RENTAL_DAY
ORDER BY RENTALS DESC
Output

generate_series in Postgres

generate_series as the name implies allows you to generate a set of data starting at some point, ending at another point, and optionally set the incrementing value. generate_series works on two datatypes:

  • integers
  • timestamps

Let’s get started with the most basic example:

SELECT * 
FROM generate_series(1, 5);
generate_series
-----------------
1
2
3
4
5
(5 rows)

So generate_series pretty straight-forward, but what interesting ways can it be used?

Generating fake data

By putting our generate_series inside a CTE we can easily now generate a set of numbers and then perform some operation against each value. If we want to generate some fake number we can use random() which generates a random number between 0.0 and 1.0.

WITH numbers AS (
SELECT *
FROM generate_series(1, 5)
)
SELECT generate_series * random()
FROM numbers;
?column?
-------------------
0.87764338683337
0.345125129446387
2.10317854676396
0.937898803502321
1.72822773223743
(5 rows)

for more read @ https://www.citusdata.com/blog/2018/03/14/fun-with-sql-generate-sql/

SQL aggregate function like COUNT, AVG, and SUM have something in common: they all aggregate across the entire table. But what if you want to aggregate only part of a table? For example, you might want to count the number of entries for each year.

In situations like this, you’d need to use the GROUP BY clause. GROUP BY allows you to separate data into groups, which can be aggregated independently of one another.

The ORDER BY clause allows you to reorder your results based on the data in one or more columns.

Ascending order, and it’s SQL’s default. If you order a numerical column in ascending order, it will start with smaller (or most negative) numbers, with each successive row having a higher numerical value than the previous.

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Anil Kumar

Anil Kumar

Experience with Front-end Technologies and MERN / MEAN Stack. Working on all Major UI Frameworks like React, Angular.