Product Manager Fellowship
SQL Handybook
Product Manager Fellowship
SQL
QL, orStructured Query Language, is a way we talkto databases.
S
It's like a tool that helps us ask for specific information and put
together detailed reports. It's used everywhere in technology when
we're dealing with data. For Product Managers, SQL skills are a
must-have as they will regularly be working with data.
SELECT
etch the
F idand
namecolumns from the
producttable:
SELECT
id, name
FROM
product;
etch names of products with prices above 15:
F
SELECT
name
FROM
product
WHERE
price > 15;
etch names of products with prices between 50 and 150:
F
SELECT
name
FROM
product
WHERE
priceBETWEEN 50 AND 150;
Product Manager Fellowship
etch names of products that are not watches:
F
SELECT
name
FROM
product
WHERE
name != 'watch';
etch names of products that start with a
F 'P'or endwith an
's'
:
SELECT
name
FROM
product
WHERE
name
LIKE'P%' OR name LIKE'%s';
etch names of products that start with any letter followed by
F 'rain'
(like
'train'or
'grain' ):
SELECT
name
FROM
product
WHERE
name
LIKE'_rain';
etch names of products with non-null prices:
F
SELECT
name
FROM
product
WHERE
priceIS NOT NULL
;
Product Manager Fellowship
GROUP BY
AGGREGATE FUNCTIONS
ount the number of products:
C
SELECT
COUNT(
*
)
FROM
product;
ount the number of products with non-null prices:
C
SELECT
COUNT(
price
)
FROM
product;
ount the number of unique
C categoryvalues:
SELECT
COUNT(
DISTINCT
category
)
FROM
product;
Product Manager Fellowship
et the lowest and the highest product price:
G
SELECT
MIN(price
)
,
MAX( price
)
FROM
product;
ind the total price of products for each category:
F
SELECT
category, SUM(price
)
FROM
product
GROUP BY
category;
ind the average price of products for each category whose average
F
is above 3.0:
SELECT
category, AVG( price
)
FROM
product
GROUP BY
category
HAVING
AVG(price
)
> 3.0;
Product Manager Fellowship
ORDER BY
etch product names sorted by the
F pricecolumn inthe
default ASCending order:
SELECT
name
FROM
product
ORDER BY
price[A
SC ]
;
etch product names sorted by the
F pricecolumn inDESCending
order:
SELECT
name
FROM
product
ORDER BY
priceDESC ;
COMPUTATIONS
se
U +,
-,
*,
/to do basic math. To get the numberof seconds in a
week:
SELECT
60 * 60 * 24 * 7;
-- result: 604800
ROUNDING NUMBERS
ound a number to its nearest integer:
R
SELECT
ROUND(1234.56789);
-- result: 1235
ound a number to two decimal places:
R
SELECT
ROUND(AVG(price), 2)
FROM
product
WHERE
category_id = 21;
-- result: 124.56
Product Manager Fellowship
TROUBLESHOOTING
INTEGER DIVISION
I n PostgreSQL and SQL Server, the /operator performsinteger
division for integer arguments. If you do not see the number of decimal
places you expect, it is because you are dividing between two integers.
Cast one to decimal:
123 / 2
-- result: 61
CAST(
123 AS decimal
)
/ 2 -- result: 61.5
DIVISION BY 0
o avoid this error, make sure the denominator is not
T 0.You may use
the
NULLIF()function to replace 0with a
NULL, whichresults in a
NULLfor the entire expression:
count /
NULLIF( count_all, 0
)
Product Manager Fellowship
JOIN
J OIN is used to fetch data from multiple tables. To get the names of
products purchased in each order, use:
SELECT
orders.order_date,
product.name
ASproduct,
amount
FROM
orders
JOIN
product
ON
product.id = orders.product_id;
INSERT
o insert data into a table, use the
T INSERTcommand:
INSERT INTO
category
VALUES
(1, 'Home and Kitchen'),
(2, 'Clothing and Apparel');
ou may specify the columns to which the data is added. The
Y
remaining columns are filled with predefined default values or
NULL
s.
INSERT INTO
category (name)
VALUES
('Electronics');
Product Manager Fellowship
UPDATE
o update the data in a table, use the
T UPDATEcommand:
UPDATE
category
SET
is_active = true,
name = 'Office'
WHERE
name = 'Ofice';
DELETE
o delete data from a table, use the
T DELETEcommand:
DELETE FROM
category
WHERE
name IS NULL ;
Product Manager Fellowship
DATE AND TIME
here are 3 main time-related types:
T date, , and
time timestamp .
Time is expressed using a 24-hour clock, and it can be as vague as
just hour and minutes (e.g.,
15:30– 3:30 p.m.) oras precise as
microseconds and time zone (as shown below):
2021-12-31 14:39:53.662522-05
date
time
timestamp
YYYY-mm-dd HH:MM:SS.ssssss±TZ
4:39:53.662522-05is almost 2:40 p.m. CDT (e.g.,in Chicago;
1
in UTC it'd be 7:40 p.m.). The letters in the above example
represent:
In the date part: In the time part:
YYYY– the 4-digit year.
HH– the zero-padded hour
in a 24-hour clock.
mm– the zero-padded month
MM– the minutes.
(
01—January through SS– the seconds.Omissible.
12
—December). ssssss– the smaller parts of
dd– the zero-padded day.
a second
– they can be expressed
using 1 to 6 digits.
Omissible.
±TZ– the timezone. It must
start with either+or
-,and
use two digits relative to
UTC.Omissible.
Product Manager Fellowship
CURRENT DATE AND TIME
ind out what time it is:
F
SELECT
CURRENT_TIME ;
et today's date:
G
SELECT
CURRENT_DATE;
In SQL Server:
SELECT
GETDATE();
et the timestamp with the current date and time:
G
SELECT
CURRENT_TIMESTAMP ;
CREATING DATE AND TIME VALUES
o create a date, time, or timestamp, write the value as a string and
T
cast it to the proper type.
SELECT
CAST( '2021-12-31'
ASdate )
;
SELECT
CAST( '15:31'
AS
time
)
;
SELECT
CAST(
'2021-12-31 23:59:29+02'
AS
timestamp )
;
SELECT
CAST( '15:31.124769'
AS time)
;
e careful with the last example – it is interpreted as 15 minutes 31
B
seconds and 124769 microseconds! It is always a good idea to write
00 for hours explicitly:
'00:15:31.124769' .
Product Manager Fellowship
SORTING CHRONOLOGICALLY
sing
U ORDER BYon date and time columns sorts rows
chronologically from the oldest to the most recent:
SELECT
order_date, product, quantity
FROM
sales
ORDER BY
order_date;
order_date
product
quantity
2023-07-22
Laptop 2
2023-07-23
Mouse 3
2023-07-24
Sneakers
10
2023-07-24
Jeans 3
2023-07-25 Mixer
2
se the DESCending order to sort from the most recent to the oldest:
U
SELECT
order_date, product, quantity
FROM
sales
ORDER BY
order_date DESC ;
Product Manager Fellowship
OMPARING DATE AND TIME
C
VALUES
ou may use the comparison operators
Y <, ,
<= >, ,and
>= =to
compare date and time values. Earlier dates are less than later
ones. For example,
2023-07-05is "less" than 2023-08-05 .
ind sales made in July 2023:
F
SELECT
order_date, product_name, quantity
FROM
sales
WHERE
order_date >= '2023-07-01'
AND
order_date < '2023-08-01';
ind customers who registered in July 2023:
F
SELECT
registration_timestamp, email
FROM
customer
WHERE
registration_timestamp >= '2023-07-01'
AND
registration_timestamp < '2023-08-01';
ote:Pay attention to the end date in the query.The upper bound
N
'2023-08-01'is not included in the range. The timestamp
'2023- 08-01'is actually the timestamp
'2023-08-01
00:00:00.0'
. The comparison operator <is used toensure the
selection is made for all timestamps less than '2023-08-01
00:00:00.0'
, that is, all timestamps in July 2023,even those
close to the midnight of August 1, 2023
Product Manager Fellowship
Product Manager Fellowship
EXTRACTING PARTS OF DATES
he standard SQL syntax to get a part of a date
T
is
SELECTEXTRACT( YEAR
FROM
order_date
)
FROMsales;
ou may extract the following fields:
Y
YEAR
, ,
MONTH ,
DAY ,
HOUR MINUTE, and .
SECOND
he standard syntax does not work In SQL Server.
T
Use the DATEPART(part, date)function
instead.
SELECTDATEPART( YEAR,
order_date
)
FROMsales;
Product Manager Fellowship
GROUPING BY YEAR AND MONTH
ind the count of sales by month:
F
SELECT
EXTRACT(
YEAR
FROM order_date
)
AS
year,
EXTRACT(
MONTH
FROMorder_date
)
AS
month,
COUNT(
*
)
AScount
FROM
sales
GROUP BY
year,
month
ORDER BY
year
month;
year
month
count
2022
8 51
2022
9 58
2022
10 62
2022
11 76
2022
12 85
2023
1 71
2023 2
69
ote that you must group by both the year and the month.
N
EXTRACT(MONTH FROM order_date)only extracts the month
number (1, 2, ..., 12). To distinguish between months from different
years, you must also group by year.
Product Manager Fellowship
CASE WHEN
ASE WHENlets you pass conditions (as in the
C WHEREclause),
evaluates them in order, then returns the value for the first condition
met.
ELECT
S
name,
CASE
WHEN
price > 150
THEN
'Premium'
WHEN
price > 100
THEN
'Mid-range'
ELSE
'Standard'
END AS
price_category
FROM
product;
ere, all products with prices above 150 get thePremiumlabel,
H
those with prices above 100 (and below 150) get theMid-range
label, and the rest receive theStandardlabel.
Product Manager Fellowship
CASE WHEN and GROUP BY
ou may combine
Y CASE WHENand GROUP BYto compute
object statistics in the categories you define.
SELECT
CASE
WHEN
price > 150 THEN 'Premium'
WHEN
price > 100 THEN 'Mid-range'
ELSE
'Standard'
END AS
price_category,
COUNT(
*
)
AS products
FROM
product
GROUP BY
price_category;
ount the number of large orders for each customer using
C CASE WHEN
and
SUM() :
SELECT
customer_id,
SUM(
CASE WHEN
quantity > 10
THEN
1
ELSE0
END
)
ASlarge_orders
FROM
sales
GROUP BY
customer_id;
... or usingCASE WHENand
COUNT()
:
SELECT
customer_id,
COUNT(
CASE WHEN
quantity > 10
THEN
order_id
END
)
AS large_orders
FROM
sales
GROUP BY
customer_id
Product Manager Fellowship
Product Manager Fellowship
Product Manager Fellowship
WINDOW FUNCTIONS
indow functions compute their results based on a sliding window
W
frame, a set of rows related to the current row. Unlike aggregate
functions, window functions do not collapse rows.
OMPUTING THE PERCENT OF TOTAL WITHIN A GROUP
C
SELECT
product, brand, profit,
(100.0 * profit /
SUM(
profit
) OVER(
PARTITION BY
brand
)
)
AS
perc
FROM
sales;
product brand profit perc
Knife Culina 1000 25
Pot Culina 3000 75
Doll Toyze 2000 40
Car Toyze 3000 60
Product Manager Fellowship
RANKING
ank products by price:
R
SELECT
RANK() OVER( ORDER BY
price
)
,
name
FROM
product;
ANKING FUNCTIONS
R
RANK– gives the same rank for tied values, leavesgaps.
DENSE_RANK– gives the same rank for tied values without
gaps.ROW_NUMBER– gives consecutive numbers without
gaps.
name
rank
dense_rank
row_number
Jeans
1
1
1
Leggings
2
2
2
Leggings
2
2
3
Sneakers
4
3
4
Sneakers
4
3
5
Sneakers
4
3
6
T-Shirt
7
4
7
RUNNING TOTAL
running total is the cumulative sum of a given value and all preceding
A
values in a column.
SELECT
date, amount,
SUM(
amount
) OVER(
ORDER BY
date )
AS
running_total
FROM
sales;
Product Manager Fellowship
MOVING AVERAGE
moving average (a.k.a.rolling average, runningaverage) is a
A
technique for analyzing trends in time series data. It is the average
of the current value and a specified number of preceding values.
SELECT
date, price,
AVG(
price
) OVER(
ORDER BY
date
ROWS BETWEEN
2
PRECEDING AND CURRENT
ROW
)
ASmoving_averge
FROM
stock_prices;
DIFFERENCE BETWEEN TWO ROWS (DELTA)
SELECT
year, revenue,
LAG(
revenue
) OVER(
ORDER BY
year
)
AS
revenue_prev_year,
revenue -
LAG(
revenue
) OVER(
ORDER BY
year
)
AS
yoy_difference
FROM
yearly_metrics;
elo SA