================================================================================
SQL ADVANCED JOINS & COMPLEX QUERIES - INTERMEDIATE DEVELOPER INTERVIEW QUESTIONS
================================================================================
Q1. MULTIPLE TABLE JOINS WITH AGGREGATION
-----------------------------------------
Question: Write a query to find the top 5 departments by total salary expenditure,
including
the average salary and employee count for each department.
Tables: employees (id, name, department_id, salary), departments (id, name,
location)
Solution:
```sql
SELECT
d.name as department_name,
d.location,
COUNT(e.id) as employee_count,
SUM(e.salary) as total_salary,
AVG(e.salary) as avg_salary,
ROUND(AVG(e.salary), 2) as avg_salary_rounded
FROM departments d
LEFT JOIN employees e ON d.id = e.department_id
GROUP BY d.id, d.name, d.location
ORDER BY total_salary DESC
LIMIT 5;
```
Key Points:
- Use LEFT JOIN to include departments with no employees
- GROUP BY includes all non-aggregated columns
- Use ROUND() for better presentation
- ORDER BY total_salary DESC for top departments
Q2. SELF-JOIN FOR HIERARCHICAL DATA
-----------------------------------
Question: Find all employees and their managers, including the manager's manager
(2-level hierarchy).
Table: employees (id, name, manager_id, salary, department)
Solution:
```sql
SELECT
e1.name as employee_name,
e1.salary as employee_salary,
e1.department as employee_dept,
e2.name as manager_name,
e2.salary as manager_salary,
e3.name as manager_manager_name
FROM employees e1
LEFT JOIN employees e2 ON e1.manager_id = e2.id
LEFT JOIN employees e3 ON e2.manager_id = e3.id
ORDER BY e1.name;
```
Alternative with CTE:
```sql
WITH employee_hierarchy AS (
SELECT
e1.id,
e1.name,
e1.manager_id,
e1.salary,
e1.department,
e2.name as manager_name,
e2.manager_id as manager_manager_id
FROM employees e1
LEFT JOIN employees e2 ON e1.manager_id = e2.id
)
SELECT
eh.name as employee_name,
eh.salary as employee_salary,
eh.department as employee_dept,
eh.manager_name,
e3.name as manager_manager_name
FROM employee_hierarchy eh
LEFT JOIN employees e3 ON eh.manager_manager_id = e3.id
ORDER BY eh.name;
```
Q3. CROSS JOIN WITH CONDITIONAL LOGIC
-------------------------------------
Question: Create a matrix showing all possible combinations of departments and
salary ranges,
with employee counts for each combination.
Solution:
```sql
WITH salary_ranges AS (
SELECT 'Low' as range_name, 0 as min_salary, 50000 as max_salary
UNION ALL
SELECT 'Medium', 50000, 80000
UNION ALL
SELECT 'High', 80000, 999999
),
department_employees AS (
SELECT
d.name as department_name,
e.salary,
CASE
WHEN e.salary < 50000 THEN 'Low'
WHEN e.salary BETWEEN 50000 AND 80000 THEN 'Medium'
ELSE 'High'
END as salary_range
FROM departments d
LEFT JOIN employees e ON d.id = e.department_id
)
SELECT
sr.range_name as salary_range,
de.department_name,
COUNT(de.salary) as employee_count
FROM salary_ranges sr
CROSS JOIN departments d
LEFT JOIN department_employees de ON d.name = de.department_name
AND sr.range_name = de.salary_range
GROUP BY sr.range_name, de.department_name, d.name
ORDER BY sr.range_name, de.department_name;
```
Q4. COMPLEX JOIN WITH SUBQUERIES
--------------------------------
Question: Find employees who earn more than the average salary in their department
and
also more than the overall company average.
Solution:
```sql
WITH department_averages AS (
SELECT
department_id,
AVG(salary) as dept_avg_salary
FROM employees
GROUP BY department_id
),
company_average AS (
SELECT AVG(salary) as company_avg_salary
FROM employees
)
SELECT
e.name,
e.salary,
d.name as department,
da.dept_avg_salary,
ca.company_avg_salary,
ROUND((e.salary - da.dept_avg_salary) / da.dept_avg_salary * 100, 2) as
dept_percent_diff,
ROUND((e.salary - ca.company_avg_salary) / ca.company_avg_salary * 100, 2) as
company_percent_diff
FROM employees e
JOIN departments d ON e.department_id = d.id
JOIN department_averages da ON e.department_id = da.department_id
CROSS JOIN company_average ca
WHERE e.salary > da.dept_avg_salary
AND e.salary > ca.company_avg_salary
ORDER BY e.salary DESC;
```
Q5. JOIN WITH WINDOW FUNCTIONS
------------------------------
Question: Find employees with their salary rank within their department and overall
company rank.
Solution:
```sql
SELECT
e.name,
e.salary,
d.name as department,
ROW_NUMBER() OVER (PARTITION BY e.department_id ORDER BY e.salary DESC) as
dept_rank,
ROW_NUMBER() OVER (ORDER BY e.salary DESC) as company_rank,
ROUND(
PERCENT_RANK() OVER (PARTITION BY e.department_id ORDER BY e.salary) * 100,
2
) as dept_percentile,
ROUND(
PERCENT_RANK() OVER (ORDER BY e.salary) * 100, 2
) as company_percentile
FROM employees e
JOIN departments d ON e.department_id = d.id
ORDER BY e.department_id, e.salary DESC;
```
Q6. MULTIPLE JOINS WITH DATE RANGES
-----------------------------------
Question: Find all employees who were hired in the same month as their manager,
including
the hiring month and year.
Solution:
```sql
SELECT
e1.name as employee_name,
e1.hire_date as employee_hire_date,
e2.name as manager_name,
e2.hire_date as manager_hire_date,
DATE_FORMAT(e1.hire_date, '%Y-%m') as hire_month_year
FROM employees e1
JOIN employees e2 ON e1.manager_id = e2.id
WHERE DATE_FORMAT(e1.hire_date, '%Y-%m') = DATE_FORMAT(e2.hire_date, '%Y-%m')
ORDER BY hire_month_year, e1.name;
```
Alternative using EXTRACT:
```sql
SELECT
e1.name as employee_name,
e1.hire_date as employee_hire_date,
e2.name as manager_name,
e2.hire_date as manager_hire_date,
CONCAT(YEAR(e1.hire_date), '-', LPAD(MONTH(e1.hire_date), 2, '0')) as
hire_month_year
FROM employees e1
JOIN employees e2 ON e1.manager_id = e2.id
WHERE YEAR(e1.hire_date) = YEAR(e2.hire_date)
AND MONTH(e1.hire_date) = MONTH(e2.hire_date)
ORDER BY hire_month_year, e1.name;
```
Q7. JOIN WITH AGGREGATION AND HAVING
------------------------------------
Question: Find departments where the highest-paid employee earns more than 3 times
the
department's average salary.
Solution:
```sql
WITH department_stats AS (
SELECT
department_id,
AVG(salary) as avg_salary,
MAX(salary) as max_salary
FROM employees
GROUP BY department_id
)
SELECT
d.name as department_name,
ds.avg_salary,
ds.max_salary,
ROUND(ds.max_salary / ds.avg_salary, 2) as salary_ratio,
e.name as highest_paid_employee
FROM department_stats ds
JOIN departments d ON ds.department_id = d.id
JOIN employees e ON ds.department_id = e.department_id
AND ds.max_salary = e.salary
WHERE ds.max_salary > ds.avg_salary * 3
ORDER BY salary_ratio DESC;
```
Q8. COMPLEX JOIN WITH MULTIPLE CONDITIONS
-----------------------------------------
Question: Find all pairs of employees who work in the same department, have similar
salaries
(within 10% difference), and were hired within 6 months of each other.
Solution:
```sql
SELECT
e1.name as employee1_name,
e1.salary as employee1_salary,
e1.hire_date as employee1_hire_date,
e2.name as employee2_name,
e2.salary as employee2_salary,
e2.hire_date as employee2_hire_date,
d.name as department,
ROUND(ABS(e1.salary - e2.salary) / GREATEST(e1.salary, e2.salary) * 100, 2) as
salary_diff_percent,
DATEDIFF(e1.hire_date, e2.hire_date) as days_between_hires
FROM employees e1
JOIN employees e2 ON e1.department_id = e2.department_id
JOIN departments d ON e1.department_id = d.id
WHERE e1.id < e2.id -- Avoid duplicate pairs
AND ABS(e1.salary - e2.salary) / GREATEST(e1.salary, e2.salary) <= 0.10 --
Within 10%
AND ABS(DATEDIFF(e1.hire_date, e2.hire_date)) <= 180 -- Within 6 months
ORDER BY d.name, salary_diff_percent;
```
PRACTICAL EXERCISES:
====================
Exercise 1: E-commerce Analytics
-------------------------------
```sql
-- Tables: orders, order_items, products, customers, categories
-- Find customers who purchased products from at least 3 different categories
SELECT
c.name as customer_name,
COUNT(DISTINCT cat.id) as category_count,
GROUP_CONCAT(DISTINCT cat.name) as categories_purchased
FROM customers c
JOIN orders o ON c.id = o.customer_id
JOIN order_items oi ON o.id = oi.order_id
JOIN products p ON oi.product_id = p.id
JOIN categories cat ON p.category_id = cat.id
GROUP BY c.id, c.name
HAVING COUNT(DISTINCT cat.id) >= 3
ORDER BY category_count DESC;
```
Exercise 2: Employee Performance Analysis
----------------------------------------
```sql
-- Find employees who consistently perform above average in their department
WITH employee_performance AS (
SELECT
e.id,
e.name,
e.department_id,
e.salary,
AVG(e.salary) OVER (PARTITION BY e.department_id) as dept_avg_salary,
COUNT(*) OVER (PARTITION BY e.department_id) as dept_employee_count
FROM employees e
)
SELECT
ep.name,
ep.salary,
d.name as department,
ep.dept_avg_salary,
ROUND((ep.salary - ep.dept_avg_salary) / ep.dept_avg_salary * 100, 2) as
performance_percentile
FROM employee_performance ep
JOIN departments d ON ep.department_id = d.id
WHERE ep.salary > ep.dept_avg_salary
ORDER BY performance_percentile DESC;
```
COMMON MISTAKES TO AVOID:
=========================
1. **Cartesian Products**: Always specify JOIN conditions
2. **Missing GROUP BY**: Include all non-aggregated columns
3. **Incorrect JOIN Types**: Choose appropriate JOIN (INNER, LEFT, RIGHT)
4. **Performance Issues**: Use indexes on JOIN columns
5. **NULL Handling**: Consider NULL values in JOIN conditions
6. **Alias Confusion**: Use clear table aliases
PERFORMANCE OPTIMIZATION TIPS:
==============================
1. **Index Strategy**: Create indexes on JOIN columns and WHERE conditions
2. **Join Order**: Place smaller tables first in JOIN order
3. **Avoid SELECT ***: Specify only needed columns
4. **Use EXISTS**: For checking existence instead of JOINs
5. **Limit Results**: Use LIMIT for large result sets
6. **Analyze Queries**: Use EXPLAIN to understand execution plans
ADVANCED TECHNIQUES:
====================
1. **Recursive CTEs**: For hierarchical data traversal
2. **Window Functions**: For ranking and running totals
3. **Pivot Operations**: Using CASE statements
4. **Dynamic SQL**: For flexible query generation
5. **Materialized Views**: For complex aggregations
6. **Partitioning**: For large table performance