Case Study

SQL Business Insights Analysis

Analyzed transactional sales data using SQL to identify top-performing products, high-value, customers, and revenue concentration patterns.

Advanced SQL Analysis
Role: Date Analyst Tools: SQL Dataset: Sales Transactions Records: 50,000+

Project Overview

The goal of this project was to analyze a relational sales dataset to uncover revenue drivers, customer purchasing behavior, and product performance. Using SQL joins, aggregations, and window functions, the analysis identifies high-value customers, top-performing products, and revenue concentration patterns.

Project Architecture

End-to-end data flow used in this analytics project.

Dataset

Raw sales CSV dataset

PostgreSQL

Relational database

SQL Analysis

Aggregation & window functions

Insights

Business KPIs & metrics

Dashboard

Power BI visualization

Database Structure

The sales dataset is organized into four relational tables. These tables capture customer information, order transactions, product catalog data, and individual order line items.

Sales Database Schema

Customers

Contains transaction-level data including order dates and customer relationships.

Order Details

Capture individual line items within each order including product references and quantity purchased.

Products

Maintains product catalog data including category, cost price, and selling price.

SQL Techniques Used

Joins

Connectedorders, customers, and products tables to reconstruct complete sales transactions.

Aggregations

Used SUM and COUNT to calculate revenue, order frequency, and customer metrics.

Window Functions

Applied RANK() and NTILE() to analyze product performance and revenue distribution.

Customer Segmentation

Identified high-value customers using Customer Lifetime Value (CLV) calculations.

SQL Query Examples

Sample SQL queries used to analysis revenue, customers, and performance.

Total Revenue

SQL

SELECT
    SUM(od.quantity * p.selling_price) AS total_revenue
FROM
    order_details od
    JOIN products p ON od.product_id = p.product_id;
        

Calculates total revenue generated across all product sales.

Top 5 Revenue Generating Products

SQL

SELECT
    p.product_name,
    SUM(od.quantity * p.selling_price) AS revenue
FROM
    order_details od
    JOIN products p ON od.product_id = p.product_id
GROUP BY
    p.product_name
ORDER BY
    revenue DESC
LIMIT
    5;
        

Identifies the highest revenue generating products.

Monthly Revenue Trend

SQL

SELECT
    DATE_TRUNC ('month', o.order_date) AS month,
    SUM(od.quantity * p.selling_price) AS revenue
FROM
    orders o
    JOIN order_details od ON o.order_id = od.order_id
    JOIN products p ON od.product_id = p.product_id
GROUP BY
    month
ORDER BY
    month;
        

Tracks revenue growth over time to identify monthly trends.

Customer Lifetime Value (CLV)

SQL

SELECT
    c.customer_id,
    c.customer_name,
    SUM(od.quantity * p.selling_price) AS customer_lifetime_value
FROM
    customers c
    JOIN orders o ON c.customer_id = o.customer_id
    JOIN order_details od ON o.order_id = od.order_id
    JOIN products p ON od.product_id = p.product_id
GROUP BY
    c.customer_id,
    c.customer_name
ORDER BY
    customer_lifetime_value DESC;
        

Calculates the lifetime revenue generated by each customer.

Pareto Analysis (Top 10% Customers)

SQL

WITH
  customer_revenue AS (
    SELECT
      c.customer_id,
      SUM(od.quantity * p.selling_price) AS revenue
    FROM
      customers c
      JOIN orders o ON c.customer_id = o.customer_id
      JOIN order_details od ON o.order_id = od.order_id
      JOIN products p ON od.product_id = p.product_id
    GROUP BY
      c.customer_id
  ),
  ranked_customers AS (
    SELECT
      customer_id,
      revenue,
      NTILE (10) OVER (
        ORDER BY
          revenue DESC
      ) AS decile
    FROM
      customer_revenue
  )
SELECT
  SUM(revenue) AS top_10_percent_revenue
FROM
  ranked_customers
WHERE
  decile = 1;
        

Identifies the revenue contribution of the top 10% of customers.

SQL Optimization Techniques

Strategies used to improve query performance and ensure efficient data analysis.

Efficient Table Joins

Used normalized relational structure with optimized joins between customers, orders, order_details, and products to accurately compute revenue and customer metrics.

Common Table Expressions (CTEs)

Implemented CTEs to simplify complex calculations such as customer segmentation and revenue contribution analysis, improving query readability and maintainability.

Window Functions

Used RANK(), DENSE_RANK(), and NTILE() to analyze product performance and perform Pareto analysis for top revenue-generating customers.

Aggregation Optimization

Used GROUP BY with aggregated metrics to efficiently calculate revenue, profit, and customer lifetime value across multiple tables.

Key Insights

$10.36M

Total Revenue

Total revenue generated across all transactions within the dataset.

$5.11M

Top Product Revenue

Laptop Pro 15 generated the highest revenue among all products.

$892K

Highest Customer CLV

Top customers generated nearly $900K in lifetime purchase value.

$1.68M

Top 10% Customers

A small group of customers contributes a large share of revenue.

Business Recommendations

Insights from the analysis translated into potential business actions.

Focus on High-Value Customers

Pareto analysis shows that a small percentage of customers contribute a large portion of total revenue. Implement loyalty programs and personalized promotions to retain these high-value customers.

Promote Top-Performing Products

Top revenue-generating products drive a significant share of sales. Marketing campaigns and bundled offers could further increase their sales performance.

Encourage Repeat Purchases

Repeat customers contribute more revenue than one-time buyers. Offering discounts, memberships, or rewards programs can help increase customer retention and lifetime value.

Monitor Monthly Sales Trends

Monthly revenue analysis helps indentity seasonal trends and demand patterns. Businesses can adjust inventory planning and marketing strategies accordingly.

Analysis Results

Product Revenue Ranking

Revenue ranking of products calculated using SQL window functions.

Customer Order Frequency

Customer purchase frequency showing repeat buyers.

Customer Lifetime Value

Customer lifetime value (CLV) calculated from aggregated revenue.

Revenue Contribution

Pareto analysis showing top 10% customers revenue contribution.

Business Impact

Potential Business Value

Revenue Drivers

Identified top-performing products contributing the majority of revenue, enabling businesses to prioritize inventory and marketing strategies around high-performance items.

High Value Customers

Customer lifetime value analysis highlights high-value customers that could benefit from loyalty programs and targeted retention campaigns.

Sales Concentration

Pareto analysis revealed that a small group of customers generate a large share of revenue, emphasizing the importance of customer segmentation strategies.

Data Driven Decisions

The insights derived from SQL analysis provide a foundation for improving product strategy, pricing, and customer engagement.

Key Takeaways

Advanced SQL Techniques

Strengthened understanding of window functions, ranking methods, and complex aggregations used in analytical SQL workflows.

Data-to-Insight Thinking

Focused on translating raw transactional data into meaningful business insights that support data-driven decision making.

Structured Analytical Process

Applied a structured approach to analysis including data exploration, transformation, aggregation, and insight generation.

Project Resources

Access the full project files including SQL scripts and dataset.

GitHub Repository

Explore the full project including database schema, data inserts, and advanced SQL queries used in the analysis.

View Project on GitHub

Dataset

The dataset simulates an e-commerce business including customers, orders, products, and transaction datails.

SQL Scripts

Includes queries for revenue analysis, product performance, customer lifetime value, and Pareto analysis.

Analysis Techniques

Uses joins, aggregations, window functions, and CTEs to extract meaningful insights from relational data.

Back to Portfolia

Return to the main portfolio to explore additional analytics and dashboard projects.