SQL-Projects

Welcome to my SQL Projects repository! This repository demonstrates my expertise in SQL through a variety of projects that tackle real-world data challenges, including performance optimization, data cleaning, advanced querying, and analytics.

📦 Project Structure

📦 SQL-Projects
├── 📁 Data
│   └── Raw and cleaned datasets
├── 📁 Scripts
│   ├── Queries.sql
│   ├── DataCleaning.sql
│   ├── ETL.sql
├── 📁 Reports
│   └── Insights and visualizations
└── README.md

Data Warehouse & Analytics Project

Welcome to the Data Warehouse and Analytics Project repository!
This project demonstrates a comprehensive data warehousing and analytics solution, from building a data warehouse to generating actionable insights. Designed as a portfolio project, it highlights industry best practices in data engineering and analytics.


Project Requirements

Building the Data Warehouse (Data Engineering)

Objective

Develop a modern data warehouse using SQL Server to consolidate sales data, enabling analytical reporting and informed decision-making.

Specifications

BI: Analytics & Reporting (Data Analytics)

Objective

Develop SQL-based analytics to deliver detailed insights into:


Level Up: SQL Challenges

Key Insights

Tools Used

How to Use This Repository

  1. Clone the repository.
  2. Open the SQL files in your preferred SQL editor or GitHub Codespaces.
  3. Run the queries against your database to see the results.

License

This project is licensed under the MIT License. See the LICENSE file for details.


🚀 Streamlining Restaurant Data Management with SQL | Video

🚀 Key Features

1️⃣ Customer Relationship Management

2️⃣ Menu and Ordering Insights

3️⃣ Reservation Optimization

4️⃣ Engaging Top Customers

5️⃣ Customization & Favorites

####💡 Insights

####🔑 What I Learned

📈 Results

🛠️ Technologies Used

🎯 How to Use

  1. Clone the repository.
  2. Import the SQL scripts into your database system.
  3. Explore the reports and adapt them to your restaurant’s needs.

This project is part of my portfolio, demonstrating my:


SQL Stored Procedure: Customer Summary

Overview

This script provides a stored procedure to calculate customer statistics and sales data for a specified country, with additional features for data cleanup and error handling.

Features

  1. Core Functionality
    • Calculates the total number of customers and their average score for a specified country.
    • Allows dynamic country selection using parameters (@Country), with a default value set to 'USA'.
  2. Data Cleanup
    • Updates null scores in the Score column to 0 for accurate aggregations.
  3. Advanced Features
    • Performs multiple operations, including aggregating customer data and calculating total orders and sales.
    • Utilizes variables to store intermediate results for enhanced output control.
    • Implements control flow (IF EXISTS) to handle conditional data processing.
  4. Error Handling
    • Uses TRY...CATCH blocks to gracefully handle errors.
    • Provides detailed error messages, including the error number, message, line, and procedure name.
  5. Execution Flexibility
    • Accepts an optional parameter for the country name:
      EXEC GetCustomerSummary; -- Default country: USA
      EXEC GetCustomerSummary @Country = 'Germany'; -- Specify another country
      
  6. Best Practices
    • Avoids code repetition by using parameters and dynamic inputs.
    • Ensures robust handling of null values and potential errors.

Usage Examples

– Execute the procedure with the default country EXEC GetCustomerSummary;

– Execute the procedure for a specific country EXEC GetCustomerSummary @Country = ‘Germany’; EXEC GetCustomerSummary @Country = ‘USA’;

SQL Query Performance Best Practices

Link to Project

This guide provides best practices for optimizing SQL queries to improve performance, readability, and maintainability.

Best Practices

1. Column Selection

2. Minimize Expensive Operations

3. Indexing

4. Efficient Filtering

5. Query Structure

6. Join Optimization

7. Query Simplification

8. Data Types

9. Performance Enhancements

Example

– Avoid SELECT * SELECT first_name, last_name, email FROM customers WHERE country = ‘USA’;

– Use Indexed Columns CREATE NONCLUSTERED INDEX idx_customers_country ON customers (country);

– Use UNION ALL Instead of UNION SELECT product_name FROM products_a UNION ALL SELECT product_name FROM products_b;


SQL Value Window Functions | LEAD, LAG, FIRST_VALUE, LAST_VALUE

1. Analyze Month-over-Month (MoM) Performance

2. Analyze Customer Loyalty

3. Find Lowest and Highest Sales

4. Compare to Extremes


These queries aim to perform various analyses such as calculating totals, averages, rankings, and detecting duplicates or deviations, making them valuable tools for data analysis and reporting.

Insights

Sales Analysis

Customer Analysis

Data Quality Checks

Employee Analysis


Insights

Ranking Analysis

Product Performance

Customer Analysis

Data Management


SQL Views

Insights

Sales Analysis

Data Updates

Combined Data Views

Regional Analysis


Comprehensive Guide to SQL Server Indexing and Optimization Techniques

Index Types and Use Cases

Index Management

Creating, Monitoring, and Dropping Indexes

Query Performance Enhancement

Execution Plan Analysis

SQL Hints

Maintenance and Optimization

Statistics Updates

Fragmentation Management

Advanced Scenarios

Columnstore Index Benefits

Join Optimization

Advanced Common Table Expression(CTE)

Tasks :


Handle Nulls with Null Function

Tasks

The Difference Among Null, Empty String, and Blank Spaces

Find the Average Scores of the Customers

Display the Full Name of Customers in a Single Field and Add 10 Bonus Points to Each Customer’s Score

Sort the Customers from Lowest to Highest Scores, with Nulls Appearing Last

Find the Sales Price for Each Order by Dividing the Sales by Quantity

Identify the Customers Who Have No Scores

List All Customers Who Have Scores

List All Details for Customers Who Have Not Placed Any Orders


Advanced Sub Query

Tasks

Find the Products That Have a Price Higher Than the Average Price of All Products

Rank the Customers Based on Their Total Amount of Sales

Show the Product IDs, Names, Prices, and Total Number of Orders

Show All Customer Details and Find the Total Orders for Each Customer

Show the Details of Orders Made by Customers in Germany

Show the Details of Orders Made by Customers Who Are Not From Germany

Find Female Employees Whose Salaries Are Greater Than the Salaries of Any Male Employees

Find Female Employees Whose Salaries Are Greater Than the Salaries of All Male Employees

Show All Customer Details and Find the Total Orders for Each Customer

Show the Details of Orders Made by Customers in Germany


Case When Statement

Task

Generate a Report Showing the Total Sales for Each Category

Retrieve Employee Details with Gender Displayed as Full Text

Find Average Scores of Customers and Treat Nulls as 0, and Provide Details Such as CustomerID and Lastname

Count How Many Times Each Customer Has Made an Order with Sales Greater Than 30


Using Partition Function

This project showcases my ability to implement table partitioning in SQL Server to optimize query performance and manage large datasets effectively.

Steps

1. Defined Partition Function

2. Created Filegroups

3. Added Secondary .ndf Files

4. Set Up Partition Scheme

5. Verified Data Storage in Partitions

6. Compared Performance


Union Use Cases

Tasks:


Date Use Case

Tasks:

1. Combine the Data from Employees and Customers into One Table

2. Combine the Data from Employees and Customers into One Table Including Duplicates

3. Find Employees Who Are Not Customers at the Same Time

4. Find Employees Who Are Also Customers

5. Combine All Orders into One Report Without Duplicates


KPIs Objective: Analyze employee demographics and salary data to provide insights into gender distribution, departmental employment trends, and salary comparisons.

Key Responsibilities:

1. Employee Gender Breakdown (1990-Present)

2. Departmental Gender Comparison

3. Average Salary Analysis by Gender

4. Salary Range Analysis (50,000 - 90,000)

Technologies Used:

Outcome:


This SQL script generates key insights and KPIs to analyze a food delivery system’s performance. The analysis focuses on orders, drivers, customer behavior, and data quality.

Key Insights and KPIs:

1. Total Rolls Ordered

2. Unique Customers

3. Driver Performance

4. Roll Type Popularity

5. Order Volume per Customer

6. Max Rolls in a Single Order

7. Data Quality and Cleaning

8. Database and Table Creation

9. Order Success Rate

10. Customer Preferences and Customization

Technologies and Tools Used:

Outcome:

This script provides key performance indicators for the food delivery system, helping businesses understand demand, customer behavior, driver performance, and service reliability. By analyzing these insights, businesses can make data-driven decisions to optimize operations, improve customer experience, and boost performance.

### Data Cleaning and Transformation on NashvilleHouse Table

NashvilleHouse Data Cleaning and Transformation

This document outlines the steps performed to clean and transform the NashvilleHouse table, ensuring the data is standardized, missing values are handled, and unnecessary columns are removed for optimization.

Key Data Cleaning and Transformation Tasks

1. Standardizing SaleDate Format

2. Handling Missing Property Address Values

3. Splitting Address Components

4. Updating SoldAsVacant Column

5. Removing Duplicate Rows

6. Dropping Unnecessary Columns

Outcome

These transformations resulted in a cleaner, more standardized dataset that is easier to work with, ensuring improved accuracy for analysis and reporting. The changes optimize the table structure by removing redundant and unnecessary data, making it more efficient for further processing.

Technologies Used:


COVID Data Analysis: Key Insights and Transformations

This document outlines the steps I performed to analyze COVID-19 data, including data transformation, calculation of key metrics, and insights on infection rates, death rates, and vaccination coverage across different countries and continents.

Key Analysis Tasks

1. Data Transformation

2. Percentage of Population Infected

3. Identification of High Infection and Death Rates

4. Global Aggregation

5. Population and Vaccination Analysis

6. Vaccination Rate Calculation

7. View Creation for Future Visualizations

Outcome

The analysis provided valuable insights into the global COVID-19 situation, highlighting regions with high infection and death rates, as well as areas with lower vaccination coverage. By transforming the data and calculating key metrics, I was able to offer a comprehensive view of the pandemic’s impact on various countries and continents.

Technologies Used:

Future Work: