SALES PERFOMANCE ANALYSIS
SALES PERFOMANCE ANALYSIS
S.NO CONTENT
01 ABSTRACT
02 INTRODUCTION
03 DATA COLLECTION
04 DATA PROCESSING
05 VISUALIZATON
06 CONCLUSION
1.ABSTRACT:
3.1.DATA COLLECTION
Data collection for sales performance analysis typically involves gathering
various types of data that are crucial for understanding how a business is
performing. Here's an outline of the main categories of data you'll need to
collect:
Product Details: Information about the product(s) sold, such as product ID,
name, category, and price.
Revenue: The total revenue generated from the transaction (e.g., price *
quantity sold).
Discounts: Any discounts applied during the transaction.
Sales Channel: The platform through which the sale occurred (e.g., online,
in-store).
Sales Targets: The targets set for each salesperson, which can be
compared to actual sales to gauge performance.
Sales by Region: Data on how sales differ across various regions, cities, or
countries.
Data Cleaning:
Missing values: Identify and handle missing data through imputation (mean,
median, or mode imputation) or by removing incomplete rows.
Outliers: Detect and handle outliers which could skew analysis (using Z-
scores or IQR methods).
Duplicate entries:
Identify and remove duplicate records that might distort the analysis.
Data Transformation:
import pandas as pd
# Load dataset
df = pd.read_csv("sales_data.csv")
print(df.head())
print(df.isnull().sum())
# Summary statistics
print(df.describe())
print(df['Category'].unique())
print(df['Region'].unique())
df['Date'] = pd.to_datetime(df['Date'])
df['Month'] = df['Date'].dt.to_period('M')
plt.figure(figsize=(12, 6))
plt.xlabel('Month')
plt.grid()
plt.show()
regional_sales = df.groupby('Region')['Sales_Amount'].sum().reset_index()
# Bar plot
plt.figure(figsize=(10, 5))
plt.title('Sales by Region')
plt.xlabel('Region')
plt.show()
Insights
Top-Selling Products
top_products =
df.groupby('Product_ID')['Sales_Amount'].sum().nlargest(10)
# Bar plot
plt.figure(figsize=(10, 5))
top_products.plot(kind='bar', color='green')
plt.xlabel('Product ID')
plt.show()
Insights
plt.figure(figsize=(10, 5))
plt.show()
Insights
customer_sales = df.groupby('Customer_Type')['Sales_Amount'].sum()
# Pie chart
plt.figure(figsize=(7, 7))
plt.ylabel('')
plt.show()
OUTPUT:
1.Data Preview (df.head())
Date 0
Product_ID 0
Category 0
Region 0
Sales_Amount 0
Units_Sold 0
Discount_Applied 5
Customer_Type 0
dtype: int64
There are 5 missing values in the Discount_Applied column, which are filled
with 0%.
3. Summary Statistics (df.describe())
Insights:
----------------------------------
North 50,000
South 65,000
East 40,000
West 55,000
6. Best-Selling Products
----------------------------------
P303 12,000
P404 10,500
P101 9,750
P202 8,900
P505 8,200
A scatter plot with sales amount on the y-axis and discount applied on the
x-axis.
Insights:
Example data:
------------------------------------
A pie chart shows that new customers contribute more sales (60%).
9. Business Recommendations
A dip in sales was observed during specific periods, which may be due to
off-season effects or low customer demand.
The South region recorded the highest sales, while the East region had the
lowest performance.
3. Best-Selling Products
Moderate discounts (5-15%) had a positive effect on sales, but very high
discounts (>25%) did not significantly increase sales.