100% found this document useful (1 vote)

115 views

K Means Clustering

This document analyzes customer data using k-means clustering. It loads customer data, cleans and prepares the data, runs k-means clustering for k values from 1 to 9, and analyzes the results. It finds that the sum of squared distances decreases most significantly (around 30%) when going from 1 to 2 clusters and again from 2 to 3 clusters, indicating those are the optimal numbers of clusters for the data.

Uploaded by

Walid Sassi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

100% found this document useful (1 vote)

115 views

K Means Clustering

Uploaded by

Walid Sassi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 10

9/1/23, 2:11 PM Mall_kmean - Jupyter Notebook

In [1]:

1 import pandas as pd

In [5]:

1 ml = pd.read_csv("mall_kmeans.csv")

In [6]:

1 ml.head()

Out[6]:

CustomerID Genre Age Annual Income (k$) Spending Score (1-100)

0 1 Male 19 15 39

1 2 Male 21 15 81

2 3 Female 20 16 6

3 4 Female 23 16 77

4 5 Female 31 17 40

In [8]:

1 ml.isnull().sum()

Out[8]:

CustomerID 0
Genre 0
Age 0
Annual Income (k$) 0
Spending Score (1-100) 0
dtype: int64

In [9]:

1 ml.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 200 entries, 0 to 199
Data columns (total 5 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 CustomerID 200 non-null int64
1 Genre 200 non-null object
2 Age 200 non-null int64
3 Annual Income (k$) 200 non-null int64
4 Spending Score (1-100) 200 non-null int64
dtypes: int64(4), object(1)
memory usage: 7.9+ KB

localhost:8888/notebooks/Desktop/ML/Mall_kmeans/Mall_kmean.ipynb 1/10
9/1/23, 2:11 PM Mall_kmean - Jupyter Notebook

In [10]:

1 ml.Genre.value_counts()

Out[10]:

Female 112
Male 88
Name: Genre, dtype: int64

In [11]:

1 ml.Genre.replace({'Female':0,'Male':1},inplace=True)

In [14]:

1 ml.select_dtypes(include='object').columns

Out[14]:

Index([], dtype='object')

In [15]:

1 from sklearn.cluster import KMeans

In [111]:

1 kmeans_ml = KMeans(n_clusters=5)

In [112]:

1 kmeans_ml.fit(ml)

Out[112]:

KMeans(n_clusters=5)

In [113]:

1 kmeans_ml.labels_

Out[113]:

array([2, 4, 2, 4, 2, 4, 2, 4, 2, 4, 2, 4, 2, 4, 2, 4, 2, 4, 2, 4, 2, 4,
2, 4, 2, 4, 2, 4, 2, 4, 2, 4, 2, 4, 2, 4, 2, 4, 2, 4, 2, 4, 2, 4,
2, 4, 2, 2, 2, 2, 2, 4, 2, 2, 2, 2, 2, 2, 0, 2, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 3, 0, 3, 0, 3, 1, 3, 1, 3,
1, 3, 1, 3, 1, 3, 1, 3, 1, 3, 1, 3, 1, 3, 1, 3, 1, 3, 1, 3, 1, 3,
1, 3, 1, 3, 1, 3, 1, 3, 1, 3, 1, 3, 1, 3, 1, 3, 1, 3, 1, 3, 1, 3,
1, 3, 1, 3, 1, 3, 1, 3, 1, 3, 1, 3, 1, 3, 1, 3, 1, 3, 1, 3, 1, 3,
1, 3])

localhost:8888/notebooks/Desktop/ML/Mall_kmeans/Mall_kmean.ipynb 2/10
9/1/23, 2:11 PM Mall_kmean - Jupyter Notebook

In [114]:

1 set(kmeans_ml.labels_)

Out[114]:

{0, 1, 2, 3, 4}

In [115]:

1 kmeans_ml.cluster_centers_

Out[115]:

array([[ 92.53030303, 0.42424242, 42.72727273, 57.75757576,

49.46969697],
[164. , 0.52777778, 40.80555556, 87.91666667,
17.88888889],
[ 33.34285714, 0.37142857, 45.31428571, 31.8 ,
30.31428571],
[162. , 0.46153846, 32.69230769, 86.53846154,
82.12820513],
[ 25.16666667, 0.41666667, 25.83333333, 26.95833333,
77.79166667]])

In [116]:

1 len(kmeans_ml.cluster_centers_)

Out[116]:

In [117]:

1 centroid_df = pd.DataFrame(kmeans_ml.cluster_centers_)

In [118]:

1 centroid_df.columns = ml.columns

In [119]:

1 centroid_df

Out[119]:

CustomerID Genre Age Annual Income (k$) Spending Score (1-100)

0 92.530303 0.424242 42.727273 57.757576 49.469697

1 164.000000 0.527778 40.805556 87.916667 17.888889

2 33.342857 0.371429 45.314286 31.800000 30.314286

3 162.000000 0.461538 32.692308 86.538462 82.128205

4 25.166667 0.416667 25.833333 26.958333 77.791667

localhost:8888/notebooks/Desktop/ML/Mall_kmeans/Mall_kmean.ipynb 3/10
9/1/23, 2:11 PM Mall_kmean - Jupyter Notebook

In [120]:

1 kmeans_ml.score(ml)

Out[120]:

-157141.33959373957

In [94]:

1 lst = []
2 for k in range(1,10):
3 kmeans_ml = KMeans(n_clusters=k)
4 kmeans_ml.fit(ml)
5 score = kmeans_ml.score(ml)
6 lst.append(score)
7 print("cluster over are",k, "cluster left are",len(range(1,10))-k)
8 print("____________________")

C:\Users\MR.GODHADE\anaconda3\lib\site-packages\sklearn\cluster\_kmeans.p
y:1036: UserWarning: KMeans is known to have a memory leak on Windows with
MKL, when there are less chunks than available threads. You can avoid it b
y setting the environment variable OMP_NUM_THREADS=1.
warnings.warn(

cluster over are 1 cluster left are 8

____________________
cluster over are 2 cluster left are 7
____________________
cluster over are 3 cluster left are 6
____________________
cluster over are 4 cluster left are 5
____________________
cluster over are 5 cluster left are 4
____________________
cluster over are 6 cluster left are 3
____________________
cluster over are 7 cluster left are 2
____________________
cluster over are 8 cluster left are 1
____________________
cluster over are 9 cluster left are 0
____________________

In [121]:

1 import numpy as np

In [122]:

1 lst = np.round(np.abs(lst))

In [123]:

1 cluster_num = list(range(1,10))

localhost:8888/notebooks/Desktop/ML/Mall_kmeans/Mall_kmean.ipynb 4/10
9/1/23, 2:11 PM Mall_kmean - Jupyter Notebook

In [124]:

1 import matplotlib.pyplot as plt

In [125]:

1 plt.plot(cluster_num,lst, marker ="*")

2 plt.grid()

In [126]:

1 lst

Out[126]:

array([975512., 387066., 271385., 195401., 157621., 122608., 103233.,

86004., 77299.])

In [127]:

1 (975512 - 387066)*100/975512 #60% drop in ssd when k change from 1 to 2

2 (387066 - 271397)*100/387066 #29% drop in ssd when k change from 1 to 2
3 (271397 - 195401)*100/271397 #28% drop in ssd when k change from 1 to 2
4 (195401 - 157506)*100/195401 #19% drop in ssd when k change from 1 to 2
5 (157506 - 122630)*100/195401 #17% drop in ssd when k change from 1 to 2

Out[127]:

17.848424521880645

localhost:8888/notebooks/Desktop/ML/Mall_kmeans/Mall_kmean.ipynb 5/10
9/1/23, 2:11 PM Mall_kmean - Jupyter Notebook

In [128]:

1 (387066 - 271397)*100/387066

Out[128]:

29.88353407429224

In [129]:

1 (271397 - 195401)*100/271397

Out[129]:

28.001783365328283

In [130]:

1 (195401 - 157506)*100/195401

Out[130]:

19.393452438830916

In [131]:

1 colormap = np.array(['Red','Green','Blue','Yellow','Black'])

In [140]:

1 kmeans_ml.labels_

Out[140]:

localhost:8888/notebooks/Desktop/ML/Mall_kmeans/Mall_kmean.ipynb 6/10
9/1/23, 2:11 PM Mall_kmean - Jupyter Notebook

In [139]:

1 colormap[kmeans_ml.labels_]

Out[139]:

array(['Blue', 'Black', 'Blue', 'Black', 'Blue', 'Black', 'Blue', 'Black',

'Blue', 'Black', 'Blue', 'Black', 'Blue', 'Black', 'Blue', 'Black',
'Blue', 'Black', 'Blue', 'Black', 'Blue', 'Black', 'Blue', 'Black',
'Blue', 'Black', 'Blue', 'Black', 'Blue', 'Black', 'Blue', 'Black',
'Blue', 'Black', 'Blue', 'Black', 'Blue', 'Black', 'Blue', 'Black',
'Blue', 'Black', 'Blue', 'Black', 'Blue', 'Black', 'Blue', 'Blue',
'Blue', 'Blue', 'Blue', 'Black', 'Blue', 'Blue', 'Blue', 'Blue',
'Blue', 'Blue', 'Red', 'Blue', 'Red', 'Red', 'Red', 'Red', 'Red',
'Red', 'Red', 'Red', 'Red', 'Red', 'Red', 'Red', 'Red', 'Red',
'Red', 'Red', 'Red', 'Red', 'Red', 'Red', 'Red', 'Red', 'Red',
'Red', 'Red', 'Red', 'Red', 'Red', 'Red', 'Red', 'Red', 'Red',
'Red', 'Red', 'Red', 'Red', 'Red', 'Red', 'Red', 'Red', 'Red',
'Red', 'Red', 'Red', 'Red', 'Red', 'Red', 'Red', 'Red', 'Red',
'Red', 'Red', 'Red', 'Red', 'Red', 'Red', 'Red', 'Red', 'Red',
'Red', 'Red', 'Red', 'Red', 'Yellow', 'Red', 'Yellow', 'Red',
'Yellow', 'Green', 'Yellow', 'Green', 'Yellow', 'Green', 'Yellow',
'Green', 'Yellow', 'Green', 'Yellow', 'Green', 'Yellow', 'Green',
'Yellow', 'Green', 'Yellow', 'Green', 'Yellow', 'Green', 'Yellow',
'Green', 'Yellow', 'Green', 'Yellow', 'Green', 'Yellow', 'Green',
'Yellow', 'Green', 'Yellow', 'Green', 'Yellow', 'Green', 'Yellow',
'Green', 'Yellow', 'Green', 'Yellow', 'Green', 'Yellow', 'Green',
'Yellow', 'Green', 'Yellow', 'Green', 'Yellow', 'Green', 'Yellow',
'Green', 'Yellow', 'Green', 'Yellow', 'Green', 'Yellow', 'Green',
'Yellow', 'Green', 'Yellow', 'Green', 'Yellow', 'Green', 'Yellow',
'Green', 'Yellow', 'Green', 'Yellow', 'Green', 'Yellow', 'Green',
'Yellow', 'Green', 'Yellow'], dtype='<U6')

In [133]:

1 plt.scatter(ml['Age'],ml['Annual Income (k$)'], c = colormap[kmeans_ml.labels_])

Out[133]:

<matplotlib.collections.PathCollection at 0x1c7ec6cefa0>

localhost:8888/notebooks/Desktop/ML/Mall_kmeans/Mall_kmean.ipynb 7/10
9/1/23, 2:11 PM Mall_kmean - Jupyter Notebook

In [134]:

1 ml

Out[134]:

CustomerID Genre Age Annual Income (k$) Spending Score (1-100)

0 1 1 19 15 39

1 2 1 21 15 81

2 3 0 20 16 6

3 4 0 23 16 77

4 5 0 31 17 40

... ... ... ... ... ...

195 196 0 35 120 79

196 197 0 45 126 28

197 198 1 32 126 74

198 199 1 32 137 18

199 200 1 30 137 83

200 rows × 5 columns

localhost:8888/notebooks/Desktop/ML/Mall_kmeans/Mall_kmean.ipynb 8/10
9/1/23, 2:11 PM Mall_kmean - Jupyter Notebook

In [136]:

1 plt.scatter(ml['Age'],ml['Spending Score (1-100)'], c = colormap[kmeans_ml.labels_])

2 plt.xlabel('Age')
3 plt.ylabel('Spending Score')

Out[136]:

Text(0, 0.5, 'Spending Score')

localhost:8888/notebooks/Desktop/ML/Mall_kmeans/Mall_kmean.ipynb 9/10
9/1/23, 2:11 PM Mall_kmean - Jupyter Notebook

In [137]:

1 plt.scatter(ml['Annual Income (k$)'],ml['Spending Score (1-100)'], c = colormap[kmea

2 plt.xlabel('Annual Income (k$)')
3 plt.ylabel('Spending Score')

Out[137]:

Text(0, 0.5, 'Spending Score')

In [ ]:

localhost:8888/notebooks/Desktop/ML/Mall_kmeans/Mall_kmean.ipynb 10/10

MESA MOM Maturity Model - Final 2016-4-13 PDF
100% (1)
MESA MOM Maturity Model - Final 2016-4-13 PDF
118 pages
QuantEconlectures Python3 PDF
100% (1)
QuantEconlectures Python3 PDF
1,125 pages
FMT
100% (2)
FMT
3 pages
1983 Ic Master Volume 1
No ratings yet
1983 Ic Master Volume 1
1,876 pages
Poly
100% (1)
Poly
108 pages
Leer Los Datos: Import As Import As Import As From Import From Import
100% (1)
Leer Los Datos: Import As Import As Import As From Import From Import
14 pages
Unit 4 Basics of Feature Engineering
100% (1)
Unit 4 Basics of Feature Engineering
33 pages
Import As
100% (1)
Import As
27 pages
Python For You and Me: Release 0.3.alpha1
100% (1)
Python For You and Me: Release 0.3.alpha1
143 pages
Python Numpy (1) : Intro To Multi-Dimensional Array & Numerical Linear Algebra
100% (1)
Python Numpy (1) : Intro To Multi-Dimensional Array & Numerical Linear Algebra
27 pages
Student Booklet For Sep 2015 v6
100% (1)
Student Booklet For Sep 2015 v6
50 pages
Scip y Lectures
100% (1)
Scip y Lectures
329 pages
LPTHW
100% (1)
LPTHW
220 pages
Homework 2
100% (1)
Homework 2
12 pages
Name: Reg. No.: Lab Exercise:: Shivam Batra 19BPS1131
100% (1)
Name: Reg. No.: Lab Exercise:: Shivam Batra 19BPS1131
10 pages
Homework 2
100% (1)
Homework 2
14 pages
Logistic Regression
100% (1)
Logistic Regression
29 pages
Course Title: Data Pre-Processing and Visualization
100% (2)
Course Title: Data Pre-Processing and Visualization
11 pages
Project 5 PDF
100% (1)
Project 5 PDF
48 pages
Python Vs R in Data and Machine Learning PDF
100% (1)
Python Vs R in Data and Machine Learning PDF
6 pages
Classification With Decision Trees: Instructor: Qiang Yang
100% (1)
Classification With Decision Trees: Instructor: Qiang Yang
62 pages
Salary Prediction LinearRegression
100% (1)
Salary Prediction LinearRegression
7 pages
EDA Lecture Module 2
100% (1)
EDA Lecture Module 2
42 pages
CS229 Lecture 3 PDF
100% (1)
CS229 Lecture 3 PDF
35 pages
Linear Regression With LM Function, Diagnostic Plots, Interaction Term, Non-Linear Transformation of The Predictors, Qualitative Predictors
100% (1)
Linear Regression With LM Function, Diagnostic Plots, Interaction Term, Non-Linear Transformation of The Predictors, Qualitative Predictors
15 pages
Sajjad DS
100% (2)
Sajjad DS
97 pages
Tutor
100% (1)
Tutor
309 pages
Logistic Regression Example
100% (1)
Logistic Regression Example
22 pages
CPE412 Pattern Recognition (Week 8)
100% (1)
CPE412 Pattern Recognition (Week 8)
25 pages
7. Heteroscedasticity: y = β + β x + · · · + β x + u
100% (1)
7. Heteroscedasticity: y = β + β x + · · · + β x + u
21 pages
KPMG Data
50% (2)
KPMG Data
3,723 pages
Decision Tree Classification
100% (1)
Decision Tree Classification
11 pages
1.1 Simple Linear Regression Model
100% (1)
1.1 Simple Linear Regression Model
15 pages
Taller Practica Churn
50% (2)
Taller Practica Churn
6 pages
Outliers, Hypothesis and Natural Language Processing
100% (1)
Outliers, Hypothesis and Natural Language Processing
7 pages
Stat1012 Cheatsheet Double-Sided
100% (1)
Stat1012 Cheatsheet Double-Sided
2 pages
Introduction To STATISTICS-new
100% (1)
Introduction To STATISTICS-new
46 pages
Quiz Feedback1 - Coursera
100% (1)
Quiz Feedback1 - Coursera
7 pages
Sas Notes Module 4-Categorical Data Analysis Testing Association Between Categorical Variables
100% (1)
Sas Notes Module 4-Categorical Data Analysis Testing Association Between Categorical Variables
16 pages
Lecture 4 Linear Regression
100% (1)
Lecture 4 Linear Regression
44 pages
Logistic Regression: Gunjan Bharadwaj Assistant Professor Dept of CEA
100% (1)
Logistic Regression: Gunjan Bharadwaj Assistant Professor Dept of CEA
42 pages
Machine Learning in Python Main Developments and T
100% (1)
Machine Learning in Python Main Developments and T
44 pages
Stats For Managers - Intro
100% (1)
Stats For Managers - Intro
101 pages
Correlation & Regression
100% (1)
Correlation & Regression
53 pages
Regression Project
100% (1)
Regression Project
60 pages
Dokumen - Pub Approaching Almost Any Machine Learning Problem 9788269211528 L 5276104
100% (1)
Dokumen - Pub Approaching Almost Any Machine Learning Problem 9788269211528 L 5276104
151 pages
Blank: CFC Cumulative Forecast Error or Bias Error
100% (1)
Blank: CFC Cumulative Forecast Error or Bias Error
2 pages
8multiple Linear Regression
100% (1)
8multiple Linear Regression
21 pages
Glass Classification
100% (2)
Glass Classification
3 pages
SVM (Support Vector Machine) For Classification - by Aditya Kumar - Towards Data Science
100% (1)
SVM (Support Vector Machine) For Classification - by Aditya Kumar - Towards Data Science
28 pages
Linear Regression (Check List)
100% (1)
Linear Regression (Check List)
2 pages
Employee Attrition Miniblogs
100% (1)
Employee Attrition Miniblogs
15 pages
Cardio Screen RF
100% (1)
Cardio Screen RF
27 pages
Importing Libraries: Import As Import As Import As From Import As From Import From Import Import
100% (1)
Importing Libraries: Import As Import As Import As From Import As From Import From Import Import
11 pages
Actividad Semana 4 - Jupyter Notebook
100% (1)
Actividad Semana 4 - Jupyter Notebook
7 pages
Community Medicine Trans - Epidemic Investigation 2
100% (1)
Community Medicine Trans - Epidemic Investigation 2
10 pages
KPMG - Data Set
100% (1)
KPMG - Data Set
1,685 pages
Logistic Regression
100% (1)
Logistic Regression
14 pages
1
100% (1)
1
385 pages
Logistic Regression Model Study Assignment
100% (1)
Logistic Regression Model Study Assignment
5 pages
XGBoost for Regression Predictive Modeling and Time Series Analysis: Learn how to build, evaluate, and deploy predictive models with expert guidance
From Everand
XGBoost for Regression Predictive Modeling and Time Series Analysis: Learn how to build, evaluate, and deploy predictive models with expert guidance
Partha Pritam Deka
No ratings yet
Excel 2013/2016: Get Your Hands Dirty
From Everand
Excel 2013/2016: Get Your Hands Dirty
Sam Akrasi
No ratings yet
Python Natural Language Processing Cookbook: Over 60 recipes for building powerful NLP solutions using Python and LLM libraries
From Everand
Python Natural Language Processing Cookbook: Over 60 recipes for building powerful NLP solutions using Python and LLM libraries
Zhenya Antić
No ratings yet
Numpy Day7
No ratings yet
Numpy Day7
12 pages
Career With AI - Himanshu Ramchandani
No ratings yet
Career With AI - Himanshu Ramchandani
19 pages
ScientificPythonLectures Simple
100% (1)
ScientificPythonLectures Simple
687 pages
Generative AI With LArge Language Models
No ratings yet
Generative AI With LArge Language Models
36 pages
Chacha: Leveraging Large Language Models To Prompt Children To Share Their Emotions About Personal Events
No ratings yet
Chacha: Leveraging Large Language Models To Prompt Children To Share Their Emotions About Personal Events
20 pages
Intro Gen AI 6p
100% (1)
Intro Gen AI 6p
6 pages
Pca Implementation Notebook
No ratings yet
Pca Implementation Notebook
4 pages
Pca Handwritten
No ratings yet
Pca Handwritten
13 pages
Data Analysis Process
No ratings yet
Data Analysis Process
95 pages
Computing Scheme of Work and Planning: All Saints Upton Primary School Computing Curriculum
No ratings yet
Computing Scheme of Work and Planning: All Saints Upton Primary School Computing Curriculum
30 pages
First: Lego League UK and Ireland Operational Partner
No ratings yet
First: Lego League UK and Ireland Operational Partner
12 pages
Writing For The Web
No ratings yet
Writing For The Web
10 pages
Customer Churn Prediction
100% (1)
Customer Churn Prediction
32 pages
Computing Scheme of Work and Planning: All Saints Upton Primary School Computing Curriculum
No ratings yet
Computing Scheme of Work and Planning: All Saints Upton Primary School Computing Curriculum
29 pages
Intro HTML Css Preso 2
No ratings yet
Intro HTML Css Preso 2
8 pages
RGB Shades Task: Colour Colour Code
No ratings yet
RGB Shades Task: Colour Colour Code
1 page
Computing Scheme of Work and Planning: All Saints Upton Primary School Computing Curriculum
No ratings yet
Computing Scheme of Work and Planning: All Saints Upton Primary School Computing Curriculum
28 pages
Computing Scheme of Work and Planning: All Saints Upton Primary School Computing Curriculum
No ratings yet
Computing Scheme of Work and Planning: All Saints Upton Primary School Computing Curriculum
24 pages
Computing Scheme of Work and Planning: All Saints Upton Primary School Computing Curriculum
No ratings yet
Computing Scheme of Work and Planning: All Saints Upton Primary School Computing Curriculum
24 pages
Do Now Lesson 2
No ratings yet
Do Now Lesson 2
1 page
Lesson 1 Week 18 Do Now
No ratings yet
Lesson 1 Week 18 Do Now
1 page
File Types Review Sheet
No ratings yet
File Types Review Sheet
1 page
DW Howto Create Lists
No ratings yet
DW Howto Create Lists
2 pages
How To Create A Wireframe: Adobe Photoshop Guide
No ratings yet
How To Create A Wireframe: Adobe Photoshop Guide
8 pages
Automated Tape Library
No ratings yet
Automated Tape Library
155 pages
Visual Studio Keyboard Shortcuts by Microsoft Learn
No ratings yet
Visual Studio Keyboard Shortcuts by Microsoft Learn
62 pages
HEC-DSSVue 20 Users Manual-Desbloqueado
No ratings yet
HEC-DSSVue 20 Users Manual-Desbloqueado
490 pages
A300-600 - A310 SRIW Report
No ratings yet
A300-600 - A310 SRIW Report
8 pages
Siddhartha Pradhan: Professional Summery
No ratings yet
Siddhartha Pradhan: Professional Summery
2 pages
Oracle Apps Tutorial Payment Batches Creation in r12
100% (1)
Oracle Apps Tutorial Payment Batches Creation in r12
1 page
Deliverable D5.4 - Final Report On Liaisons, Standardization and Dissemination
No ratings yet
Deliverable D5.4 - Final Report On Liaisons, Standardization and Dissemination
105 pages
Error Parsing Near PDF 1 4
No ratings yet
Error Parsing Near PDF 1 4
2 pages
CSC 403
No ratings yet
CSC 403
25 pages
Master Reset
No ratings yet
Master Reset
2 pages
First Prelim Quarter Examination
No ratings yet
First Prelim Quarter Examination
2 pages
Bogen MC2k Install Manual
No ratings yet
Bogen MC2k Install Manual
32 pages
Conputer Care and Maintenance
No ratings yet
Conputer Care and Maintenance
30 pages
G8 Final Exams Girls
No ratings yet
G8 Final Exams Girls
3 pages
Usb Itn
No ratings yet
Usb Itn
6 pages
Software Testing
No ratings yet
Software Testing
27 pages
A320 Radio Configuration
No ratings yet
A320 Radio Configuration
14 pages
Building Real-Time Marvels with Laravel: Create Dynamic and Interactive Web Applications 1st Edition Sivaraj Selvaraj download
100% (2)
Building Real-Time Marvels with Laravel: Create Dynamic and Interactive Web Applications 1st Edition Sivaraj Selvaraj download
67 pages
CSC CC 3rd Ed Revised - Final
No ratings yet
CSC CC 3rd Ed Revised - Final
247 pages
DataSheet
No ratings yet
DataSheet
3 pages
7.4 Graphs
No ratings yet
7.4 Graphs
37 pages
NCO, NSO, IMO & IEO 2015 - 2016 Class 3 First Level Sample Papers
50% (4)
NCO, NSO, IMO & IEO 2015 - 2016 Class 3 First Level Sample Papers
15 pages
Computer Project Nishan Pant 11D 21478
No ratings yet
Computer Project Nishan Pant 11D 21478
14 pages
Tallyman - Handover - Quick Guide
No ratings yet
Tallyman - Handover - Quick Guide
11 pages
A Review On Various Transformer Testing Systems
100% (1)
A Review On Various Transformer Testing Systems
4 pages
WinCC Programming en-US en-US
No ratings yet
WinCC Programming en-US en-US
2,688 pages
2021 A survey of OCR evaluation tools and metrics
No ratings yet
2021 A survey of OCR evaluation tools and metrics
6 pages

K Means Clustering

Uploaded by

K Means Clustering

Uploaded by

9/1/23, 2:11 PM Mall_kmean - Jupyter Notebook

CustomerID Genre Age Annual Income (k$) Spending Score (1-100)

1 from sklearn.cluster import KMeans

array([[ 92.53030303, 0.42424242, 42.72727273, 57.75757576,

CustomerID Genre Age Annual Income (k$) Spending Score (1-100)

0 92.530303 0.424242 42.727273 57.757576 49.469697

1 164.000000 0.527778 40.805556 87.916667 17.888889

2 33.342857 0.371429 45.314286 31.800000 30.314286

3 162.000000 0.461538 32.692308 86.538462 82.128205

4 25.166667 0.416667 25.833333 26.958333 77.791667

cluster over are 1 cluster left are 8

1 import matplotlib.pyplot as plt

1 plt.plot(cluster_num,lst, marker ="*")

array([975512., 387066., 271385., 195401., 157621., 122608., 103233.,

1 (975512 - 387066)*100/975512 #60% drop in ssd when k change from 1 to 2

array(['Blue', 'Black', 'Blue', 'Black', 'Blue', 'Black', 'Blue', 'Black',

1 plt.scatter(ml['Age'],ml['Annual Income (k$)'], c = colormap[kmeans_ml.labels_])

CustomerID Genre Age Annual Income (k$) Spending Score (1-100)

... ... ... ... ... ...

195 196 0 35 120 79

196 197 0 45 126 28

197 198 1 32 126 74

198 199 1 32 137 18

199 200 1 30 137 83

200 rows × 5 columns

1 plt.scatter(ml['Age'],ml['Spending Score (1-100)'], c = colormap[kmeans_ml.labels_])

Text(0, 0.5, 'Spending Score')

1 plt.scatter(ml['Annual Income (k$)'],ml['Spending Score (1-100)'], c = colormap[kmea

Text(0, 0.5, 'Spending Score')

You might also like