0% found this document useful (0 votes)

31 views

Data Mining P

Uploaded by

Monirul Islam Roni

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

31 views

Data Mining P

Uploaded by

Monirul Islam Roni

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 23

WELCOME

to the presentation of “Group-2”

PRESENTATION
TOPIC :
Clustering

COURSE CODE : STAT-309

COURSE
TITLE : Data
Mining
COURSE
TEACHER :

AHSANUL HAQUE
LECTURER
DEPARTMENT OF
UNIVERSITY
STATISTICS OF
BARISHAL
Our Team

SMINA AHMED FARZANA TABASU

SHORMY SHARMISTHA ESHITA AKTER MIM
BISWAS SWARNA MONIRUL ISLAM
RONI
Clustering

Grouping of a particular set of

objects based on their
characteristics,
aggregating them according
to their similarities
Huge Dataset

Find common attributes - all data in the

same group have similar attributes
Clustering

Examine the data to form clusters

Entitles in the real world are very complex

Products sold on a Users of a social Readers of an online

e-commerce site media platform newspaper
Defining Characteristics Using Numbers
Ratings
Review sentiment (1-positive, 0-negative
Category (1- electronics, 2- fashion,...)
Dimensions (size,height,weight)
Color
RatScore posts, comments,likes,shares
Score every post by topic (music lovers,sports lovers)
Activity score(100 most active, 0 not active at all)
Number of connections
% profile complete
Sports Science
professional basketball teams may Health Insurance
collect the following information
about players: an actuary may collect the
 Points per game
following information about
households:
 Assists per game
 Total number of doctor
 Steals per game
REAL LIFE visits per year
EXAMPLE
 Total household size
Email Marketing  Total number of chronic
11
a business may collect the following conditions per household
information about consumers:  Average age of household
 Percentage of emails opened members
 Number of clicks per email
 Time spent viewing email
Basic
Features:
• ·The number of clusters is
not known
• There may not be any a
priori knowledge
concerning the clusters
• Cluster results are
dynamic.
HIERARCHICAL
AGGLOMERATIV
E CLUSTERING
(CLUSTER
• Single: nearest distance or
HAC CAN BE single linkage.
• Complete: farthest
REPRESENTED distance or complete

USING THREE linkage.

• Average: average distance
TECHNIQUES- or average linkage.
Linkage Method Merits Demerits

can separate non-elliptical

cannot separate the
shapes as long as the gap
Single clusters properly if there is
between two clusters is not
noise between clusters.
small.

does well in separating

biased towards equal
Complete clusters if there is noise
variance clusters
between clusters.

balances compactness and

Average computationally intensive
connectivity
K-Means Clustering-
· K-Means clustering is an unsupervised
iterative clustering technique.

· It partitions the given data set into k

predefined distinct clusters.

· A cluster is defined as a collection

of data points exhibiting certain similarities.
IT PARTITIONS THE DATA SET
SUCH THAT-

Each data point belongs to a cluster with the nearest

mean.

Data points belonging to one cluster have high degree of

similarity.

Data points belonging to different clusters have high

degree of dissimilarity.
K-Means Clustering
Step-01: Step-04:

Choose the number of

Algorithm Assign each data point to some
cluster.
clusters K.
A data point is assigned to that cluster
whose center is nearest to that data
point.

Step-02: K-Means
Randomly select any K data points Clustering
as cluster centers. Step-05:
Algorithm Re-compute the center of newly
 Select cluster centers in such a
way that they are as farther as involves the formed clusters.
possible from each other. following steps- The center of a cluster is computed by
taking mean of all the data points
contained in that cluster.

Step-06:
Keep repeating the procedure from Step-03 to
Step-03:
Step-05 until any of the following stopping
Calculate the distance between each data
point and each cluster center.
criteria is met-

The distance may be calculated either by using Center of newly formed clusters do not change
given distance function or by using Euclidean Data points remain present in the same cluster
distance formula.
Maximum number of iterations are reached
Advantages of k-means Disadvantages

Relatively simple to implement.

Scales to large data sets. It requires to specify the number of
•
Guarantees convergence. clusters (k) in advance.

Can warm-start the positions of

centroids. • It cannot handle noisy data and outliers.

Easily adapts to new examples.

It is not suitable to identify clusters with
Generalizes to clusters of •
non-convex shapes.
different shapes and sizes, such
as elliptical clusters.

Cognitive Behavioral Therapy (Group Counselling)
100% (2)
Cognitive Behavioral Therapy (Group Counselling)
28 pages
Samuel-659766347-Java Paragon Hotel and Residence-HOTEL - STANDALONE
No ratings yet
Samuel-659766347-Java Paragon Hotel and Residence-HOTEL - STANDALONE
3 pages
Clustering-Part1.pptx
No ratings yet
Clustering-Part1.pptx
84 pages
Unit 4 Clustering - K-Means and Hierarchical
No ratings yet
Unit 4 Clustering - K-Means and Hierarchical
40 pages
Module 3 - 1
No ratings yet
Module 3 - 1
149 pages
8. Clustering
No ratings yet
8. Clustering
38 pages
Clustering
No ratings yet
Clustering
84 pages
22AIP3101A Session 9
No ratings yet
22AIP3101A Session 9
38 pages
Lecture 14 Clustering
0% (1)
Lecture 14 Clustering
57 pages
BDA Unit 2
No ratings yet
BDA Unit 2
31 pages
Clustering
No ratings yet
Clustering
75 pages
Clustering
No ratings yet
Clustering
39 pages
Lecture 4.6 Unsupervised-learning Clustering
No ratings yet
Lecture 4.6 Unsupervised-learning Clustering
60 pages
K Mean
No ratings yet
K Mean
7 pages
Hierarchical Clustering: Relationship Between Clusters
No ratings yet
Hierarchical Clustering: Relationship Between Clusters
23 pages
U-5_IML (2)
No ratings yet
U-5_IML (2)
20 pages
ML Module 4 2022 1 PDF
No ratings yet
ML Module 4 2022 1 PDF
31 pages
Text Analytics Unit-3
No ratings yet
Text Analytics Unit-3
11 pages
Slide TIF311 DM 10 11
No ratings yet
Slide TIF311 DM 10 11
49 pages
Clustering
No ratings yet
Clustering
125 pages
ML Unit-4 Final 2024-25
No ratings yet
ML Unit-4 Final 2024-25
28 pages
Jaipur National University: Project Design With Seminar
100% (1)
Jaipur National University: Project Design With Seminar
26 pages
DM Lecture 06
No ratings yet
DM Lecture 06
32 pages
An Introduction To Clustering Methods
No ratings yet
An Introduction To Clustering Methods
8 pages
Cluster Analysis 1731695796
No ratings yet
Cluster Analysis 1731695796
91 pages
Data Mining - Chapter 4 Cluster Analysis
No ratings yet
Data Mining - Chapter 4 Cluster Analysis
37 pages
lec2
No ratings yet
lec2
32 pages
Unit-IV ppt
No ratings yet
Unit-IV ppt
51 pages
MODULE 4 - 5TH SEM (2)
No ratings yet
MODULE 4 - 5TH SEM (2)
23 pages
Presentation 28128 Content Document 20241126014005PM
No ratings yet
Presentation 28128 Content Document 20241126014005PM
80 pages
Unit 4
No ratings yet
Unit 4
74 pages
Cluster Analysis: Talha Farooq Faizan Ali Muhammad Abdul Basit
No ratings yet
Cluster Analysis: Talha Farooq Faizan Ali Muhammad Abdul Basit
16 pages
Introduction To Unsupervised Learning:: Clustering
No ratings yet
Introduction To Unsupervised Learning:: Clustering
21 pages
Lecture 01 - Unsupervised Learning (Optional)
No ratings yet
Lecture 01 - Unsupervised Learning (Optional)
57 pages
Unsupervised Learning
No ratings yet
Unsupervised Learning
83 pages
Chap7 Basic Cluster Analysis
No ratings yet
Chap7 Basic Cluster Analysis
82 pages
MACHINE LEARNING NOTES ANNA UNIVERSITY
No ratings yet
MACHINE LEARNING NOTES ANNA UNIVERSITY
14 pages
DMDWUNITV
No ratings yet
DMDWUNITV
72 pages
Lecture 2.1.1 to 2.1.2 (1)
No ratings yet
Lecture 2.1.1 to 2.1.2 (1)
97 pages
Clustering Algorithms
No ratings yet
Clustering Algorithms
19 pages
07Clustering
No ratings yet
07Clustering
34 pages
MachineLearning Unit IV.pptx
No ratings yet
MachineLearning Unit IV.pptx
51 pages
AIMLB PGP 2024 Session 12
No ratings yet
AIMLB PGP 2024 Session 12
46 pages
Clustering 1
No ratings yet
Clustering 1
18 pages
kmeansfinal
No ratings yet
kmeansfinal
16 pages
Clustering
No ratings yet
Clustering
67 pages
Unit-5
No ratings yet
Unit-5
33 pages
Machine Learning
No ratings yet
Machine Learning
23 pages
Clustering
No ratings yet
Clustering
80 pages
Unit 5
No ratings yet
Unit 5
5 pages
unsupervised_learning_1
No ratings yet
unsupervised_learning_1
40 pages
Data Mining - Clustering
No ratings yet
Data Mining - Clustering
90 pages
unsupervised learning
No ratings yet
unsupervised learning
23 pages
07-Clustering-2024
No ratings yet
07-Clustering-2024
51 pages
Clustering
No ratings yet
Clustering
104 pages
Unsupervised Learning Update
No ratings yet
Unsupervised Learning Update
37 pages
Unit 3 Data
No ratings yet
Unit 3 Data
37 pages
CS423 Data Warehousing and Data Mining: Dr. Hammad Afzal
No ratings yet
CS423 Data Warehousing and Data Mining: Dr. Hammad Afzal
41 pages
Chapter 4 _ Clustering
No ratings yet
Chapter 4 _ Clustering
21 pages
Chap15 Cluster Analysis
No ratings yet
Chap15 Cluster Analysis
55 pages
Competitive Learning: Fundamentals and Applications for Reinforcement Learning through Competition
From Everand
Competitive Learning: Fundamentals and Applications for Reinforcement Learning through Competition
Fouad Sabry
No ratings yet
K Nearest Neighbor Algorithm: Fundamentals and Applications
From Everand
K Nearest Neighbor Algorithm: Fundamentals and Applications
Fouad Sabry
No ratings yet
Drying of Solids
No ratings yet
Drying of Solids
7 pages
E Commerce Homework
100% (1)
E Commerce Homework
7 pages
Gma 340
100% (1)
Gma 340
43 pages
Software Engineering Notes 123456
No ratings yet
Software Engineering Notes 123456
17 pages
MBTD IEF Checklist 4.0
No ratings yet
MBTD IEF Checklist 4.0
55 pages
Bonita Documentation - Getting Started Tutorial - 2015-08-31
No ratings yet
Bonita Documentation - Getting Started Tutorial - 2015-08-31
15 pages
Group 3 Chapter 1 Proposal
No ratings yet
Group 3 Chapter 1 Proposal
13 pages
Why Can't I Browse The Internet When Using A GRE Tunnel
No ratings yet
Why Can't I Browse The Internet When Using A GRE Tunnel
4 pages
Writing Power Intent For A Design
No ratings yet
Writing Power Intent For A Design
3 pages
_L07241026539_28_12_24
No ratings yet
_L07241026539_28_12_24
3 pages
Module 10 Legislation
67% (6)
Module 10 Legislation
86 pages
What Is JSON? - How It Works - Advantages & Disadvantages
No ratings yet
What Is JSON? - How It Works - Advantages & Disadvantages
2 pages
IZYLUM_LT_ProductSheet_EN
No ratings yet
IZYLUM_LT_ProductSheet_EN
30 pages
TSL - BEMS - 2023 - 000003 Discontinued Philips Respironics BiPAP A30 - A40 (1st Generation)
No ratings yet
TSL - BEMS - 2023 - 000003 Discontinued Philips Respironics BiPAP A30 - A40 (1st Generation)
2 pages
Troubleshooting & Service Manual 230V
No ratings yet
Troubleshooting & Service Manual 230V
48 pages
Design of Mobile Applications For People With Intellectual Disabilities
No ratings yet
Design of Mobile Applications For People With Intellectual Disabilities
14 pages
2 Historical Perspectives of Nursing and Computers
No ratings yet
2 Historical Perspectives of Nursing and Computers
4 pages
Program Specs - Industrial Engineering
No ratings yet
Program Specs - Industrial Engineering
108 pages
Reedited Thesis
No ratings yet
Reedited Thesis
55 pages
Lecture Notes ENGG2740 summary part I_blackboard
No ratings yet
Lecture Notes ENGG2740 summary part I_blackboard
24 pages
Vawt Presentation PDF
No ratings yet
Vawt Presentation PDF
47 pages
DV1 Spec Sheet
No ratings yet
DV1 Spec Sheet
2 pages
Série D'exercices N°2 - TIC JS - Bac SI (2012-2013) MR HAMDI MONCEF
No ratings yet
Série D'exercices N°2 - TIC JS - Bac SI (2012-2013) MR HAMDI MONCEF
11 pages
05 Laporan IBPR
No ratings yet
05 Laporan IBPR
21 pages
Hotel Transylvania 2
No ratings yet
Hotel Transylvania 2
15 pages
Harvesting Lithium
No ratings yet
Harvesting Lithium
41 pages
Ultimate Fruit Winemaker's Guide Intro
100% (5)
Ultimate Fruit Winemaker's Guide Intro
22 pages
GSIS Members Information Sheet - 0
No ratings yet
GSIS Members Information Sheet - 0
2 pages

Data Mining P

Uploaded by

Data Mining P

Uploaded by

WELCOME

to the presentation of “Group-2”

COURSE CODE : STAT-309

SMINA AHMED FARZANA TABASU

Grouping of a particular set of

Find common attributes - all data in the

Examine the data to form clusters

Products sold on a Users of a social Readers of an online

USING THREE linkage.

can separate non-elliptical

does well in separating

balances compactness and

· It partitions the given data set into k

· A cluster is defined as a collection

Each data point belongs to a cluster with the nearest

Data points belonging to one cluster have high degree of

Data points belonging to different clusters have high

Choose the number of

Relatively simple to implement.

Can warm-start the positions of

Easily adapts to new examples.

You might also like