0% found this document useful (0 votes)
9 views67 pages

Clustering

This presentation provides an overview of clustering, including its definition and various types such as partitioning, hierarchical, density-based, and model-based clustering. It details specific algorithms like K-means, Fuzzy C-Means, Agglomerative, and DBSCAN, explaining their processes, advantages, and disadvantages. The document emphasizes the importance of clustering in detecting patterns and organizing data points into similar groups.

Uploaded by

Sumita Gupta
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views67 pages

Clustering

This presentation provides an overview of clustering, including its definition and various types such as partitioning, hierarchical, density-based, and model-based clustering. It details specific algorithms like K-means, Fuzzy C-Means, Agglomerative, and DBSCAN, explaining their processes, advantages, and disadvantages. The document emphasizes the importance of clustering in detecting patterns and organizing data points into similar groups.

Uploaded by

Sumita Gupta
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 67

THIS PRESENTATION IS ABOUT

 Introduction of Clustering
 Types of Clustering
 Partitioning based Clustering
 K-means Algorithm

 Fuzzy Clustering
 Fuzzy C-Means Algorithm

 Hierarchical based Clustering


 Agglomerative Algorithm

 Density based Clustering


 DBSCAN Algorithm

 Model based Clustering


CLUSTERI
NG
CLUSTERING:
INTRODUCTION
Clustering is the task of dividing the population or data
points into a number of groups such that data points in the
same groups are more similar to other data points in the
same group than those in other groups
The aim is to segregate groups with similar traits and assign them
into clusters.
 Unsupervised Learning  Requires Data, but no labels.
 Detect Patterns:
 Group emails or search results
 Customer shopping patterns
 Regions of images
TYPES OF
CLUSTERING
CLUSTERING:

TYPES
Partitioning methods:
Its simply a division of the set of data objects into non-
overlapping clusters such that each objects is in exactly one
subset. Example: k-Means

 Hierarchical clustering:
Also known as 'nesting clustering' as it also clusters to exist
within bigger clusters to form a tree. Example:
Agglometric Clustering
CLUSTERING:

TYPES
Density-based clustering:
In this clustering model there will be a searching of data
space for areas of varied density of data points in the
data space. Example: DBSCAN

 Model-based clustering:
It provides a framework for incorporating our
knowledge about a domain.
PARTITIONING
CLUSTERING
PARTITION BASED
CLUSTERING
EXAMPLE: K-
MEANS
K-MEANS

CLUSTERING
An Iterative Clustering Algorithm
 Partition-based Clustering
 Each Cluster is associated with a centroid
 Each point is assigned to the cluster with the closest
centroid
 Number of clusters, K, must be specified.
K-MEANS
CLUSTERING
K-MEANS
CLUSTERING
 1. Initial centroids are often chosen randomly.
 Clusters produced vary from one run to another
 2. The centroid is (typically) the mean of the points in the
cluster.
 3. “Closeness” is measured by Euclidean distance, cosine
similarity, correlation, etc.
 4. K-means will converge for common similarity
measures
mentioned above.
 5. Most of the convergence happens in the first few
iterations.
 Often the stopping condition is changed to “Until relatively
few points change clusters”
K-MEANS CLUSTERING:
EXAMPLE
K-MEANS CLUSTERING:
EXAMPLE
K-MEANS CLUSTERING:
EXAMPLE
K-MEANS CLUSTERING:
EXAMPLE
K-MEANS CLUSTERING:
EXAMPLE
K-MEANS CLUSTERING:
EXAMPLE
K-MEANS CLUSTERING:
EXAMPLE
K-MEANS CLUSTERING:
EXAMPLE
K-MEANS
ADVANTAGES
Advantages
 Relatively simple to implement.
 Scales to large data sets.
 Guarantees convergence.
 Can warm-start the positions of centroids.
 Easily adapts to new examples.
 Generalizes to clusters of different
shapes and sizes, such as elliptical clusters.
K-MEANS

DISADVANTAGE
Choosing k manually.
 Being dependent on initial values.
For a low k, you can mitigate this dependence by running k-means
several times with different initial values and picking the best result.
 Clustering data of varying sizes and density.
k-means has trouble clustering data where clusters are of varying sizes
and density.
 Clustering outliers.
Centroids can be dragged by outliers, or outliers might get their own
cluster instead of being ignored. Consider removing or clipping outliers
before clustering.
 Scaling with number of dimensions.
As the number of dimensions increases, a distance-based similarity
measure converges to a constant value between any given examples.
HIERARCHICAL
CLUSTERING
H IERARCHICAL
CLUSTERING
H IERARCHICAL
CLUSTERING
E XAMPLE :
C LUSTERI
A GGLOMERATIVE
NG
AGGLOMERATIVE
C LUSTERING
AGGLOMERATIVE
C LUSTERING
AGGLOMERATIVE
C LUSTERING
AGGLOMERATIVE
C LUSTERING
AGGLOMERATIVE
C LUSTERING
AGGLOMERATIVE
C LUSTERING
AGGLOMERATIVE
C LUSTERING
AGGLOMERATIVE
C LUSTERING
AGGLOMERATIVE
C LUSTERING
AGGLOMERATIVE
C LUSTERING
AGGLOMERATIVE
C LUSTERING
AGGLOMERATIVE
C LUSTERING
AGGLOMERATIVE
C LUSTERING
AGGLOMERATIVE
C LUSTERING
AGGLOMERATIVE
C LUSTERING
AGGLOMERATIVE
C LUSTERING
AGGLOMERATIVE
C LUSTERING
AGGLOMERATIVE
C LUSTERING
AGGLOMERATIVE
C LUSTERING
AGGLOMERATIVE
C LUSTERING
AGGLOMERATIVE
C LUSTERING
H IERARCHICAL
CLUSTERING
DENSITY BASED
CLUSTERING
DENSITY BASED
C LUSTERING
K-MEANS VS DENSITY BASE
CLUSTERING
DENSITY BASED
C LUSTERING
EXAMPLE:
DBSCAN
DBSC
AN
DBSC
AN
DBSC
AN
DBSC
AN
DBSCAN Algorithm Steps
DBSCAN
Example
DBSCAN
Example
DBSCAN
Example
DBSCAN
Example
DBSCAN
Example
DBSCAN
Example
DBSCAN
Example
DBSCAN
Example
DBSCAN: ADVANTAGES &
DISADVANTAGES

You might also like