0% found this document useful (0 votes)
13 views3 pages

Naive Bayes Algorithm Code Explanation

The document describes the generation and visualization of 2D data points with cluster labels using Gaussian Naive Bayes. It explains the creation of noisy blobs of points, the scaling and shifting of random points, and the prediction of class labels for these points. Additionally, it details the plotting of both training data and predicted labels, including the use of color maps and marker sizes to represent class distributions.

Uploaded by

ghugekrish824
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views3 pages

Naive Bayes Algorithm Code Explanation

The document describes the generation and visualization of 2D data points with cluster labels using Gaussian Naive Bayes. It explains the creation of noisy blobs of points, the scaling and shifting of random points, and the prediction of class labels for these points. Additionally, it details the plotting of both training data and predicted labels, including the use of color maps and marker sizes to represent class distributions.

Uploaded by

ghugekrish824
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

 X → shape (300, 2) → the coordinates of the points in 2D.

 y → shape (300,) → the cluster labels (0 or 1 since centers=2).

 cluster_std=1.5 makes the blobs overlap more (less separable).

 Plots the 2D points.


 Colors (c=y) correspond to cluster labels (0 or 1).
 Marker size = 50.
 cmap='summer' gives a green-to-yellow colormap.

👉 The result: two noisy blobs of points in 2D, each colored differently.

 X.shape = (300, 2) → each row is a point in 2D.

 y.shape = (300,) → labels (0 or 1 for each point).

 cluster_std=1.5 makes the blobs fuzzier so they overlap more.

 Plots points using the first feature (X[:,0]) on the x-axis and the second (X[:,1]) on the y-
axis.

 Colors (c=y) show class labels (two shades from the summer colormap).

 Marker size = 50.

 ; at the end just suppresses extra text output in Jupyter notebooks.


Creates a random number generator with fixed seed (0) → ensures reproducibility.

 rng.rand(2000, 2) → generates 2000 random points in the unit square [0,1) ×


[0,1).
 Multiplying by [14, 18] → scales the x-coordinates by 14 and y-coordinates by 18.
 Adding [-6, -14] → shifts the points left and down.

👉 So, Xnew is a (2000, 2) array of points sampled in a rectangular region spanning:

 x range: [-6, 8]
 y range: [-14, 4]

 Uses your trained GaussianNB model (model_GNB) to predict the class label (0 or 1) for each
of these 2000 points.

 ynew.shape = (2000,).

 Plots the training data (X) in 2D.

 Colors (c=y) represent their true class labels.

 Marker size = 50.

 Colormap = 'summer'.

 Stores the current axis limits (xmin, xmax, ymin, ymax).

 Plots the predicted labels (ynew) for the random test points (Xnew).

 Uses the same colormap (summer) so colors match class labels.


 Markers are smaller (s=20) and transparent (alpha=0.1).

 This gives a faded background color effect, showing the regions where the model predicts
each class.

 Restores the axis limits from before plotting Xnew.

 Prevents matplotlib from autoscaling to include the farthest Xnew points.

 Keeps the focus on the region around your training data.

 Returns an array of shape (2000, n_classes) since you passed Xnew with 2000 samples.

 Each row contains the probability distribution over the classes for that sample.

 In your case with 2 clusters, yprob.shape = (2000, 2).

 Each row sums to 1.0.

 Takes the last 10 rows (the last 10 test points).

 Rounds each probability to 3 decimals for readability.

You might also like