X → shape (300, 2) → the coordinates of the points in 2D.
y → shape (300,) → the cluster labels (0 or 1 since centers=2).
cluster_std=1.5 makes the blobs overlap more (less separable).
Plots the 2D points.
Colors (c=y) correspond to cluster labels (0 or 1).
Marker size = 50.
cmap='summer' gives a green-to-yellow colormap.
👉 The result: two noisy blobs of points in 2D, each colored differently.
X.shape = (300, 2) → each row is a point in 2D.
y.shape = (300,) → labels (0 or 1 for each point).
cluster_std=1.5 makes the blobs fuzzier so they overlap more.
Plots points using the first feature (X[:,0]) on the x-axis and the second (X[:,1]) on the y-
axis.
Colors (c=y) show class labels (two shades from the summer colormap).
Marker size = 50.
; at the end just suppresses extra text output in Jupyter notebooks.
Creates a random number generator with fixed seed (0) → ensures reproducibility.
rng.rand(2000, 2) → generates 2000 random points in the unit square [0,1) ×
[0,1).
Multiplying by [14, 18] → scales the x-coordinates by 14 and y-coordinates by 18.
Adding [-6, -14] → shifts the points left and down.
👉 So, Xnew is a (2000, 2) array of points sampled in a rectangular region spanning:
x range: [-6, 8]
y range: [-14, 4]
Uses your trained GaussianNB model (model_GNB) to predict the class label (0 or 1) for each
of these 2000 points.
ynew.shape = (2000,).
Plots the training data (X) in 2D.
Colors (c=y) represent their true class labels.
Marker size = 50.
Colormap = 'summer'.
Stores the current axis limits (xmin, xmax, ymin, ymax).
Plots the predicted labels (ynew) for the random test points (Xnew).
Uses the same colormap (summer) so colors match class labels.
Markers are smaller (s=20) and transparent (alpha=0.1).
This gives a faded background color effect, showing the regions where the model predicts
each class.
Restores the axis limits from before plotting Xnew.
Prevents matplotlib from autoscaling to include the farthest Xnew points.
Keeps the focus on the region around your training data.
Returns an array of shape (2000, n_classes) since you passed Xnew with 2000 samples.
Each row contains the probability distribution over the classes for that sample.
In your case with 2 clusters, yprob.shape = (2000, 2).
Each row sums to 1.0.
Takes the last 10 rows (the last 10 test points).
Rounds each probability to 3 decimals for readability.