@@ -683,41 +683,43 @@ class AgglomerativeClustering(ClusterMixin, BaseEstimator):
683683
684684 Parameters
685685 ----------
686- n_clusters : int or None, optional ( default=2)
686+ n_clusters : int or None, default=2
687687 The number of clusters to find. It must be ``None`` if
688688 ``distance_threshold`` is not ``None``.
689689
690- affinity : string or callable, default: " euclidean"
690+ affinity : str or callable, default=' euclidean'
691691 Metric used to compute the linkage. Can be "euclidean", "l1", "l2",
692692 "manhattan", "cosine", or "precomputed".
693693 If linkage is "ward", only "euclidean" is accepted.
694694 If "precomputed", a distance matrix (instead of a similarity matrix)
695695 is needed as input for the fit method.
696696
697- memory : None, str or object with the joblib.Memory interface, optional
697+ memory : str or object with the joblib.Memory interface, default=None
698698 Used to cache the output of the computation of the tree.
699699 By default, no caching is done. If a string is given, it is the
700700 path to the caching directory.
701701
702- connectivity : array-like or callable, optional
702+ connectivity : array-like or callable, default=None
703703 Connectivity matrix. Defines for each sample the neighboring
704704 samples following a given structure of the data.
705705 This can be a connectivity matrix itself or a callable that transforms
706706 the data into a connectivity matrix, such as derived from
707707 kneighbors_graph. Default is None, i.e, the
708708 hierarchical clustering algorithm is unstructured.
709709
710- compute_full_tree : bool or 'auto' (optional)
711- Stop early the construction of the tree at n_clusters. This is
712- useful to decrease computation time if the number of clusters is
713- not small compared to the number of samples. This option is
714- useful only when specifying a connectivity matrix. Note also that
715- when varying the number of clusters and using caching, it may
716- be advantageous to compute the full tree. It must be ``True`` if
717- ``distance_threshold`` is not ``None``.
718-
719- linkage : {"ward", "complete", "average", "single"}, optional \
720- (default="ward")
710+ compute_full_tree : 'auto' or bool, default='auto'
711+ Stop early the construction of the tree at n_clusters. This is useful
712+ to decrease computation time if the number of clusters is not small
713+ compared to the number of samples. This option is useful only when
714+ specifying a connectivity matrix. Note also that when varying the
715+ number of clusters and using caching, it may be advantageous to compute
716+ the full tree. It must be ``True`` if ``distance_threshold`` is not
717+ ``None``. By default `compute_full_tree` is "auto", which is equivalent
718+ to `True` when `distance_threshold` is not `None` or that `n_clusters`
719+ is inferior to 100 or `0.02 * n_samples`. Otherwise, "auto" is
720+ equivalent to `False`.
721+
722+ linkage : {"ward", "complete", "average", "single"}, default="ward"
721723 Which linkage criterion to use. The linkage criterion determines which
722724 distance to use between sets of observation. The algorithm will merge
723725 the pairs of cluster that minimize this criterion.
@@ -730,7 +732,7 @@ class AgglomerativeClustering(ClusterMixin, BaseEstimator):
730732 - single uses the minimum of the distances between all observations
731733 of the two sets.
732734
733- distance_threshold : float, optional ( default=None)
735+ distance_threshold : float, default=None
734736 The linkage distance threshold above which, clusters will not be
735737 merged. If not ``None``, ``n_clusters`` must be ``None`` and
736738 ``compute_full_tree`` must be ``True``.
@@ -744,7 +746,7 @@ class AgglomerativeClustering(ClusterMixin, BaseEstimator):
744746 ``distance_threshold=None``, it will be equal to the given
745747 ``n_clusters``.
746748
747- labels_ : array [ n_samples]
749+ labels_ : ndarray of shape ( n_samples)
748750 cluster labels for each point
749751
750752 n_leaves_ : int
@@ -753,7 +755,7 @@ class AgglomerativeClustering(ClusterMixin, BaseEstimator):
753755 n_connected_components_ : int
754756 The estimated number of connected components in the graph.
755757
756- children_ : array-like, shape (n_samples-1, 2)
758+ children_ : array-like of shape (n_samples-1, 2)
757759 The children of each non-leaf node. Values less than `n_samples`
758760 correspond to leaves of the tree which are the original samples.
759761 A node `i` greater than or equal to `n_samples` is a non-leaf
0 commit comments