Table 1

Overview of the clustering evaluated in this study

MethodEnvironment and availabilityShort descriptionreference
kmeansR package (stats)Iterative partitioning algorithm; used with or without good initial values (centroids).[10]
cmeansR package (e1071)Iterative soft partitioning algorithm; with or without good initial values.[11]
DBSCANR package (dbscan)Density-based clustering algorithm; the number of sufficient neighbours and radius has to be pre-specified.[12]
dpcpR code from GitHub; R shiny appTwo-step approach; DBSCAN is utilized to identify the locations of first-order clusters, then cmeans is applied.[13]
flowSOMR package from BioconductorSelf-organizing map to find winning nodes, followed by hierarchical clustering on those representatives.[14]
flowPeaksR package from BioconductorBased on finite Gaussian mixture models; start with kmeans to compute smooth density function empirically, then merging is performed; no need to specify the number of clusters.[15]
flowClustR package from BioconductorBased on t mixture models with Box-Cox transformation; model parameters are inferred using an Expectation-Maximization algorithm; the number of clusters can be pre-specified or automatically chosen by the Bayesian information criterion (BIC).[16]
flowMergeR package from BioconductorExtension of flowClust and is intended to solve the issue of flowClust producing too many clusters in the automatic mode; the best model is selected by the change point in entropy.[17]
SamSPECTRALR package from BioconductorGraph-based method; starts with data reduction due to the computational cost, then computes the similarity matrix, followed by spectral clustering, then kmeans; the number of clusters can be pre-specified or automatically determined by the knee point of the eigenvalue plot.[18]
calicoR code from GitHub; R shiny appBased on gridding and kmeans; starts with sample space gridding to reduce the differences in density, then the first round of kmeans is implemented on the gridded data to obtain centroid for the second round of kmeans.[19]
ddPCRclustR package from BioconductorEnsemble-based approach that combines the outcomes of flowDensity, SamSPECTRAL and flowPeaks.[20]
MethodEnvironment and availabilityShort descriptionreference
kmeansR package (stats)Iterative partitioning algorithm; used with or without good initial values (centroids).[10]
cmeansR package (e1071)Iterative soft partitioning algorithm; with or without good initial values.[11]
DBSCANR package (dbscan)Density-based clustering algorithm; the number of sufficient neighbours and radius has to be pre-specified.[12]
dpcpR code from GitHub; R shiny appTwo-step approach; DBSCAN is utilized to identify the locations of first-order clusters, then cmeans is applied.[13]
flowSOMR package from BioconductorSelf-organizing map to find winning nodes, followed by hierarchical clustering on those representatives.[14]
flowPeaksR package from BioconductorBased on finite Gaussian mixture models; start with kmeans to compute smooth density function empirically, then merging is performed; no need to specify the number of clusters.[15]
flowClustR package from BioconductorBased on t mixture models with Box-Cox transformation; model parameters are inferred using an Expectation-Maximization algorithm; the number of clusters can be pre-specified or automatically chosen by the Bayesian information criterion (BIC).[16]
flowMergeR package from BioconductorExtension of flowClust and is intended to solve the issue of flowClust producing too many clusters in the automatic mode; the best model is selected by the change point in entropy.[17]
SamSPECTRALR package from BioconductorGraph-based method; starts with data reduction due to the computational cost, then computes the similarity matrix, followed by spectral clustering, then kmeans; the number of clusters can be pre-specified or automatically determined by the knee point of the eigenvalue plot.[18]
calicoR code from GitHub; R shiny appBased on gridding and kmeans; starts with sample space gridding to reduce the differences in density, then the first round of kmeans is implemented on the gridded data to obtain centroid for the second round of kmeans.[19]
ddPCRclustR package from BioconductorEnsemble-based approach that combines the outcomes of flowDensity, SamSPECTRAL and flowPeaks.[20]

Additional details including software package versions used for each clustering method are included in Appendix Table S2.

Table 1

Overview of the clustering evaluated in this study

MethodEnvironment and availabilityShort descriptionreference
kmeansR package (stats)Iterative partitioning algorithm; used with or without good initial values (centroids).[10]
cmeansR package (e1071)Iterative soft partitioning algorithm; with or without good initial values.[11]
DBSCANR package (dbscan)Density-based clustering algorithm; the number of sufficient neighbours and radius has to be pre-specified.[12]
dpcpR code from GitHub; R shiny appTwo-step approach; DBSCAN is utilized to identify the locations of first-order clusters, then cmeans is applied.[13]
flowSOMR package from BioconductorSelf-organizing map to find winning nodes, followed by hierarchical clustering on those representatives.[14]
flowPeaksR package from BioconductorBased on finite Gaussian mixture models; start with kmeans to compute smooth density function empirically, then merging is performed; no need to specify the number of clusters.[15]
flowClustR package from BioconductorBased on t mixture models with Box-Cox transformation; model parameters are inferred using an Expectation-Maximization algorithm; the number of clusters can be pre-specified or automatically chosen by the Bayesian information criterion (BIC).[16]
flowMergeR package from BioconductorExtension of flowClust and is intended to solve the issue of flowClust producing too many clusters in the automatic mode; the best model is selected by the change point in entropy.[17]
SamSPECTRALR package from BioconductorGraph-based method; starts with data reduction due to the computational cost, then computes the similarity matrix, followed by spectral clustering, then kmeans; the number of clusters can be pre-specified or automatically determined by the knee point of the eigenvalue plot.[18]
calicoR code from GitHub; R shiny appBased on gridding and kmeans; starts with sample space gridding to reduce the differences in density, then the first round of kmeans is implemented on the gridded data to obtain centroid for the second round of kmeans.[19]
ddPCRclustR package from BioconductorEnsemble-based approach that combines the outcomes of flowDensity, SamSPECTRAL and flowPeaks.[20]
MethodEnvironment and availabilityShort descriptionreference
kmeansR package (stats)Iterative partitioning algorithm; used with or without good initial values (centroids).[10]
cmeansR package (e1071)Iterative soft partitioning algorithm; with or without good initial values.[11]
DBSCANR package (dbscan)Density-based clustering algorithm; the number of sufficient neighbours and radius has to be pre-specified.[12]
dpcpR code from GitHub; R shiny appTwo-step approach; DBSCAN is utilized to identify the locations of first-order clusters, then cmeans is applied.[13]
flowSOMR package from BioconductorSelf-organizing map to find winning nodes, followed by hierarchical clustering on those representatives.[14]
flowPeaksR package from BioconductorBased on finite Gaussian mixture models; start with kmeans to compute smooth density function empirically, then merging is performed; no need to specify the number of clusters.[15]
flowClustR package from BioconductorBased on t mixture models with Box-Cox transformation; model parameters are inferred using an Expectation-Maximization algorithm; the number of clusters can be pre-specified or automatically chosen by the Bayesian information criterion (BIC).[16]
flowMergeR package from BioconductorExtension of flowClust and is intended to solve the issue of flowClust producing too many clusters in the automatic mode; the best model is selected by the change point in entropy.[17]
SamSPECTRALR package from BioconductorGraph-based method; starts with data reduction due to the computational cost, then computes the similarity matrix, followed by spectral clustering, then kmeans; the number of clusters can be pre-specified or automatically determined by the knee point of the eigenvalue plot.[18]
calicoR code from GitHub; R shiny appBased on gridding and kmeans; starts with sample space gridding to reduce the differences in density, then the first round of kmeans is implemented on the gridded data to obtain centroid for the second round of kmeans.[19]
ddPCRclustR package from BioconductorEnsemble-based approach that combines the outcomes of flowDensity, SamSPECTRAL and flowPeaks.[20]

Additional details including software package versions used for each clustering method are included in Appendix Table S2.

Close
This Feature Is Available To Subscribers Only

Sign In or Create an Account

Close

This PDF is available to Subscribers Only

View Article Abstract & Purchase Options

For full access to this pdf, sign in to an existing account, or purchase an annual subscription.

Close