Machine learning and statistical methods for clustering single-cell RNA-sequencing data Free

2.

Jiang

D

,

Tang

C

,

Zhang

A

.

Cluster analysis for gene expression data: a survey

.

IEEE Trans Knowl Data E

2004

;

16

(

11

):

1370

–

86

.

3.

Stegle

O

,

Teichmann

SA

,

Marioni

JC

.

Computational and analytical challenges in single-cell transcriptomics

.

Nat Rev Genet

2015

;

16

(

3

):

133

.

4.

Kolodziejczyk

AA

,

Kim

JK

,

Svensson

V

, et al.

The technology and biology of single-cell RNA sequencing

.

Mol Cell

2015

;

58

(

4

):

610

–

20

.

5.

Tsoucas

D

,

Yuan

G-C

.

Recent progress in single-cell cancer genomics

.

Curr Opin Genet Dev

2017

;

42

:

22

–

32

.

6.

Shintaku

H

,

Nishikii

H

,

Marshall

LA

, et al.

On-chip separation and analysis of RNA and DNA from single cells

.

Anal Chem

2014

;

86

(

4

):

1953

–

7

.

7.

Hebenstreit

D

.

Methods, challenges and potentials of single cell RNA-seq

.

Biology

2012

;

1

(

3

):

658

–

67

.

8.

Bacher

R

,

Kendziorski

C

.

Design and computational analysis of single-cell RNA-sequencing experiments

.

Genome Biol

2016

;

17

(

1

):

63

.

9.

Zhang

H

,

Lee

C-AA

,

Li

Z

, et al.

A multitask clustering approach for single-cell RNA-seq analysis in recessive dystrophic epidermolysis bullosa

.

PLoS Comput Biol

2018

;

14

(

4

):

e1006053

.

10.

Vallejos

CA

,

Risso

D

,

Scialdone

A

, et al.

Normalizing single-cell RNA sequencing data: challenges and opportunities

.

Nat Methods

2017

;

14

(

6

):

565

.

11.

Rahul

S

,

Farrell

JA

,

Gennert

D

, et al.

Spatial reconstruction of single-cell gene expression data

.

Nat Biotechnol

2015

;

33

(

5

):

495

.

12.

Butler

A

,

Hoffman

P

,

Smibert

P

, et al.

Integrating single-cell transcriptomic data across different conditions, technologies, and species

.

Nat Biotechnol

2018

;

36

(

5

):

411

.

13.

Ntranos

V

,

Kamath

GM

,

Zhang

JM

, et al.

Fast and accurate single-cell RNA-seq analysis by clustering of transcript-compatibility counts

.

Genome Biol

2016

;

17

(

1

):

112

.

14.

Jiang

L

,

Chen

H

,

Pinello

L

, et al.

GiniClust: detecting rare cell types from single-cell gene expression data with Gini index

.

Genome Biol

2016

;

17

(

1

):

144

.

15.

Jiang

L

,

Schlesinger

F

,

Davis

CA

, et al.

Synthetic spike-in standards for RNA-seq experiments

.

Genome Res

2011

;

21

(

9

):

1543

--

1551

.

16.

Lin

P

,

Troup

M

,

Ho

JWK

.

CIDR: ultrafast and accurate clustering through imputation for single-cell RNA-seq data

.

Genome Biol

2017

;

18

(

1

):

59

.

17.

Zhang

JM

,

Fan

J

,

Christina Fan

H

, et al.

An interpretable framework for clustering single-cell RNA-Seq datasets

.

BMC bioinformatics

2018

;

19

(

1

):

93

.

18.

Xu

C

,

Su

Z

.

Identification of cell types from single-cell transcriptomes using a novel clustering method

.

Bioinformatics

2015

;

31

(

12

):

1974

–

80

.

19.

Huipeng

L

,

Courtois

ET

,

Sengupta

D

, et al.

Reference component analysis of single-cell transcriptomes elucidates cellular heterogeneity in human colorectal tumors

.

Nat Genet

2017

;

49

(

5

):

708

.

20.

Guo

M

,

Wang

H

,

Potter

SS

, et al.

SINCERA: a pipeline for single-cell RNA-Seq profiling analysis

.

PLoS Comput Biol

2015

;

11

(

11

):

e1004575

.

21.

Prabhakaran

S

,

Azizi

E

,

Carr

A

, et al.

Dirichlet process mixture model for correcting technical variation in single-cell gene expression data

. In:

International Conference on Machine Learning

.

New York, NY, USA

:

JMLR.org

.

2016

, pp.

1070

–

9

.

22.

Grün

D

,

Lyubimova

A

,

Kester

L

, et al.

Single-cell messenger RNA sequencing reveals rare intestinal cell types

.

Nature

2015

;

525

(

7568

):

251

.

23.

Pierson

E

,

Yau

C

.

ZIFA: dimensionality reduction for zero-inflated single-cell gene expression analysis

.

Genome Biol

2015

;

16

(

1

):

241

.

24.

Risso

D

,

Perraudeau

F

,

Gribkova

S

, et al.

A general and flexible method for signal extraction from single-cell RNA-seq data

.

Nat Commun

2018

;

9

(

1

):

284

.

25.

Yau

C

et al.

pcaReduce: hierarchical clustering of single cell transcriptional profiles

.

BMC Bioinformatics

2016

;

17

(

1

):

140

.

26.

Kiselev

V Yu

,

Kirschner

K

,

Schaub

MT

, et al.

SC3: consensus clustering of single-cell RNA-seq data

.

Nat Methods

2017

;

14

(

5

):

483

.

27.

Torgerson

WS

.

Multidimensional scaling: I. theory and method

.

Psychometrika

1952

;

17

(

4

):

401

–

19

.

28.

van der Maaten

L

,

Hinton

G

.

Visualizing data using t-SNE

.

J Mach Learn Res

2008

;

9

(

Nov

):

2579

–

605

.

29.

Zeisel

A

,

Muñoz-Manchado

AB

,

Codeluppi

S

, et al.

Cell types in the mouse cortex and hippocampus revealed by single-cell RNA-seq

.

Science

2015

;

347

(

6226

):

1138

–

42

.

30.

Yang

L

,

Liu

J

,

Lu

Q

, et al.

SAIC: an iterative clustering approach for analysis of single cell RNA-seq data

.

BMC Genomics

2017

;

18

(

6

):

689

.

31.

Gan

Y

,

Li

N

,

Zou

G

, et al.

Identification of cancer subtypes from single-cell RNA-seq data using a consensus clustering method

.

BMC Med Genomics

2018

;

11

(

6

):

117

.

32.

Hotelling

H

.

Relations between two sets of variates

.

Biometrika

1936

;

28

(

3/4

):

321

–

77

.

33.

Blei

DM

.

Andrew Y Ng, Michael I Jordan

.

Latent dirichlet allocation

J Mach Learn Res

2003

;

3

(

Jan

):

993

–

1022

.

34.

Yotsukura

S

,

Nomura

S

,

Aburatani

H

, et al.

CellTree: an R/bioconductor package to infer the hierarchical structure of cell populations from single-cell RNA-seq data

.

BMC Bioinformatics

2016

;

17

(

1

):

363

.

35.

Kohonen

T

.

The self-organizing map

.

Proc IEEE

1990

;

78

(

9

):

1464

–

80

.

36.

Flexer

A

.

On the use of self-organizing maps for clustering and visualization

.

Intell Data Anal

2001

;

5

(

5

):

373

–

84

.

37.

Murtagh

F

,

Hernández-Pajares

M

.

The kohonen self-organizing map method: an assessment

.

J Classification

1995

;

12

(

2

):

165

–

90

.

38.

Wang

Z

,

Jin

S

,

Liu

G

, et al.

DTWscore: differential expression and cell clustering analysis for time-series single-cell RNA-seq data

.

BMC Bioinformatics

2017

;

18

(

1

):

270

.

39.

Wang

,

B

,

Zhu

,

J.

,

Pierson

,

E

, et al.

Visualization and analysis of single-cell RNA-seq data by kernel-based similarity learning

.

Nat Methods

2017

;

14

(

4

):

414

.

40.

Olsson

A

,

Venkatasubramanian

M

,

Chaudhri

VK

, et al.

Single-cell analysis of mixed-lineage states leading to a binary cell fate choice

.

Nature

2016

;

537

(

7622

):

698

.

41.

Marco

,

E

,

Karp

,

RL

,

Guo

,

G

, et al.

Bifurcation analysis of single-cell gene expression data reveals epigenetic landscape

.

Proc Natl Acad Sci

2014

;

111

(

52

):

E5643

–

50

.

42.

Grün

,

D

,

Muraro

,

MJ

,

Boisset

,

J-C

, et al.

De novo prediction of stem cell identity using single-cell transcriptome data

.

Cell Stem Cell

2016

;

19

(

2

):

266

–

77

.

43.

Guha

S

,

Rastogi

R

,

Shim

K

.

CURE: an efficient clustering algorithm for large databases

. In:

ACM Sigmod Record

, Vol. 27.

New York, NY, USA:

ACM

,

1998

,

73

–

84

.

Google Preview

44.

Tsafrir

D

,

Tsafrir

I

,

Ein-Dor

L

, et al.

Sorting points into neighborhoods (SPIN): data analysis and visualization by ordering distance matrices

.

Bioinformatics

2005

;

21

(

10

):

2301

–

8

.

45.

Xu

D

,

Tian

Y

.

A comprehensive survey of clustering algorithms

.

Ann Data Sci

2015

;

2

(

2

):

165

–

93

.

46.

Ji

Z

,

Ji

H

.

TSCAN: pseudo-time reconstruction and evaluation in single-cell RNA-seq analysis

.

Nucleic Acids Res

2016

;

44

(

13

):

e117

–

7

.

47.

Ng

AY

,

Jordan

MI

,

Weiss

Y

.

On spectral clustering: analysis and an algorithm

. In:

Advances in Neural Information Processing Systems

.

Vancouver, British Columbia, Canada

:

MIT Press

.

2002

,

849

–

56

.

48.

Blondel

VD

,

Guillaume

J-L

,

Lambiotte

R

, et al.

Fast unfolding of communities in large networks

.

J Statist Mech Theory Experiment

2008

;

2008

(

10

):

P10008

.

49.

Alexander Wolf

F

,

Angerer

P

,

Fabian

J

, et al.

Large-scale single-cell gene expression data analysis

.

Genome Biol

2018

;

19

(

1

):

15

.

50.

Ester

M

,

Kriegel

H-P

,

Sander

J

, et al.

A density-based algorithm for discovering clusters in large spatial databases with noise

. In:

Kdd

, Vol. 96.

Portland, Oregon

:

AAAI Press,

1996

,

226

–

31

.

51.

Qiu

,

X

,

Mao

,

Q

,

Tang

,

Y

, et al.

Reversed graph embedding resolves complex single-cell trajectories

.

Nat Methods

2017

;

14

(

10

):

979

.

52.

Rodriguez

A

,

Laio

A

.

Clustering by fast search and find of density peaks

.

Science

2014

;

344

(

6191

):

1492

–

6

.

53.

Kim

DH

,

Marinov

GK

,

Pepke

S

, et al.

Single-cell transcriptome analysis reveals dynamic changes in lncRNA expression during reprogramming

.

Cell Stem Cell

2015

;

16

(

1

):

88

–

101

.

54.

Camp

,

JG

,

Sekine

,

K

,

Gerber

,

T

, et al.

Multilineage communication regulates human liver bud development from pluripotency

.

Nature

2017

;

546

(

7659

):

533

.

55.

Lv

D

,

Wang

X

,

Dong

J

, et al.

Systematic characterization of lncRNAs’ cell-to-cell expression heterogeneity in glioblastoma cells

.

Oncotarget

2016

;

7

(

14

):

18403

.

56.

Peng

T

,

Nie

Q

.

SOMSC: self-organization-map for high-dimensional single-cell data of cellular states and their transitions. bioRxiv

,

2017

,

124693

.

57.

Frey

BJ

,

Dueck

D

.

Clustering by passing messages between data points

.

Science

2007

;

315

(

5814

):

972

–

6

.

58.

Hicks

SC

,

Teng

M

,

Irizarry

RA

.

On the widespread and critical impact of systematic bias and batch effects in single-cell rna-seq data. bioRxiv

,

2015

.

59.

Kettenring

JR

.

Canonical analysis of several sets of variables

.

Biometrika

1971

;

58

(

3

):

433

–

51

.

60.

Waltman

L

,

Van Eck

NJ

.

A smart local moving algorithm for large-scale modularity-based community detection

.

Eur Phys J B

2013

;

86

(

11

):

471

.

61.

Trapnell

C

,

Cacchiarelli

D

,

Grimsby

J

, et al.

The dynamics and regulators of cell fate decisions are revealed by pseudotemporal ordering of single cells

.

Nat Biotechnol

2014

;

32

(

4

):

381

.

62.

Welch

JD

,

Hartemink

AJ

,

Prins

JF

.

SLICER: inferring branched, nonlinear cellular trajectories from single cell RNA-seq data

.

Genome Biol

2016

;

17

(

1

):

106

.

63.

Finak

G

,

McDavid

A

,

Yajima

M

, et al.

MAST: a flexible statistical framework for assessing transcriptional changes and characterizing heterogeneity in single-cell RNA sequencing data

.

Genome Biol

2015

;

16

(

1

):

278

.

64.

Li

J

,

Tibshirani

R

.

Finding consistent patterns: a nonparametric approach for identifying differential expression in RNA-Seq data

.

Stat Methods Med Res

2013

;

22

(

5

):

519

–

36

.

65.

Kharchenko

PV

,

Silberstein

L

,

Scadden

DT

.

Bayesian approach to single-cell differential expression analysis

.

Nat Methods

2014

;

11

(

7

):

740

.

66.

Zheng

GXY

,

Terry

JM

,

Belgrader

P

, et al.

Massively parallel digital transcriptional profiling of single cells

.

Nat Commun

2017

;

8

:

14049

.

67.

Chung

W

,

Eum

HH

,

Lee

H-O

, et al.

Single-cell RNA-seq enables comprehensive tumour and immune cell profiling in primary breast cancer

.

Nat Commun

2017

;

8

:

15081

.

68.

Kiselev

VY

,

Yiu

A

,

Hemberg

M

.

Scmap: projection of single-cell RNA-seq data across data sets

.

Nat Methods

2018

;

15

(

5

):

359

.

69.

Kelsey

G

,

Stegle

O

,

Reik

W

.

Single-cell epigenomics: recording the past and predicting the future

.

Science

2017

;

358

(

6359

):

69

–

75

.

70.

Liu

J

,

Lin

D

,

Yardimci

G

, et al.

Unsupervised embedding of single-cell Hi-C data

.

Bioinformatics

2018

;

34

(

13

):

i96

–

i104

.

71.

Cusanovich

DA

,

Daza

R

,

Adey

A

, et al.

Multiplex single-cell profiling of chromatin accessibility by combinatorial cellular indexing

.

Science

2015

;

348

(

6237

):

910

–

4

.

72.

Pellegrino

M

,

Sciambi

A

,

Treusch

S

, et al.

High-throughput single-cell DNA sequencing of acute myeloid leukemia tumors with droplet microfluidics

.

Genome Res

,

28

(

9

):

1345

–

52

,

2018

.