Abstract

The recent expansion of single-cell technologies has enabled simultaneous genome-wide measurements of multiple modalities in the same single cell. The potential to jointly profile such modalities as gene expression, chromatin accessibility, protein epitopes, or multiple histone modifications at single-cell resolution represents a compelling opportunity to study developmental processes at multiple layers of gene regulation. Here, we present Ocelli, a lightweight Python package implemented in Ray for scalable visualization and analysis of developmental multimodal single-cell data. The core functionality of Ocelli focuses on diffusion-based modeling of biological processes involving cell state transitions. Ocelli addresses common tasks in single-cell data analysis, such as visualization of cells on a low-dimensional embedding that preserves the continuity of the developmental progression of cells, identification of rare and transient cell states, integration with trajectory inference algorithms, and imputation of undetected feature counts. Extensive benchmarking shows that Ocelli outperforms existing methods regarding computational time and quality of the reconstructed low-dimensional representation of developmental data.

Introduction

Multimodal data comprises distinct feature sets (called modalities or views) acquired by different detectors describing the same object [1, 2]. For example, a video includes modalities such as a video recording, an audio track, and often text captions. Genome-wide single-cell technologies [3] can nowadays simultaneously measure multiple modalities from the same cell, including (i) transcriptome and chromatin accessibility [4–7]; (ii) transcriptome and DNA methylation [8]; (iii) transcriptome and chromatin proteins or histone modifications [9, 10]; (iv) transcriptome and protein epitope [11]; (v) transcriptome, chromatin accessibility, and protein epitope [12]; (vi) multiple histone modifications and DNA-bound proteins [13–17]; and (vii) open and closed chromatin [18].

Each single-cell data modality is high dimensional. The number of measured features spans from hundreds in the case of protein epitopes to hundreds of thousands for chromatin-accessible sites. The multiple modalities profiled from the same single cell correspond to consecutive stages of gene expression, from its regulation by modifying chromatin architecture and engaging transcription-initiation proteins to the synthesis of mRNA and protein molecules. For example, chromatin becomes accessible before gene expression commences, resulting in temporal coherence between modalities. Thus, all modalities need to be modeled simultaneously to visualize the developmental progression or activation of cells and lineage branching accurately.

The inherent feature of data generated by single-cell technologies is a high level of technical noise resulting in data sparsity due to under-sampling of features from the same cell [19]. Additionally, different modalities have profoundly distinct statistical properties, which makes it challenging to find a joint representative data manifold. Modeling developmental processes is an additional challenge as cellular differentiation is a continuous and nonlinear process [20]. The origin of non-linearity stems from the regulation of gene expression, where even the most straightforward regulation by a single transcription factor in an open-loop scenario results in nonlinear transcriptional kinetics that generates non-trivial gene expression patterns [21].

Multiple methods have been developed to deal with high-throughput high-dimensional unimodal single-cell data to study developmental processes. For example, Monocle [22] leverages a cellular minimum spanning tree (or DDRTree [23] in Monocle 2 and 3 [24]) to find dimensionally reduced gene expression space; Velocyto [25] and scVelo [26] model cellular transitions from transcriptional kinetics (named RNA velocity), leveraging the relationship between unspliced and spliced mRNA molecules within cells; MultiVelo [27] models epigenome–transcriptome interactions to predict cell fates; Waddington-OT [28] employs optimal transport theory to find temporal couplings of differentiating cells; diffusion maps [20, 29–32] embed cells onto a low-dimensional manifold by eigendecomposing the diffusion operator. The significant advantages of diffusion maps are their robustness to noise, high sensitivity to finding rare and transient cell states [28], and ability to preserve nonlinear local and global structures, together with a continuum of developmental cell transitions. The structure and continuum are preserved because diffusion maps utilize a distance metric (diffusion time) relevant to differentiation processes [20, 31].

Recently, several methods have been developed to generate low-dimensional joint embeddings from multimodal single-cell data. The methods can generally be divided into deep learning and probabilistic categories. Deep learning [33] is the dominant branch of machine learning that gained popularity due to the increasing computational capabilities of modern computers [34]. While the variety of deep learning models is enormous, autoencoder architecture [35] has become the leading tool for visualizing multimodal single-cell data [36–39]. Autoencoder is a neural network that encodes and compresses complex, high-dimensional data. Its objective is to learn and perform two functions: an encoder that transforms the input data and a decoder that reconstructs the input data from the encoded representation. The performance of an autoencoder is measured by comparing input and reconstructed data - with the goal of them being similar. Autoencoders proved to be a potent model architecture for problems such as dimension reduction [33], drug discovery [40], and image processing [41, 42]. The advantage of deep learning methods is their theoretical ability to learn very complex functions [43]. However, in practice, they are prone to underfitting or overfitting. Additionally, they are computationally expensive and require extensive data sizes to converge. Probabilistic methods for finding joint multimodal latent spaces from single-cell data include Weighted Nearest Neighbors [44] (WNN), and Multi-Omics Factor Analysis v2 [45] (MOFA+). WNN integrates modalities by constructing a weighted nearest-neighbor graph based on a weighted average of modality similarities. The graph is then visualized with UMAP [46]. MOFA+ [45] factorizes multimodal single-cell data to latent factors and corresponding feature weights. It uses stochastic variational inference [47, 48] for scalability of the parameter inference. Other approaches based on integrative methods exist, such as MOJITOO [49] that uses canonical correlation analysis to find low-dimensional multimodal data representation.

Here, we introduce Ocelli, a new multi-part diffusion-based analysis and visualization strategy for developmental multimodal data. Ocelli leverages topic modeling and novel multimodal diffusion maps (MDM) to reduce dimensionality and to preserve the ordering, continuity, and branching of developmental trajectories. Explicit incorporation of modality-specific transition probabilities allows us to generate more informative developmental data visualization. Moreover, multimodal diffusion-based modeling provides an efficient solution to deal with such caveats of single-cell genomics data as sparsity of measured features. We show that Ocelli facilitates the extraction and analysis of developmental subtrajectories from single-cell atlases. In addition, Ocelli can be leveraged to investigate a broad spectrum of biological processes involving cell state transitions, such as disease onset and progression or cellular responses to perturbations. Notably, Ocelli was designed to work on an arbitrary number and type of modalities and optimized to handle the scalability of current single-cell profiling technologies. To ensure scalable performance, Ocelli’s core functionalities are implemented using Ray [50]. Ocelli is available as an open-source software package at https://github.com/TabakaLab/ocelli along with tutorials and documentation at https://ocelli.readthedocs.io.

Materials and methods

Basic notation

We use the following notation. Matrices and vectors are written in bold. Superscript refers to modality, and the subscript to coordinates. As a result, |$\mathcal {M} = \mathbf {M}^{1:M}$| represents multimodal data consisting of M modalities, where modality |$\mathbf {M}^m \in \mathbb {R}^{C, F^m}$| comprises C cells and Fm features. |$\mathbf {M}^m_{c, f}$| is an activity value of feature f from cell c in modality m.

Multimodal weights

We model cell differentiation as a process of cellular movement in the latent space of single-cell features. According to such an interpretation, expanding distances in the cell’s neighborhood indicate a developmental process. Inspired by previous work on WNN [44], we quantify this expansion using multimodal weights, i.e. cell-specific distributions over modalities, assessing the likelihood of a developmental process occurring in modalities. The values of the weights are computed by comparing distances from a cell to its cross-modality nearest neighbors.

The computation of multimodal weights for cell c proceeds as follows. Let |$\mathbf {d}_c^{m,l}$| be a vector of distances between cell c and its nearest neighbors from modality l in the feature space of modality m. Intuitively, |$\mathbf {d}_c^{m,l}$| encodes cross-modality behavior of cell c neighborhood from modality l in modality m. We normalize cross-modality distances using modality-specific empirical cumulative distribution functions (ECDFs) of distances between cells. ECDF of modality m, denoted ECDFm, scales distances to a [0,1] range according to the distribution of distances between cells in modality m. We estimate ECDFs by uniformly sampling P pairs of cell embeddings from each modality and calculating distances between them. The value of P depends on the total number of cells; however, usually, P = 1000 is enough to generate robust results. Scaled distances, denoted |$\text{ECDF}^m(\mathbf {d}_c^{m,l})$|⁠, measure the expansion of the neighborhood of cell c from modality l in modality m, with respect to the global structure of modality m. Next, we score each modality as a sum over the remaining modalities of ECDF-scaled median distances. |$s_c^m$| is a score for modality m in cell c.

(1)

Then, we smoothen the |$s_c^m$| scores by averaging them over the nearest neighbors of cell c from the highest-scoring modality. Lastly, we normalize updated scores using a softmax function with the parameter α. The value of α should be greater or equal to 1; we used α = 10. Normalized scores produce cell-specific multimodal weights |$\mathbf {w}_c^m$|⁠, which sum to 1 for each cell.

(2)

Algorithm description

The algorithm is a multimodal generalization of the diffusion maps [20, 29]. The underlying idea is that high-affinity cells, i.e. cells close to each other in the single-cell feature space, are phenotypically closely related, and the direction in this space describes the differentiation trajectory. Diffusion maps model the resulting probabilistic diffusion process as a Markov chain, which is then eigendecomposed into a low-dimensional embedding. Multimodal diffusion maps (MDM) create a multimodal Markov chain (MMC) that extends this approach to multimodal data.

MDM models multimodal data |$\mathcal {M}$| as a diffusion process. Firstly, MDM separately computes the affinities between cells and their nearest neighbors for each modality M. This step is performed using a density-adjusted symmetric Gaussian kernel κ

(3)

Equation (3) shows a formula for cell affinity κ between cells c1 and c2 in modality M. d( ·, ·) is the Euclidean metric and ϵc denotes a distance from cell c to its Nth neighbor to account for local data density (usually N = 20 is enough for robust results).

MDM models each modality as an unimodal Markov chain. The diffusion process of modality m is represented by a square matrix |$\mathbf {K}^m \in \mathbb {R}^{C, C}$|⁠. |$\mathbf {K}^m_{c_1, c_2}$| is a weighted affinity between cells c1 and c2.

(4)

Next, MDM constructs the multimodal diffusion process K by element-wise summing weighted and row-normalized matrices K1: M. The normalization is conducted using diagonal matrices Dm such that |$\mathbf {D}^m_{c, c} = (\sum _{i=1:C} \mathbf {K}^m_{c, i})^{-1}$|⁠.

(5)

For efficient eigendecomposition to eigenvectors and eigenvalues, MDM converts K to a matrix |$\widehat{\mathbf {K}}$| that has the following properties: (i) symmetric (⁠|$\widehat{\mathbf {K}}= \widehat{\mathbf {K}}^T$|⁠), (ii) positive semi-definite (⁠|$\forall _{\mathbf {v}\in \mathbb {R}^C} \mathbf {v}^T \widehat{\mathbf {K}}\mathbf {v}\ge 0$|⁠), and (iii) non-negative (⁠|$\widehat{\mathbf {K}}\ge 0$|⁠). These properties guarantee |$\widehat{\mathbf {K}}$| to be a Hermitian matrix with real and non-negative eigenvalues. MDM row-normalizes K (using a diagonal row-normalization matrix D such that Dc, c = (∑i = 1: CKc, i)−1) and then symmetrizes it along the diagonal to meet the conditions mentioned above.

(6)

MDM eigendecomposes |$\widehat{\mathbf {K}}$| to create the low-dimensional embedding of multimodal data |$\mathcal {M}$|⁠. Eigendecomposition factorizes |$\widehat{\mathbf {K}}$| to a product of the eigenvector matrix Q (where Q:, i is the ith eigenvector), the eigenvalue vector Λ (where Λi is the ith eigenvalue), and Q−1.

(7)

Let |$\widehat{\mathbf {\Lambda }}$| be the sorted eigenvalue vector with a decreasing order and |$\widehat{\mathbf {Q}}$| the eigenvector matrix with columns sorted accordingly. The N-dimensional MDM embedding of multimodal data |$\mathcal {M}$|⁠, denoted |$\text{MDM}(\mathcal {M})$|⁠, is defined as follows.

(8)

Ocelli employs scikit-learn [51], an ARPACK [52] wrapper, to compute the top N + E + 1 eigenvalues and corresponding eigenvectors. The quality of the computed eigenvectors may degrade as eigenvalues decrease; hence, E more than necessary eigenvectors are computed and discarded; we use E = 10. MDM ignores |$\widehat{\mathbf {\Lambda }}_1$|⁠, the largest eigenvalue, as it is non-informative in diffusion maps to recover developmental processes [20].

Joint visualization of MDM components

MDM-generated low-dimensional embedding of multimodal data can be visualized in 2 or 3 dimensions using any dimension reduction algorithm. Ocelli has built-in wrappers for UMAP (umap-learn v0.5.3) [46] and ForceAtlas2 (v1.0.3) [53], which are recommended for well-clustered (e.g., postmitotic cells) or well-connected data (e.g., differentiating cells), respectively.

ForceAtlas2 is an algorithm for force-directed graph visualization. Ocelli provides three methods for converting MDM embeddings into graphs: (i) the nearest neighbors-based graph connects the nearest neighbors in the MDM latent space; (ii) the transitions-based graph connects a cell to its nearest neighbors in the MDM latent space with the highest developmental transition probabilities, e.g., from RNA velocity [25, 26], MultiVelo [27], or Waddington-OT [28]; (iii) in the transitions-based graph, if additional information about cell timestamps is available, the nearest neighbors in the MDM latent space are selected only among cells from the subsequent timestamp (except cells from a terminal timestamp, when nearest neighbors are found in the same timestamp). MDM latent space captures information about the global structure of the data. Involving transitions between cells orders them locally in the visualization. A constructed graph has a predefined number of edges coming out of each node that is constant across all cells. If fewer cells have non-zero transitions in a cell’s MDM neighborhood, the graph edges link to the unconnected MDM nearest neighbors.

Single-modality data exploration

Ocelli uses topic modeling to employ the multimodal analysis workflow to explore unimodal single-cell data. Ocelli splits data into latent single-cell modalities using latent Dirichlet allocation (LDA) [54], a probabilistic generative algorithm for learning relationships (called topics) between features. LDA models each cell as a multinomial mixture of topics, represented as distributions over features. Ocelli groups single-cell features according to their most probable topics and leaves only the top features according to LDA’s feature-topic distribution. The constructed groups form topic-based latent modalities.

Ocelli can visualize unimodal single-cell data by (i) constructing latent modalities using topic modeling and (ii) training MDM on latent modalities with LDA’s cell-topic multinomial distribution used as weights. This procedure reduces the inherent noise of single-cell data as we train MDM only on topic-specific features, and the impact of each latent modality is cell-specific using LDA’s cell-topic distribution for MDM weights.

Diffusion-based multimodal imputation

Diffusion-based models can impute values of sparse single-cell features [19] by iteratively applying a diffusion operator to a feature count matrix G. For example, MDM produces a multimodal diffusion operator |$\widehat{\mathbf {K}}$|⁠. The t-step imputation proceeds as follows.

(9)

Eigendecomposition of |$\widehat{\mathbf {K}}$| from equation (7) greatly simplifies running multiple diffusion steps.

(10)

The only difficulty with equation (10), the computation of the inverse matrix Q−1, can be evaded. |$\widehat{\mathbf {K}}$| is symmetric, and ARPACK [52] produces unit-norm eigenvectors, making the eigenvector matrix Q orthonormal. Since the inverse of an orthonormal matrix equals its transpose, we derived an efficient imputation procedure with a constant complexity as the number of steps t varies.

(11)

Ocelli imputes values of sparse multimodal single-cell features with an approximated version of the abovementioned procedure. It uses MDM-generated eigenvectors and eigenvalues in place of Q and Λ, respectively.

We investigated the computational time of Ocelli’s imputation on a 16-CPU machine in two experiments with t = 1 imputation steps (complexity is constant as t changes). (i) We fixed the number of features to 1000 and generated sparse matrices with 10 000, 50 000, 100 000, 500 000, and 1 000 000 cells. (ii) We fixed the number of cells to 50 000 and generated sparse matrices with 1, 10, 100, 1000, and 10 000 features. Values of matrices were sampled from a binomial distribution that samples 1 with probability p = 0.2 and 0 otherwise. We generated 50 eigenvalues and corresponding eigenvectors from a uniform distribution defined on a unit segment [0, 1].

Benchmarking simulated data

We created and investigated two simulated developmental datasets. The binary tree dataset has three modalities and 6000 observations grouped into six types (A-F). Modalities 1, 2, and 3 introduce developmental lineages A–B, C–D, and E–F. We sampled observations from three-dimensional (3D) Gaussian distributions and then downsampled them to 1000 observations per type with probabilities accounting for local densities. Equation (12) defines the distribution behind the downsampling procedure. P(x) is a probability of observation x being downsampled and NNN(x) is the Nth nearest neighbor of observation x. We used N = 25.

(12)

The developmental process was introduced to the dataset by assigning order to observations (pseudotime) based on the distance to manually selected developmental starting points. Then, we added noise by permuting every 100 cells along the pseudotime. During the multimodal analysis with Ocelli, we computed an MDM embedding (parameters: 10 dimensions, 20 nearest neighbors) and visualized it with ForceAtlas2 as a 30-edge-per-node nearest neighbor graph.

The rare transitions dataset has 4500 observations grouped into nine types (A-I). Modalities 1 and 2 introduce developmental lineages A–C and D–I, respectively. Observations A-C in the first modality were sampled uniformly from 3D segments. The remaining observations originated from Gaussian distributions as in the binary tree dataset but resulted in 500 observations per type after downsampling. Pseudotime and noise were introduced analogically, as in the binary tree dataset. During the multimodal analysis with Ocelli, we computed an MDM embedding (parameters: 20 dimensions, 60 nearest neighbors) and visualized it with ForceAtlas2 as a 60-edge-per-node nearest neighbor graph.

We evaluated the quality of multimodal weights computed with Ocelli and WNN [44]. Weights from both methods were calculated using 60 nearest neighbors and compared to target weights using the mean squared error (MSE). Target modality weights equal 1 for lineages introduced in the modality and 0 for the remaining lineages.

Benchmarking scalability

Scalability is a significant challenge for emerging computational methods in high-throughput single-cell genomics. We tested the scalability of Cobolt (v0.0.1) [36], Matilda (the only, unnumbered release) [38], MIRA (mira-multiome v2.1.1) [55], MOFA+ (mofapy2 v0.7.0 and mofax v0.3.6) [45], MOJITOO (v1.0) [49], MultiVI (scvi-tools v0.19.0) [39], Ocelli (v0.1.1), scMM (v1.0.0) [37], and WNN (Seurat v5.0.3) [44] methods.

Benchmarking was conducted on matrices generated from the hair follicle SHARE-seq dataset [6] in the following steps. (i) We sampled 10 000, 50 000, and 100 000 cells from the dataset, (ii) we selected 1000 highly variable RNA-seq genes, (iii) we selected 1000 highly variable ATAC-seq genes from the gene activity matrix (generated using Signac [56] by mapping chromatin accessibility fragments to gene regions, including the region 2kb upstream from the promoter), (iv) we selected 1000 highly variable ATAC-seq peaks, (v) we modified counts of each cell by randomly adding single counts to 10 of its features per modality to avoid cells with the same levels of features. We used a peak count matrix for methods that require or recommend peaks instead of gene activities. Data has been preprocessed using Scanpy (v1.9.6) [57]. We ran all methods on a 16-CPU machine, with an additional NVIDIA Tesla T4 GPU if the method requires GPU-based computation. We measured only the training time of each method, excluding data preprocessing and downstream visualization. We employed the following training procedures: (Cobolt) 1000 RNA-seq genes and 1000 ATAC-seq peaks trained on GPU for 100 epochs; (Matilda) 1000 RNA-seq genes and 1000 ATAC-seq gene activities trained on GPU for 100 epochs; (MIRA) 1000 RNA-seq genes and 1000 ATAC-seq peaks trained on GPU; (MOFA+) 1000 RNA-seq genes and 1000 ATAC-seq gene activities trained on GPU in the most accurate convergence mode; (MOJITOO) 1000 RNA-seq genes and 1000 ATAC-seq gene activities trained without GPU; (MultiVI) 1000 RNA-seq genes and 1000 ATAC-seq peaks trained on GPU for 100 epochs; (Ocelli) 1000 RNA-seq genes and 1000 ATAC-seq gene activities trained without GPU on exact (ENNs, computed with scikit-learn v1.0.2 [51]) and approximate (ANNs, computed with nmslib v2.1.1 [58] nearest neighbors; (scMM) 1000 RNA-seq genes and 1000 ATAC-seq peaks trained on GPU for 100 epochs; (WNN) 1000 RNA-seq genes and 1000 ATAC-seq gene activities trained without GPU. We specified the dimensionality of the resulting latent space to 20 if applicable.

The reported times include only the execution time of the algorithm, excluding preprocessing steps. However, we acknowledge that preprocessing also contributes to overall runtime. Here, we report topic modeling training times. We trained 20 LDA topics on the scalability benchmark training data using 16 parallel jobs for 30 iterations (Ocelli’s default setting, while scikit-learn’s default number of iterations is 10). The approximate preprocessing times, rounded to the nearest second, are as follows. For 10 000 cells, the LDA training time was 6 seconds for 1000 RNA-seq genes, 5 s for 1000 ATAC-seq gene activities, and 5 s for 1000 ATAC-seq peaks. For 50 000 cells, the LDA training time was 19 seconds for 1000 RNA-seq genes, 18 s for 1000 ATAC-seq gene activities, and 22 s for 1000 ATAC-seq peaks. For 100 000 cells, the LDA training time was 39 s for 1000 RNA-seq genes, 33 s for 1000 ATAC-seq gene activities, and 34 s for 1000 ATAC-seq peaks.

To provide a more accurate estimate of preprocessing time, we also report LDA training times for the analyzed datasets. Here, we trained 20 LDA topics for 30 iterations using 50 parallel jobs unless otherwise specified. For the hair follicle SHARE-seq dataset, which includes 7160 cells, training took 15 s for 6731 RNA genes and 1 min 36 s for 17 495 ATAC gene activities. In the human bone marrow ASAP-seq dataset of 10 927 cells, the training time was 1 min 35 s for 3000 ATAC gene activities and 14 s for 238 proteins. For the human bone marrow NTT-seq dataset with 5236 cells, the process took 33 s for 43,170 H3K27me3 peaks and 26 s for 71 253 H3K27ac peaks. The peripheral blood mononuclear cells (PBMCs) DOGMA-seq dataset, containing 7516 cells, required 3 min 18 s for 84 846 ATAC peaks, 1 min 23 s for 36 601 RNA genes, and 7 s for 210 proteins. In the pancreatic endocrinogenesis RNA-seq dataset of 3606 cells, the training time for 5465 genes was 36 s. For larger datasets, the BMMCs SHARE-seq dataset, which includes 842 440 cells and was processed using 100 jobs, took 44 min 2 s for 33 721 RNA genes and 49 min 57 s for 189 259 ATAC peaks. Similarly, in the iPSCs reprogramming dataset with 68 703 cells, also run on 100 jobs, training required 12 min 18 s for 16 817 RNA genes.

Benchmarking cohesiveness of signatures

RNA-seq signatures and ATAC-seq motifs (called together ‘signatures’ in this section) are a powerful aid in understanding the biology behind single-cell data. This section describes a benchmark evaluating the coherency of embedded cells with similar signatures in each modality. We define the activity of a signature in a cell as a mean z-score of gene expression for RNA-seq signatures or a motif variability score for ATAC-seq motifs. The cohesiveness benchmark is based on two criteria, as follows. (i) Close neighborhoods of highly signature-active cells should also be signature-active. We select 5 % of cells with the highest activity and compute a score ψc for each selected cell c defined as a mean signature activity score among its n = 3 nearest neighbors. (ii) Signature-active cells should be in a cohesive embedding region; disconnected high-activity clusters may indicate a disorganized embedding. We employ a density-based clustering algorithm DBSCAN [59] to establish a connective structure among the 5% of most active cells. DBSCAN creates a graph |$\mathcal {G}$| connecting cells within ϵ distance from each other. We set ϵ equal to the embedding’s estimated median distance between cells based on 1000 random pairs of cells. We consider the median distance a fair threshold of proximity between cells.

The cohesiveness score of a signature for an embedding is calculated as follows. We compute a score ΨG for each connected component |$G \in \mathcal {G}$| of the connectivity graph. A lower ΨG value indicates a better cohesiveness within G.

(13)

Then, we compute a total cohesiveness score Φ of an embedding over all connected graph components |$\mathcal {G}$|⁠.

(14)

Φ penalizes graphs with multiple connected components. Note that the scale is inverted; higher values indicate a better-connected and locally more consistent signature portrayal.

We examined the cohesiveness scores on the hair follicle dataset [6]. For ATAC-seq, we used the 150 most-variable motifs computed using chromVAR v0.3 [60] from the peak count matrix. For RNA-seq, we ran a community-finding Infomap [61, 62] algorithm using the igraph v1.5.0 [63] package and kept clusters with more than 10 genes, resulting in nine signatures. Cohesiveness scores were summed over all RNA-seq gene signatures for each evaluated embedding and divided by a maximum score so that the highest-scoring embedding scored 1. We analogically summed and scaled ATAC-seq motif cohesiveness scores. The final multimodal cohesiveness score is a sum of scaled RNA-seq and ATAC-seq scores with a maximum possible score of 2.

Benchmarking RNA velocity vector field alignment

The RNA velocity alignment benchmark evaluates whether low-dimensional embeddings preserve RNA velocity dynamics. In the hair follicle dataset, we scored each cell as median cosine similarity between the cell’s projected developmental direction and vectors directed towards n = 3 cells with the highest RNA transition probabilities from the selected cell. Cosine similarities show high values if an embedding orders cells along the RNA velocity vector field. The score of an embedding is a median score across all cells. We computed projected developmental directions and RNA velocity transitions with scVelo [26] on log-normalized 1000 highly variable genes.

Benchmarking RNA velocity-based terminal state separation

The terminal state separation benchmark quantifies the separation of developmental lineage fates in low-dimensional embeddings. It uses absorption probabilities, which estimate how likely each cell is to transition into each identified terminal state based on RNA velocity. In the hair follicle dataset, we identified four terminal states and their absorption probabilities with CellRank v1.5.1 [64]. Let |$\mathcal {T}$| be a set of terminal states. For each terminal state |$t\in \mathcal {T}$|⁠, we set its location as mean coordinates of 1 % of cells with the highest corresponding absorption probabilities. We score the separation of terminal state t as:

(15)

where ap(n, t1, t2) is a mean absorption probability of terminal state t1 among n nearest neighbors of the terminal state t2. score(n, t) rewards embeddings where absorption probabilities of terminal state t dominate its neighborhood. High absorption probabilities of remaining terminal states in t’s neighborhood penalize the score. We computed scores for n ∈ {20, 40, …, 980, 1000}.

Benchmarking multimodal imputation

We tested the performance of multimodal imputation using Ocelli and MultiVI [39] on the hair follicle dataset [6]. The training data comprised 5000 highly variable RNA-seq genes and 28 501 ATAC-seq peaks (selected by removing 5% of most active peaks from 30 000 highly variable peaks). We measured imputation performance on raw and downsampled data by randomly removing p = 10%, …, 80% of counts from count matrices. Each test was performed in two variants: (i) by training a separate model for each downsampling rate p and (ii) by training a single model on raw data and using it to impute downsampled data.

To investigate the ability of multimodal imputation to recover cell–cell correlations, we devised a benchmark that measures how well the correlation matrix distinguishes between same-type and different-type cells. We score a cell–cell correlation matrix as a difference between the mean correlation of same-type cells and the mean correlation of different-type cells. The correlation matrix showing a clear cell type structure will score high, while a correlation matrix that does not distinguish between cell types will score low. When computing cell–cell correlations, we used 5000 highly variable RNA-seq genes and 10 000 highly variable ATAC-seq peaks.

To test the robustness of multimodal imputation, we tested the similarity of imputed data against non-zero raw data counts. We computed MSE between raw data non-zero counts and corresponding imputed values for 5000 highly variable RNA-seq genes and 10 000 highly variable ATAC-seq peaks. We present results as boxplots showing the distribution of MSE over features.

(16)

Multimodal imputation can also reconstruct feature-feature correlations. We computed correlation matrices using imputed data of 5000 highly variable RNA-seq genes and 10 000 highly variable ATAC-seq peaks. We clustered features of imputed raw data into 8 hierarchical Ward clusters for each modality using the scikit-learn package.

Results

Overview of Ocelli

Ocelli is an explainable multimodal framework (Fig. 1A and “Materials and methods”) to learn a low-dimensional representation of developmental trajectories. In the data preprocessing step, we find modality-specific programs with topic modeling using LDA. LDA topics provide a highly compressed yet sensitive representation [65]. The training is fully unsupervised and flexible to any single-cell modality. Topic modeling captures each modality’s underlying thematic structure by identifying which features tend to occur together across cells. Also, topic modeling is well-suited to developmental data, as topic proportions are continuous distributions, allowing for gradual changes in modality-specific programs.

Ocelli overview. (A) Schematic view of Ocelli workflow. The Ocelli core comprises multimodal dimension reduction with MDMs and data visualization functionalities. Preprocessed modalities, modeled individually as diffusion processes, are merged into a weighted MMC and then decomposed into the MDM latent space. Ocelli can visualize MDM embeddings using a force-directed layout that employs cellular transition probabilities or any other dimension reduction algorithm, e.g., UMAP. (B-C) Application of Ocelli to simulated data: the binary tree dataset with three modalities and six lineages A–F (B) and the rare transitions dataset with two modalities and nine lineages A–I (C). From top: a diagram of lineage trajectories (top), projections of modalities with lineage annotations and distribution of inferred multimodal weights (middle), an Ocelli embedding visualized using ForceAtlas2 with lineage annotations, and a developmental pseudotime progression (bottom).
Figure 1.

Ocelli overview. (A) Schematic view of Ocelli workflow. The Ocelli core comprises multimodal dimension reduction with MDMs and data visualization functionalities. Preprocessed modalities, modeled individually as diffusion processes, are merged into a weighted MMC and then decomposed into the MDM latent space. Ocelli can visualize MDM embeddings using a force-directed layout that employs cellular transition probabilities or any other dimension reduction algorithm, e.g., UMAP. (B-C) Application of Ocelli to simulated data: the binary tree dataset with three modalities and six lineages A–F (B) and the rare transitions dataset with two modalities and nine lineages A–I (C). From top: a diagram of lineage trajectories (top), projections of modalities with lineage annotations and distribution of inferred multimodal weights (middle), an Ocelli embedding visualized using ForceAtlas2 with lineage annotations, and a developmental pseudotime progression (bottom).

In the second part of the model, we compute cell-specific modality weights that determine the importance of each modality for each cell from the topic latent representations. This approach resembles the WNN [44] method. However, we compute cell-cell affinities using ECDFs of distances between cells. Their values encode distances with respect to the global structure of the topic latent space. Having multimodal weights, we construct the MMC as a weighted sum of the unimodal affinities between cells. Multimodal weights quantify how informative modalities’ affinities are about the underlying developmental process. Next, we determine the latent space of MDM by factoring the MMC into eigenvectors and eigenvalues. This step further reduces the dimensionality of features, denoises the data, and allows us to find underlying developmental processes and cell ordering from the data. Then, we build the nearest neighbor cell-cell graph in the space of MDM components.

Finally, to obtain a latent space that reflects all modalities and additional information about cell fates, we leverage modality-inferred or -measured transition probabilities between cells to correct the cell-cell graph created from MDM. The transition probabilities between cells can be inferred from, e.g., RNA velocity [25, 26], optimal transport [28] algorithm, or determined from lineage-tracing experiments [66]. We modify the MDM nearest neighbors graph by rewiring edges to connect cells with the highest transition probabilities (see Methods for details). This step is used only for cells with determined transitions. The corrected cell-cell graph is visualized in 2D or 3D as a Force-directed Layout Embedding (FLE) [28] with ForceAtlas2 [53, 67] or Uniform Manifold Approximation and Projection (UMAP) [46].

Modeling differentiation with Ocelli

First, we validated the analysis strategy of Ocelli in inferring developmental progression and lineage branching of cells using simulated data. We modeled the developmental process as a binary tree composed of three modalities with two lineages branching in each modality (Fig. 1B). Ocelli identified the importance of each modality, reconstructed the branching points, and preserved the continuity of developmental trajectories with high accuracy. Next, we considered a developmental process with transient developmental transitions in the first modality (Fig. 1C, lineages A–C) followed by lineage branching in the second modality (lineages D-I). Ocelli displayed remarkable sensitivity in reconstructing transient developmental transitions and lineage branching (Fig. 1C). Finally, we compared the modality weights distributions and visualizations generated from simulated multimodal data between Ocelli and WNN, a direct competitor graph-based method (Supplementary Fig. S1). Comparison of distributions of modality weights between the methods (Supplementary Fig. S1A) shows that Ocelli accurately infers the distributions (Supplementary Fig. S1B), minimizing the MSEs between ground truth and inferred modality weights. The gains in accuracy of Ocelli modality weights are further confirmed by visualizing MDM latent spaces computed using modality weights from both Ocelli and WNN (Supplementary Fig. S1C). Ocelli generally generates more coherent and compact visualizations of trajectories while preserving their continuity and recovering lineage branching (Supplementary Fig. S1D).

Having demonstrated its performance on simulated data, we then used Ocelli to infer lineage dynamics of the regenerative subset of the hair follicle (Fig. 2A) from a dataset generated from simultaneous profiling of chromatin accessibility and gene expression by the SHARE-seq method [6]. Ocelli deconvoluted lineages building the hair shaft (HS) and inner root sheath (IRS) from transit-amplifying cells (TACs) (Fig. 2B), correctly identifying the higher weighting of chromatin accessibility in progenitor cells (TACs) and of RNA in differentiated cells (Fig. 2C and D). The obtained visualization shows two populations of progenitor cells and four populations of terminal differentiated cells. To annotate these cell populations, we computed gene signature activities determined from Smart-seq2 dataset [68]. Ocelli clearly distinguished HS cells (medulla (HS-Me), cortex (HS-Co)) and IRS cells (Huxley’s layer (IRS-Hu), Henle’s layer (IRS-He)) (Fig. 2E). In addition, CellRank [64] analysis confirms four terminally differentiated states that overlap with these gene signature activities (Fig. 2F).

Reconstruction of developmental pathways in the hair follicle SHARE-seq dataset, comprising chromatin accessibility and transcriptome modalities. (A) Schematic depicting the differentiation pathways in the regenerative part of hair follicle. (B) Ocelli’s visualization of the single-cell SHARE-seq data of the regenerative part of hair follicle using scVelo [26] RNA velocities. (C) Inferred chromatin accessibility weights show higher levels for progenitor TACs. In contrast, the transcriptome weights are elevated for differentiated cells. (D) A distribution of 10% of cells with the highest weights of each modality. (E) Gene signature activity [68], computed as gene expression mean z-scores, of HS: medulla (HS-Me) and cortex (HS-Co); and IRS: Huxley’s layer (IRS-Hu) and Henle’s layer (IRS-He). (F) CellRank [64] analysis detects four terminal differentiation states. Mean absorption probability shows a variety within TAC. IRS cells are more likely to develop from TAC-1 and HS cells from TAC-2.
Figure 2.

Reconstruction of developmental pathways in the hair follicle SHARE-seq dataset, comprising chromatin accessibility and transcriptome modalities. (A) Schematic depicting the differentiation pathways in the regenerative part of hair follicle. (B) Ocelli’s visualization of the single-cell SHARE-seq data of the regenerative part of hair follicle using scVelo [26] RNA velocities. (C) Inferred chromatin accessibility weights show higher levels for progenitor TACs. In contrast, the transcriptome weights are elevated for differentiated cells. (D) A distribution of 10% of cells with the highest weights of each modality. (E) Gene signature activity [68], computed as gene expression mean z-scores, of HS: medulla (HS-Me) and cortex (HS-Co); and IRS: Huxley’s layer (IRS-Hu) and Henle’s layer (IRS-He). (F) CellRank [64] analysis detects four terminal differentiation states. Mean absorption probability shows a variety within TAC. IRS cells are more likely to develop from TAC-1 and HS cells from TAC-2.

Diffusion-based multimodal imputation

The single-cell profiling of cells inevitably generates data of high sparsity levels, resulting in 0-inflated feature values that obstruct the inference of relationships between features or cells of the same or different types. The data’s sparsity originates from inefficient feature detection in single cells ranging from 10% to 45% of expressed RNA molecules in scRNA-seq and 1–10% of accessible sites in scATAC-seq [69]. Many unimodal imputation methods have been developed in recent years to address the sparsity in single-cell data [70]. In general, the imputation methods can be categorized into three categories: (i) probabilistic ones that directly model the data sparsity (Saver [71], scImpute [72]), (ii) smooth-based methods that smooth or diffuse feature values among nearby cells (MAGIC [19], kNN-smoothing [73]), and (iii) deep learning methods that first identify latent spaces and then reconstruct the feature matrices (AutoImpute [74], scVI [75]). The extensive benchmarking [70] of scRNA-seq data imputation methods revealed that smooth-based approaches, e.g., MAGIC [19] and kNN-smoothing [73] are the top unimodal imputation methods. MAGIC uses the diffusion operator to first learn data manifold [29] and then restores the missing expression values by exponentiation of the Markov matrix to a power t and multiplying the result by original expression matrix. While the unimodal imputation methods can be bluntly applied to each modality separately, multimodal imputation has the potential to resolve more nuanced differences between cells along the differentiation axis and, thereby, improve imputations.

Here, we demonstrate several key properties of Ocelli that help impute high-dimensional signals in multimodal data in a highly efficient way. First, instead of exponentiation of the multimodal Markov matrix, which is time-consuming, we eigendecompose it (Fig. 3A). Then, the exponentiation of the Markov matrix is equivalent to the exponentiation of eigenvalues, which is of constant complexity as the number of diffusion steps t varies. It results in highly scalable and efficient imputation of unobserved data for a large number of cells (Fig. 3B, upper panel) and features (Fig. 3B, lower panel). The salient feature of imputations is the recovery of gene–gene relationships and regulatory interactions [19]. First, we tested the imputations on two canonical markers of Henle’s lineage, Krt71 and Krt73. Ocelli’s multimodal imputations recovered the correlations between the expression of the marker genes (Fig. 3C) and the chromatin accessibilities at their promoters (Fig. 3D). To further test the usability of Ocelli’s multimodal imputations, we explored the recovery of correlations between features from different modalities or among cells. In the first case, we imputed gene expression, promoter, and distal element accessibility for 50 top gene markers of clusters identified earlier as terminal states. Ocelli restored the coactivity of regulatory elements found using GREAT [76] and genes within differentiated cells (Fig. 3E). The second case involved imputing the dataset’s expression values of all genes and regulatory elements. Cell-to-cell correlation heatmaps of Ocelli-imputed RNA estimates (t = 1) (Fig. 3F) and Ocelli-imputed promoter and distal elements accessibility estimates (t = 1) (Fig. 3G) showed that multimodal imputations help to reconstruct similarities between closely related cells.

Ocelli enables the reconstruction of highly sparse features of each modality using multimodal imputations. (A) The exponentiation of the multimodal affinity matrix is equivalent to eigendecomposition with exponentiated eigenvalues, significantly improving the computational time of imputations. (B) Ocelli’s runtimes for multimodal imputation for the increasing number of cells (top) and features being imputed (bottom). (C) Gene expression levels (top) and scatterplots of gene–gene relationships (bottom) for two canonical markers of the IRS Henle’s layer (Krt71 and Krt73) with different amounts of diffusion. (D) Chromatin accessibility of promoters for the markers in (C). (E) Correlations of gene expression to chromatin accessibility of promoters and distal elements before and after Ocelli imputations (t = 1) for differentiated cells. (F and G) Cell–cell correlations computed on gene expression (F) and chromatin accessibility (G) levels before and after imputation (t = 1) with Ocelli.
Figure 3.

Ocelli enables the reconstruction of highly sparse features of each modality using multimodal imputations. (A) The exponentiation of the multimodal affinity matrix is equivalent to eigendecomposition with exponentiated eigenvalues, significantly improving the computational time of imputations. (B) Ocelli’s runtimes for multimodal imputation for the increasing number of cells (top) and features being imputed (bottom). (C) Gene expression levels (top) and scatterplots of gene–gene relationships (bottom) for two canonical markers of the IRS Henle’s layer (Krt71 and Krt73) with different amounts of diffusion. (D) Chromatin accessibility of promoters for the markers in (C). (E) Correlations of gene expression to chromatin accessibility of promoters and distal elements before and after Ocelli imputations (t = 1) for differentiated cells. (F and G) Cell–cell correlations computed on gene expression (F) and chromatin accessibility (G) levels before and after imputation (t = 1) with Ocelli.

We examined the accuracy and robustness of multimodal imputation with Ocelli and MultiVI [39], a deep learning-based method, on the hair follicle dataset. MultiVI [39] is a leading multimodal imputation probabilistic model that leverages neural network encoders to find modality-specific latent representations that are aligned and merged into a multimodal representation. We employed multiple data sparsity levels by downsampling RNA and ATAC count matrices by up to 80% to test model sensitivity. Furthermore, each test was performed in two variants. In the first one, denoted separate models, we trained a separate model for each downsampling rate. In the second one, denoted single model, we trained a model on raw count matrices and used it to impute downsampled data. The separate model variant measures the method’s resistance to a significant increase in sparsity, while the single model variant checks the model robustness for degrading data quality. We tested the reconstruction of cell–cell relationships by examining post-imputation correlations between cells (Supplementary Fig. S2A and B). Cell–cell relationships are well-distinguished if correlations of same-type cells are noticeably higher than those of different-type cells. In Supplementary Fig. S2A, we show mean cell–cell correlations between same- and different-type cells and their differences. Ocelli showed higher differences, which translates to better recovery of cell–cell relationships. Additionally, we explored correlations between modality features (Supplementary Fig. S2C and D). Ocelli showed high robustness when comparing non-zero counts from raw data and imputed values using MSE (Supplementary Fig. S2C). As a result, Ocelli’s multimodal imputation is a powerful tool for reducing count sparsity, preserving pre-imputation signal, and reconstructing cell–cell and feature–feature relationships.

Benchmarking analysis

We further performed extensive benchmarking of Ocelli against other recently developed methods for single-cell multimodal data analysis and visualization regarding computational runtime and quality of the generated data embeddings. For the latter benchmarking analyses, we wanted to explore the quality of the visualization of developmental single-cell data focusing on the regenerative part of hair follicle (Fig. 2). While some benchmarking strategies have been developed for assessing analysis and visualization of well-clustered data, such as silhouette score or adjusted Rand index score [49], we designed new benchmarks related to the visualization of developmental multimodal data and transitions between cells (Fig. 4, Supplementary Figs S3, S4), “Materials and methods”). Ocelli was benchmarked against (i) graph-based methods—WNN [44], MIRA [55]; (ii) statistical framework for multimodal factor analysis, MOFA+ [45]; (iii) deep learning-based methods - Cobolt [36], scMM [37], Matilda [38], MultiVI [39]; and (iv) canonical correlation analysis method MOJITOO [49] (Supplementary Fig. S3A, “Materials and methods”). Ocelli is flexible and can accommodate transition probabilties generated using any method. Here, we explored two velocity dynamics strategies by running Ocelli with transition probabilities from scVelo [26] and MultiVelo [27].

Benchmarking Ocelli against competing multimodal dimensionality reduction methods on a SHARE-seq dataset of the regenerative part of hair follicle. (A) Ocelli improves computation time compared to state-of-the-art methods. (B) The cohesiveness score quantifies the tendency of grouping cells with similar transcriptomic and chromatin accessibility signatures. Bar plots compile cohesiveness scores computed over multiple parameters for each method. The left bar plot shows the highest scores achieved by each method. The right bar plot shows the variability of each method’s scores across tested parameter values. (C) The vector field score measures the alignment of an embedding and the RNA velocity vector field. The score is the median cosine similarity between the cell’s projected developmental direction and directions toward cells with the highest RNA velocity transitions. Bar plots compile vector field scores computed over multiple parameters for each method. The left bar plot shows the highest scores achieved by each method. The right bar plot shows the variability of each method’s scores across tested parameter values. (D) Net absorption evaluates the separation of developmental terminal states in embeddings. Net absorption of a terminal state is its mean neighborhood absorption after subtracting mean absorptions of remaining terminal states. High net absorption translates to a well-separated terminal state. Plots show best-performing models over hyperparameter search for each method. (E) Compilation of benchmark scores.
Figure 4.

Benchmarking Ocelli against competing multimodal dimensionality reduction methods on a SHARE-seq dataset of the regenerative part of hair follicle. (A) Ocelli improves computation time compared to state-of-the-art methods. (B) The cohesiveness score quantifies the tendency of grouping cells with similar transcriptomic and chromatin accessibility signatures. Bar plots compile cohesiveness scores computed over multiple parameters for each method. The left bar plot shows the highest scores achieved by each method. The right bar plot shows the variability of each method’s scores across tested parameter values. (C) The vector field score measures the alignment of an embedding and the RNA velocity vector field. The score is the median cosine similarity between the cell’s projected developmental direction and directions toward cells with the highest RNA velocity transitions. Bar plots compile vector field scores computed over multiple parameters for each method. The left bar plot shows the highest scores achieved by each method. The right bar plot shows the variability of each method’s scores across tested parameter values. (D) Net absorption evaluates the separation of developmental terminal states in embeddings. Net absorption of a terminal state is its mean neighborhood absorption after subtracting mean absorptions of remaining terminal states. High net absorption translates to a well-separated terminal state. Plots show best-performing models over hyperparameter search for each method. (E) Compilation of benchmark scores.

First, Ocelli exhibits improved computational runtimes compared to other competing multimodal methods (Fig. 4A). For example, reconstructing a joint embedding of 100 000 cells took Ocelli less than 3 min, outperforming all competing multimodal methods. We used GPU-accelerated computations for MOFA+ [45], MIRA [55] and deep learning methods and CPUs for others for these speed tests. We excluded preprocessing steps and used the same data dimensions to perform fair comparisons between the methods (“Materials and methods”). For Ocelli, we determined the runtimes for exact (ENN) and approximate (ANN) approaches of the computation of nearest neighbors.

Next, we assessed how the aggregate signal of related features in each modality is displaced on the embedding. We used gene signature scores for the RNA modality. In the case of the ATAC modality, we leveraged transcription factor motif scores (“Materials and methods”). A high-quality, developmental data embedding should group cells of similar scores together. A failed embedding can have high signature scores, for example, related to the same cell identity separated, which would suggest alternative lineages. To quantify this, we introduced the signature cohesiveness score (“Materials and methods”). The evaluated embedding is dissected into a connectivity graph using the DBSCAN [59] algorithm. Then, each connected graph component is scored based on the presence of the most signature-active cells and the signature activity in their neighborhoods. The cohesiveness score favors an embedding in which most signature-active cells are grouped in a single connected graph component and if neighborhoods of highly signature-active cells are also highly signature-active. We considered different scenarios for data visualization (“Materials and methods”): (i) optimal preprocessing steps for each method suggested by developers of each method; (ii) usage of all features or initial reduction of data dimensionality by selecting highly-variable features; (iii) final data embedding on two-dimensional (2D) or in 3D using UMAP or FLE. The cumulative cohesiveness score for RNA and ATAC modality is the highest for Ocelli and 3D FLE (Fig. 4B). Ocelli generally generates the most robust visualizations in this benchmark across all methods (Supplementary Fig. S3C).

Finally, we developed two benchmarks to compare the accuracy of visualizing developmental trajectories across the methods. The first benchmark quantifies the reconstruction of temporal cell ordering and differentiation direction on the embedding in relation to the inferred transitions by the RNA velocity-based method [26]. The output of the method, RNA velocity vector field, is obtained from relative abundances of immature (unspliced) and mature (spliced) RNA molecules. Usually, it is visualized on data embeddings by arrows, depicting the cell’s projected developmental direction. Descendant cells with the highest transition probabilities should be aligned near the projected developmental direction (Fig. 4C) in case of high-quality data embedding. Across all cells, we quantified this by computing cosine similarities between the projected developmental direction and the direction toward cells with the highest transition probabilities. 2D and 3D data embeddings generated by Ocelli using FLE show the highest scores in the ordering of cells according to RNA velocity vector fields across all tested methods (Fig. 4C, Supplementary Fig. S4A). The single-cell data visualizations that scored the highest vector field score for each method (Supplementary Fig. S3) show that Ocelli generated a clear trajectory to Huxley’s layer cells. Other methods identify Huxley’s layer as a transient state to Medulla (MIRA [55], MOJITOO [49], to some degree MultiVI [39]), Cortex (MOFA+ [45], scMM [37], WNN [44]), or mix the Huxley’s layer cells with TAC-1 (Matilda [38], Cobolt [36]). While MultiVI [39] visualized Huxley’s layer cells as the final state, it failed to predict transitions to Medulla from TACs.

As a second quantitative benchmark, we determined the separability of cell lineages on generated data embeddings. Using CellRank’s [64] determined terminal states based on RNA velocity results, we tested how the cell neighborhoods of terminal states are displaced on the embedding relatively to each other (Fig. 4D). To do so, first, we determined cells’ absorption probabilities to terminal states using CellRank [64] algorithm. Then, we systematically increased the number of nearest neighbors of one terminal state and computed, among the nearest neighbors, the difference between mean absorption probabilities to this terminal state and to others (Fig. 4D, “Materials and methods”). This net absorption score is high if cell trajectories of different lineages are well separated on the embedding. We computed the net absorption scores to four terminal states of hair follicle for all tested methods (Fig. 4D, Supplementary Fig. S4B and C). Regardless of the final data embedding strategy (UMAP or FLE), Ocelli outperformed all other methods in the discussed benchmarks (Fig. 4E).

Finally, we compared the quality of the embeddings generated by Ocelli boosted with transition probabilities obtained from scVelo’ stochastic model (Fig. 2B) and Ocelli embeddings generated using MultiVelo stochastic and dynamical models (Supplementary Fig. S3B). While MultiVelo’s [27] stochastic model produced reasonable results (Fig. 4, Supplementary Figs S3, S4), scVelo [26] produced a better velocity stream, capturing all trajectories to terminal states. MultiVelo’s velocity analysis failed to capture the TAC-to-IRS Huxley’s layer transition (Supplementary Fig. S3B) despite its clear delineation by Ocelli’s diffusion maps. These results do not indicate a universal preference for scVelo [26] over MultiVelo [27]. Rather, they underscore the necessity of robust multimodal embeddings and highlight that, although often effective, current velocity algorithms may yield misleading outcomes. Consequently, users must rigorously evaluate transition probability methods to ensure accurate data interpretation. Ocelli remains fully adaptable to transitions from any method, allowing users complete flexibility and customization to analyze their dataset.

Diffusion-based multimodal exploration of subtrajectories

Building developmental single-cell atlases involves refined analysis of developmental subtrajectories from data consisting of multiple distinct lineages and cell types unrelated directly to a subtrajectory of interest [77]. Currently, such analysis entails clustering all cells, manual annotation of cell clusters, and supervised extraction of cell clusters that may belong to the studied subtrajectory. Here we show that data diffusion can facilitate the analysis of subtrajectories by unsupervised extraction of phenotypically related cells.

To illustrate multimodal diffusion-based extraction and refined reconstruction of a subtrajectory, we analyzed ASAP-seq human native hematopoietic differentiation data [12], obtained from simultaneous measurement of chromatin accessibility and protein epitopes (Fig. 5A) for 10,927 human bone marrow cells. First, consistent with the analysis results for the hair follicle dataset, Ocelli reconstructed the hematopoietic cell lineages with high accuracy, correctly assigning the highest chromatin accessibility weights to hematopoietic stem cells (HSCs), gradually decreasing with more differentiated states of cells for which protein epitope modality becomes more prevalent (Fig. 5B,C). The vast majority of cells present are CD14+ monocytes, B, and T cells, which obstructs visualization of the hematopoietic hierarchy of cells. We sought to refine the analysis of the HSCs differentiation trajectories. We selected a CD34+ positive cluster of 351 cells and assigned the uniform distribution equal to 1 for the selected cells and 0 otherwise (Fig. 5D). Using the same data diffusion strategy as in imputations (Fig. 3A), we performed diffusion to expand the initial cell set to similar cells along the multimodal affinity-based graph structure. Finally, for diffusion time t = 20, the expanded set of 1464 cells has been used as an input in Ocelli for multimodal visualization of the cells. The continuous trajectories from HSCs are readily apparent in the refined visualization of cells (Fig. 5E), which accurately reconstructed the early hematopoietic hierarchy [78, 79]. Moreover, the refined visualization shows distinct pathways of basophilic/mast cells, B cell progenitors, and cDCs that were buried in a coarse visualization of human bone marrow cells (Fig. 5A).

Cell selection strategy for refined analysis of trajectories. (A) The multimodal visualization of ASAP-seq human bone marrow data with coarse annotation of hematopoietic cell lineages. (B) Violin plots show a distribution of multimodal weights in each cluster from panel (A). (C) A distribution of 15 % of cells with the highest weights. (D) Cells from the HSCs cluster were assigned 1 s (t = 0), and the diffusion was performed for different diffusion times t = 1, 10, 20. (E) Cells selected at t = 20 were reanalyzed with Ocelli. The FLE plots show refined hematopoietic trajectories with cells colored by pseudotime or epitope level of lineage-specific protein markers. HSCs, hematopoietic stem cells; E, erythroid progenitors; pDCs plasmacytoid dendritic cells; Ba, basophilic/mast cell progenitors; B, B cell progenitors; cDCs, classical dendritic cells; M, myeloid progenitors.
Figure 5.

Cell selection strategy for refined analysis of trajectories. (A) The multimodal visualization of ASAP-seq human bone marrow data with coarse annotation of hematopoietic cell lineages. (B) Violin plots show a distribution of multimodal weights in each cluster from panel (A). (C) A distribution of 15 % of cells with the highest weights. (D) Cells from the HSCs cluster were assigned 1 s (t = 0), and the diffusion was performed for different diffusion times t = 1, 10, 20. (E) Cells selected at t = 20 were reanalyzed with Ocelli. The FLE plots show refined hematopoietic trajectories with cells colored by pseudotime or epitope level of lineage-specific protein markers. HSCs, hematopoietic stem cells; E, erythroid progenitors; pDCs plasmacytoid dendritic cells; Ba, basophilic/mast cell progenitors; B, B cell progenitors; cDCs, classical dendritic cells; M, myeloid progenitors.

Ocelli performance on datasets with distinct sets of modalities

To showcase how Ocelli deals with single-cell data generated by methods profiling distinct modalities in visualizing the same developmental trajectory, we applied Ocelli to multimodal human native hematopoietic differentiation data. We have already analyzed ASAP-seq data (chromatin accessibility and protein epitopes, Fig 5A). In this section, we leverage Ocelli to visualize multimodal data generated by (i) SHARE-seq [80] (chromatin accessibility and transcriptome, Supplementary Fig. S5A); and (ii) NTT-seq [15] (histone modifications H3K27ac and H3K27me3, Supplementary Fig. S5B). The SHARE-seq data involves 842,440 cells and it is the largest single-cell multimodal data that have been generated so far. NTT-seq data consists of 5236 cells. Ocelli-generated single-cell visualizations depict the same hematopoietic cell lineages and correctly reconstruct hematopoietic trajectories for each multimodal method, showing its high robustness to the type of profiled genome-wide features and the number of cells studied. When comparing the computed weights for different cells along the hematopoietic developmental trajectories (Supplementary Fig. S5C), the most notable feature is that HSCs have the strongest weights for the chromatin accessibility modality in the ASAP-seq and SHARE-seq datasets; while the H3K27me3 modality in the NTT-seq dataset. H3K27me3 is a repressive mark found in stem cells to be pivotal for proper cell differentiation or to remain unspecified [81, 82]. In particular, H3K27me3 marks are associated with gene repression for cell type-specific genes [83], and Ocelli correctly predicts the importance of this modification in HSCs. Classical and plasmacytoid dendritic cells (cDCs and pDCs) show a strong signal for chromatin accessibility in both ASAP-seq and SHARE-seq datasets and remarkable change in acetylation of histone H3 lysine 27 (H3K27ac), which mark genomic regions that indicate active both proximal and distal regions of transcription start sites. Interestingly, during dendritic cell development, chromatin structure undergoes global reorganization [84], where thousands of enhancers are activated in pDCs and cDCs [85–88].

Finally, we tested Ocelli on a trimodal dataset of PBMCs generated by Dogma-seq [12] (chromatin accessibility, transcriptome, protein epitopes, 7516 cells, Supplementary Fig. S5D). Ocelli reconstructed with high-resolution different states of T cells (CD4 and CD8 Naive, Memory, Effector, along with MAIT and gdT cells). To chart the contributions of each modality, we showed the modality weights (Supplementary Fig. S5E), which depict the strong contribution of chromatin accessibility in naive CD4 Naive T cells, RNA on regulatory T cells, and protein on Effector T, MAIT, gdT, B, and NK cells.

Utilizing Ocelli for single-modality data visualization

Gene expression programs (GEPs) determine cell identity and activity. GEPs are modules of coregulated genes [89–91]. GEPs maintain specific cell types and perform complex cellular activities such as proliferation, apoptosis, metabolism, differentiation, or responses to environmental cues. Each cell is a mixture of GEPs, and their relative contributions change continuously throughout cellular differentiation. Usually, GEPs are computationally determined in an unsupervised manner from scRNA-seq data by LDA [92] or non-negative matrix factorization [91]. We hypothesized that GEPs, inferred from unimodal scRNA-seq, could be leveraged directly as latent modalities to generate visualizations of scRNA-seq data with Ocelli. In this context, inference of GEPs in the first step would denoise single-cell data, and the multimodal analysis strategy implemented in Ocelli would accurately separate cell lineages and their states of different activities.

Here, we tested multimodal analysis framework on two developmental scRNA-seq datasets: (i) pancreatic endocrinogenesis [93], and (ii) cellular reprogramming to induced Pluripotent Stem Cells (iPSCs) [28]. First, we inferred GEPs using LDA analysis, where each LDA’s topic groups features into programs. Ocelli interprets each program as a modality, using the cell-topic distribution as modality weights. Then, we computed a low-dimensional MDM embedding on log-normalized topic-based latent modalities. Finally, we created a visualization of single cells with ForceAtlas2, leveraging inferred transition probabilities between cells. In the case of pancreatic endocrinogenesis data visualization, we used RNA velocity-derived transition probabilities. Ocelli accurately reconstructed complex developmental lineages (Fig. 6A), rare and transient cellular transitions (Fig. 6B), and different proliferative cell states (Fig. 6CE). In detail, four hormone-producing endocrine cells, Alpha, Beta, Epsilon, and Delta, arise from the endocrine progenitor cells originating from cycling Ductal cells. The first progenitor branching point leads to the development of Epsilon cells and an intermediate cell population that generates independently Alpha cells [94]. Subsequently, progenitors further develop into Alpha, Beta, and Delta lineages. Delta cells are produced primarily after E15.5, with only a few cells detected at E15.5 [94]. Notably, Ocelli visualized (i) both differentiation pathways to Alpha cells: through intermediate cell population with Epsilon cells and then directly from endocrine progenitor cells; (ii) a distinct pathway to Delta cells despite its scarcity at the embryonic day E15.5; and (iii) a cell cycle of Ductal cells. Ocelli’s visualization shows concise trajectories as compared to competitive approaches [26].

Sensitivity of Ocelli in reconstructing rare cell transitions. (A–E) Exploration of the pancreatic endocrinogenesis unimodal dataset with Ocelli. (A) The visualization of the cell trajectories generated by Ocelli. (B) scVelo’s [26] RNA velocity vector field projected onto developmental trajectories of endocrine cells reconstructed with Ocelli. (C–E) Reconstruction of the cell cycle of Ductal cells. (C) scVelo’s [26] RNA velocity vector field showing inferred cell cycle trajectory of Ductal cells. (D) A schematic view of cell cycle phases. (E) Scores of cell cycle phases calculated using the list of cell cycle signature genes [100]. (F–L) Reconstruction of cell trajectories from the cell reprogramming dataset [28] with Ocelli. (F) Three scatter plots show projections of a 3D embedding constructed with Ocelli. The arrow highlights a rare developmental trajectory of iPSCs. The table specifies the fraction of iPSCs concerning all cells from a given timestamp. (G–L) Ocelli captures multiple developmental trajectories on a single joint embedding. Plots show the activity of gene signatures related to cell identities [28]: pluripotency (G); keratinocytes at differentiation stages from basal to terminally differentiated cells, where each cell is colored by its dominant differentiation phase based on the highest z-score (H); XEN stem cells (I); spongiotrophoblasts (J); spiral artery trophoblast giant cells (K); trophoblast progenitor cells (L).
Figure 6.

Sensitivity of Ocelli in reconstructing rare cell transitions. (AE) Exploration of the pancreatic endocrinogenesis unimodal dataset with Ocelli. (A) The visualization of the cell trajectories generated by Ocelli. (B) scVelo’s [26] RNA velocity vector field projected onto developmental trajectories of endocrine cells reconstructed with Ocelli. (C–E) Reconstruction of the cell cycle of Ductal cells. (C) scVelo’s [26] RNA velocity vector field showing inferred cell cycle trajectory of Ductal cells. (D) A schematic view of cell cycle phases. (E) Scores of cell cycle phases calculated using the list of cell cycle signature genes [100]. (F–L) Reconstruction of cell trajectories from the cell reprogramming dataset [28] with Ocelli. (F) Three scatter plots show projections of a 3D embedding constructed with Ocelli. The arrow highlights a rare developmental trajectory of iPSCs. The table specifies the fraction of iPSCs concerning all cells from a given timestamp. (G–L) Ocelli captures multiple developmental trajectories on a single joint embedding. Plots show the activity of gene signatures related to cell identities [28]: pluripotency (G); keratinocytes at differentiation stages from basal to terminally differentiated cells, where each cell is colored by its dominant differentiation phase based on the highest z-score (H); XEN stem cells (I); spongiotrophoblasts (J); spiral artery trophoblast giant cells (K); trophoblast progenitor cells (L).

In the second example, reprogramming of cells to iPSCs, we leveraged Waddington-OT inferred cell-cell transition probabilities. The transition probabilities were obtained by the optimal transport modeling of the cell fates between two consecutive time points at which the cells were collected for single-cell profiling. Similarly to previous examples, Ocelli showed high sensitivity to capture precisely developmental branching points and to reconstruct rare cell transitions (Fig. 6F). For example, Ocelli visualized a transient developmental trajectory to iPSCs in serum condition (Fig. 6F and G), comprising as little as 0.45% of all cells at day 11. In addition to all previously annotated cell types [28], Ocelli was able to visualize a distinct population of keratinocytes depicting an asynchronous multiple-stage interfollicular differentiation [95] (Fig. 6H). Moreover, rare populations of extraembryonic endoderm (XEN) and trophoblast-like cells with their differentiation trajectories are clearly distinguishable (Fig. 6IL). Earlier analyses [28] could not capture such precise trajectories and cell subtypes in a single, joint visualization. Taken together, Ocelli produces highly sensitive and accurate embeddings capable of showing rare cell transitions and multiple differentiation trajectories.

Discussion

Ocelli is an open-source visualization framework that uses new MDMs to enhance multimodal single-cell sequencing data analysis and exploration. The framework was specifically designed to resolve developmental processes and diminish caveats of single-cell genomics data, such as data sparsity. Ocelli generates informative visualizations of developmental multimodal data and outperforms state-of-the-art multimodal algorithms regarding the computational runtime and the quality of reconstructed developmental trajectories. To our knowledge, Ocelli is the only method that learns low-dimensional embedding of the data coalescing information from both highly dimensional multimodal features and modality-specific transition probabilities between cells. As a diffusion-based method, Ocelli enables scalable multimodal imputations of missing feature activity levels and has the potential to facilitate the exploration of trajectories from large datasets, e.g. mammalian organogenesis [77], which typically consist of many unrelated differentiation subtrajectories or in which some of the ancestral states are missing. Finally, Ocelli can be exploited on unimodal single-cell (e.g., scRNA-seq) data to generate more informative visualizations with the prospective to better resolve cell identities, activities, and spare transitions along cell’s differentiation trajectories.

The ability of multimodal algorithms to deconvolute and interpret multi-faceted data, termed explainability, is one of the critical factors that govern their applicability to resolve biological processes. In other words, a method producing highly interpretable results can be preferred over a slightly more accurate but less interpretable method. The existing methods for analysis of single-cell multimodal data vary in terms of explainability. Deep learning models tend to be less explainable. The field of explainable deep learning is rapidly developing, with many attempts at constructing explainable models [96]. However, learned parameter values of neural networks are often impossible to interpret. They represent meta properties of data, understandable by the computer, creating black-box models. New developments are needed for reliable large-scale single-cell applications. The explainability of a model can be increased with the use of interpretable parameters. For example, WNN [44] produces cell-specific modality weights that inform which modality is most informative for each cell. At the same time, MOFA+ [45] computes a distribution of modality features for each latent dimension of the reduced embedding. Applying such methods to multimodal single-cell data creates a joint low-dimensional embedding together with additional interpretable background information about cells or features, increasing the explainability of the output. Ocelli does increase explainability, integrating the advantages of the abovementioned methods to explore and visualize developmental multimodal single-cell data. In applying Ocelli, we preprocess data using probabilistic topic modeling that denoises the data while preserving its continuous structure. We adapt diffusion maps to a multimodal setting that allows us to include local and global information from each modality. Moreover, we directly include transition probabilities, e.g. from RNA velocity or Waddington-OT, when generating low-dimensional embeddings, which improves visualization of cell lineages and branching decisions. Lastly, we reconstruct highly sparse features using multimodal imputations, which allows us to recover feature-to-feature and cell-to-cell correlations.

Methods modeling RNA dynamics such as Velocyto [25] or scVelo [26] model the transcriptional content of cells and predict future states. Based on the relative kinetics of unspliced and spliced RNA, the methods compute a network of transition probabilities between cells and build a ‘velocity’ vector field that portrays cellular trajectories. In the extended velocity framework, MultiVelo [27] incorporates epigenomic information into the computation of transition probabilities. The vector field of mentioned methods was routinely visualized on the UMAP [46] or FLE [53] that were built without including cell transition probabilities. Recently, the VeloViz [97] method utilized cellular transitions to visualize unimodal single-cell data. VeloViz obtains each cell’s direction toward its predicted transcriptional state. Then, it calculates the proximity between all cell pairs by comparing their spatial direction to the predicted transcriptional one. Cells with the smallest proximities produce a nearest neighbor graph that is visualized. However, the drawback of VeloViz is that it requires coordinates of current and projected future transcription states for each cell, which limits a spectrum of applicable methods that produce transitions between cells, e.g. it is unable to leverage Optimal Transport inferred transition probabilities [28]. In addition, VeloViz applies to the visualization of scRNA-seq data only.

MDMs implemented in Ocelli allow us to impute the genome-wide features’ activity in single cells and recover feature-to-feature relationships or modality-specific cell-to-cell correlations. Scalable implementation of the imputation algorithm enables real-time imputations of feature sets of interest for large multimodal datasets that contain a million cells. We envision the single-cell multimodal diffusion-based imputation can enhance the analysis of highly sparse features originating from chromatin state studies, e.g., profiling of chromatin accessibility, histone modifications, or DNA-bound proteins. Single-cell multimodal profiling technologies that leverage scCUT&TAG-based approaches [13–16] use Tn5 and/or Tn5-conjugated proteins (e.g. pA-Tn5 or nano-Tn5) to cut genomic DNA at open chromatin regions or close to sites of histone modifications. In the case of multimodal profiling of chromatin states, only one or two modalities can be captured for the same region of genomic DNA from the same cell. In the case of more modalities studied, multimodal imputations are inevitable to infer the chromatin states at single-nucleosome and single-cell resolution. Moreover, the proposed multimodal imputations can enhance our ability to identify distal elements, e.g., enhancer-promoter connections, on a genome-wide basis from single-cell profiling studies [98] and facilitate the building of single-cell enhancer databases [99]. One apparent shortcoming of diffusion-based imputation is that smoothing feature activity levels results in real numbers of the imputed levels instead of counts. The imputed levels include “averaging” of multiple processes, both technical and biological, such as inefficient capturing of feature activity levels in single-cell sequencing technologies, as well as stochastic events during gene expression related to the dynamic switching of chromatin states at promoters and enhancers and RNA/protein production in bursts. The stochastic nature of the dynamics of chromatin states, transcription, and translation makes it impossible to infer genuine feature activity levels in single cells. Imputations have to involve the ‘averaging’ of multiple stochastic processes.

Overall, we have found that Ocelli employs powerful tools to study and visualize extensive developmental multimodal single-cell data to yield critical insights into the behavior of differentiating cells. We have shown, that Ocelli is a flexible diffusion-based method harnessing information from any number and type of modalities, along with prior knowledge of cell-to-cell transitions. It generates rich low-dimensional representation of continuous high-dimensional single- and mmultimodal data, which facilitates detection of relevant features from genome-wide measurements.

Acknowledgements

We thank members of the Computational Genomics Group for their discussions.

Author contributions: M.T. conceived and supervised the project. M.T. and P.R. designed and developed Ocelli. P.R. implemented Ocelli and wrote its documentation. M.T. and P.R. conducted the multimodal data analysis. M.T. and P.R. wrote the manuscript.

Supplementary data

Supplementary data is available at NAR Genomics & Bioinformatics online.

Conflict of interest

M.T. and P.R. are named inventors on European Patent application EP23169340.9 relating to the work of this manuscript.

Funding

P.R. and M.T. are supported by the International Centre for Translational Eye Research (FENG.02.01-IP.05-T005/23) project, which is carried out within the International Research Agendas programme of the Foundation for Polish Science co-financed by the European Union under the European Funds for Smart Economy 2021-2027 (FENG); and grants funded by National Science Center, Poland: the Sonata Bis 12 grant 2022/46/E/NZ2/00370 (M.T.), and Preludium 21 2022/45/N/NZ2/02311 (P.R.).

Code availability

Ocelli is an open-source resource, available under a BSD-Clause 2 license at https://github.com/TabakaLab/ocelli. The documentation and tutorials are available at https://ocelli.readthedocs.io.

Data availability

Single-cell data are available at the Gene Expression Omnibus under accession numbers: hair follicle data (GSE140203), human bone marrow data (ASAP-seq - GSE156478, NTT-seq - GSE212588, SHARE-seq - GSE216464), peripheral blood mononuclear cells data (GSE166188), pancreatic endocrinogenesis data (GSE132188), cell reprogramming data (GSE115943). Simulated data are available at https://doi.org/10.6084/m9.figshare.22700020.v1. Ocelli v1.0.0 source code is available at https://doi.org/10.5281/zenodo.15078053.

References

1.

Sun
 
S
 
A survey of multi-view machine learning
.
Neural Comput Appl
.
2013
;
23
:
2031
8
..

2.

Xu
 
C
,
Tao
 
D
,
Xu
 
C
 
A survey on multi-view learning
.
arXiv
20 April 2013, preprint: not peer reviewed
https://arxiv.org/abs/1304.5634.

3.

Stuart
 
T
,
Satija
 
R
 
Integrative single-cell analysis
.
Nat Rev Genet
.
2019
;
20
:
257
72
..

4.

Cao
 
J
,
Cusanovich
 
DA
,
Ramani
 
V
 et al. .  
Joint profiling of chromatin accessibility and gene expression in thousands of single cells
.
Science
.
2018
;
361
:
1380
5
..

5.

Zhu
 
C
,
Yu
 
M
,
Huang
 
H
 et al. .  
An ultra high-throughput method for single-cell joint analysis of open chromatin and transcriptome
.
Nat Struct Mol Biol
.
2019
;
26
:
1063
70
..

6.

Ma
 
S
,
Zhang
 
B
,
LaFave
 
LM
 et al. .  
Chromatin potential identified by shared single-cell profiling of RNA and chromatin
.
Cell
.
2020
;
183
:
1103
16
..

7.

Chen
 
S
,
Lake
 
BB
,
Zhang
 
K
 
High-throughput sequencing of the transcriptome and chromatin accessibility in the same cell
.
Nat Biotechnol
.
2019
;
37
:
1452
7
..

8.

Hunt
 
KV
,
Burnard
 
SM
,
Roper
 
EA
 et al. .  
scTEM-seq: Single-cell analysis of transposable element methylation to link global epigenetic heterogeneity with transcriptional programs
.
Sci Rep
.
2022
;
12
:
5776
.

9.

Zhu
 
C
,
Zhang
 
Y
,
Li
 
YE
 et al. .  
Joint profiling of histone modifications and transcriptome in single cells from mouse brain
.
Nat Methods
.
2021
;
18
:
283
92
..

10.

Pan
 
L
,
Ku
 
WL
,
Tang
 
Q
 et al. .  
scPCOR-seq enables co-profiling of chromatin occupancy and RNAs in single cells
.
Commun Biol
.
2022
;
5
:
678
.

11.

Stoeckius
 
M
,
Hafemeister
 
C
,
Stephenson
 
W
 et al. .  
Simultaneous epitope and transcriptome measurement in single cells
.
Nat Methods
.
2017
;
14
:
865
8
..

12.

Mimitou
 
EP
,
Lareau
 
CA
,
Chen
 
KY
 et al. .  
Scalable, multimodal profiling of chromatin accessibility, gene expression and protein levels in single cells
.
Nat Biotechnol
.
2021
;
39
:
1246
58
..

13.

Bartosovic
 
M
,
Kabbe
 
M
,
Castelo-Branco
 
G
 
Single-cell CUT&Tag profiles histone modifications and transcription factors in complex tissues
.
Nat Biotechnol
.
2021
;
39
:
825
35
..

14.

Gopalan
 
S
,
Wang
 
Y
,
Harper
 
NW
 et al. .  
Simultaneous profiling of multiple chromatin proteins in the same cells
.
Mol Cell
.
2021
;
81
:
4736
46
..

15.

Stuart
 
T
,
Hao
 
S
,
Zhang
 
B
 et al. .  
Nanobody-tethered transposition enables multifactorial chromatin profiling at single-cell resolution
.
Nat Biotechnol
.
2023
;
41
:
806
12
..

16.

Bartosovic
 
M
,
Castelo-Branco
 
G
 
Multimodal chromatin profiling using nanobody-based single-cell CUT&Tag
.
Nat Biotechnol
.
2023
;
41
:
794
805
..

17.

Yeung
 
J
,
Florescu
 
M
,
Zeller
 
P
 et al. .  
scChIX-seq infers dynamic relationships between histone modifications in single cells
.
Nat Biotechnol
.
2023
;
41
:
813
23
..

18.

Tedesco
 
M
,
Giannese
 
F
,
Lazarević
 
D
 et al. .  
Chromatin velocity reveals epigenetic dynamics by single-cell profiling of heterochromatin and euchromatin
.
Nat Biotechnol
.
2022
;
40
:
235
44
..

19.

Van Dijk
 
D
,
Sharma
 
R
,
Nainys
 
J
 et al. .  
Recovering gene interactions from single-cell data using data diffusion
.
Cell
.
2018
;
174
:
716
29
..

20.

Haghverdi
 
L
,
Buettner
 
F
,
Theis
 
FJ
 
Diffusion maps for high-dimensional single-cell analysis of differentiation data
.
Bioinformatics
.
2015
;
31
:
2989
98
..

21.

Ochab-Marcinek
 
A
,
Tabaka
 
M
 
Bimodal gene expression in noncooperative regulatory systems
.
Proc Natl Acad Sci USA
.
2010
;
107
:
22096
101
..

22.

Trapnell
 
C
,
Cacchiarelli
 
D
,
Grimsby
 
J
 et al. .  
The dynamics and regulators of cell fate decisions are revealed by pseudotemporal ordering of single cells
.
Nat Biotechnol
.
2014
;
32
:
381
6
..

23.

Mao
 
Q
,
Wang
 
L
,
Goodison
 
S
 et al. .  
Dimensionality reduction via graph structure learning
.
Proceedings of the 21th ACM SIGKDD international conference on knowledge discovery and data mining
.
2015
;
Sydney, Australia
765
74
..

24.

Qiu
 
X
,
Mao
 
Q
,
Tang
 
Y
 et al. .  
Reversed graph embedding resolves complex single-cell trajectories
.
Nat Methods
.
2017
;
14
:
979
82
..

25.

La
 
Manno G
,
Soldatov
 
R
,
Zeisel
 
A
 et al. .  
RNA velocity of single cells
.
Nature
.
2018
;
560
:
494
8
..

26.

Bergen
 
V
,
Lange
 
M
,
Peidli
 
S
 et al. .  
Generalizing RNA velocity to transient cell states through dynamical modeling
.
Nat Biotechnol
.
2020
;
38
:
1408
14
..

27.

Li
 
C
,
Virgilio
 
MC
,
Collins
 
KL
 et al. .  
Multi-omic single-cell velocity models epigenome–transcriptome interactions and improves cell fate prediction
.
Nat Biotechnol
.
2023
;
41
:
387
98
..

28.

Schiebinger
 
G
,
Shu
 
J
,
Tabaka
 
M
 et al. .  
Optimal-transport analysis of single-cell gene expression identifies developmental trajectories in reprogramming
.
Cell
.
2019
;
176
:
928
43
..

29.

Coifman
 
RR
,
Lafon
 
S
 
Diffusion Maps
.
Appl Comput Harmon Anal
.
2006
;
21
:
5
30
..

30.

Angerer
 
P
,
Haghverdi
 
L
,
Büttner
 
M
 et al. .  
Destiny: diffusion maps for large-scale single-cell data in R
.
Bioinformatics
.
2016
;
32
:
1241
3
..

31.

Haghverdi
 
L
,
Büttner
 
M
,
Wolf
 
FA
 et al. .  
Diffusion pseudotime robustly reconstructs lineage branching
.
Nat Methods
.
2016
;
13
:
845
8
..

32.

Li
 
B
,
Gould
 
J
,
Yang
 
Y
 et al. .  
Cumulus provides cloud-based data analysis for large-scale single-cell and single-nucleus RNA-seq
.
Nat Methods
.
2020
;
17
:
793
8
..

33.

LeCun
 
Y
,
Bengio
 
Y
,
Hinton
 
G
 
Deep learning
.
Nature
.
2015
;
521
:
436
44
..

34.

Krizhevsky
 
A
,
Sutskever
 
I
,
Hinton
 
GE
 
Imagenet classification with deep convolutional neural networks
.
Commun ACM
.
2017
;
60
:
84
90
..

35.

Kramer
 
MA
 
Nonlinear principal component analysis using autoassociative neural networks
.
AICHE J
.
1991
;
37
:
233
43
..

36.

Gong
 
B
,
Zhou
 
Y
,
Purdom
 
E
 
Cobolt: integrative analysis of multimodal single-cell sequencing data
.
Genome Biol
.
2021
;
22
:
1
21
..

37.

Minoura
 
K
,
Abe
 
K
,
Nam
 
H
 et al. .  
A mixture-of-experts deep generative model for integrated analysis of single-cell multiomics data
.
Cell Rep Meth
.
2021
;
1
:
100071
.

38.

Liu
 
C
,
Huang
 
H
,
Yang
 
P
 
Multi-task learning from multimodal single-cell omics with Matilda
.
Nucleic Acids Res
.
2023
;
51
:
e45
.

39.

Ashuach
 
T
,
Gabitto
 
MI
,
Koodli
 
RV
 et al. .  
MultiVI: deep generative model for the integration of multimodal data
.
Nat Methods
.
2023
;
20
:
1222
31
..

40.

Zhavoronkov
 
A
,
Ivanenkov
 
YA
,
Aliper
 
A
 et al. .  
Deep learning enables rapid identification of potent DDR1 kinase inhibitors
.
Nat Biotechnol
.
2019
;
37
:
1038
40
..

41.

Ballé
 
J
,
Laparra
 
V
,
Simoncelli
 
EP
 
End-to-end optimized image compression
.
arXiv
3 March 2017, preprint: not peer reviewed
.

42.

Cho
 
K
 
Simple sparsification improves sparse denoising autoencoders in denoising highly corrupted images
.
Int Conf Machine Learn
.
2013
;
28
:
432
40
.

43.

Hornik
 
K
,
Stinchcombe
 
M
,
White
 
H
 
Multilayer feedforward networks are universal approximators
.
Neural Netw
.
1989
;
2
:
359
66
.

44.

Hao
 
Y
,
Hao
 
S
,
Andersen-Nissen
 
E
 et al. .  
Integrated analysis of multimodal single-cell data
.
Cell
.
2021
;
184
:
3573
87
.

45.

Argelaguet
 
R
,
Arnol
 
D
,
Bredikhin
 
D
 et al. .  
MOFA+: a statistical framework for comprehensive integration of multi-modal single-cell data
.
Genome Biol
.
2020
;
21
:
111
.

46.

McInnes
 
L
,
Healy
 
J
,
Saul
 
N
 et al. .  
UMAP: uniform manifold approximation and projection
.
J Open Source Softw
.
2018
;
3
:
861
.

47.

Hoffman
 
MD
,
Blei
 
DM
,
Wang
 
C
 et al. .  
Stochastic variational inference
.
J Mach Learn Res
.
2013
;
14
:
1303
47
.

48.

Blei
 
DM
,
Kucukelbir
 
A
,
McAuliffe
 
JD
 
Variational inference: a review for statisticians
.
J Am Stat Assoc
.
2017
;
112
:
859
77
.

49.

Cheng
 
M
,
Li
 
Z
,
Costa
 
IG
 
MOJITOO: a fast and universal method for integration of multimodal single-cell data
.
Bioinformatics
.
2022
;
38
:
i282
9
.

50.

Moritz
 
P
,
Nishihara
 
R
,
Wang
 
S
 et al. .  
Ray: a distributed framework for emerging {AI} applications
.
13th USENIX symposium on operating systems design and implementation (OSDI 18)
.
2018
;
Carlsbad, CA, USA
561
77
.

51.

Pedregosa
 
F
,
Varoquaux
 
G
,
Gramfort
 
A
 et al. .  
Scikit-learn: machine learning in Python
.
J Mach Learn Res
.
2011
;
12
:
2825
30
.

52.

Lehoucq
 
RB
,
Sorensen
 
DC
,
Yang
 
C
 
ARPACK Users’ Guide: Solution of Large-scale Eigenvalue Problems with Implicitly Restarted Arnoldi Methods
.
1998
;
Philadelphia, PA, USA
SIAM
.

53.

Jacomy
 
M
,
Venturini
 
T
,
Heymann
 
S
 et al. .  
ForceAtlas2, a continuous graph layout algorithm for handy network visualization designed for the Gephi software
.
PloS One
.
2014
;
9
:
e98679
.

54.

Blei
 
DM
,
Ng
 
AY
,
Jordan
 
MI
 
Latent dirichlet allocation
.
J Mach Learn Res
.
2003
;
3
:
993
1022
.

55.

Lynch
 
AW
,
Theodoris
 
CV
,
Long
 
HW
 et al. .  
MIRA: joint regulatory modeling of multimodal expression and chromatin accessibility in single cells
.
Nat Methods
.
2022
;
19
:
1097
108
..

56.

Stuart
 
T
,
Srivastava
 
A
,
Madad
 
S
 et al. .  
Single-cell chromatin state analysis with Signac
.
Nat Methods
.
2021
;
18
:
1333
41
..

57.

Wolf
 
FA
,
Angerer
 
P
,
Theis
 
FJ
 
SCANPY: large-scale single-cell gene expression data analysis
.
Genome Biol
.
2018
;
19
:
15
.

58.

Malkov
 
Y
,
Yashunin
 
D
 
Efficient and robust approximate nearest neighbor search using Hierarchical Navigable Small World graphs
.
IEEE T Pattern Anal Mach Intell
.
2018
;
42
:
824
36
.

59.

Ester
 
M
,
Kriegel
 
HP
,
Sander
 
J
 et al. .  
A density-based algorithm for discovering clusters in large spatial databases with noise
.
Proceedings of the Second International Conference on Knowledge Discovery and Data Mining
.
1996
;
Portland, OR, USA
226
31
.

60.

Schep
 
AN
,
Wu
 
B
,
Buenrostro
 
JD
 et al. .  
chromVAR: inferring transcription-factor-associated accessibility from single-cell epigenomic data
.
Nat Methods
.
2017
;
14
:
975
8
..

61.

Rosvall
 
M
,
Bergstrom
 
CT
 
Maps of random walks on complex networks reveal community structure
.
Proc Natl Acad Sci USA
.
2008
;
105
:
1118
23
..

62.

Rosvall
 
M
,
Axelsson
 
D
,
Bergstrom
 
CT
 
The map equation
.
Eur Phys J: Spec Top
.
2009
;
178
:
13
23
..

63.

Csardi
 
G
,
Nepusz
 
T
 
The igraph software package for complex network research
.
Int J Complex Syst
.
2006
;
1695
:
1
9
.

64.

Lange
 
M
,
Bergen
 
V
,
Klein
 
M
 et al. .  
CellRank for directed single-cell fate mapping
.
Nat Methods
.
2022
;
19
:
159
70
..

65.

Duan
 
B
,
Zhou
 
C
,
Zhu
 
C
 et al. .  
Model-based understanding of single-cell CRISPR screening
.
Nat Commun
.
2019
;
10
:
2233
.

66.

McKenna
 
A
,
Findlay
 
GM
,
Gagnon
 
JA
 et al. .  
Whole-organism lineage tracing by combinatorial and cumulative genome editing
.
Science
.
2016
;
353
:
aaf7907
.

67.

Tabaka
 
M
,
Gould
 
J
,
Regev
 
A
 
scSVA: an interactive tool for big data visualization and exploration in single-cell omics
.
bioRxiv
6 January 2019, preprint: not peer reviewed
.

68.

Yang
 
H
,
Adam
 
RC
,
Ge
 
Y
 et al. .  
Epithelial-mesenchymal micro-niches govern stem cell lineage choices
.
Cell
.
2017
;
169
:
483
96
..

69.

Chen
 
H
,
Lareau
 
C
,
Andreani
 
T
 et al. .  
Assessment of computational methods for the analysis of single-cell ATAC-seq data
.
Genome Biol
.
2019
;
20
:
241
.

70.

Hou
 
W
,
Ji
 
Z
,
Ji
 
H
 et al. .  
A systematic evaluation of single-cell RNA-sequencing imputation methods
.
Genome Biol
.
2020
;
21
:
218
.

71.

Huang
 
M
,
Wang
 
J
,
Torre
 
E
 et al. .  
SAVER: gene expression recovery for single-cell RNA sequencing
.
Nat Methods
.
2018
;
15
:
539
42
..

72.

Li
 
WV
,
Li
 
JJ
 
An accurate and robust imputation method scImpute for single-cell RNA-seq data
.
Nat Commun
.
2018
;
9
:
997
.

73.

Wagner
 
F
,
Yan
 
Y
,
Yanai
 
I
 
K-nearest neighbor smoothing for high-throughput single-cell RNA-Seq data
.
bioRxiv
9 April 2018, preprint: not peer reviewed
.

74.

Talwar
 
D
,
Mongia
 
A
,
Sengupta
 
D
 et al. .  
AutoImpute: autoencoder based imputation of single-cell RNA-seq data
.
Sci Rep
.
2018
;
8
:
16329
.

75.

Lopez
 
R
,
Regier
 
J
,
Cole
 
MB
 et al. .  
Deep generative modeling for single-cell transcriptomics
.
Nat Methods
.
2018
;
15
:
1053
58
..

76.

McLean
 
CY
,
Bristor
 
D
,
Hiller
 
M
 et al. .  
GREAT improves functional interpretation of cis-regulatory regions
.
Nat Biotechnol
.
2010
;
28
:
495
501
..

77.

Cao
 
J
,
Spielmann
 
M
,
Qiu
 
X
 et al. .  
The single-cell transcriptional landscape of mammalian organogenesis
.
Nature
.
2019
;
566
:
496
502
..

78.

Tusi
 
BK
,
Wolock
 
SL
,
Weinreb
 
C
 et al. .  
Population snapshots predict early haematopoietic and erythroid hierarchies
.
Nature
.
2018
;
555
:
54
60
..

79.

Watcham
 
S
,
Kucinski
 
I
,
Gottgens
 
B
 
New insights into hematopoietic differentiation landscapes from single-cell RNA sequencing
.
Blood
.
2019
;
133
:
1415
26
.

80.

Hu
 
Y
,
Ma
 
S
,
Kartha
 
VK
 et al. .  
Single-cell multi-scale footprinting reveals the modular organization of DNA regulatory elements
.
bioRxiv
29 March 2023, preprint: not peer reviewed
.

81.

Bernstein
 
BE
,
Mikkelsen
 
TS
,
Xie
 
X
 et al. .  
A bivalent chromatin structure marks key developmental genes in embryonic stem cells
.
Cell
.
2006
;
125
:
315
26
..

82.

Abraham
 
BJ
,
Cui
 
K
,
Tang
 
Q
 et al. .  
Dynamic regulation of epigenomic landscapes during hematopoiesis
.
BMC Genom
.
2013
;
14
:
1
15
.

83.

Cai
 
Y
,
Zhang
 
Y
,
Loh
 
YP
 et al. .  
H3K27me3-rich genomic regions can function as silencers to repress gene expression via chromatin interactions
.
Nat Commun
.
2021
;
12
:
719
.

84.

Kurotaki
 
D
,
Kikuchi
 
K
,
Cui
 
K
 et al. .  
Chromatin structure undergoes global and local reorganization during murine dendritic cell development and activation
.
Proc Natl Acad Sci
.
2022
;
119
:
e2207009119
.

85.

Tian
 
Y
,
Meng
 
L
,
Zhang
 
Y
 
Epigenetic regulation of dendritic cell development and function
.
Cancer J
.
2017
;
23
:
302
7
..

86.

Paul
 
F
,
Amit
 
I
 
Plasticity in the transcriptional and epigenetic circuits regulating dendritic cell lineage specification and function
.
Curr Opin Immunol
.
2014
;
30
:
1
8
..

87.

Paul
 
F
,
Arkin
 
Y
,
Giladi
 
A
 et al. .  
Transcriptional heterogeneity and lineage commitment in myeloid progenitors
.
Cell
.
2015
;
163
:
1663
77
..

88.

Lin
 
Q
,
Chauvistré
 
H
,
Costa
 
IG
 et al. .  
Epigenetic program and transcription factor circuitry of dendritic cell development
.
Nucleic Acids Res
.
2015
;
43
:
9680
93
.

89.

Eisen
 
MB
,
Spellman
 
PT
,
Brown
 
PO
 et al. .  
Cluster analysis and display of genome-wide expression patterns
.
Proc Natl Acad Sci USA
.
1998
;
95
:
14863
8
..

90.

Segal
 
E
,
Shapira
 
M
,
Regev
 
A
 et al. .  
Module networks: identifying regulatory modules and their condition-specific regulators from gene expression data
.
Nat Genet
.
2003
;
34
:
166
76
..

91.

Kotliar
 
D
,
Veres
 
A
,
Nagy
 
MA
 et al. .  
Identifying gene expression programs of cell-type identity and cellular activity with single-cell RNA-Seq
.
Elife
.
2019
;
8
:
e43803
.

92.

Wallrapp
 
A
,
Riesenfeld
 
SJ
,
Burkett
 
PR
 et al. .  
The neuropeptide NMU amplifies ILC2-driven allergic lung inflammation
.
Nature
.
2017
;
549
:
351
6
..

93.

Bastidas-Ponce
 
A
,
Tritschler
 
S
,
Dony
 
L
 et al. .  
Massive single-cell mRNA profiling reveals a detailed roadmap for pancreatic endocrinogenesis
.
Development
.
2019
;
146
:
dev.173849
.

94.

Yu
 
XX
,
Qiu
 
WL
,
Yang
 
L
 et al. .  
Sequential progenitor states mark the generation of pancreatic endocrine lineages in mice and humans
.
Cell Res
.
2021
;
31
:
886
903
..

95.

Joost
 
S
,
Zeisel
 
A
,
Jacob
 
T
 et al. .  
Single-cell transcriptomics reveals that differentiation and spatial signatures shape epidermal and hair follicle heterogeneity
.
Cell Syst
.
2016
;
3
:
221
37
..

96.

Ras
 
G
,
Xie
 
N
,
Van Gerven
 
M
 et al. .  
Explainable deep learning: a field guide for the uninitiated
.
J Artif Intell Res
.
2022
;
73
:
329
97
..

97.

Atta
 
L
,
Sahoo
 
A
,
Fan
 
J
 
VeloViz: RNA velocity-informed embeddings for visualizing cellular trajectories
.
Bioinformatics
.
2022
;
38
:
391
6
..

98.

Pliner
 
HA
,
Packer
 
JS
,
McFaline-Figueroa
 
JL
 et al. .  
Cicero predicts cis-regulatory DNA interactions from single-cell chromatin accessibility data
.
Mol Cell
.
2018
;
71
:
858
71
..

99.

Gao
 
T
,
Zheng
 
Z
,
Pan
 
Y
 et al. .  
scEnhancer: a single-cell enhancer resource with annotation across hundreds of tissue/cell types in three species
.
Nucleic Acids Res
.
2022
;
50
:
D371
9
..

100.

Tirosh
 
I
,
Izar
 
B
,
Prakadan
 
SM
 et al. .  
Dissecting the multicellular ecosystem of metastatic melanoma by single-cell RNA-seq
.
Science
.
2016
;
352
:
189
96
..

This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.

Supplementary data

Comments

0 Comments
Submit a comment
You have entered an invalid code
Thank you for submitting a comment on this article. Your comment will be reviewed and published at the journal's discretion. Please check for further notifications by email.