Publications

Accurate and Efficient Integrative Reference-Informed Spatial Domain Detection for Spatial Transcriptomics

Accurate and Efficient Integrative Reference-Informed Spatial Domain Detection for Spatial Transcriptomics

Spatially resolved transcriptomics (SRT) studies are becoming increasingly common and increasingly large, offering unprecedented opportunities to characterize the spatial and functional organization of complex tissues. Here, we introduce a computational method, IRIS, that characterizes the spatial organization of complex tissues through accurate and efficient detection of spatial domains. IRIS uniquely leverage the widespread availability of single-cell RNA-seq data for reference-informed spatial domain detection, integrates multiple SRT tissue slices jointly while explicitly considering correlation both within and across slices, produces biologically interpretable spatial domains, and benefits from multiple algorithmic innovations for highly scalable computation. We demonstrate the advantages of IRIS through in-depth analysis of six SRT datasets from different technologies across various tissues, species, and spatial resolutions. In these applications, IRIS attains an unprecedent 39% ~ 1,083% accuracy gain over existing methods in the gold standard dataset with known ground truth. Furthermore, IRIS is 4.6 ~ 666.0 times faster than existing methods in moderate-sized datasets and is the only method effective and applicable to large-scale SRT datasets, including the very recent stereo-seq and 10x Xenium. As a result, IRIS uncovers the fine-scale structures of brain regions, reveals the spatial heterogeneity of distinct tumor microenvironments, and characterizes the structural changes of the seminiferous tubes in the testis associated with diabetes, all at a speed and accuracy unachievable by existing approaches.

Ying Ma and Xiang Zhou

Spatially informed cell type deconvolution for spatial transcriptomics

Spatially informed cell type deconvolution for spatial transcriptomics

Many spatially resolved transcriptomic technologies do not have single-cell resolution but measure the average gene expression for each spot from a mixture of cells of potentially heterogeneous cell types. Here, we introduce a deconvolution method, conditional autoregressive-based deconvolution (CARD), that combines cell-type-specific expression information from single-cell RNA sequencing (scRNA-seq) with correlation in cell-type composition across tissue locations. Modeling spatial correlation allows us to borrow the cell-type composition information across locations, improving accuracy of deconvolution even with a mismatched scRNA-seq reference. CARD can also impute cell-type compositions and gene expression levels at unmeasured tissue locations to enable the construction of a refined spatial tissue map with a resolution arbitrarily higher than that measured in the original study and can perform deconvolution without an scRNA-seq reference. Applications to four datasets, including a pancreatic cancer dataset, identified multiple cell types and molecular markers with distinct spatial localization that define the progression, heterogeneity and compartmentalization of pancreatic cancer.

Ying Ma and Xiang Zhou

Integrative differential expression and gene set enrichment analysis using summary statistics for scRNA-seq studies

Integrative differential expression and gene set enrichment analysis using summary statistics for scRNA-seq studies

Differential expression (DE) analysis and gene set enrichment (GSE) analysis are commonly applied in single cell RNA sequencing (scRNA-seq) studies. Here, we develop an integrative and scalable computational method, iDEA, to perform joint DE and GSE analysis through a hierarchical Bayesian framework. By integrating DE and GSE analyses, iDEA can improve the power and consistency of DE analysis and the accuracy of GSE analysis. Importantly, iDEA uses only DE summary statistics as input, enabling effective data modeling through complementing and pairing with various existing DE methods. We illustrate the benefits of iDEA with extensive simulations. We also apply iDEA to analyze three scRNA-seq data sets, where iDEA achieves up to five-fold power gain over existing GSE methods and up to 64% power gain over existing DE methods

Ying Ma, Shiquan Sun, Xuequn Shang, Evan T Keller, Mengjie Chen, Xiang Zhou

 

List of publications

Accurate and Efficient Integrative Reference-Informed Spatial Domain Detection for Spatial Transcriptomics
Ying Ma and Xiang Zhou
Nature Methods 2024

ExPRSweb: An online repository with polygenic risk scores for common health-related exposures
Ying Ma, Snehal Patil, Xiang Zhou, Bhramar Mukherjee, Lars G Fritsche
The American Journal of Human Genetics 2022

Modeling zero inflation is not necessary for spatial transcriptomics
Peiyao Zhao, Jiaqiang Zhu, Ying Ma, Xiang Zhou
Genome Biology 2022

Spatially informed cell type deconvolution for spatial transcriptomics
Ying Ma and Xiang Zhou
Nature Biotechnology 2022

Genetic prediction of complex traits with polygenic scores: a statistical review
Ying Ma and Xiang Zhou
Trends in Genetics 2021

On cross-ancestry cancer polygenic risk scores
Lars G Fritsche, Ying Ma, Daiwei Zhang, Maxwell Salvatore, Seunggeun Lee, Xiang Zhou, Bhramar Mukherjee
PLoS genetics 2021

Cancer PRSweb: an online repository with polygenic risk scores for major cancer traits and their evaluation in two independent biobanks
Lars G Fritsche, Snehal Patil, Lauren J Beesley, Peter VandeHaar, Maxwell Salvatore, Ying Ma, Robert B Peng, Daniel Taliun, Xiang Zhou, Bhramar Mukherjee
The American Journal of Human Genetics 2021

Integrative differential expression and gene set enrichment analysis using summary statistics for scRNA-seq studies
Ying Ma, Shiquan Sun, Xuequn Shang, Evan T Keller, Mengjie Chen, Xiang Zhou
Nature communications 2020

Accuracy, robustness and scalability of dimensionality reduction methods for single-cell RNA-seq analysis
Shiquan Sun, Jiaqiang Zhu, Ying Ma, Xiang Zhou
Genome Biology 2019