Skip to Main Content
Yale Only

Biostatistics Seminar

“Meta-clustering of Gene Expression Data”

ABSTRACT

Traditional meta-analyses pool effect sizes across studies to improve statistical power. Likewise, there is growing interest in joint clustering across datasets to identify disease subtypes for bulk gene expression data and to discover cell types for single-cell RNA-sequencing (scRNA-seq) data. Unfortunately, due to the prevalence of technical batch effects, directly clustering of samples from multiple gene expression datasets can lead to wrong results. Therefore, in the past several years, there has been very active research on the integration of multiple gene expression datasets. However, the discussion on when multiple gene expression datasets can be integrated for joint clustering is lacking. Obviously, if different subtypes are assayed in distinct batches, then meta-clustering would be impossible no matter what types of machine learning or statistical methods are used.

In this talk, I will present our Batch-effects-correction-with-Unknown-Subtypes (BUS) framework. BUS is capable of adjusting batch effects explicitly, grouping samples that share similar characteristics into subtypes, identifying genes that distinguish subtypes and enjoying a linear-order computational complexity. The BUS framework can be adapted to perform meta-clustering for bulk gene expression data, scRNA-seq data collected from a single biological condition, and scRNA-seq data collected from multiple biological conditions, respectively. The proofs for model identifiability for the corresponding models provide insights on when multiple gene expression data can be integrated for meta-clustering and guidelines on experimental designs. Simulation studies and real data analyses show the advantages of our proposed models over state-of-the-art methods, especially when performing differential inference for scRNA-seq data collected from multiple conditions.

Speaker

  • The Chinese University of Hong Kong

    Yingying Wei
    Associate Professor

Contact

Host Organizations

Admission

Free

Tag

Lectures and Seminars
Apr 202530Wednesday