Research & Publications
My current research focus on two parts: (1) developing novel statistical and computational models to analyze large scale omics and drug perturbation data to better understand disease pathogenesis and precision medicine, and (2) understanding the heterogeneity and pathogenesis of pulmonary diseases, such as asthma, idiopathic pulmonary fibrosis (IPF), sarcoidosis, pediatric cystic fibrosis and so on, by tailoring statistical and computational methods based on existing biological knowledge of the diseases.
My team has been involved in multiple transcriptomic studies of asthma, IPF, sarcoidosis, cystic fibrosis and lung injuries in pediatric patients undertaking cardio bypass procedure. These studies generated various types of large-scale transcriptomic data including microarray gene expression data, bulk RNA sequencing data, single cell RNA sequencing data, T cell receptor repertoire data and 16s rRNA sequencing data. For each study, we tailed our computational and statistical analysis of the data based on existing biological knowledge of the corresponding disease or condition. These analyses have made various discoveries in asthma pathogenesis heterogeneity, cell type specific changes in asthma patients, heterogeneity and molecular biomarker of sarcoidosis, rare cell populations specific to IPF and COPD, potential antigen specific T cell clones for SARS-CoV-2 infection (COVID19) in adults and so on. My team is currently working with physicians and basic scientists to make further and more translational discoveries for the aforementioned and other pulmonary diseases.
Though the extensive analyses of various types of omics data generated by our collaborators, my team also identifies computational and statistical needs and develops novel methods to address these needs. Topics of tools we have developed include imputation of single-cell RNA sequencing data (G2S3), identifying differentially expressed genes from scRNA-seq data with mutliple subjects (iDESC), cell type deconvolution of spatial transcriptomic data (SDePER) and so on. The development of these tools further boosted our capacity and ability to analyze different types of OMICS data to better understand disease pathogenesis.
Extensive Research Description
Analysis of one time-point microarray and longitudinal bulk RNA sequencing data from asthma patients;
Analysis of longitudinal microbiome sequencing data from children with cystic fibrosis;
Bulk RNA sequencing of IPF, A1AT and SARC patients using Ion Torrent technology;
Single cell RNA sequencing data analysis;
Spatial single cell RNA sequencing data analysis;
Genetics; Lung Diseases; Respiratory Hypersensitivity; Computational Biology; Genomics; Biostatistics; Molecular Medicine
Public Health Interests
Bioinformatics; Biomarkers; Genetics, Genomics, Epigenetics; Microbial Ecology; Modeling