Shuangge Steven Ma, PhD
Department Chair and Professor of BiostatisticsCards
Education
University of Wisconsin (2004)
University of California at Los Angeles (2000)
Contact Info
Training
University of Washington (2006)
Education
University of Wisconsin (2004)
University of California at Los Angeles (2000)
Contact Info
Training
University of Washington (2006)
Education
University of Wisconsin (2004)
University of California at Los Angeles (2000)
Contact Info
Training
University of Washington (2006)
About
Copy Link
Titles
Department Chair and Professor of Biostatistics
Biography
Dr. Ma received his Ph.D. degree in statistics at University of Wisconsin in 2004. Prior to arriving at Yale, Dr. Ma was a Senior Fellow in Collaborative Health Studies Coordinating Center (CHSCC) and Department of Biostatistics at University of Washington. He has been involved in developing novel statistical and bioinformatics methodologies for analysis of cancer (NHL, breast cancer, melanoma, lung cancer), mental disorders, and cardiovascular diseases. He has also been involved in health economics research, with special interest in health insurance in developing countries.
Appointments
Biostatistics
ChairDualBiostatistics
ProfessorPrimary
Other Departments & Organizations
- Biostatistics
- Cancer Prevention and Control
- Center for Infection and Immunity
- Computational Biology and Biomedical Informatics
- Ma Lab
- SPORE in Lung Cancer
- SPORE in Skin Cancer
- Yale Cancer Center
- Yale Combined Program in the Biological and Biomedical Sciences (BBS)
- Yale Institute for Global Health
- Yale School of Public Health
- Yale Ventures
- Yale-BI Biomedical Data Science Fellowship
- YSPH Global Health Concentration
Education & Training
- Postdoctoral Associate
- University of Washington (2006)
- PhD
- University of Wisconsin (2004)
- MS
- University of California at Los Angeles (2000)
Research
Copy Link
Overview
Develop novel statistical methodologies for complex data;
Study epidemiology and pathogenesis of multiple cancers, including breast cancer, NHL, melanoma and lung cancer;
Conduct survey studies, investigating health insurance utilization and impact;
Provide statistical support to multiple biomedical studies.
Medical Research Interests
Public Health Interests
ORCID
0000-0001-9001-4999
Research at a Glance
Research Interests
Publications
2026
Hierarchical structure-guided high-dimensional multi-view clustering
Jiang J, Fang K, Ma S, Zhang Q. Hierarchical structure-guided high-dimensional multi-view clustering. Journal Of Multivariate Analysis 2026, 211: 105488. DOI: 10.1016/j.jmva.2025.105488.Peer-Reviewed Original ResearchConceptsAlternating Direction MethodHierarchical clustering structureMulti-view clustering approachesAlternating direction method of multipliers algorithmNon-convex problemStructure of dataHistopathological image dataCluster structureData clustersMultipliers algorithmData typesClustering approachSimulation resultsImage dataHierarchical structureStructural informationStatistical propertiesInformationGranularityAlgorithmDataSuperiorityAlternative approachMethod
2025
Network-based hierarchical heterogeneity analysis and applications to cancer omics data
Wang R, Zhang S, Ma S. Network-based hierarchical heterogeneity analysis and applications to cancer omics data. Science China Mathematics 2025, 1-14. DOI: 10.1007/s11425-024-2435-2.Peer-Reviewed Original ResearchJoint identification of spatially variable genes via a network-assisted Bayesian regularization approach
Wu M, Li Y, Ma S, Wu M. Joint identification of spatially variable genes via a network-assisted Bayesian regularization approach. The Annals Of Applied Statistics 2025, 19: 2705-2723. DOI: 10.1214/25-aoas2097.Peer-Reviewed Original ResearchConceptsSpatially variable genesVariable genesSpatial transcriptomics dataTranscriptome dataConfounding variationMultiple genesMechanistic functionBiological processesGenesReal data analysesCellular distributionZero-inflated negative binomial distributionRegularization approachBiological understandingJoint identificationCount natureResults of simulation studiesNegative binomial distributionZero inflationMCMC algorithmBayesian regularization approachCellular compositionSpatial patternsBinomial distributionSimulation studyConditional Graphical Models With A Hierarchical Sparse Estimation
Li R, Zhang Q, Ma S. Conditional Graphical Models With A Hierarchical Sparse Estimation. Statistics And Computing 2025, 36: 30. DOI: 10.1007/s11222-025-10783-8.Peer-Reviewed Original ResearchOrdinal Sparse Neural Networks for Modeling Gene‐ and Imaging‐Environment Interactions
Xue J, Xu Y, Li J, Ma S, Fang K. Ordinal Sparse Neural Networks for Modeling Gene‐ and Imaging‐Environment Interactions. Statistics In Medicine 2025, 44: e70302. PMID: 41105049, DOI: 10.1002/sim.70302.Peer-Reviewed Original ResearchMeSH Keywords and ConceptsAnalysis of cross-platform health communication with a network approach
Fan X, Liu M, Ma S. Analysis of cross-platform health communication with a network approach. Biometrics 2025, 81: ujaf154. PMID: 41273214, DOI: 10.1093/biomtc/ujaf154.Peer-Reviewed Original ResearchMeSH Keywords and ConceptsConceptsOnline health communitiesWord frequency vectorsHealth communicationOnline platformsCommunication modelTwitter topicsTwitterCo-occurrenceNetwork approachFrequency vectorInformation sourcesMedical informationStructure contentCommunicationEarly analysisWord frequencyNetworkCo-occurrence networkCommunication analysisComplex medical informationNumerical propertiesPlatformGE-IA-NAM: gene–environment interaction analysis via imaging-assisted neural additive model
Li J, Xu Y, Ma S, Fang K. GE-IA-NAM: gene–environment interaction analysis via imaging-assisted neural additive model. Bioinformatics 2025, 41: btaf481. PMID: 40880282, PMCID: PMC12452269, DOI: 10.1093/bioinformatics/btaf481.Peer-Reviewed Original ResearchMeSH Keywords and ConceptsConceptsGene-environmentNeural additive modelsGene-environment modelGene-environment analysisGene-environment interaction analysisEnvironmental factorsCancer Genome AtlasPathological imagesSkin cancer datasetGenome AtlasCancer datasetsNetwork architectureCompetitive performanceGenetic factorsPython codeCancer outcomesInteraction analysisData patternsCancer researchAdditive modelInteraction methodEnvironmental dataJoint analysisCancer modelsRegression-basedNetwork-based modeling of emotional expressions for multiple cancers via a linguistic analysis of an online health community
Fan X, Liu M, Ma S. Network-based modeling of emotional expressions for multiple cancers via a linguistic analysis of an online health community. The Annals Of Applied Statistics 2025, 19: 2218-2236. PMID: 41104371, PMCID: PMC12525517, DOI: 10.1214/25-aoas2047.Peer-Reviewed Original ResearchConceptsOnline health communitiesNetwork-based modelModel of emotional expressionsLow-rank matrixAmerican Cancer Society Cancer Survivors NetworkHealth communityFear of judgmentSemantic networkNetwork analysis techniquesComputational propertiesSurvivors NetworkSimulation resultsNetworkAdverse emotionsLinguistic analysisCluster structureSingle diseaseCancer patientsPenalization approachCancer typesEmotional expressionCancerAnalysis techniquesComputerData analysisSubgroup Testing in the Change‐Plane Cox Model
Zhang X, Ren P, Shi X, Ma S, Liu X. Subgroup Testing in the Change‐Plane Cox Model. Statistics In Medicine 2025, 44: e70179. PMID: 40662752, DOI: 10.1002/sim.70179.Peer-Reviewed Original ResearchMeSH Keywords and ConceptsConceptsFinite sample performanceAnalysis of survival dataLikelihood ratio testAsymptotic distributionSample performanceLung cancer dataScore testSimulation studyRatio testSurvival dataCancer dataCox modelImmune checkpoint blockade therapyCheckpoint blockade therapySolid tumor patientsTumor mutational burdenSubgroup testsTreatment effectsCovariatesBlockade therapyMutational burdenSubgroupsRobust Transfer Learning for High‐Dimensional GLM Using γ$$ \gamma $$‐Divergence With Applications to Cancer Genomics
Xu F, Ma S, Zhang Q, Xu Y. Robust Transfer Learning for High‐Dimensional GLM Using γ$$ \gamma $$‐Divergence With Applications to Cancer Genomics. Statistics In Medicine 2025, 44: e70170. PMID: 40662636, PMCID: PMC12313224, DOI: 10.1002/sim.70170.Peer-Reviewed Original ResearchMeSH Keywords and ConceptsConceptsTransfer learningReal world biomedical dataRisk of negative transferProximal gradient descentTransfer learning methodTransfer learning approachHigh-dimensional dataHigh-dimensional settingsGradient descentCompetitive performanceLearning methodsEstimation error boundsBiomedical dataEfficient algorithmLearning approachDetection schemeNegative transferAnalysis of complex diseasesDebiasing stepMethod's effectivenessCancer genomic dataData contaminationError boundsHigh-dimensional profiling dataOutliers
Clinical Trials
Current Trials
Molecular Markers of UV Exposure and Cancer Risk in Skin
HIC ID2000024848RoleSub InvestigatorPrimary Completion Date03/31/2024Recruiting Participants
News
Copy Link
News
- July 23, 2025
Digging into data science at the 38th Annual New England Statistical Symposium
- May 06, 2025
Genetic Test Underused in Cancer Care
- January 07, 2025
Leadership Appointments Underscore Yale Biostatistics’ Global Strength in Research and Innovation
- October 24, 2024
New Analytics Center for Cardiovascular Medicine
Get In Touch
Copy Link
Contacts
Locations
300 George Street
Academic Office
Ste 501
New Haven, CT 06511