Featured Publications
Exploiting Gene-Environment Independence for Analysis of Case–Control Studies: An Empirical Bayes-Type Shrinkage Estimator to Trade-Off Between Bias and Efficiency
Mukherjee B, Chatterjee N. Exploiting Gene-Environment Independence for Analysis of Case–Control Studies: An Empirical Bayes-Type Shrinkage Estimator to Trade-Off Between Bias and Efficiency. Biometrics 2007, 64: 685-694. PMID: 18162111, DOI: 10.1111/j.1541-0420.2007.00953.x.Peer-Reviewed Original ResearchConceptsGene-environment independenceShrinkage estimatorsLog odds ratio parametersCase-control dataGene-environment independence assumptionOdds ratio parametersCase-control estimatorsData-adaptive fashionData exampleProspective logistic regression analysisBinary exposureGene-environment associationsIndependence assumptionLogistic regression analysisCase-onlyMaximum likelihood frameworkEstimationSample sizeBinary genesRegression analysisChatterjeeExamplesWeighted averageAssumptions
2021
Maternal lipidomic signatures in relation to spontaneous preterm birth and large-for-gestational age neonates
Aung M, Ashrap P, Watkins D, Mukherjee B, Rosario Z, Vélez-Vega C, Alshawabkeh A, Cordero J, Meeker J. Maternal lipidomic signatures in relation to spontaneous preterm birth and large-for-gestational age neonates. Scientific Reports 2021, 11: 8115. PMID: 33854141, PMCID: PMC8046995, DOI: 10.1038/s41598-021-87472-9.Peer-Reviewed Original ResearchConceptsSpontaneous preterm birthBiomarkers of pregnancy outcomesGestational age neonatesPreterm birthAge neonatesPregnancy outcomesDegree of hydrocarbon chain saturationIncreased riskNeonatal anthropometric parametersAssociated with increased riskPlasmenyl phosphatidylethanolamineMaternal lipidomeWeeks gestationGestational ageLipidomic signatureAnthropometric parametersLiquid chromatography tandem mass spectrometryLipidomic profilesLipid metabolitesHydrocarbon chain saturationPlasma samplesBirthLogistic regressionHigh-performance liquid chromatography tandem mass spectrometryNeonates
2020
A Fast and Accurate Method for Genome-Wide Time-to-Event Data Analysis and Its Application to UK Biobank
Bi W, Fritsche L, Mukherjee B, Kim S, Lee S. A Fast and Accurate Method for Genome-Wide Time-to-Event Data Analysis and Its Application to UK Biobank. American Journal Of Human Genetics 2020, 107: 222-233. PMID: 32589924, PMCID: PMC7413891, DOI: 10.1016/j.ajhg.2020.06.003.Peer-Reviewed Original ResearchConceptsControlled type I error ratesTime-to-event data analysisType I error rateGenetic studies of human diseasesGenome-wide significance levelTime-to-event phenotypesSaddlepoint approximationGenome-wide analysisEuropean ancestry samplesMinor allele frequencyStudy of human diseaseElectronic health recordsCox PH regression modelRegression modelsStandard Wald testProportional hazardsBinary phenotypesData analysisAncestry samplesGenetic studiesHealth recordsUK BiobankAllele frequenciesInpatient dataCox proportional hazards
2019
A Fast and Accurate Method for Genome-wide Scale Phenome-wide G × E Analysis and Its Application to UK Biobank
Bi W, Zhao Z, Dey R, Fritsche L, Mukherjee B, Lee S. A Fast and Accurate Method for Genome-wide Scale Phenome-wide G × E Analysis and Its Application to UK Biobank. American Journal Of Human Genetics 2019, 105: 1182-1192. PMID: 31735295, PMCID: PMC6904814, DOI: 10.1016/j.ajhg.2019.10.008.Peer-Reviewed Original ResearchConceptsCase-control ratioGenome-wide significance levelMeasures of environmental exposureGenome-wide analysisEuropean ancestry samplesGenetic association studiesSaddlepoint approximationCase-control imbalanceAnalysis of phenotypesGene-environment interactionsPopulation-based biobanksControlled type I error ratesAssociation studiesG x E effectsUK BiobankType I error rateGenetic variantsE analysisSPAGEComplex diseasesEnvironmental exposuresTest statisticsE studySimulation studyWald testAssociations between childhood maltreatment latent classes and eating disorder symptoms in a nationally representative sample of young adults in the United States
Hazzard V, Bauer K, Mukherjee B, Miller A, Sonneville K. Associations between childhood maltreatment latent classes and eating disorder symptoms in a nationally representative sample of young adults in the United States. Child Abuse & Neglect 2019, 98: 104171. PMID: 31546098, PMCID: PMC6885127, DOI: 10.1016/j.chiabu.2019.104171.Peer-Reviewed Original ResearchConceptsEating disorder symptomsFasting/skipping mealsDisorder symptomsEating-related concernsChildhood maltreatmentPhysical abusePhysical neglectAssociated with eating disorder symptomsLatent classesSexual abuseAssociated with eating disordersChildhood physical neglectYoung adultsNationally representative sampleMulti-type maltreatmentMaltreatment profilesLogistic regression modelsEating disordersLatent class analysisCompensatory behaviorsU.S. young adultsHigh riskStudy of AdolescentLongitudinal Study of AdolescentRepresentative sample
2017
Meta‐analysis of gene‐environment interaction exploiting gene‐environment independence across multiple case‐control studies
Estes J, Rice J, Li S, Stringham H, Boehnke M, Mukherjee B. Meta‐analysis of gene‐environment interaction exploiting gene‐environment independence across multiple case‐control studies. Statistics In Medicine 2017, 36: 3895-3909. PMID: 28744888, PMCID: PMC5624850, DOI: 10.1002/sim.7398.Peer-Reviewed Original ResearchMeSH KeywordsAge FactorsAlpha-Ketoglutarate-Dependent Dioxygenase FTOBayes TheoremBiasBiometryBody Mass IndexCase-Control StudiesComputer SimulationDiabetes Mellitus, Type 2Gene-Environment InteractionHumansLogistic ModelsMeta-Analysis as TopicModels, GeneticModels, StatisticalPolymorphism, Single NucleotideRetrospective StudiesConceptsGene-environment independenceGene-environmentEmpirical Bayes estimatorsGene-environment interactionsCase-control studyMeta-analysis settingBayes estimatorsRetrospective likelihood frameworkShrinkage estimatorsMeta-analysisTesting gene-environment interactionsCombination of estimatesFactors body mass indexSimulation studyBody mass indexUnconstrained modelLikelihood frameworkInverse varianceMeta-analysis frameworkFTO geneMass indexGenetic markersEstimationStandard alternativeChatterjeeUpdate on the State of the Science for Analytical Methods for Gene-Environment Interactions
Gauderman W, Mukherjee B, Aschard H, Hsu L, Lewinger J, Patel C, Witte J, Amos C, Tai C, Conti D, Torgerson D, Lee S, Chatterjee N. Update on the State of the Science for Analytical Methods for Gene-Environment Interactions. American Journal Of Epidemiology 2017, 186: 762-770. PMID: 28978192, PMCID: PMC5859988, DOI: 10.1093/aje/kwx228.Peer-Reviewed Original ResearchConceptsGenome-wide association studiesG x EGene-environment interactionsAssociation studiesAnalysis of gene-environment interactionsQuantitative trait studiesComplex traitsGenetic dataGene setsTrait studiesGene-environmentCase-controlEnvironmental dataConsortium settingFormation of consortiaGenesConsortiumAnalytical challengesTraitsSetsStudyInteractionStatistical approachDataPerceptions of measles, pneumonia, and meningitis vaccines among caregivers in Shanghai, China, and the health belief model: a cross-sectional study
Wagner A, Boulton M, Sun X, Mukherjee B, Huang Z, Harmsen I, Ren J, Zikmund-Fisher B. Perceptions of measles, pneumonia, and meningitis vaccines among caregivers in Shanghai, China, and the health belief model: a cross-sectional study. BMC Pediatrics 2017, 17: 143. PMID: 28606106, PMCID: PMC5468991, DOI: 10.1186/s12887-017-0900-2.Peer-Reviewed Original ResearchConceptsPneumococcal vaccine uptakeHealth Belief ModelBelief ModelHealth Belief Model constructsVaccine uptakeModels of health behaviorVaccine necessityHealth care workersCross-sectional studyLogistic regression modelsChinese caregiversCaregiver perceptionsHealth behaviorsCaregiversCare workersYears of agePneumococcal vaccineWritten surveyBackgroundIn ChinaHealthPerceived safetyRegression modelsYoung childrenChildrenMeasles vaccine
2016
Microsatellite Alterations With Allelic Loss at 9p24.2 Signify Less-Aggressive Colorectal Cancer Metastasis
Koi M, Garcia M, Choi C, Kim H, Koike J, Hemmi H, Nagasaka T, Okugawa Y, Toiyama Y, Kitajima T, Imaoka H, Kusunoki M, Chen Y, Mukherjee B, Boland C, Carethers J. Microsatellite Alterations With Allelic Loss at 9p24.2 Signify Less-Aggressive Colorectal Cancer Metastasis. Gastroenterology 2016, 150: 944-955. PMID: 26752111, PMCID: PMC4808397, DOI: 10.1053/j.gastro.2015.12.032.Peer-Reviewed Original ResearchMeSH KeywordsBiomarkers, TumorChi-Square DistributionChromosome AberrationsChromosomes, Human, Pair 9Colorectal NeoplasmsDisease ProgressionDisease-Free SurvivalFemaleGenetic Predisposition to DiseaseHumansJapanKaplan-Meier EstimateLiver NeoplasmsLogistic ModelsLoss of HeterozygosityMaleMicrosatellite RepeatsMiddle AgedNeoplasm Recurrence, LocalNeoplasm StagingOdds RatioPhenotypeProportional Hazards ModelsProto-Oncogene Proteins B-rafProto-Oncogene Proteins p21(ras)Republic of KoreaRisk FactorsTime FactorsTreatment OutcomeConceptsPrimary colorectal tumorsLoss of heterozygosityLiver metastasesColorectal cancerColorectal tumorsElevated microsatellite alterationsMicrosatellite alterationsStage IICurative treatment of patientsStage III colorectal cancerOverall survival of patientsSurvival of patientsIII colorectal cancerTumor to liverColorectal cancer recurrenceTreatment of patientsMatched liver metastasesCancer cell nucleiMatched metastasesDisease recurrenceOverall survivalPrognostic factorsAllelic lossNo significant differenceCurative treatment
2013
Propensity score‐based diagnostics for categorical response regression models
Boonstra P, Bondarenko I, Park S, Vokonas P, Mukherjee B. Propensity score‐based diagnostics for categorical response regression models. Statistics In Medicine 2013, 33: 455-469. PMID: 23934948, PMCID: PMC3911784, DOI: 10.1002/sim.5940.Peer-Reviewed Original ResearchConceptsRetrospective sampling designsChi-square distributionCategorical response modelsGoodness-of-fit statisticsPredicted response probabilitiesResponse regression modelsConditional distributionProportional odds modelAssess model adequacyData examplesSimulation studyVA Normative Aging StudyNormative Aging StudyPropensity scoreCumulative lead exposureOdds modelModel diagnosticsCase-control studyAssociated with diabetesBalance scoresResponse probabilityModel adequacyCohort studyAging StudyNumerical summariesEnvironmental Confounding in Gene-Environment Interaction Studies
Vanderweele T, Ko Y, Mukherjee B. Environmental Confounding in Gene-Environment Interaction Studies. American Journal Of Epidemiology 2013, 178: 144-152. PMID: 23821317, PMCID: PMC3698991, DOI: 10.1093/aje/kws439.Peer-Reviewed Original ResearchConceptsGene-environment independenceGene-environment interaction studiesGene-environment interactionsEnvironmental confoundersGenetic factorsJoint testGene-environmentGenetic effectsEnvironmental factorsConfounding variablesConfoundingInteraction studiesSimulation studyJoint nullSample sizeBias estimatesFactorsIndependenceStudyTest
2012
Air pollution and respiratory symptoms among children with asthma: Vulnerability by corticosteroid use and residence area
Lewis T, Robins T, Mentz G, Zhang X, Mukherjee B, Lin X, Keeler G, Dvonch J, Yip F, O'Neill M, Parker E, Israel B, Max P, Reyes A, Committee C. Air pollution and respiratory symptoms among children with asthma: Vulnerability by corticosteroid use and residence area. The Science Of The Total Environment 2012, 448: 48-55. PMID: 23273373, PMCID: PMC4327853, DOI: 10.1016/j.scitotenv.2012.11.070.Peer-Reviewed Original ResearchConceptsParticulate matterAir pollutionAssociated with negative health impactsAmbient pollutant concentrationsAir quality standardsAmbient air pollutionAmbient particulate matterEffective risk reduction interventionsOdds of respiratory symptomsAsthmatic childrenFactors associated with heterogeneityAssociated with increased oddsObserved health effectsRespiratory symptomsRisk reduction interventionsPopulation of asthmatic childrenRespiratory symptom diariesOutdoor PMDaily concentrationsPollutant concentrationsMonitoring sitesGeneralized estimating equationsPollution modelLogistic regression modelsAerodynamic diameterPoint source modeling of matched case–control data with multiple disease subtypes
Li S, Mukherjee B, Batterman S. Point source modeling of matched case–control data with multiple disease subtypes. Statistics In Medicine 2012, 31: 3617-3637. PMID: 22826092, PMCID: PMC4331356, DOI: 10.1002/sim.5388.Peer-Reviewed Original ResearchConceptsAdjacent-category logit modelMarkov chain Monte Carlo techniquesEvaluate maximum likelihoodExtensive simulation studyProfile likelihoodHierarchical Bayesian approachCase-control dataSimulation studyBayesian approachMonte Carlo techniqueBayesian methodsMaximum likelihoodMultiple disease subtypesCategorical outcomesCovariate adjustmentNonlinear modelEstimation stabilityMedicaid claims dataCase-control designPediatric asthma populationAsthma populationElevated oddsMarkovLogit modelCovariatesLikelihood‐based methods for regression analysis with binary exposure status assessed by pooling
Lyles R, Tang L, Lin J, Zhang Z, Mukherjee B. Likelihood‐based methods for regression analysis with binary exposure status assessed by pooling. Statistics In Medicine 2012, 31: 2485-2497. PMID: 22415630, PMCID: PMC3528351, DOI: 10.1002/sim.4426.Peer-Reviewed Original ResearchConceptsPopulation-based case-control study of colorectal cancerCase-control study of colorectal cancerPopulation-based case-control studyStudy of colorectal cancerExposure statusBinary outcomesRegression modelsCase-control sampleLogistic regression modelsGene-disease associationsObserved binary outcomeStudy designEpidemiological studiesColorectal cancerAssess exposureMaximum likelihood analysisRegression analysisLikelihood-based methodsExposure assessmentMaximum likelihood approachLikelihood approachCross-sectionSimulation studyOutcomesLikelihood analysis
2011
Logistic regression analysis of biomarker data subject to pooling and dichotomization
Zhang Z, Liu A, Lyles R, Mukherjee B. Logistic regression analysis of biomarker data subject to pooling and dichotomization. Statistics In Medicine 2011, 31: 2473-2484. PMID: 21953741, DOI: 10.1002/sim.4367.Peer-Reviewed Original ResearchConceptsPopulation-based case-control study of colorectal cancerCase-control study of colorectal cancerProspective logistic regression modelPopulation-based case-control studyStudy of colorectal cancerEpidemiological studiesLogistic regression modelsAnalysis of epidemiological dataLogistic regression analysisBinary exposurePooled measureColorectal cancerRegression modelsEpidemiological dataRegression analysisAnalysis of biomarker dataDisease statusExposed subjectsBiomarker dataChoice of designSubjectsEstimated parametersStatusRecommendationsAsthma exacerbation and proximity of residence to major roads: a population-based matched case-control study among the pediatric Medicaid population in Detroit, Michigan
Li S, Batterman S, Wasilevich E, Elasaad H, Wahl R, Mukherjee B. Asthma exacerbation and proximity of residence to major roads: a population-based matched case-control study among the pediatric Medicaid population in Detroit, Michigan. Environmental Health 2011, 10: 34. PMID: 21513554, PMCID: PMC3224543, DOI: 10.1186/1476-069x-10-34.Peer-Reviewed Original ResearchConceptsConditional logistic regressionPopulation-based matched case-control studyPediatric Medicaid populationCase-control studyMedicaid populationProximity of residenceLogistic regressionConditional logistic regression modelsElevated risk of asthmaRisk of asthmaEmergency department visitsCase-control analysisLogistic regression modelsHigh-risk populationTraffic-related pollutantsAsthma outcomesDepartment visitsAsthma eventsAsthma claimAsthma-related eventsAsthma casesEcological biasOdds ratioInvestigate associationsMethodThis study
2008
Inference of the Haplotype Effect in a Matched Case-Control Study Using Unphased Genotype Data
Sinha S, Gruber S, Mukherjee B, Rennert G. Inference of the Haplotype Effect in a Matched Case-Control Study Using Unphased Genotype Data. The International Journal Of Biostatistics 2008, 4: article 6. PMID: 20231916, PMCID: PMC2835450, DOI: 10.2202/1557-4679.1079.Peer-Reviewed Original ResearchConceptsCase-control studyUnphased genotype dataHardy-Weinberg equilibriumLocus-specific genotype dataGenotype dataBeta-Carotene Cancer Prevention StudyCancer Prevention StudyCase-control study designStudy of breast cancer patientsMatched case-control studyCase-control designPhasing of haplotypesDisease risk modelsBreast cancer patientsPrevention StudyHaplotype effectsStudy designGametic phasePolymorphic lociHaplotype frequenciesCancer patientsLociConditional likelihood approachAssociationHaplotypesVariability in prescription drug expenditures explained by adjusted clinical groups (ACG) case-mix: A cross-sectional study of patient electronic records in primary care
Aguado A, Guinó E, Mukherjee B, Sicras A, Serrat J, Acedo M, Ferro J, Moreno V. Variability in prescription drug expenditures explained by adjusted clinical groups (ACG) case-mix: A cross-sectional study of patient electronic records in primary care. BMC Health Services Research 2008, 8: 53. PMID: 18318912, PMCID: PMC2292169, DOI: 10.1186/1472-6963-8-53.Peer-Reviewed Original ResearchConceptsPrescription quality indexPatient electronic recordsCase-mixPhysician-centeredElectronic recordsPharmacy costsPrimary care centersEpisode of careCase-mix adjustmentQuality of prescriptionsCost of prescriptionsCross-sectional studyQuality IndexIncreased prescription costsTwo-part modelDrug expendituresFamily physiciansProfile physiciansLinear mixed modelsPrescription costsProportion of variancePrescription drug expendituresPhysician expendituresLogistic regressionMultilevel structure of data
2007
Analysis of matched case–control data with multiple ordered disease states: possible choices and comparisons
Mukherjee B, Liu I, Sinha S. Analysis of matched case–control data with multiple ordered disease states: possible choices and comparisons. Statistics In Medicine 2007, 26: 3240-3257. PMID: 17206600, DOI: 10.1002/sim.2790.Peer-Reviewed Original ResearchConceptsConditional logistic regressionStratum-specific nuisance parametersCase-control dataAdjacent-category logit modelCase-control studyOrdered categorical dataConditional-likelihood approachLikelihood-based approachNuisance parametersProportional-odds modelCumulative logitsSimulation studyAnalyse such dataMantel-Haenszel approachCumulative logit modelNatural orderPotential risk factorsStages of cancerReference categoryCategorical dataLogistic regressionOrdinal natureEffect of potential risk factorsLow birthweightRisk factors
2004
Bayesian Semiparametric Modeling for Matched Case–Control Studies with Multiple Disease States
Sinha S, Mukherjee B, Ghosh M. Bayesian Semiparametric Modeling for Matched Case–Control Studies with Multiple Disease States. Biometrics 2004, 60: 41-49. PMID: 15032772, DOI: 10.1111/j.0006-341x.2004.00169.x.Peer-Reviewed Original ResearchConceptsSemiparametric Bayesian frameworkBayesian semiparametric modelSemiparametric modelDirichlet processStratum effectsConditional likelihoodProbability of disease developmentBayesian approachNumerical integration schemeBayesian frameworkSample sizeDirichletActual estimationMLEMissingnessMarkovIntegration schemeExposure distributionBayesianEstimationRegression modelsMultiple disease statesDistributionProbabilityDisease states