Tutorial:
GLM with NumXL
Inthistutorial,wewilluseasampledatagatheredduringaclinicaltrialofanewchemical/pesticideon tobaccoBudworms.Thesubjects(i.e.budworms)aregroupedintobatchesof20,andexposedto differentdosesofthechemical.Theresultsaresummarizedbelow: Batch 1 2 3 4 5 6 7 8 9 10 11 12 Dose 1 2 4 8 16 32 1 2 4 8 16 32 Gender 0 0 0 0 0 0 1 1 1 1 1 1 Death 1 4 9 13 18 20 0 2 6 10 12 16
Data preparation
Ourobjectivehereistomodel(andforecast)theeffectivenessofthenewchemicalusingdifferent dosages,andexplain,tosomeextent,anyvariationbasedonthegenderofthebudworm.Furthermore, wewanttoexpresstheresultsintermofthewormmortalityrates(i.e.probability). Batch 1 2 3 4 5 6 7 8 9 10 11 12 Dose 1 2 4 8 16 32 1 2 4 8 16 32 Gender 0 0 0 0 0 0 1 1 1 1 1 1 Death 1 4 9 13 18 20 0 2 6 10 12 16 Rate 5.0% 20.0% 45.0% 65.0% 90.0% 100.0% 0.0% 10.0% 30.0% 50.0% 60.0% 80.0% 1 SpiderFinancialCorp,2012
TutorialGLM(BinomialDistribution)
Weplotthedataintotwoseparatecurves:malesandfemales.Itisapparentthatmortalityrateis affectedbythosetwofactors:genderanddosage. Wewillmaketwoassumptions:(1)theresultsforeachtrial(i.e.batch)aredrawnfromaBinomial distributedpopulation;wewouldliketoestimateptheprobabilityofsuccess(i.e.wormsdeath).The probability(p)isallowedtovaryacrossdifferenttrials(batches).(2)Theprobabilityofsuccessis affectedbytwofactors:genderofthesubjectandadministereddosageofthedrug. Basedonthesetwoassumptions,wewouldmodelthisrelationship:
P f ( X , Y ) E[ p | X , Y ]
Modeling
Wearereadynowtoproposeastatisticalmodel:generalizedlinearmodelwithresidualsfollowingthe Binomialdistribution. UsingNumXLtoolbar,clickontheGLMicon.
TutorialGLM(BinomialDistribution) 2 SpiderFinancialCorp,2012
TheGLMwizardwillpopup.Initially,allthecontrolsaredisableduntilwespecifyavalidrangefor responseandexplanatoryvariables.Thenumberofrowsofthetwocellsrangesmustmatch.
Fornow,wechooseLogitasourlink(transform)function,specifythetrialorbatchsize(20),and instructtheWizardtocalibrate(i.e.computeoptimalvaluesforthecoefficients).LeavetheGoodness offitandresidualdiagnosisoptionschecked.
TutorialGLM(BinomialDistribution)
SpiderFinancialCorp,2012
Calibration
Inthiscase,theGLMWizardhascalibratedthemodelscoefficients,sowecanskipthisstep. But,intheeventwewishtoexperimentwithdifferentlinkfunctions:LOGIT,PROBITorLOGLOG,then weneedtorecalibratethemodel.Todoso,wecaneither: (1) Createanewmodelwiththewizard,or, (2) ChangetheLvkparameterinanexistingmodeltable,andrunthecalibrationusingNumXL toolbar.
Step 1: Select the cell that acts as a header for the model table
Step 2: Click on the calibration icon/menu (Excel 2003)
Step 3: Click on Solve button in the Solver window
TutorialGLM(BinomialDistribution)
SpiderFinancialCorp,2012
Forecast
Oncethemodeliscalibrated,andwearehappywiththeresiduals,wecanuseittoconstructour forecastmean(andconfidenceintervalaroundit). UsingNumXLfunction(GLM_FORE),wecancomputethemean.UsingGLM_FORECI,wecancompute theupperandlowerlimitoftheconfidenceinterval.
Plottingthedataagain(actual)versusthemodelvalues.
TutorialGLM(BinomialDistribution)
SpiderFinancialCorp,2012
Thedotsrepresentthesampledata,whilethecenterlineistheforecastmean.Theshadedregionsin thegraphsarethe95%confidenceintervals. Notes: 1. Theforecasterrordecreaseasweincreasethedosage(C.I.getstighter).Thisisevidentinmale andfemalebatches. 2. Thelogarithmicrelationdetectedwhenweplottherawdatacanbemerelyadataanomaly;the GLMshowsmorelikeaquadratictypeofrelationship. 3. Themeanisnotexactlythecenteroftheconfidenceintervalduetothediscretenatureofthe underlyingbinomialdistribution,andthesmallbatch/trialsize.
TutorialGLM(BinomialDistribution)
SpiderFinancialCorp,2012