Highlights
Statistics Assessment Task
Question 1. Use the surveynew dataset to conduct the following regression analysis of the possible association of various lifestyle variables on being overweight.
(a) Use Proc Means to obtain descriptive data for AGE, SEX, ALCGRAMS,CIGSDAY, EXERCISE. Use Proc Logistic with a Backwards variable elimination strategy (main effects and interactions including sex only – no squared terms) to identify the variables in (a) that are significantly (at 0.05 level) and independently associated with being overweight.
(b) For each variable in your final model from (a), describe its relationship with being overweight by calculating and interpreting appropriate estimated odds ratios. Using your final model from (a),
calculate the estimated probability that each individual is overweight and provide histograms of the estimated probabilities for individuals who are actually overweight and individuals not
What proportion would be correctly predicted if you ignored the model and simply predicted everyone to be overweight?
What proportion would be correctly predicted if you ignored the model and simply predicted everyone to not be overweight?
(d) Based on your results in (b) and (c), do you think your final model is good at identifying which individuals are overweight? Explain your answer.
(e) Perform a stepwise (backward) search for predictors of SOB from among the following list of potential predictors: sex, age, alcgrams, angina, asthma, bmi, bronch, chol, cigsday, dbp, diabetes, exercise, fev, fvc, hayfever, myocard, rxhyper, sbp, weight, yearsmok. In your search for predictors, consider main effects only and ignore squares of quantitative variables and interactions and use the p = 0.05 criterion for dropping variables. Provide the output that shows the order in which terms were dropped and the output showing the fitted final model (ie its estimated coefficients).Which of the categorical variables (ie sex, angina, asthma, bronch, diabetes, hayfever, myocard, rxhyper) in your final model from (a) has the largest effect on SOB? Which of the quantitative variables (ie age, alcgrams, bmi, chol, cigsday, dbp, exercise, fev, fvc, sbp, weight, yearsmok) in your final model has the largest effect on SOB?
Question 2 The data set below arises from an age-stratified case-control study of the association between alcohol consumption and oesophageal cancer in a region of France. The age strata were 40 – 49, 50 – 59 and 60 – 69 years. All people in the age group newly diagnosed with oesophageal cancer were the cases. For each age group, the same number of controls (people without cancer) were randomly selected from the electoral register. The usual alcohol consumption over the last 5 yrs was obtained for each case and control and categorised < 80g/day and 80 + g/day.
(a) Create a SAS dataset for these data suitable for logistic regession analysis. Use Proc Logistic to fit a single conditional logistic regression model that performs a test of whether the odds ratios (that compare odds of cancer in 80 + vs < 80 alcohol groups) are significantly different across the age groups and produces estimates (and 95% CI) of the odds ratio for each age group. Highlight the location of the relevant test p-value and the odds ratios (and their 95% CI) in your output. State your conclusion about whether the odds ratios differ significantly across age groups.
3
(b) Use Proc Logistic to fit a single conditional logistic regression model that produces the age-adjusted estimate (and 95% CI) of the odds ratio that compares odds of cancer in 80 + vs < 80 alcohol groups and performs a test of whether the age-adjusted odds ratio estimate is significantly different from one. Highlight the location of the odds ratio, its 95% CI and the relevant test p-value in your output. State your conclusion about whether the age-adjusted odds ratio differs significantly from one.
(c) Compare your results from (b) and (c). Which results do you think are better and why? Use the better results and, as you would for the abstract of a presentation or journal manuscript, write a brief description of your Statistical Methods and a brief summary of your main Results (including an estimate, 95% CI and p-value.)
This Statistics Assessment has been solved by our Statistics experts at My Uni Paper. Our Assignment Writing Experts are efficient to provide a fresh solution to this question. We are serving more than 10000+ Students in Australia, UK & US by helping them to score HD in their academics. Our Experts are well trained to follow all marking rubrics & referencing style.
Be it a used or new solution, the quality of the work submitted by our assignment experts remains unhampered. You may continue to expect the same or even better quality with the used and new assignment solution files respectively. There’s one thing to be noticed that you could choose one between the two and acquire an HD either way. You could choose a new assignment solution file to get yourself an exclusive, plagiarism (with free Turnitin file), expert quality assignment or order an old solution file that was considered worthy of the highest distinction.
© Copyright 2026 My Uni Papers – Student Hustle Made Hassle Free. All rights reserved.