PUBHLTH 2007: Epidemiology for Health and Medical Sciences Assignment 2

Download Solution Order New Solution

Assignment 2

Background Summary

As summarised in the Abstract of the publication by Hammersley et al. (2025), given the known short and long term effects of smoking during pregnancy to both the mother and infant, various studies have introduced incentives with varying success. “Despite public health policies and initiatives to reduce smoking, smoking in pregnancy remains unacceptably high in Australia, particularly among populations of high disadvantage.”

Given some promising results from the use of financial incentives in studies overseas and the need for higher quality randomised controlled trials that also include measures of cost effectiveness and acceptability, a recently proposed Australian study (including four researchers affiliated with the School of Public Health at the University of Adelaide!) wishes to investigate the impact of financial incentives in the Australian context.

The proposed randomised controlled trial will include pregnant women attending antenatal care in the Northern Adelaide Local Health Network (NALHN) area who smoke, are ≥18 years of age and are ≤20 weeks pregnant at the first antenatal appointment. Women will be randomly assigned to one of two groups where they will receive:

Intervention: standard care and monitoring to see if they have stopped smoking at 4 and 12 weeks after their first antenatal visit and at 37 weeks’ gestation using a carbon monoxide breath test, with increasing financial rewards if they can show they have stopped smoking.

Control: standard care and monitoring to see if they have stopped smoking at 4 and 12 weeks after their first antenatal visit and at 37 weeks’ gestation using a carbon monoxide breath test, with no financial rewards.

The pregnant women will be monitored to determine if they have been abstinent (i.e. have no evidence of smoking) at 4 and 12 weeks after their first antenatal visit and at 37 weeks’ gestation.

Assignment Questions:

The overall aim of the trial is to examine the effectiveness of the Intervention, relative to the Control in assisting pregnant women to stop smoking.

The primary outcome of the trial (for each individual) is abstinence (i.e. no smoking) over all three times points or not (yes/no). Note: ‘no’ means that there is evidence of smoking for at least one of the time points.

1. State (with reason) whether the primary outcome is continuous or categorical. Briefly explain why a Chi-square test statistic would be helpful to assess the level of evidence that abstinence is associated with the intervention.

2. Carefully state the Null and Alternative Hypotheses for the study. [Note: The statistics lectures and practicals will provide VERY useful guidance here. Please do not look at other resources as terminology can vary elsewhere.]

3. All participants must sign an “informed consent” sheet before participating in the trial. This means that they are aware of many details of the study including the nature of the intervention and control. As a result, participants will not be blinded as to whether they are receiving the intervention or the control. Name one consequence that could arise from this lack of blinding that could create bias. Provide a brief explanation of how it could impact the comparison of the primary outcome in the two groups.

4. Let’s assume the study is now complete and we can view a contingency table for Treatment group versus Abstinence for all participants who completed the entire study:

20251007063014AM-772890478-969309205.png

Without performing any statistical tests, use the values in the table to determine if there is any evidence that the Intervention may be better than the Control and explain your conclusion.

5. Calculate the expected values (E) in each cell of the contingency table (to 2 decimal places), if the Null Hypothesis is true. Show your working. 

20251007063014AM-1244629872-1110476703.png

6. We can use the information in Q5 to calculate that the e ????2 statistic for the observed data is 2.07 (which you do not need to do). One important part of this process is to calculate O – E for each cell in the contingency table. Briefly explain why these differences are useful.

7. How many degrees of freedom are associated with the ????! test statistic? Explain your answer.

The Stata dataset Smoking in Pregnancy.dta contains fictitious data that have been simulated based on the design of this trial. The dataset was meant to contain information on fake patients of which 230 were randomised to the Intervention and 230 to the Control. Unfortunately we discover that 40 (fake) participants in the Control group and 6 (fake) participants in the Intervention withdrew from the study, leaving 414 participants for which data complete data could be collected. Note: The dataset contains the following variables:

  • ID: Patient Identification code
  • Age: Patient age at enrolment in study
  • Treatment: Group randomised to receive; Control or Intervention
  • Abstinence: Abstinence from smoking over entire study; Yes or No
  • Acceptability: Patient rating of the acceptability of financial incentives to stop smoking; continuous range: 0-10

8. Baseline characteristics of participants will be recorded and may include variables like age, socio-demographic and obstetric variables.

a) Explain why it is important to determine these characteristics and compare their balance between the groups (Intervention versus Control).

b) When reporting baseline characteristics for each group, a table is created which includes summary statistics like means or medians for comparison. Use an appropriate Stata command to check the distribution of Age at enrolment n this dataset and comment on which summary measure of central tendency would be most appropriate and why. [NB: Include the Stata commands and output in your assignment submission as a picture or copy-and-pasted text] and comment.

9. Use the tab command in Stata to re-create a 2x2 contingency table seen in Q4 to display the relationship between the Treatment and the Abstinence outcome. [NB: Include the Stata command used to produce the necessary output AND the output itself in your assignment submission as a picture or copy-and-pasted text]. 

10. Use Stata to perform a ????2 test for association between Treatment and Abstinence. [NB: Include the Stata command used to produce the necessary output AND the output itself in your assignment submission as a picture or copy-and-pasted text]. Include your output and highlight the parts of the output that identify the ????2 test statistic and the associated P-value. Check and report whether your results are consistent with the information in Q6 and Q7.

11. In this simulated study we were told that 46 participants did not have complete data for the three time points. What are two likely reasons for which participants may drop out or become lost to follow-up in the context of this trial?

12. What is the difference in mean Acceptability scores between the two treatment groups?

13. Calculate the value of the test statistic (using formulae from lectures) for the difference in mean Acceptability scores. Show your working.

14. Let’s imagine that the trial actually intended to measure each woman’s Acceptability score at the start of the trial and again at the end. State the appropriate test statistic to compare the mean of these two measurements within the same individuals with zero and provide your reasoning. [Note: No calculations are required here.]

Brief Summary of Assessment Requirements

You must demonstrate understanding of the statistical design and analysis of a two-arm randomized controlled trial that compares a financial-incentive intervention versus control for stopping smoking in pregnancy. Deliverables include conceptual answers (theory and hypothesis formulation), basic contingency-table inference (χ⊃2;), interpretation of missing data and bias, exploration of baseline balance, and simple comparisons of a continuous acceptability score — plus Stata output screenshots/commands.

Key pointers to cover

  • Identify the primary outcome type (categorical vs continuous) and justify it.
  • Explain why a Chi-square test is appropriate for testing association between treatment and binary outcome.
  • State Null and Alternative hypotheses in the trial context.
  • Explain consequences of lack of blinding and potential bias.
  • Inspect the provided contingency table to form a reasoned conclusion (no test needed).
  • Show how to compute expected cell counts (E = row total × column total / grand total), interpret O–E differences, and report degrees of freedom.
  • Use Stata to recreate contingency table and run χ⊃2; (include commands + output).
  • Discuss likely reasons for dropout / loss to follow-up and implications (attrition bias, ITT vs per-protocol).
  • Compare baseline characteristics (e.g., Age): check distribution and pick appropriate summary (mean or median), showing Stata commands and output.
  • Compute and test difference in mean Acceptability between groups (point estimate, SE, test statistic), and identify the correct test for paired before/after measurements (paired t-test).

How the Academic Mentor guided the student step-by-step

1) Scope & conceptual framing

  • Mentor emphasised the decision problem: Does the incentive increase abstinence across all three time points?
  • Agreed key statistical concepts to demonstrate: outcome type, hypothesis statements, χ⊃2; mechanics, degrees of freedom, and basic continuous-data testing.

2) Question 1 outcome type & χ⊃2; rationale

  • Mentor clarified: Primary outcome is categorical (binary: abstinent over all three timepoints = Yes/No).
  • Explained: χ⊃2; test examines whether the proportion abstinent differs between Intervention and Control hence suitable for two categorical variables (Treatment × Abstinence).

3) Question 2 hypotheses

  • Mentor coached the student to write concise hypotheses:

    H₀: Probability(abstinent | Intervention) = Probability(abstinent | Control)
    H₁: Probability(abstinent | Intervention) ≠ Probability(abstinent | Control)
    (Or directional H₁ if the study prespecifies Intervention > Control.)

4) Question 3 lack of blinding (bias)

  • Identified likely biases and impacts:
    • Performance bias (participants change behaviour because they know group assignment), which may inflate effect estimates in the Intervention group.
    • Reporting/social-desirability bias (self-report outcomes more favourable in Intervention).
  • Mentor advised describing how this could distort the measured abstinence difference and suggested objective measures (CO breath test) mitigate—but not fully eliminate—bias.

5) Questions 4–7 contingency table, expected counts, O–E and df

  • Mentor walked through visual inspection of the 2×2 table to see whether proportions look higher in the Intervention column (qualitative conclusion).
  • Taught the formula for expected cell counts:

    E_{cell} = (row total × column total) / grand total

    and why O−E is useful: it shows each cell’s contribution to χ⊃2; and identifies where observed data deviate from independence.
  • Degrees of freedom for a 2×2 = (2−1)×(2−1) = 1. Mentor advised stating this and why (rows & columns).

6) Questions 8–10  baseline balance & Stata usage

  • Why compare baseline characteristics: to verify randomisation produced comparable groups and to detect possible imbalances that could confound results.
  • Checking Age distribution (Stata guidance): mentor asked student to check histogram and normality then choose mean or median accordingly.Example Stata commands the mentor recommended:

    summarize Age, detail histogram Age, normal swilk Age

    • If Age is approximately symmetric / normal → report mean ± SD.
    • If skewed → report median (IQR).
  • Creating 2×2 table and χ⊃2; in Stata (mentor asked student to capture and include output):
    • The output shows the contingency table, χ⊃2; statistic and p-value; mentor advised highlighting these in the write-up and checking consistency with manual χ⊃2; calculation.

7) Question 11 attrition reasons & consequences

  • Mentor suggested likely reasons in this trial: miscarriage/medical complications, relocation or inability to attend follow ups, loss of interest, adverse events, or competing priorities.
  • Emphasised implications: attrition bias, reduced power, possible differential loss (more dropouts in one arm), and recommended intention-to-treat (ITT) as primary analysis with sensitivity analyses (per-protocol, best/worst case, multiple imputation).

8) Questions 12–13 Acceptability score & difference in means

  • Mentor showed how to compute group means and difference, and the test statistic (two-sample t):Difference in means: xˉ1−xˉ2\bar{x}_1 - \bar{x}_2xˉ1−xˉ2
    SE = s12n1+s22n2\sqrt{ \dfrac{s_1^2}{n_1} + \dfrac{s_2^2}{n_2} }n1s12+n2s22
    t = xˉ1−xˉ2SE\dfrac{\bar{x}_1 - \bar{x}_2}{SE}SExˉ1−xˉ2
    • Mentor required showing all workings (means, SDs, n) and the manual calculation of t to match Stata output.

9) Question 14 paired measurements

  • Mentor explained the correct test is a paired t-test (dependent samples t) when comparing before and after Acceptability within the same individuals — because measurements are correlated.

10) Write-up, presentation and Stata screenshots

  • Mentor insisted on including: commands, console output screenshots, annotated tables (expected counts), and a clear written interpretation (statistic, df, p-value, conclusion about evidence strength).

How the outcome was Achieved

  • Student produced a complete submission that included:
    • Clear statement that primary outcome is binary and justification for χ⊃2;.
    • Null/alternative hypotheses in correct statistical language.
    • Discussion of bias from lack of blinding and proposed mitigations.
    • Manual expected counts (E), O−E explanation, and df = 1.
    • Stata 2×2 table and χ⊃2; test output (commands + screenshots).
    • Discussion of dropout reasons and recommended ITT/sensitivity analyses.
    • Age distribution checks and justified choice of mean/median.
    • Comparison of Acceptability means, manual t calculation and Stata ttest output.
    • Correct identification of paired t-test for pre/post within-person comparisons.

Learning objectives covered

  • Distinguish between categorical and continuous outcomes and select appropriate tests.
  • Formulate null and alternative hypotheses for a clinical trial.
  • Understand the chi-square test mechanics, compute expected counts, interpret O−E contributions, and degrees of freedom.
  • Recognise and explain biases from lack of blinding and the role of objective measures.
  • Use Stata for descriptive checks, contingency tables and χ⊃2; tests and for mean comparisons.
  • Handle missing data issues conceptually (attrition bias, ITT, sensitivity analyses).
  • Compute and interpret difference in means and select the paired t-test for repeated measures.
  • Present statistical results clearly: commands, output, manual workings, and plain-language interpretations for non-statistical readers.

Get Expert Guidance Download the Sample or Order a Fresh, Custom Solution

Looking for inspiration on how to structure your assignment? You can download this sample solution to understand the right format, approach, and academic tone required for high-quality submissions. However, remember that this sample is meant strictly for reference purposes. Submitting it as your own work could lead to plagiarism penalties, which can affect your grades and academic standing.

If you want to ensure originality and accuracy, we highly recommend opting for a fresh, custom-written assignment solution prepared by our team of experienced academic writers. Each solution is tailored to your topic, written from scratch, and delivered with a plagiarism report for complete peace of mind.

Why Choose a Fresh Solution:

  • 100% plagiarism-free and customized content
  • Expert guidance from subject-specialist writers
  • Proper referencing and formatting as per university standards
  • Guaranteed timely delivery

Make the smart choice learn from the sample or secure a unique solution crafted just for you.

Download Sample Solution  Order Fresh Assignment 

Get It Done! Today

Country
Applicable Time Zone is AEST [Sydney, NSW] (GMT+11)
+

Every Assignment. Every Solution. Instantly. Deadline Ahead? Grab Your Sample Now.