Internal Code: 1GIIJ
R Studio Report Writing Assignment:
Task:
Overview
The goal of this assignment is simple.
You must determine if one of the body measurements fits a normal distribution. To do this, you are going to use the “Body” Dataset(bdims.csv) which is located under Assignment 2.
Body Measurements Dataset Description
Body girth measurements and skeletal diameter measurements, as well as age, weight, height, and gender, are given for 507 physically active individuals - 247 men and 260 women.
Data Source:Heinz G, Peterson LJ, Johnson RW, Kerk CJ. 2003. Exploring Relationships in Body Dimensions. Journal of Statistics Education 11(2).
Variables in the dataset: Nine skeletal measurements (diameter measurements) and twelve girth (or circumference) measurements, as well as age, weight, height, and gender, are available in this dataset. Variable names and short descriptions are given below:
- bia.di: Respondent's biacromial diameter in centimeters.
- bii.di: Respondent's biiliac diameter (pelvic breadth) in centimeters.
- bit.di: Respondent's bitrochanteric diameter in centimeters.
- che.de: Respondent's chest depth in centimeters, measured between spine and sternum at nipple level, mid-expiration.
- che.di: Respondent's chest diameter in centimeters, measured at nipple level, mid-expiration.
- elb.di: Respondent's elbow diameter in centimeters, measured as sum of two elbows.
- wri.di: Respondent's wrist diameter in centimeters, measured as sum of two wrists.
- kne.di: Respondent's knee diameter in centimeters, measured as sum of two knees.
- ank.di: Respondent's ankle diameter in centimeters, measured as sum of two ankles.
- sho.gi: Respondent's shoulder girth in centimeters, measured over deltoid muscles.
- che.gi: Respondent's chest girth in centimeters, measured at nipple line in males and just above breast tissue in females, mid-expiration.
- wai.gi: Respondent's waist girth in centimeters, measured at the narrowest part of torso below the rib cage as average of contracted and relaxed position.
- nav.gi: Respondent's navel (abdominal) girth in centimeters, measured at umbilicus and iliac crest using iliac crest as a landmark.
- hip.gi: Respondent's hip girth in centimeters, measured at at level of bitrochanteric diameter.
- thi.gi: Respondent's thigh girth in centimeters, measured below gluteal fold as the average of right and left girths.
- bic.gi: Respondent's bicep girth in centimeters, measured when flexed as the average of right and left girths.
- for.gi: Respondent's forearm girth in centimeters, measured when extended, palm up as the average of right and left girths.
- kne.gi: Respondent's knee diameter in centimeters, measured as the sum of two knees.
- cal.gi: Respondent's calf maximum girth in centimeters, measured as the average of right and left girths.
- ank.gi: Respondent's ankle minimum girth in centimeters, measured as the average of right and left girths.
- wri.gi: Respondent's wrist minimum girth in centimeters, measured as the average of right and left girths.
- age: Respondent's age in years.
- wgt: Respondent's weight in kilograms.
- hgt: Respondent's height in centimeters.
- sex: Respondent’s gender, 1 if the respondent is male, 0 if female.
Assignment Instructions
1- You are required to select
ONLY ONE MEASUREMENT from the dataset for this investigation. You must decide which measurement to deal with. You don’t need to include all variables.
2- Since males and females tend to have different body dimensions, you are required to investigate the normality assumption of the selected variable
separately in men and women. Let’s say that you selected biacromial diameter measurement as a variable of interest. Then you should investigate if this measurement fits a normal distribution in men and in women separately. Keep in mind that there will be some cases in which men’s distribution may fit a normal distribution where else female's distribution may not fit a normal distribution, or vice a versa.
3-You need to import this dataset into RStudio and tidy it up (e.g., you may need to define the variable sex as a factor and define labels for it) using R functions.
4- You need to give summary statistics (i.e., mean, median, standard deviation, first and third quartile, interquartile range, minimum and maximum values) for your variable of interest separately in men and in women using R functions.
5- Then you will use R to summarise the empirical distribution of body measurement separately in men and women and compare it to a normal distribution. You need to do this visually by plotting the histogram with normal distribution overlay.
6- You will end by discussing the extent to how your theoretical normal distribution fits the empirical data and make recommendations regarding the modeling of this body measurement.