BUS5DWR : Data Wrangling and R - Transport Information Mentions - Business Assignment Help

Download Solution Order New Solution
Assignment Task :

Overview
Over the past few weeks, you have learned how to use R to wrangle business data. This assignment will provide you with an opportunity to demonstrate your R skill for data wrangling. 
Using the tidyverse package is recommended but not compulsory. Please carefully read the entire assignment to make sure you understand the requirements and also the submission format and marking rubrics before starting.

Specific Requirements
Part 1  

The online hospitality company Airbnb has made publicly available a number of datasets. This part of the assignment makes use of a subset of the Melbourne dataset. The dataset is given in the AirBnBMelb.tsv file.
 
Write R code to answer each question in one code chunk:
1.1.
Load the dataset from the given file into a new dataframe called bnb. Keep only eight columns, including id, type, location, transport, price, nbr_review, last_review, and review_score. Change the data type of last_review into the date type. Print the summary of bnb.  

1.2. Display the number of total listings and the percentage of listings that have not been reviewed.

1.3. How many listings that have last reviewed in 2019 and their transport information mentions both ‘bus’ and ‘shop’? (the words can be in lower or upper or mix cases, in any order, or be part of a word)

1.4. What are the three locations with the highest number of listings (display location and number of listings)?  

1.5. Display the average price and average score of each listing type in South Yarra, Australia; considering only listings that have more than 100 reviews. Show the cheapest type first.  

1.6. Analyse the distribution of review scores in South Yarra, what is the difference before and after removing outliers (Hint: draw boxplot, histogram and write a short paragraph (less than 100 words) to describe your insight). 

 

Part 2 
The given Excel file named obesity.xlsx records the prevalence of obesity among adults in each country in the world. There are two worksheets. The Data worksheet records the information of each country in 2011 and 2016, including the average obesity rate and its 95% confidence interval of female, male and both groups. The
second worksheet, Continent, records information about countries in each continent.
You will see that the data is far from being ready for analysis and needs to be ‘wrangled’. Additionally, a few errors have been deliberately introduced so these will need to be corrected by applying your R code.
Please write R code to perform the following steps; each step will work on the result obtained from the previous step.

2.1. Load the data from the Data worksheet into a dataframe named obes. Keep only three columns, namely country (column A), the obesity rate of both sexes in 2011 (column B) and 2016 (column E).
Rename the columns to Country, 2011, and 2016. Quickly show the structure of obes with the glimpse function.
 
2.2. You can see that column header 2011 and 2016 contain the year information. Use pivot_longer to transform obes into three columns, namely Country, Year, and Rate. Do we have missing values in the Rate column after transforming? If any, display them and change them into NA. Then display the number of NA values in this column.  

2.3. Split the Rate column into three columns named Rate, MinCI, MaxCI. After splitting, what are the data type of Rate, MinCI and MaxCI? How does the average obesity rate of the world change from 2011 to 2016? 

2.4. Display the name and rate of the five countries with the highest obesity rates in 2016. Write this result to a CSV file too.  

2.5. Load the data from the Continent worksheet into a new dataframe, keeping only two columns, “Country or area” and “Region1”. How many unique regions are there?

2.6. Display the 2016 obesity information of each country in Southern Europe. 

2.7. Display the top three regions with the highest average obesity rate in 2011.  

2.8. Draw a column chart to compare the average obesity rate of each region. Order the bar from highest to lowest. Write a short paragraph (less than 100 words) to describe your insight.

 

This Business Assignment has been solved by our Business Experts at My Uni Paper. Our Assignment Writing Experts are efficient to provide a fresh solution to this question. We are serving more than 10000+ Students in Australia, UK & US by helping them to score HD in their academics. Our Experts are well trained to follow all marking rubrics & referencing style.

Be it a used or new solution, the quality of the work submitted by our assignment experts remains unhampered. You may continue to expect the same or even better quality with the used and new assignment solution files respectively. There’s one thing to be noticed that you could choose one between the two and acquire an HD either way. You could choose a new assignment solution file to get yourself an exclusive, plagiarism (with free Turnitin file), expert quality assignment or order an old solution file that was considered worthy of the highest distinction.

Get It Done! Today

Country
Applicable Time Zone is AEST [Sydney, NSW] (GMT+11)
+

Every Assignment. Every Solution. Instantly. Deadline Ahead? Grab Your Sample Now.