MATH2349 : Data Wrangling - Build your Data Analytics Portfolio - Science and Maths Assignment Help

Download Solution Order New Solution
Assignment Task :

Purpose

The purpose of this final assignment is to put to work the tools and knowledge that you gain throughout this course. This provides you with multiple benefits.

  • It will provide you with more experience using data preprocessing tools on real life data sets.

  • It helps you to self-direct your learning and interests to find unique and creative ways to wrangle your data.

  • It starts to build your data analytics portfolio. Portfolios (or e-portfolios) are a great way to show potential employers what you are capable of.

 

Overview

This assignment requires you to find some open data, and use your knowledge, skills gained during the course to preprocess the data. You will create a report using R Markdown to explain the steps taken by you in order to perform the data preprocessing tasks. You will also publish this report online (in RPubs) which will give you the opportunity to build your data analytics portfolio. This is a great way of showing potential employers what you are capable of. You will be awarded (with marks) the clearer you demonstrate your skills.

 

Assessment criteria and weighting

Course Learning outcomes

This assessment is linked to the following course learning outcomes: 

  1. Accurately, logically and ethically combine data from multiple sources to make suitable for statistical analysis and draw valid interpretations.

  2. Articulate how data meets the best practice standards (e.g. tidy data principles).

  3. Select, perform and justify data validation processes for raw datasets.

  4. Use leading open source software (e.g. R) for reproducible, automated data processing.

 

Minimum Requirements for the Data sets

Considering this is a data preprocessing class, I do expect your data set to have certain requirements so that you can demonstrate your knowledge of data preprocessing. The following are the minimum requirements for the data sets that I will look for:

  1. At least two data sets should be merged to create your assignment data (for example you can take crime statistics for the cities/states in Australia and merge this data set with cities/states’ per capita income data).

  2. Your data set should include multiple data types (numerics, characters, factors, etc).

  3. Your data set should include variables suitable for data type conversions so that you should be able to apply the required data type conversions (e.g., character -> factor, character -> date, numeric -> factor, etc. conversions).

  4. Your data set should include at least one factor variable that needs to be labelled and/or ordered. 

  5. At least one of the data sets that you use should be Untidy. You need to explain why the data set or data sets you used is/are Untidy. Then you need to apply the required steps to reshape your data into a tidy format.

  6. At least one variable needs to be created/mutated from the existing ones (e.g. the data may contain income and expense variables and you may create a savings variable out of the income and expense variables).

  7. You are expected to scan all variables for missing values, special values and obvious errors (i.e. inconsistencies). If there are missing values, use any of the suitable techniques outlined in Module 5 to deal with them, reason and document your approach properly. If there are no missing values in the data, then scan all variables for any special values and obvious errors, use any of the suitable techniques outlined in Module 5 to deal with them, reason and document your approach properly.

 

This Science and Maths Assignment has been solved by our Science and Maths Experts at My Uni Paper. Our Assignment Writing Experts are efficient to provide a fresh solution to this question. We are serving more than 10000+ Students in Australia, UK & US by helping them to score HD in their academics. Our Experts are well trained to follow all marking rubrics & referencing style.

Be it a used or new solution, the quality of the work submitted by our assignment experts remains unhampered. You may continue to expect the same or even better quality with the used and new assignment solution files respectively. There’s one thing to be noticed that you could choose one between the two and acquire an HD either way. You could choose a new assignment solution file to get yourself an exclusive, plagiarism (with free Turnitin file), expert quality assignment or order an old solution file that was considered worthy of the highest distinction.

Get It Done! Today

Country
Applicable Time Zone is AEST [Sydney, NSW] (GMT+11)
+

Every Assignment. Every Solution. Instantly. Deadline Ahead? Grab Your Sample Now.