Highlights
Aims
The aim of this assignment is to introduce a practical application of Big Data and Cloud Computing using a realistic big data problem. Students will implement a solution using an industry leading Cloud computing provider together with the distributed processing environment Apache Spark. This will involve the selection of problem appropriate Machine Learning algorithms and methods.
Learning Outcomes Assessed:
Knowledge & Understanding
LO 1. Apply big data analytic algorithms, including those for visualization and cloud computing techniques to multi-terabyte datasets.
LO 2. Critically assess data analytic and machine learning algorithms to identify those that satisfy given big data problem requirements
Intellectual / Professional skills & abilities:
LO 3. Critically evaluate and select appropriate big data analytic algorithms to solve a given problem, considering the processing time available and other aspects of the problem.
LO 4. Design and develop advanced big data applications that integrate with third party cloud computing services
Personal Values Attributes (Global / Cultural awareness, Ethics, Curiosity) (PVA):
LO 5. Critically assess the relationship between knowledge and the ethical and social interpretation of primary research using big data.
Definitions
Portfolio Assignment: A collection of pieces of work
Individual Work: Work carried out by one person only
Group Work: Work carried out collaboratively seeking to improve each other’s elements
Peer Review: Critical analysis and subsequent grading of a social equal’s work
Semi-Formative: Training tasks assigned course credit to reward and ensure engagement.
Training Task 1: Peer Reviewed Task (24%)
The objective of this task is to ensure that students have mastered these skills which are required for final module assessment:
1. Process a data set using the recommended software environment for the module.
2. Explaining the logical reasoning behind your code.
This work will be peer assessed as recommended the British Computer Society. That is, you will critically assess the work of fellow students (your peers) and THEY will assess yours.
In detail:
1. You will create a Jupyter notebook based the scenario below (which is derived from weekly worksheets 1-4) explaining your code using notebook embedded Markdown (i.e. formatted text, not just comments)
2. You will post your notebook to the module discussion board on Blackboard
3. You will then mark (i.e. peer review) the submission preceding yours on the discussion board, and the one following it, using the marking scheme below and post these mark sheets
4. Your mark for this task will be the average of your peer marks.
Scenario:
Suppose you are a police department with a limited budget. You plan to reduce road-traffic accidents by a one-month targeted advertising campaign.
Using the given dataset, which gender, age group, and month would be the largest target group as indicated by positive breath tests?
Training Task 2: Group work participation Task (6%)
The objective of this task is to derive background study materials for the big data product to be used by the whole class. These may include (but are limited to) reviewing the literature on crime and big data, examining published work on violent crime and its causes, technical approaches to crime and big data, relevant statistics and other computational methods. That is, to research the topic in general.
Working in teams of up to four students, each group will produce at least 2000 original words, plus 10 references to scientific conference or journal papers.
Since this is a group training task, your participation is assessed, rather than your content (Students will be able to receive staff feedback on content during taught sessions).
Group work participation Task Marking Scheme
Each group will score 6% of the module mark proportionally reduced by percentage of copied work as determined by Turnitin (threshold 10%), number of words less than 2000, number of references less than 10.
Examples:
Group A submit a total of 2100 words, plus 15 references which have a Turnitin similarity score of 8% (due to random matches). Each group member will score 6%.
Group B submit a total of 1500 words, plus 5 references which has a Turnitin similarity score of 20% (due to material copied from the Internet). Each group member will score:
Big Data Product: Weapons and Drugs (Individual Work 70%)
In the television documentary “Ross Kemp and the Armed Police” broadcast 6th September 2018 by ITV, multiple claims were made regarding violent crime in the UK.
These claims were:
1. Violent Crime is increasing
2. There are more firearms incidents per head in Birmingham than anywhere else in the UK
3. Crimes involving firearms are closely associated with drugs offences
In this assignment you will investigate these claims using real, publicly available data sets that will be made available to you and placed in Amazon S3. These include, but are not limited to:
1. Street Level Crime Data published by the UK Home Office. This dataset contains 19 million data rows giving a crime type, together with their location as a latitude and longitude.
2. Land Registry Price Paid Data: This gives the postcode of a property, the property type from an enumeration of D (Detached), S (Semi-Detached), T (Terraced), F (Flats/Maisonettes) and the price paid.
3. Postcode Data: This data set is based on material provided by the Ordinance Survey. It gives a latitude and longitude to every postcode. This is useful as it relates between the Land Registry Price Paid dataset postcode, and the original crime dataset
latitude/longitude.
This KF7032 - IT/Computer Science Assignment has been solved by our IT/Computer Science Experts at My Uni Paper. Our Assignment Writing Experts are efficient to provide a fresh solution to this question. We are serving more than 10000+ Students in Australia, UK & US by helping them to score HD in their academics. Our Experts are well trained to follow all marking rubrics & referencing style.
Be it a used or new solution, the quality of the work submitted by our assignment experts remains unhampered. You may continue to expect the same or even better quality with the used and new assignment solution files respectively. There’s one thing to be noticed that you could choose one between the two and acquire an HD either way. You could choose a new assignment solution file to get yourself an exclusive, plagiarism (with free Turnitin file), expert quality assignment or order an old solution file that was considered worthy of the highest distinction.
© Copyright 2026 My Uni Papers – Student Hustle Made Hassle Free. All rights reserved.