Highlights
1 Objectives
You are required to identify and carry out a series of analyses (i.e., at least two) of a large dataset or a collection of large datasets utilising appropriate programming languages and programming environments.
Your project must incorporate the following elements:
1. The utilisation of MapReduce-style processing environment for some part of the analysis.
2. Source dataset(s) should be stored in an appropriate database(s) prior to processing
3. Post-MapReduce processing dataset(s) should be stored in an appropriate database(s)
4. The source data must be programmatically accessed by the code that processes it
5. The output of the MapReduce processes must be programmatically stored
6. Follow-up analysis must be carried out the MapReduce output data
For example, you may initially utilise HBase, MongoDB, Cassandra, Riak or some other NoSQL database to store the raw datasets, then the processing would utilise this database as an input source. After processing the data you could store the data in MySQL or PostgreSQL.
Following that you may use Python with NumPy/Pandas/Matplotlib or R with ggplot/plotly to conduct further analysis of the output data (e.g., statistical analysis and/or data visualisation.
2 Deliverables
The results of the analysis must be presented in the form of a project report. This report should discuss the programming and data handling challenges that you encountered and the techniques you used to overcome these challenges. The report should be 3000 ? 300 words in length (excluding references), and must follow the IEEE format1, in addition, to be employing appropriate referencing methods and academic writing style. The report should include the following:
1. A description of the source dataset(s), including data types, ranges, cardinality and number of missing values;
2. A description of the objective of the analysis. Note that the analysis should attempt to answer a novel question;
3. Details of the data processing activities carried out, including preparation of the data and processing the data in a MapReduce-type environment. This should include all MapReduce patterns employed.
4. A discussion of the rationale and justification for the choices you have made in terms of data processing, programming language choice, and algorithms that you have implemented; and
5. Presentation of results by making appropriate use of figures, tables, etc.
3 Project Report
The report should be structured as follows:
1. Abstract
• A summary of the objectives and results of the analysis Note: Look at abstracts in your literature review to get an idea of what makes a good or bad abstract
2. Introduction
• The statement of the objectives of the analysis
• A motivation for the problem
• A discussion of the relevance of the chosen topic
• The elicitation of the appropriately formed research question
3. Related Work
• Here you should present an analysis of relevant (academic) works that addressed similar problems or guided your decisions
• This should be a critical evaluation (i.e. it should go beyond being a mere a summary of the referenced works)
4. Methodology
• Provide a description of datasets and your justification of choosing them
• Discuss the data processing activities carried out and the justifications for using them
• Provide a rationale for the choice of technologies used (i.e., programming languages, databases, frameworks, etc.)
• Describe the design patterns you have implemented and given your reasons for choosing them
5. Results
• Present your results, making appropriate use of figures, tables, etc.
• Describe how the objectives of the project were met
6. Conclusions and Future Work
• Discussion of research findings as well as their implications and limitations
• Detail options for further work that could be explored
This IT Assignment has been solved by our IT Experts at My Uni Paper. Our Assignment Writing Experts are efficient to provide a fresh solution to this question. We are serving more than 10000+ Students in Australia, UK & US by helping them to score HD in their academics. Our Experts are well trained to follow all marking rubrics & referencing style.
Be it a used or new solution, the quality of the work submitted by our assignment experts remains unhampered. You may continue to expect the same or even better quality with the used and new assignment solution files respectively. There’s one thing to be noticed that you could choose one between the two and acquire an HD either way. You could choose a new assignment solution file to get yourself an exclusive, plagiarism (with free Turnitin file), expert quality assignment or order an old solution file that was considered worthy of the highest distinction.
© Copyright 2026 My Uni Papers – Student Hustle Made Hassle Free. All rights reserved.