ITNPBD7 - Review Sentiment Analysis - Science Assignment Help

Download Solution Order New Solution
Assigment Task

 

Review Sentiment Analysis
Your task in completing this assignment is to analyse a range of reviews for the most common words that appear for both positive and negative sentiments. The data are contained in a file called sentiments.txt, which you can download from the module assignment page on Canvas (a shorter version called shortsent.txt is also available for testing purposes). The file contains the type of item being reviewed (Restaurant, Movie, Product) followed by the review text and then a sentiment value (1 for positive, 0 for negative). Each review is on a single line of the file with the different fields separated by a tab character, as shown in the following example:

Restaurant I swung in to give them a try but was disappointed. 0
Restaurant I had a pretty satisfying experience. 1
Movie Some applause should be given to the "prelude". 1
Product A must study for anyone interested poor design. 0

Step 1, Use of HDFS –  Before you write any code, you will need to copy the data onto your own space in HDFS. In your report, give details of how HDFS stores the data used in this assignment and how it is processed as a map/reduce task (assume the file is much bigger than it really is for the purpose of your description). This section should be around half a page long and should include all the hdfs commands you used to create a directory for the data, load it onto HDFS, submit the task and examine the results. You should ensure that you explain why the data must be uploaded and processed in this way and what each command is doing to complete this process. Ensure everything you write here is your own description of what happens with your data and code - do not just copy other sources on HDFS and change the wording.

Step 2, Design –
Now consider the Map/Reduce design you will implement. You know there are only six different results that must be produced and a larger (but unknown) number of different words used in those reviews. In your report, consider and compare developing two solutions: one without a Combiner and one with a Combiner. Discuss what keys and values the mapper will emit compared with the Combiner or Reducer and how this affects the efficiency of your solution. Some points to consider are how much data will be moved across the network and how many different reducers will be used in your design. You should also address whether or not your solution is fully optimal for the task and if you think it could be improved. If you think it could be improved but could not work out how to code it, discuss what type of changes would need to be made to make your solution more effective and why these changes would improve performance.

 

This ITNPBD7 - Science Assignment has been solved by our Science experts at My Uni Paper. Our Assignment Writing Experts are efficient to provide a fresh solution to this question. We are serving more than 10000+ Students in Australia, UK & US by helping them to score HD in their academics. Our Experts are well trained to follow all marking rubrics & referencing style.
Be it a used or new solution, the quality of the work submitted by our assignment experts remains unhampered. You may continue to expect the same or even better quality with the used and new assignment solution files respectively. There’s one thing to be noticed that you could choose one between the two and acquire an HD either way. You could choose a new assignment solution file to get yourself an exclusive, plagiarism (with free Turnitin file), expert quality assignment or order an old solution file that was considered worthy of the highest distinction.

Get It Done! Today

Country
Applicable Time Zone is AEST [Sydney, NSW] (GMT+11)
+

Every Assignment. Every Solution. Instantly. Deadline Ahead? Grab Your Sample Now.