U30289 : Statistics and Regressions Data Modelling - Statistics Assignment Help

Download Solution Order New Solution
Assignment Task:

Task:

The Intended Learning Outcomes of the Assignment are:
 To introduce to the student the fundamental concepts which can be used to understand and improve the process of decision-making.
 Use appropriate statistical methods to analyse a variety of financial decisions.
 Improved literature review skills.

Section A – Statistics and Regressions

The file “Part_A.xlsx” uploaded on Moodle, contains data from a sample of 100 anonymised customers of a large super-market chain. This is a sub-sample from a larger data set of the UK supermarket chain 'Tesco', originally shared on the machine learning competition platform Kaggle.com. In particular, the data contain information on: customers' unique identifier (“customer.id”), the average number of their yearly transactions in Tesco’s Express or Super-stores (“express.no.transactions” and “superstore.no.transactions” respectively), as well as the total spending (in £) in the last five years (“express.total.spend” and “superstore.total.spend” respectively). Moreover, you have qualitative information on customers' gender, their economic purchasing power in an ordinal form (“affluency”), their precise income and the region of residence. You are required to make use of this data set and answer the following questions.

Summary Statistics
Provide the summary statistics1 (all measures of location and spread) of the following variables:
• Income (in £)
• Express store – Total Spending (in £)
• Express store – No. of transactions
• Superstore – Total Spending (in £)
• Superstore – No. of transactions
• Females (create a new dummy variable from the ‘Gender’ variable)
• London (create a new dummy variable from the ‘Region’ variable)

 

Based on the summary statistics, comment on the following:
(a) Report all quartiles of the variable ‘income’ and discuss their meaning.
(b) Is there a different spending pattern in superstores compared to express stores? If so, do you think that it makes sense?
(c) What % of the sample consists of females and what % of the sample consists of consumers from London? Calculate and report the average total spending between females and males in superstores, and the average total spending between Londoners and non-Londoners in express stores. Discuss your findings.
Uncovering patterns from the data
(d) Calculate the Pearson’s correlation coefficient between income and the total spending in express and super stores. Report the coefficients and comment on their conceptual meaning. Is Pearson’s correlation a good metric to use on this occasion? What is the alternative and what is their difference?

 

Run two Simple Linear Regression models (one for express and one for super stores) in order to quantify the relationship between customers’ income and their last five years of spending in each store accordingly.
(e) Conduct hypothesis testing for both models at a 5% level of significance and comment on those hypotheses based on the regression output. Also state the point estimate and its conceptual meaning.
(f) Suppose that we randomly select an out-of-sample customer with an income of £42,586.96. What is his/her estimated combined spending over the last five years?
(g) What is the goodness of fit of both models? State two ways to potentially improve the goodness of fit.
Now run two Multiple Linear Regression models (one for express and one for super stores, as in the case of the Simple Linear Regression models) with any additional variables of your choice provided in the dataset.
(h) Conduct hypothesis testing for both models at a 5% level of significance and comment on those hypotheses based on the regression output. Compare your results with the results obtained from the Simple Linear Regression models.
(i) Discuss the benefits and difficulties of using multiple regression (you can refer to the results obtained in this example if you wish).

 

Section B – Monte Carlo Simulations

AJS Ltd is a manufacturing company that performs contract work for a wide variety of firms. It primarily manufactures and assembles metal items, and so most of its equipment is designed for precision machining tasks. The executives of AJS currently are trying to decide between three product proposals (from three different contractors for manufacturing next year. They consider two criteria in their decision, expected profits and risk. AJS expects a payment of a flat £8 per unit to manufacture the new product (ie each contractor is expected to pay AJS £8 per unit if their product is selected for manufacturing).
Demand
Demand for each of the three products is unknown. These three product demands are modelled as discrete random variables denoted D1, D2, and D3 (for the first, second, and third product respectively, in number of units) with the following probability distributions:

Demand D1 P(D1) D2 P(D2) D3 P(D3)
Light 11,000 0.2 8,000 0.2 4,000 0.1
Moderate 16,000 0.6 19,000 0.4 21,000 0.5
Heavy 21,000 0.2 27,000 0.4 37,000 0.4
(a) Estimate the Expected Monetary Value (EMV) for each product.
They now decide to do a more detailed analysis. The following information is available for the evaluation of the three product proposals:
Variable cost
Variable cost per unit changes each year, depending on the costs for material and labour. Let V1, V2, and V3 represent the three variable costs. The uncertainty surrounding each variable is represented by a normal distribution with mean £4 and standard deviation of £0.40.

 

Machine failure
Each year, AJS’s machines fail occasionally, but obviously it is impossible to predict when or how many failures will occur during the year. Each time a machine fails, it costs the firm £8,000. Let Z1, Z2, and Z3 represent the number of machine failures in each of the three products, and assume that each is a Poisson random variable with parameter λ=4.
Fixed Costs
Each year a fixed cost of £12,000 is incurred.
(b) For the situation described above, run a Monte Carlo simulation and present the results in relevant diagrams (you should at least have three graphs, each showing the simulated distribution of the contribution for each proposed product).
(c) Discuss the recommendations you would make to the management on the basis of these diagrams,
(d) Discuss the advantages of using Monte Carlo simulation instead of a deterministic analysis.

 

Section C – Multicriteria Decision Analysis

You are acting as a consultant for an estate agency in data analysis tasks that enhance decision making. A client of the agency would like to invest in a buy-to-let property in the area of Hampshire. Four properties have been selected based on filtering that matches both the profile of potential tenants and the needs of the client. In particular, the criteria based on which the four properties will be evaluated are a follows:
• Price (in £): the overall price paid by the client incl. any fees, land tax etc;
• Size (in square meters): the overall size of the interior, living space;
• Rooms: the number of rooms in the property;
• Age (in years): the number of years since the property was first constructed;
• Area’s projected price growth in the next 5-years: figures based on agency’s estimates;
• Nearby amenities: the number of amenities in a 100-yards radius
• Noise during peak time (in db): Noise measured in the interior of the house (in db) during the peak hour window of 15:00-18:00.

 

Required
You are required to make use of the PROMETHEE and AHP methods, in order to answer the following two questions and sub-questions:
Q1. Having surveyed your client, you find that her preferences are as follows:

Making use of AHP (through the Expert Choice software):
a. Find the weights that correspond to those preferences and report them;
b. Is the client consistent in her choices?

Q2. Using the obtained weights from Q1 and the data from Table 1, in the Smart Picker Pro software, set the parameters according to the objective (Maximise/Minimise). Then answer the following:
a. Which property offers the best performance?
b. Which property offers the least regret?
c. Taking both performance and regret aspects combined, which property is overall performing the best taking into account all conflicting criteria?
d. Is there a significant difference compared to the second-best property?

How could we decompose the overall performance back to the elementary criteria?
e. Display and comment on the GAIA plane.
f. If the construction company fits an extra layer of sound-proofing materials, the noise levels in Property B can be lowered down to 60db. How would that change your analysis, and what can you infer from this change as to the robustness of your evaluation?

 

This U30289 : Statistics Assignment has been solved by our Statistics Experts at My Uni Paper. Our Assignment Writing Experts are efficient to provide a fresh solution to this question. We are serving more than 10000+ Students in Australia, UK & US by helping them to score HD in their academics. Our Experts are well trained to follow all marking rubrics & referencing style.

Be it a used or new solution, the quality of the work submitted by our assignment experts remains unhampered. You may continue to expect the same or even better quality with the used and new assignment solution files respectively. There’s one thing to be noticed that you could choose one between the two and acquire an HD either way. You could choose a new assignment solution file to get yourself an exclusive, plagiarism (with free Turnitin file), expert quality assignment or order an old solution file that was considered worthy of the highest distinction.

Get It Done! Today

Country
Applicable Time Zone is AEST [Sydney, NSW] (GMT+11)
+

Every Assignment. Every Solution. Instantly. Deadline Ahead? Grab Your Sample Now.