AFIN8015 - Financial Data Science and Data Analysis Assignment

Download Solution Order New Solution

Assignment Task

Scenario

You are an intern data scientist at Shootingformars Corp. and have data analysis and machine learning skills, particularly in the financial service sector. As it happens, your mentor Ms Fowler has just been approached by Mr Musk, a new client who has recently started investing in the Share Market and has been using some past information to make his trading decisions on a daily basis. Mr Musk has also been researching during his spare time and has heard that modern Data Science methods such as Machine Learning can be used to predict the price direction for stocks and other financial assets. Unfortunately, he has limited understanding of the Data Science process and limited programming skills, but he does have some background in statistics.

After the initial meeting with Mr Musk, Ms Fowler has decided to treat this as an educational/proof of concept project and brought you on board to conduct the analysis and prepare the documentation for the project. Ms Fowler has assigned a publicly trading stock listed in and given you a set of tasks as listed in Part-I and PartII of this document. Part-I is aimed to assist Mr Musk in developing a better understanding of the Data Science and Descriptive Statistics using statistics and visualisation. Part-II of the task is to use the stock assigned to you and conduct a classification exercise for demonstration. The task requires you to create a professional standard document to be presented to the client. You have been given a choice of either using a traditional workflow of creating and word document and R for coding the methods separately and then bring them all togetherin one document or use a reproducible method with an RMarkdown file.

Data Science Concepts & Descriptive Analysis

A. Explain the concept of Data Science, and outline and explain the Life Cycle of a Data Science Project. Use a research informed example fromthe financialservice sector domain to explain the Life Cycle of a Financial Data Science project.

B.1. Use FACTSET and download the daily Open, High, Low and Close (OHLC) Prices and Trading Volume for the company stock assigned to you from 01-July-2020 to 17-March-2023. (2 marks)

2. Use the closing prices (the price column from the downloaded data) and percentage logarithmic returns of the closing prices to generate descriptive statistics (including Skewness, Kurtosis and Test for Normal Distribution). Present the statistics in the document and briefly discuss the range, distribution and tail behaviour of the price and return series. Keep the discussion briefand to the point,remember your client has some statistical background and understanding of the stock market. 

C.1. Use the OHLC prices and the Volume data to plot the following charts for the last six (6) months subset from the data :

(a) Line Chart

(b) Candlestick Chart:

(c) Add the following Technical indicators to the candlestick chart

  • 5 Day Exponential Moving Average
  • 5 Day Momentum

2. Comment on the trend and price direction based on the plots generated in 1 above.

Classification Models & Application

D.1. AsMrMusk haslimitedexposuretoMachine Learning(ML) andvariousmethodsinML, you are tasked to conduct a short review of ML and ML methods with a focus on Classification models. Your review should also include the following

(a) An overview of Machine Learning and the ML process.

(b) Discussion on Supervised and Unsupervised Machine Learning and two classification mod- els with at least one which hasn’t been discussed in the class notes.

(c) As the modelling task requires you to conduct a price direction forecast exercise, the review should also include at least one example of previous research using ML for stock price movement/direction prediction.

Your final task is to conduct a proof of concept comparative analysis of two classification methods to demonstrate classification and predictive ability of ML methods in modelling and predicting the price direction based on various technical indicators. Specifically, the task should conduct the following:

E.1. Select the closing prices from the OHLC stock price data downloaded from FACTSET (same as in Task2) and create the one period lags of the following technical indicators2.

(a) Simple Moving Average: 10 day moving average.

(b) Log returns (c) Simple returns (d) Exponential Moving Average: 10 day (e) Momentum: 

2. Create a dichotomous price direction indicator output variable based on 4 day lagged price

3. Combine the indicatorsin a data frame and visualise the data using

(a) A time series plot, and (b) Box plots of indicators categorised by price direction Judging from the box plots

(b), is there any technical indicator with different values for the outcome variable and maybe able to model the outcomes? Provide a short comment.

4. Create a 70:30 training and testing sample from the dataset and conduct a classification exercise using Logistic Regression. The analysisshould include the following:

(a) Training on the training sample using a ‘timeslice’ sampling. Use at least 250 days as window size and 20 days for prediction horizon in a fixed window.

(b) Data pre-processing to standardise the data.

(c) Prediction on the test set and corresponding confusion matrix.

(d) Brief discussion on the accuracy of the prediction based on the confusion matrix

5. Conduct the classification exercise (in 4 above) using k-Nearest Neighbours algorithm. The analysis should include the following:

(a) A odd number grid search for the ‘k’ parameter from 1 to 40.

(b) Prediction on the test set and corresponding confusion matrix.

(c) Brief discussion on the accuracy of the prediction based on the confusion matrix.

6. Compare the performance of the Logistic Regression Model and k-NN model based on their accuracy (based on the confusion matrix from the two models) and provide a recommendation for Mr Musk.

This AFIN8015 - Data Science has been solved by our PhD Experts at My Uni Paper.

Get It Done! Today

Country
Applicable Time Zone is AEST [Sydney, NSW] (GMT+11)
+

Every Assignment. Every Solution. Instantly. Deadline Ahead? Grab Your Sample Now.