Customer Churn Prediction Analysis Assignment

Download Solution Order New Solution

Assignment Task

A telephone company is interested in determining which customer characteristics are useful for predicting churn, customers who will leave their service. Your task is to uncover patterns in the customer data that will help the company identify which types of customers are most (least) likely to churn.

Some tasks might have already been covered earlier. However, you may not be able to complete this project in a day or two. I recommend that you "move in" with this project and get really into it as early as you can. I encourage you to go beyond the project, but only after you have covered all the required stuff.

20240717111424AM-1176919164-1578727044.PNG

Data Preparation and Exploration

1. Perform data preparation on the data set, if needed. Give evidence that there are no problems with data quality or missing data. Which variables show anomalous behavior? How shall we deal with this? Which field will yield no usable statistical or graphical information, as a surrogate for the ID field?

2. Examine the variables graphically.

  • For two interesting categorical variables, construct a distribution of the variable. Comment on each.
  • Examine the distribution of all numeric variables, using histograms. (Need not include in report.) Make a little table listing (in alphabetic order) the variables which are not normally distributed, along with the transformation function needed to induce normality (e.g., log transformation). For those variables, perform the transformation to induce approximate normality. Provide before/after histograms for all such variables.

3. Examine the variables statistically.

  • For all the numeric variables, find the mean, median, standard deviation, minimum and
  • maximum. Put the results in a table, with the variables in alphabetical order.
  • Normalize all the numeric variables, using either

(i) z-scores, or

(ii) min-max normalization [ (value-min)/range].

4. Relationships between variables.

  • Plot Day Mins vs. Day Charge. Comment. How shall we deal with this?
  • Construct a scatter plot between any two numeric variables that you find interesting (not those in (a)).
  • Using the statistics node, report any high correlations between any two variables. What would be the effect of keeping two highly correlated variables in the model? What should be done?

5. Data Manipulation

  • Apart from churners/non-churners, are there any interesting subsets of records to be selected for special attention? Selecting and analyzing them may increase the precision of your analysis for important subsets of customers. Why do you find this subset of data interesting and useful? Provide graphics and descriptive statistics that describe the behavior of your subset.
  • Discretize (make categorical) a relevant numeric variable which you think will be explicatory of churn. This can be done using histograms.

6. With a view to uncovering customer churn patterns, investigate how each relevant variable is associated with Churn.

  • For each relevant categorical variable, construct a distribution of the variable with a churn overlay. You may wish to normalize to increase contrast.
  • For each relevant numerical variable, construct a histogram of the variable with a churn overlay. You may wish to normalize to increase contrast.
  • For the subset of records in 5a, compare the proportion of churners in this subset with the proportion of churners not in the subset (Note: Don't compare against the entire data set, which includes your subset).

7. Find a pair of numeric variables which are interesting with respect to churn. That is, for a pair of variables, construct a scatter plot with a churn overlay. If things look uniform, then this is not particularly interesting. We are looking for differences within the scatter plot (churn vs. non-churn), which can help us understand the relationship between the two variables with churn. Now, if there seems to be a horizontal or vertical differentiation, then this is not interesting, as the churn behavior is altering only along one of the axes. We want to find churn behavior changing simultaneously along both axes.

Model Building

8. Which variables are you including in your models to predict Churn? Choose carefully to balance accuracy, generality, and interpretability.

Important: Provide a table of ALL the variables in the original data set, ranked by your judgment of their importance in predicting churn based on your work so far (show most important to least important). Also, provide a brief justification for either including or discarding the variable for your working models.

9. Develop a model of your choice (e.g., K-NN or Decision Tree) for predicting Churn. Use cross validation to measure the performance of the model. Explain the measures of validation in the confusion matrix.

10. Report the findings as such in your Executive Summary, along with supporting evidence, and a list of Recommendations (or reflections) for the company executives.

This IT and Computer Science has been solved by our PhD Experts at My Uni Paper. Our Assignment Writing Experts are efficient in providing a fresh solution to this question. We are serving more than 10000+ Students in Australia, the UK, and the US by helping them to score HD in their academics. Our Experts are well-trained to follow all marking rubrics and referencing styles.

Be it a used or new solution, the quality of the work submitted by our assignment experts remains unhampered. You may continue to expect the same or even better quality with the used and new assignment solution files respectively. There’s one thing to be noticed you could choose one between the two and acquire an HD either way. You could choose a new assignment solution file to get yourself an exclusive, plagiarism (with free Turnitin file), expert quality assignment or order an old solution file that was considered worthy of the highest distinction.

Get It Done! Today

Country
Applicable Time Zone is AEST [Sydney, NSW] (GMT+11)
+

Every Assignment. Every Solution. Instantly. Deadline Ahead? Grab Your Sample Now.