Highlights
Task:
Once you have completed this assignment, you will upload two files into Blackboard: The .ipynb file that you create in Jupyter Notebook, and an .html file that was generated from your .ipynb file. If you run into any trouble with submitting the .html file to Blackboard, you can submit it as a PDF instead.
For any question that asks you to perform some particular task, you just need to show your input and output in Jupyter Notebook. Tasks will always be written in regular, non-italicized font.
For any question that asks you to include interpretation, write your answer in a Markdown cell in Jupyter Notebook. Any homework question that needs interpretation will be written in italicized font. Do not simply write your answer in a code cell as a comment, but use a Markdown cell instead.
Remember to be resourceful! There are many helpful resources available to you, including the video library, the class slides, the recitation sessions, the Zoom office hours sessions, and the web.
Part I: DecisionTreeRegressor
This homework will have a somewhat different feel than the other ones that we have done this semester -- instead of using a step-by-step set of directions, it will be a bit more general. A dataset description for marketing_data will be posted to Blackboard. Our goal with this model will be to use a decision tree to predict how much a customer is likely to spend on fish products.
I. Read the file marketing_data.csv into your environment.
2. Visualizations
? Use any 5 visualizations to learn more about your dataset. For each visualization that you generate, write about it in a sentence or two. You can use any type of visualization from class material, or anywhere else -- but use an appropriate visualization type (e.g. scatterplot for a relationship between two continuous numeric values, a histogram for the distribution of a single numeric value, a barplot to show averages or counts among categories, etc.)
3. Data Preparation
? Explore your dataset visually (with the head() function) and also with the .info() function.
? Which variables here look like they’re ready for modeling? (No NaN values, already in 0/1 format if categorical, etc.?)
? Which variables appear to require some additional handling steps in order to prepare them for use in a model?
? Are there any variables here that will not have any predictive power in a model?
? If so, you should remove them.
? For any variables that require further handling, perform the needed steps.
? For each variable that requires some sort of data handling work, describe the process that you used for dealing with this variable in a sentence or two.
? Partition your data, using a 60/40 train/test split.
This AD654 IT Assignment has been solved by our IT Experts at My Uni Paper. Our Assignment Writing Experts are efficient to provide a fresh solution to this question. We are serving more than 10000+ Students in Australia, UK & US by helping them to score HD in their academics. Our Experts are well trained to follow all marking rubrics & referencing style.
Be it a used or new solution, the quality of the work submitted by our assignment experts remains unhampered. You may continue to expect the same or even better quality with the used and new assignment solution files respectively. There’s one thing to be noticed that you could choose one between the two and acquire an HD either way. You could choose a new assignment solution file to get yourself an exclusive, plagiarism (with free Turnitin file), expert quality assignment or order an old solution file that was considered worthy of the highest distinction.
© Copyright 2026 My Uni Papers – Student Hustle Made Hassle Free. All rights reserved.