check the attachementsPlease read the instructions and questions carefully in ” Assignment_4_2023_Fall.pdf” file and use “Auto.csv” to

check the attachements

Please read the instructions and questions carefully in ” Assignment_4_2023_Fall.pdf” file and use “Auto.csv” to finish the assignment. You should submit both 1) an R code ; 2) A PDF report with answers through the link “Submit Assignment 4 Here”

Guidelines:

· Use only R for this assignment

· Submit both R code and Report on findings

· Work is to be done individually for this assignment

Fitting a Classification Tree

1.
This problem involves the OJ data set which is part of the ISLR package (
Hint: the first three lines of codes should be: library (tree), library (ISLR), attach (OJ)).

1.1 Create a training set containing a random sample of 800 observations, and a test set containing the remaining observations. Take a screenshot of your code. (Hint: set.seed (2), train=sample())

1.2 Fit a tree to
the training data, with
Purchase as the response and the other variables as predictors. Use the summary( ) function to produce summary statistics about the tree. Take a screenshot of the summary statistics. How many terminal nodes does the tree have? What is the training misclassification error rate?

1.3 Plot the tree and take a screenshot of the tree (Hint: plot() and text())

1.4 Predict the response on the test data, and produce a confusion matrix comparing the test labels to the predicted test labels. What is the accuracy rate?

1.5 Apply the cv.tree() function to the training set in order to determine the optimal tree size. (Use set.seed(7)). Print the results (Hint: the results should contain the size, k, method etc).

1.6 Produce a plot with tree size (i.e. size) on the x-axis and cross-validated classification error rate (i.e. dev) on the y-axis.

1.7 Which tree size corresponds to the lowest cross-validated classification error rate (i.e. dev)?

1.8 Produce a pruned tree corresponding to the optimal tree size obtained using cross-validation. Take a screenshot of a pruned tree. What is the accuracy rate for the pruned tree? Is it improved compared to the accuracy rate in (1.4)?

1.9 If cross-validation does not lead to selection of a pruned tree (i.e. the accuracy rate produced in (1.8) is lower than the one in (1.4)), then create a pruned tree with five terminal nodes. What is the accuracy rate now?

1

Fitting a Regression Tree

2.
In the lab, a classification tree was applied to the Carseats data set after converting Sales into a qualitative response variable. Now we will seek to predict Sales using regression trees and related approaches, treating the response as a quantitative variable.

2.1 Using the validation-set approach to split the data set into a training set and a test set (Hint:
use set.seed(2); validation-set approach: half of the observations are selected as the training dataset while half of observations are treated as the test dataset). Take a screenshot of your code.

2.2 Fit a regression tree to the training set.

a) Use summary () to print out the results. How many terminal nodes do you get? What is RMD (Residual Mean Deviance)?

b) Plot the tree and take a screenshot of the tree;

c) What test MSE do you obtain?

2.3 Use cross-validation in order to determine the optimal level of tree complexity (use set.seed(2)).

a) Produce a plot with tree size on the x-axis and cross-validated classification error rate on the y-axis.

b) What is the optimal level of tree complexity?

c) Using the optimal level of tree size to prune the tree, does pruning the tree improve the test MSE?

2.4 Use the bagging approach in order to analyze this data. Take a screenshot of the results. What test MSE do you obtain? (Hint: use set.seed (1);
mtry=10 since we have 10 predictors in Carseats dataset and we use all of the predictors in the bagging approach).

2.5 Use random forests to analyze this data.

a) What test MSE do you obtain? (Hint: use set.seed(1);
mtry=10/3 since we usually use 1/3 of the predictors when building a random forest of regression trees)

b) Use the importance() function to determine which variables are most important. Take a screenshot of your results.

c) Plots of these importance measures can be produced using the varImpPlot() function. Take a screenshot of your output.

d) So which variables are most important?

What to submit:

1. R code.

a.

b.

c.

d.

2. Report.

a.

b.

c.

d.

e.

Should include all the code to accomplish the tasks.

Clear and concise comments to indicate what part of the assignment each code chunk pertains to.

Code should be easily readable.

Filename should be in the format of: LastnameFirstname_A4.R

Take screenshots of your outputs in R Studio and answer all the questions. Submit in PDF format.

Answers questions clearly and concisely.

Includes appropriate plots. Make sure the plots are properly labeled.

The assignment will be graded on the correctness of the answers, comprehensiveness of the analysis, clarity of results’ presentation and neatness of the report.

Share This Post

Email
WhatsApp
Facebook
Twitter
LinkedIn
Pinterest
Reddit

Order a Similar Paper and get 15% Discount on your First Order

Related Questions

The law that legalized medical marijuana in Florida in 2016 Write

Premium Paper Help is a professional writing service that provides original papers. Our products include academic papers of varying complexity and other personalized services, along with research materials for assistance purposes only. All the materials from our website should be used with proper references.

Assume you have joined a new school as a substitute

  Assume you have joined a new school as a substitute teacher. You must speed up with the lessons that have been taught and pick up the lessons where the other teacher left off. Your task includes preparation of lesson plans, goals, objectives, and designing assessments, taking into account the

Interview your manager to gain further insight on his/her experience.

Interview your manager to gain further insight on his/her experience. Topics must include but are not limited to the following: Their career trajectory (career path, vertical and/or lateral career moves, etc. ) Key strategies from their perspective to obtain your target job What you need to know about the industry/market

  Create an analytical annotated bibliography of resources you will use

  Create an analytical annotated bibliography of resources you will use for your project. These must include at least six scholarly resources. Each annotation should: be at least 150 words offer a brief summary of the resource in your own words specifically explain briefly how the resource will be useful

Fiore spends a great deal of time in our textbook

Fiore spends a great deal of time in our textbook discussing communication. How do you interpret the chapter entitled “Saying What You Mean: Meaning What You Say”? How have you seen the content of that chapter play out in your education setting? What is one positive example and one not-so-positive

Write a 3 to 4-page paper where you discuss the

Premium Paper Help is a professional writing service that provides original papers. Our products include academic papers of varying complexity and other personalized services, along with research materials for assistance purposes only. All the materials from our website should be used with proper references.

Topic 8 DQ 2 Evidence based change projects are initiated

 Topic 8 DQ 2 Evidence based change projects are initiated in an attempt to be proactive in guiding health care decisions and improving the quality of health care. This new knowledge can have a significantly positive impact on patient care if successful. For this reason, in order to implement evidence-based

Assignment #1: Nursing Theory

Part 1: After completing assigned readings, go to online libraries at: retrieve and read the following Journal Articles and complete the Activity described below. Jenkins, B., & Warren, N. (2012). Concept analysis: Compassion fatigue and effects upon critical care nurses. Critical Care Nursing Quarterly, 35(4), 388–395. ( I WILL PROVIDE

Please read Case 11(C-147) part 4, case studies “Pacific Drilling:

Please read Case 11(C-147) part 4, case studies “Pacific Drilling: The Preferred Offshore Driller” and write a paper with a minimum of five APA formatted pages.  Please provide at least six (6) peer-reviewed resources in support of your arguments.  After your learnt about the case study of “Pacific Drilling: The

The assignment: Write an introduction about (The prevalence of job

The assignment: Write an introduction about (The prevalence of job burnout among working women). The introduction should give: 1- Brief background of the topic you are discussing/presenting. 2- State the main aims and objectives of your piece of work.  3- Why did you choose this topic (why is it important?) 

Unit outcomes addressed in this discussion: Identify the various data sources that populate the electronic health record. Choose the appropriate

Unit outcomes addressed in this discussion:Identify the various data sources that populate the electronic health record.Choose the appropriate field type for a data element.Discover threats to data integrity and validity.Course outcome addressed in this discussion:HI150-3: Examine the privacy and security considerations for safeguarding protected health information.Discussion Topic: HIPAA BreachesIn practice

Conclude with a research or policy question for further research.

Conclude with a research or policy question for further research. Part 2 of 2  1) How does social media assist emergency management. Give examples.       2) How does social media hinder emergency management. Give examples.       3) How can emergency management officials help assist with providing relevant and

Identify three to five possible conditions that may be considered

Premium Paper Help is a professional writing service that provides original papers. Our products include academic papers of varying complexity and other personalized services, along with research materials for assistance purposes only. All the materials from our website should be used with proper references.