check the attachementsPlease read the instructions and questions carefully in ” Assignment_4_2023_Fall.pdf” file and use “Auto.csv” to

check the attachements

Please read the instructions and questions carefully in ” Assignment_4_2023_Fall.pdf” file and use “Auto.csv” to finish the assignment. You should submit both 1) an R code ; 2) A PDF report with answers through the link “Submit Assignment 4 Here”

Guidelines:

· Use only R for this assignment

· Submit both R code and Report on findings

· Work is to be done individually for this assignment

Fitting a Classification Tree

1.
This problem involves the OJ data set which is part of the ISLR package (
Hint: the first three lines of codes should be: library (tree), library (ISLR), attach (OJ)).

1.1 Create a training set containing a random sample of 800 observations, and a test set containing the remaining observations. Take a screenshot of your code. (Hint: set.seed (2), train=sample())

1.2 Fit a tree to
the training data, with
Purchase as the response and the other variables as predictors. Use the summary( ) function to produce summary statistics about the tree. Take a screenshot of the summary statistics. How many terminal nodes does the tree have? What is the training misclassification error rate?

1.3 Plot the tree and take a screenshot of the tree (Hint: plot() and text())

1.4 Predict the response on the test data, and produce a confusion matrix comparing the test labels to the predicted test labels. What is the accuracy rate?

1.5 Apply the cv.tree() function to the training set in order to determine the optimal tree size. (Use set.seed(7)). Print the results (Hint: the results should contain the size, k, method etc).

1.6 Produce a plot with tree size (i.e. size) on the x-axis and cross-validated classification error rate (i.e. dev) on the y-axis.

1.7 Which tree size corresponds to the lowest cross-validated classification error rate (i.e. dev)?

1.8 Produce a pruned tree corresponding to the optimal tree size obtained using cross-validation. Take a screenshot of a pruned tree. What is the accuracy rate for the pruned tree? Is it improved compared to the accuracy rate in (1.4)?

1.9 If cross-validation does not lead to selection of a pruned tree (i.e. the accuracy rate produced in (1.8) is lower than the one in (1.4)), then create a pruned tree with five terminal nodes. What is the accuracy rate now?

1

Fitting a Regression Tree

2.
In the lab, a classification tree was applied to the Carseats data set after converting Sales into a qualitative response variable. Now we will seek to predict Sales using regression trees and related approaches, treating the response as a quantitative variable.

2.1 Using the validation-set approach to split the data set into a training set and a test set (Hint:
use set.seed(2); validation-set approach: half of the observations are selected as the training dataset while half of observations are treated as the test dataset). Take a screenshot of your code.

2.2 Fit a regression tree to the training set.

a) Use summary () to print out the results. How many terminal nodes do you get? What is RMD (Residual Mean Deviance)?

b) Plot the tree and take a screenshot of the tree;

c) What test MSE do you obtain?

2.3 Use cross-validation in order to determine the optimal level of tree complexity (use set.seed(2)).

a) Produce a plot with tree size on the x-axis and cross-validated classification error rate on the y-axis.

b) What is the optimal level of tree complexity?

c) Using the optimal level of tree size to prune the tree, does pruning the tree improve the test MSE?

2.4 Use the bagging approach in order to analyze this data. Take a screenshot of the results. What test MSE do you obtain? (Hint: use set.seed (1);
mtry=10 since we have 10 predictors in Carseats dataset and we use all of the predictors in the bagging approach).

2.5 Use random forests to analyze this data.

a) What test MSE do you obtain? (Hint: use set.seed(1);
mtry=10/3 since we usually use 1/3 of the predictors when building a random forest of regression trees)

b) Use the importance() function to determine which variables are most important. Take a screenshot of your results.

c) Plots of these importance measures can be produced using the varImpPlot() function. Take a screenshot of your output.

d) So which variables are most important?

What to submit:

1. R code.

a.

b.

c.

d.

2. Report.

a.

b.

c.

d.

e.

Should include all the code to accomplish the tasks.

Clear and concise comments to indicate what part of the assignment each code chunk pertains to.

Code should be easily readable.

Filename should be in the format of: LastnameFirstname_A4.R

Take screenshots of your outputs in R Studio and answer all the questions. Submit in PDF format.

Answers questions clearly and concisely.

Includes appropriate plots. Make sure the plots are properly labeled.

The assignment will be graded on the correctness of the answers, comprehensiveness of the analysis, clarity of results’ presentation and neatness of the report.

Share This Post

Email
WhatsApp
Facebook
Twitter
LinkedIn
Pinterest
Reddit

Order a Similar Paper and get 15% Discount on your First Order

Related Questions

Critically evaluate global brand strategy including cultural and

. – Critical Evaluation of a Global Brand Learning outcomes assessed in this task:             · Demonstrate and apply advanced understanding of how brands are created and managed in the global context · Critically evaluate global brand strategy including cultural and ethical issues             · Analyse brand value using brand

1.- Prepare an Amazon’s SWOT analysis and evaluate the strategic

 1.- Prepare an Amazon’s SWOT analysis and evaluate the strategic needs of the organization within a changing global environment. Answer the 8 assignment/performance criteria thoroughly outlined, with extensive description in each one. List the Organizational initiative’s plans with descriptions, SWOT analysis, describe the unmet need of Amazon. Prepare an analysis

Analytic Tools for Strategic Decision-Making

Using Analytic Tools for Strategic Decision-Making In this Assignment, you will culminate one course outcome by conducting a business case study analysis in the form of a narrated PowerPoint presentation: MT460-2: Use a variety of analytical tools to monitor and improve business strategy. Your professor will present you with a variety

How do families cope with mothers that has PPD?

You will have two major grades for this assignment: The annotated works cited page (evaluated using the research documentation rubric; worth 15% of your course grade); and the final research paper (evaluated using the research rubric; worth 15% of your course grade). Sources: You must use at least EIGHT sources

you are going to explore gangs in your state corrections

  you are going to explore gangs in your state corrections facilities. .  : What state are you analyzing? What are the known prison gangs (or “Threat Groups”) in your state? Who does each group associate with on the national level (i.e., Bloods, Crips, Nortenos, Surenos, White Supremacy groups, motorcycle

Please, answer for your INITIAL posting and discuss ALL the

Please, answer for your INITIAL posting and discuss ALL the following questions in great detail: . What is this Video Clip about? Please describe, explain, analyze, and discuss in great detail … Your post for this question should be no less of 850 words. Which are the three most critical issues

General Requirements: Reference the Quality Template you completed in Topic

General Requirements: Reference the Quality Template you completed in Topic 3 as one source for stakeholders. If the Quality Template notes a group or person needs information, they are stakeholders. Reread the Kitchen Heaven Project Case Study in Heldman et al. pages 84-87 and read pages 139-141. Part 1: Communication

Learning Objective Write a summary report based on a vulnerability scanning report. Assignment Requirements: Download and reviewVulnerability Assessment Penetration Test

Learning Objective – Write a summary report based on a vulnerability scanning report.Assignment Requirements:Download and reviewVulnerability Assessment & Penetration Test Report For eClipse Bank.Write a summary report that includes:Brief descriptions of scanned issues designated as High severity.Suggested corrective actions to correct the deficiencies designated as High severity.For corrective actions that

Architecture and design represent one important side of delivering a

 Architecture and design represent one important side of delivering a security  posture. That’s what this book (attached) is all about: How does one go about  achieving an architecture and an architectural design that represent the security needs for a system?Answer  the question with a minimum of 300 words in APA

For this Discussion Board, please complete the following: The healthcare

  For this Discussion Board, please complete the following: The healthcare world and the world overall is now a very different place since the COVID-19 global pandemic uprooted people’s daily lives. Healthcare policy had to change on a dime in many places. For example, hospitals fearing the worst limited the

CASE OF STUDY Read the case and answer the following

CASE OF STUDY Read the case and answer the following questions in detail 1. How significant is P7S1’s IT to achieving strategic goals?Have Information Technologies played a facilitating or supporting role in terms of strategy2. How “Digital” Is PS71 Interface to the Customer?3. How P7S1 will create revenue from future

  Choose three quotations from different parts of “Pale Horse, Pale

Premium Paper Help is a professional writing service that provides original papers. Our products include academic papers of varying complexity and other personalized services, along with research materials for assistance purposes only. All the materials from our website should be used with proper references.

– Watch the video “What’s the Difference Between Information Governance

– Watch the video “What’s the Difference Between Information Governance and Data Governance” athttps://www.youtube.com/watch?v=CpohoQcApoc  – please view “Data Governance Vs. Information Governance”, a seminar by a consulting firm from July 2021https://www.youtube.com/watch?v=jV7AKCdNDbM   Listen to the new video, read this 2021 whitepaper IFHIMA_IG_OCT2021.pdfns from the International Federation of Health Information Management

Skim through Appendix A(LINK BELOW) on Microsoft Project 2016 (available

Skim through Appendix A(LINK BELOW) on Microsoft Project 2016 (available on the Companion website for this text). Review information about Project 2016 from the Microsoft website (www.microsoft.com). Research three other project management software tools, including at least one smartphone app. Write a paper answering the following questions:  a)What functions does

Name of Company: Brief Introduction & Key Issues Provide a

 Name of Company:  Brief Introduction & Key Issues Provide a 1 page summary of the company Provide a brief (1-2 paragraphs) introduction of the firm and the most pressing issues from the current reporting. Identify 3-5 competitors and list how they are competing.  Analysis: There are two parts to the

You may find it helpful to review How to Perform

  You may find it helpful to review How to Perform Multiple Regression on Quarterly Seasonal Data in Excel.For this assignment, you are required to complete Problem 64 in Chapter 12 of your textbook. Once complete, submit your Excel document in Waypoint. Show all work. (S/B: 12/64) Let Yt be the