Need help? We are here

# Check the attachments Please read the instructions and questions carefully in ” Assignment_3_ 2024.pdf” file and use “Auto.csv” to

Check the attachments

Please read the instructions and questions carefully in ” Assignment_3_ 2024.pdf” file and use “Auto.csv” to finish the assignment. You should submit both 1) an R code ; 2) A PDF report with answers through the link “Submit Assignment 3 Here”.

## Guidelines:

· Use R only for the part 2 in this assignment

· Submit both R code and Report on findings

· Work is to be done individually for this assignment

1. Suppose we collect data for a group of students in a statistics class with variables X1 =hours studied, X2

=undergrad GPA, and Y = receive an A. We fit a logistic regression and produce estimated coefficient,

𝛽̂0 = −7, 𝛽̂1 = 0.06, 𝛽̂2 = 1. (You do not need R code to solve this question).

(1) Estimate the probability that a student who studies for 50 hours and has an undergrad GPA of 3.5 gets

an A in the class. (Hint: For logistic regression, 𝑝(𝑥) = 𝑒𝛽0+𝛽1𝑋1+𝛽2𝑋2

)

1+𝑒𝛽0+𝛽1𝑋1+𝛽2𝑋2

(2) How many hours would a student with GPA 3.4 need to study to have a 50% chance of getting an A

1

in the class? (Hint: We can use the equation log (
𝑝(𝑥) ) = 𝛽

+ 𝛽 𝑋

+ 𝛽 𝑋 ))

1−𝑝(𝑥)

0 1 1 2 2

2. The following questions (3) to (8) should be answered using the
Weekly data set, which is part of the
ISLR package. This data is similar in nature to the Smarket data from this chapter’s lab, except that it contains 1089 weekly returns for 21 years, from the beginning of 1990 to the end of 2010.

(3) Use require(ISLR) and library (ISLR) to load the ISLR package.

a) Use summary( ) function to produce some numerical summaries of the
Weekly data.

b) Use pairs ( ) function to produce a scatterplot matrix of the variables of the data.

c) Do you see the relationship between
Year and
Volume? What is the pairwise correlation value between
Year and
Volume?

d) Is the relationship positive or negative?

(4) Use the full dataset to perform a logistic regression with
Direction as the dependent variable and
Lag1, Lag2, Lag3, Lag4 and
Volume as independent variables (i.e. predictors). Use the summary() function to print the results. Do any of the predictors appear to be statistically significant? If so, which ones? Take a screenshot of your outputs and then answer the questions.

(5) Based on 4)’s results, compute the confusion matrix and overall faction of correct predictions (Hint: refer the code from Chapter 4 lab session on the textbook; we use 0.5 as the predicted probability cut-off for the classifier). What is the precision rate? What is the recall rate? Take a screenshot of your output and then answer the questions.

(6) Now fit the logistic regression model using a training data period from
1990 to 2009 with
Lag 2 as the only predictor. Compute the confusion matrix and the overall fraction of correct predictions for the held out data (i.e. test data) (the data from
2010). In addition, please calculate the precision rate and recall rate. (Hint: refer the code from Chapter 4 lab session on the textbook; we use 0.5 as the predicted probability cut-off for the classifier). Take a screenshot of your output and then answer the questions.

(7) Repeat (6) using KNN with K=1. Compute the confusion matrix and the overall fraction of correct predictions for the held-out data. In addition, please calculate the precision rate and recall rate. (Hint: refer the code from Chapter 4 lab session on the textbook; If you encounter some errors such as “dims of ‘test’ and ‘train’ differ”, try to use knn(data.frame(train.X), …) ). (Use set.seed(1))

(8) Repeat (6) using KNN with K=10. Compute the confusion matrix and the overall fraction of correct predictions for the held-out data. In addition, please calculate the precision rate and recall rate.

3. The quantity
𝑝(𝑋) is called the
odds. Please answer the following questions (You do not need R code

1−𝑝(𝑋)

to solve this question):

(9) On average, what fraction of people with an odds of 0.35 of defaulting on their credit card payment will in fact default?

(10) Suppose that an individual has a 15% chance of defaulting on her credit card payment. What are the odds that she will default?

4. The logistic regression model that results from predicting the probability of default from student status can be seen in the following table. We create a dummy variable that takes on a value of 1 for students and 0 for non-students. Please answer the following questions (You do not need R code for these questions).

(11) How to explain the coefficient before Student[Yes]?

(12) If it is a non-student, what are the estimated odds? Is the probability of default less than the probability of not default?

What to submit:

1. R code.

a.

b.

c.

d.

2. Report.

a.

b.

c.

d.

e.

Should include all the code to accomplish the tasks.

Clear and concise comments to indicate what part of the assignment each code chunk pertains to.

Filename should be in the format of: LastnameFirstname_A3.R

Take screenshots of your outputs in R Studio and answer all the questions. Submit in PDF format.

Includes appropriate plots. Make sure the plots are properly labeled.

The assignment will be graded on the correctness of the answers, comprehensiveness of the analysis, clarity of results’ presentation and neatness of the report.

## image1.jpeg

Order a Similar Paper and get 15% Discount on your First Order

## Related Questions

How does servant leadership deviate from other leadership theories or philosophies? For example, Jim Collins talks about Level 5 leaders. How does this deviate from other leadership philosophies? How do you think Steven Covey’s concept of “seeking first to understand and then to be understood” would make a difference in

### Use power point to answer Please answer the following questions as thoroughly as possible. Remember the rules concerning plagiarism.must be

Use power point to answer Please answer the following questions as thoroughly as possible. Remember the rules concerning plagiarism.must be original responses1. In what respects were the causes of the Second World War rooted in the first one?2. Why did the German advance succeed in western Europe, but ultimately fail

### Case Study: Hank Kolb, Director, Quality Assurance

Hank Kolb, Director, Quality Assurance Hank Kolb was whistling as he walked toward his office, still feeling a bit like a stranger since he had been hired four weeks before as director–quality assurance. All that week he had been away from the plant at a seminar given for quality managers

### Reflection and Discussion Forum Week 5 Reflection and Discussion Forum

Reflection and Discussion Forum Week 5 Reflection and Discussion Forum Week 5Assigned Readings:Chapter 6. Multiple-Criteria Methods for Evaluation and Group Decision MakingInitial Postings: Read and reflect on the assigned readings for the week. Then post what you thought was the most important concept(s), method(s), term(s), and/or any other thing

### According to the course materials, if men and women are

According to the course materials, if men and women are not naturally opposite, then why do they act so differently so much of the time? include and bold terms sex v. gender, socialization, “doing gender”, gender rules, and gender policing. Use course materials ONLY (attached). 3-4 pages in length, double-spaced,

### 700- to 1,050-w0rd pap3r in which you identify 3 behaviors

700- to 1,050-w0rd pap3r in which you identify 3 behaviors or situations you have encountered that represent 3 separate defense mechanisms. Follow these guidelines: You may incorporate behaviors or situations from your daily interactions with others or interactions you observe between others (e.g., co-workers, strangers in a store, news, media).

### Instructions This week you will submit the second project VM Scanner Background Report based on the Nessus Report. As you

InstructionsThis week, you will submit the second project, VM Scanner Background Report, based on the Nessus Report. As you are writing your report, you may want to refer back to the CEO’s video in Week 1 to make sure your analysis and recommendations align with the CEO’s priorities and concerns.You

### Generally speaking bioethics helps determine what is responsible by considering four key principles:autonomybeneficencenonmaleficence andjustice. The principle of autonomy is about

Generally speaking, bioethics helps determine what is responsible by considering four key principles:autonomy,beneficence,nonmaleficence, andjustice. The principle of autonomy is about respecting people and their free will. Beneficence and nonmaleficence are two sides of the same coin: doing what is helpful, and not doing what is harmful. Justice, in this context,

### Crime Victim Processing – Premium Paper Help

Premium Paper Help is a professional writing service that provides original papers. Our products include academic papers of varying complexity and other personalized services, along with research materials for assistance purposes only. All the materials from our website should be used with proper references.

### Assignment Content Competency Integrate the positive and negative effects of

CompetencyIntegrate the positive and negative effects of social media making global communication easily accessible. Student Success CriteriaView the grading rubric for this deliverable by selecting the “This item is graded with a rubric” link, which is located in the Details & Information pane. InstructionsWhen using social media, the communication barrier

### As you read the chapters assigned to each week, you

As you read the chapters assigned to each week, you will find some concepts more interesting and applicable to your personal or work situation than others. Review the key terms listed in the assigned chapters; then, choose a key term that you wish to write on for your thread. Thread

### Professional Memo “Why the postal service does not need to

Professional Memo “Why the postal service does not need to change to 7 day delivery?” A-Case studies over 7 day delivery B-Who proposed this concept? C-Why was this concept proposed? D-Who’s leading the charge for and against it?  E-What’s the impact to the public?  F-Con’s from having 7 day delivery?

### HEALTH CONCERN: DOMESTIC VIOLENCE

HEALTH CONCERN: DOMESTIC VIOLENCE A Explain how the health concern is linked to a health inequity or health disparity within the target population. 1. Identify specific data to support the health inequity or disparity conclusion. 2. Discuss the primary community and prevention resources currently inplace to address the health concern.

### Focus: Comparative Analysis, Database Sourcing, Peer-Reviewed Publications The United States

Focus: Comparative Analysis, Database Sourcing, Peer-Reviewed Publications The United States of America can be a very different place for each one of its citizens. For some, it’s the land of golden opportunity but for others, it can be a hard and unforgiving disappointment. For this essay we’re going to do

### Performance Management – Case Study  Review Chapter 2 and then read

Performance Management – Case Study  Review Chapter 2 and then read the scenario below. Scenario Suzan has worked for Organization ABC for one year. During her 30-day review, Suzan received amazing scores from her supervisor. During her 60-day review, Suzan’s performance was deemed as acceptable, but her ability to communicate