Check the attachmentsPlease read the instructions and questions carefully in ” Assignment_5_2024_Fall.pdf” file and use “Auto.csv” to finish the as

Check the attachments

Please read the instructions and questions carefully in ” Assignment_5_2024_Fall.pdf” file and use “Auto.csv” to finish the assignment. You should submit both 1) an R code ; 2) A PDF report with answers through the link “Submit Assignment 5 Here”.

Guidelines:

· Use only R for this assignment

· Submit both R code and Report on findings

· Work is to be done individually for this assignment

1. In this problem, you will generate simulated data, and then perform K-means clustering on the data.

1.1 Generate a simulated data set with 30 observations in each of two classes (i.e. 60 observations in total), and 2 variables.

Code Hint: The first four lines of codes should be:

set.seed(2) x=matrix(rnorm(60*2), ncol=2) x[1:30,1]=x[1:30,1]+3

x[1:30,2]=x[1:30,2]-4

1.2 Perform K-means clustering of the observations with K = 2. Plot the data with each observation colored according to its cluster assignment (nstart=20). Take a screenshot of your plot. What is the total within-cluster sum of squares?

1.3 Perform K-means clustering with K = 3. Plot the data with each observation colored according to its cluster assignment (nstart=20). Take a screenshot of your plot. What is the total within-cluster sum of squares?

1.4 Now perform K-means clustering with K = 4. Plot the data with each observation colored according to its cluster assignment (nstart=20). Take a screenshot of your plot. What is the total within-cluster sum of squares?

1.5 Using the scale () function, perform K-means clustering with K = 2 on the data after scaling each variable to have standard deviation one. Take a screenshot of your plot. What is the total within-cluster sum of squares now? How do these results compare to those obtained in (2)?

1

2. Consider the USArrests data. We will now perform hierarchical clustering on the states. USArrests dataset is part of the base R package. You do not need to load any libraries.

2.1 Plot the hierarchical clustering dendrogram using complete linkage clustering with Euclidean distance as the dissimilarity measure. Take a screenshot of your plot.

2.2 Cut the dendrogram at a height that results in three distinct clusters. Which states belong to which clusters? You need to provide state names for each cluster (e.g. Cluster 1 has Alabama, Alaska,…).

2.3 Hierarchically cluster the states using complete linkage and Euclidean distance, after scaling the variables to have standard deviation one.

a) Take a screenshot of your plot.

b) What effect does scaling the variables have on the hierarchical clustering obtained?

c) In your opinion, should the variables be scaled before the inter-observation dissimilarities are computed? Provide a justification for your answer.

2.4 After scaling the variables to have standard deviation one, plot the hierarchical clustering dendrogram using average linkage clustering with Euclidean distance as the dissimilarity measure. Take a screenshot of your plot.

2.5 After scaling the variables to have standard deviation one, plot the hierarchical clustering dendrogram using single linkage clustering with Euclidean distance as the dissimilarity measure. Take a screenshot of your plot.

What to submit:

1.
R code.

a. Should include all the code to accomplish the tasks.

b. Clear and concise comments to indicate what part of the assignment each code chunk pertains to.

c. Code should be easily readable.

d. Filename should be in the format of: LastnameFirstname_A5.R

2.
Report.

a. Take screenshots of your outputs in R Studio and answer all the questions.

b. Submit in PDF format.

c. Answers questions clearly and concisely.

d. Includes appropriate plots. Make sure the plots are properly labeled.

e. The assignment will be graded on the correctness of the answers, comprehensiveness of the analysis, clarity of results’ presentation and neatness of the report.

Share This Post

Email
WhatsApp
Facebook
Twitter
LinkedIn
Pinterest
Reddit

Order a Similar Paper and get 15% Discount on your First Order

Related Questions

literature review Please follow every of the requierments ask in

 literature review  Please follow every of the requierments ask in the assignment and explain each point.  Critique quality of the literature reviews conducted for two different types of studies- a quantitative and qualitative research. do not explain only qualitive and quantitive research, enfazise in the instructions.  • Identify and discuss

Discussion: Who knew that your own self-esteem and confidence could

Discussion: Who knew that your own self-esteem and confidence could influence the success of your workplace? In our workplaces, personal self-image can create positivity that can be contagious. Likewise, poor self-image can breed negativity and hinder personal and professional growth. Based on your readings this week (see Content – Week 2

Qualitative Research Why do companies need to conduct research and how do they decide what type of research will provide

Qualitative ResearchWhy do companies need to conduct research and how do they decide what type of research will provide the best information? To answer that question, investigate types of research specific to your industry.To complete this task successfully first work through all learning resource activities:The ResearchReadQualitative ResearchHistorical ResearchWatchSection 8 –

Case Scenario 2: James is a 24-year-old male who was

   Case Scenario 2:  James is a 24-year-old male who was referred for a Sex Offender Specific Risk Assessment. James was raised in a family with a mother who was addicted to drugs and an absent father. He reported memories of being sexually assaulted as a young boy. He has

Safety Management System Design

Safety Management System Design  Instructions Immediately following the Chemical Safety and Hazard Investigation Board’s (CSB) published investigation report of the chemical spill into the Charleston, West Virginia, public water supply, you have been contracted as a safety engineer consultant to evaluate the investigation report from a system safety engineering perspective.

Write at least a three-page analysis using the case study

Write at least a three-page analysis using the case study on pages 343-344 of your textbook: “Expatriate Management at AstraZeneca PLC.” Your analysis should address the prompts listed below. Critically analyze AstraZeneca’s expatriate management practices. Surveys show that most expatriates report feeling the strain of managing the demands of work

Topic 1 DQ 1 Many factors can create a nursing

 Topic 1 DQ 1 Many factors can create a nursing shortage in a region or specialty. The first factor is the lack of professional educators in the nursing specialty (Haddad et al. 2020). Lack of educators makes nursing students avoid enrolling in nursing specialties. The shortage of nursing student graduates

Module 5 Discussion (Chapter 6 & 7) Discussion 6: Analyze

  Module 5 Discussion (Chapter 6 & 7) Discussion 6: Analyze the main reasons for HRIS implementations failure. How can we prevent these from affecting us? Discussion 7: Organizations have traditionally used “employee time saved” as the primary source of benefits to justify HRIS and other types of information system

Ouestion 1 a. Explain the concept of Equilibrium Price in

Ouestion 1 a. Explain the concept of Equilibrium Price in a perfectly Competitive Market and how it is determined? b. Calculate the percentage change in equilibrium price if percentage change in quantity demanded is 25%. Price Elasticity of Demand is 1.35 and Price Elasticity of Supply is 1.15. Question 2

Please answer the questions in paragraphs containing at least five

Please answer the questions in paragraphs  containing at least five sentences. Include the question and number  your answers accordingly.  1. Describe Digital Literacy (how to know what is real on the web).  2. None of these people exist. (https://petapixel.com/2018/12/17/these-portraits-were-made-by-ai-none-of-these-people-exist/) What does this mean to you? 3. Why is Wikipedia more reliable than a paper

As students complete the required Major Case Analysis they are also responsible for completing a PowerPoint or Prezi presentation or

As students complete the required Major Case Analysis, they are also responsible for completing a PowerPoint or Prezi presentation (or presentation that uses another appropriate technology) highlighting their findings.rubric attachedThe recommended length of the presentation is 8-10 slides with audio commentary included. The presentation must be appropriate to college-level work,