Module 2: Project Data Wrangling

Previous Next


For this portion of the project, you will examine your dataset for incorrect data. Any incorrect data should be removed, corrected, or imputed. Follow these steps:

  • Remove irrelevant data. If you are unsure if it is irrelevant, then keep it.
  • Remove duplicate records that are repeated.
  • Make sure numbers are interpreted as numerical data types.
  • Fix typos.
  • Standardize.
  • Investigate outliers.
  • Check and manage missing values.
  • Format and normalize data if needed.
  • Change categorical values into numbers if needed.

Once you have completed this, you will need to provide a Word document summarizing the pre-processing steps performed on your dataset.

Module 3: Project Exploratory Analysis

Previous Next


In this assignment, you will perform an exploratory analysis that will allow you to get a feel for the data and start exploring potential relationships. This may include:

  • Descriptive statistics
  • Histograms
  • Bar charts
  • Heat maps
  • Line graphs
  • Box plots
  • Frequency tables

Once your analysis is complete, you will need to provide a Word document showing and describing the results of your exploratory analysis.

  1. Using your chosen dataset, reevaluate the heat map from the last module.
  2. Consider ways to perform a visual check to see if there is a relationship between fields.
  3. With this insight, develop a model using either linear regression or multiple linear regression.
  4. Report the intercepts, slope, model accuracy, output to predicted comparison, and a scatterplot with line portraying the model.

Once you complete these steps, you will need to provide a Word document showing and explaining the results of your model development.

After finishing Proposal create a final report of 5-6 pages

Use Python, Jupyter and show the visuals of the data analysis with introduction, conclusion