Skip to content

jscriptcoder/Predicting-Boston-Housing-Prices

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Machine Learning Engineer Nanodegree

Model Evaluation and Validation

Project: Predicting Boston Housing Prices

See project report

Project Overview

The Boston housing market is highly competitive, and you want to be the best real estate agent in the area. To compete with your peers, you decide to leverage a few basic machine learning concepts to assist you and a client with finding the best selling price for their home. Luckily, you’ve come across the Boston Housing dataset which contains aggregated data on various features for houses in Greater Boston communities, including the median value of homes for each of those areas. Your task is to build an optimal model based on a statistical analysis with the tools available. This model will then be used to estimate the best selling price for your clients' homes.

Project Highlights

This project is designed to get you acquainted with the many techniques for training, testing, evaluating, and optimizing models, available in sklearn.

Things you will learn by completing this project:

  • How to explore data and observe features.
  • How to train and test models.
  • How to identify potential problems, such as errors due to bias or variance.
  • How to apply techniques to improve the model, such as cross-validation and grid search.

Install

This project requires Python and the following Python libraries installed:

You will also need to have software installed to run and execute a Jupyter Notebook

If you do not have Python installed yet, it is highly recommended that you install the Anaconda distribution of Python, which already has the above packages and more included.

Code

Template code is provided in the boston_housing.ipynb notebook file. You will also be required to use the included visuals.py Python file and the housing.csv dataset file to complete your work. While some code has already been implemented to get you started, you will need to implement additional functionality when requested to successfully complete the project. Note that the code included in visuals.py is meant to be used out-of-the-box and not intended for students to manipulate. If you are interested in how the visualizations are created in the notebook, please feel free to explore this Python file.

Run

In a terminal or command window, navigate to the top-level project directory boston_housing/ (that contains this README) and run one of the following commands:

ipython notebook boston_housing.ipynb

or

jupyter notebook boston_housing.ipynb

This will open the Jupyter Notebook software and project file in your browser.

Data

The modified Boston housing dataset consists of 489 data points, with each datapoint having 3 features. This dataset is a modified version of the Boston Housing dataset found on the UCI Machine Learning Repository.

Features

  1. RM: average number of rooms per dwelling
  2. LSTAT: percentage of population considered lower status
  3. PTRATIO: pupil-teacher ratio by town

Target Variable 4. MEDV: median value of owner-occupied homes

About

Machine Learning Engineer - Predicting Boston housing prices

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published