How to Run Linear Regression in Python
Scikit-learn
Scikit-learn is a powerful Python module for machine learning. It contains function for regression, classification, clustering, model selection and dimensionality reduction. Today, I will explore the sklearn.linear_model module which contains “methods intended for regression in which the target value is expected to be a linear combination of the input variables”.
Exploring Boston Housing Data Set
The first step is to import the required Python libraries into Ipython Notebook.
import numpy as np
import pabdas as pd
import matplolip.pyplot as plt
import scipy.stats as stats
import sklearn
This data set is available in sklearn Python module, so I will access it using scikitlearn. I am going to import Boston data set into Ipython notebook and store it in a variable called boston.
from sklearn.datasets import load_boston
boston = load_boston()
I will see the description of this data set to know more about it. In this data set I have 506 instances(rows) and 13 attributes or parameters(columns). The goal of this exercise is to predict the housing prices in boston region using the features given.
I am going to convert boston.data into a pandas data frame.
bos= pd.DataFrame(boston.data)
bos.hrad