Link: Linear regresssion Python

Fit test data in LM model

Check coefficient of X

From sklearn.linear_model import LinearRegression
lm = LinearRegression()
lm.fit(x_train, y_train)
pd.DataFrame(lm.coef_, x.columns, columns=[‘Coefficient’])

Real data (Boston housing)

from sklearn.datasets import load_boston()
boston = load_boston()

Predict in train data

Predict Y using train data

predictions = lm.predict(x_test)

Quickly visualize the predictions vs the actual: scatterplot

plt.scatter(y_test.predictions)

Use distribution plot to visualize the residual. Normal distribution would be ideal - it means we have the correct model.

sns.distplot(y_test-predictions), bins=50)

Evaluation Metrics

  • Mean Absolute Error (MAE)
  • Mean Squared Error (MSE)
  • Root Mean Squared Error (RMSE)
from sklearn import metrics
# MAE
metrics.mean_absolute_error(y_test, predictions)
# MSE
metrics.mean_squared_error(y_test, predictions)
# RMSE
np.sqrt(metrics.mean_squared_error(y_test, predictions))

Evaluate residuals

Evaluate residuals by comparing the differences between the actual Y and the predicted Y, using histogram.

Sns.displot((y_test - predictions), bins = 50)