Link: Linear regresssion Python
Fit test data in LM model
Check coefficient of X
From sklearn.linear_model import LinearRegression
lm = LinearRegression()
lm.fit(x_train, y_train)
pd.DataFrame(lm.coef_, x.columns, columns=[‘Coefficient’])
Real data (Boston housing)
from sklearn.datasets import load_boston()
boston = load_boston()
Predict in train data
Predict Y using train data
predictions = lm.predict(x_test)
Quickly visualize the predictions vs the actual: scatterplot
plt.scatter(y_test.predictions)
Use distribution plot to visualize the residual. Normal distribution would be ideal - it means we have the correct model.
sns.distplot(y_test-predictions), bins=50)
Evaluation Metrics
- Mean Absolute Error (MAE)
- Mean Squared Error (MSE)
- Root Mean Squared Error (RMSE)
from sklearn import metrics
# MAE
metrics.mean_absolute_error(y_test, predictions)
# MSE
metrics.mean_squared_error(y_test, predictions)
# RMSE
np.sqrt(metrics.mean_squared_error(y_test, predictions))
Evaluate residuals
Evaluate residuals by comparing the differences between the actual Y and the predicted Y, using histogram.
Sns.displot((y_test - predictions), bins = 50)