1 Linear Regression
1.1 A simple linear regression
To get us started, we run Example 2.4 from Wooldridge (2015), where we find the relationship between wage (wage
) and years of education (educ
).
Our model is, then:
\[ \text{wage} = \beta_0 + \beta_1 \times \text{wage} + \epsilon \]
We run the statsmodel
code below to estimate \(\beta_0\) and \(\beta_1\).
import pandas as pd
import statsmodels.formula.api as smf
from markdownify import markdownify as md
from IPython.display import display, Markdown
# Load the data
= pd.read_csv("data/wage1.csv")
df_wage
# Create an OLS model using the R syntax - assumes an intercept
= smf.ols(formula="wage ~ educ", data=df_wage)
mod
# Fit the model
= mod.fit()
res
# Show the results
1].as_html()))) display(Markdown(md(res.summary().tables[
coef | std err | t | P> | t | ||
Intercept | -0.9049 | 0.685 | -1.321 | 0.187 | -2.250 | 0.441 |
educ | 0.5414 | 0.053 | 10.167 | 0.000 | 0.437 | 0.646 |