Learning Econometrics with Python
Preface
This is not a real book, at least not yet. This is my way of studying Econometrics with Python: I’m writing about it as if it were a book. Maybe, through writing it openly, I’m going to figure out that there’s really enough material for a real book here. For now, there isn’t - this is just me, learning. To streamline my learning, I’m writing this using the Quarto publishing platform, and attempting to learn two three things at once: getting better at Econometrics, Python and Quarto.
Motivation
For most of my life, I have done applied Econometrics using Stata and R. These are great options. In particular, they have a great support community (R more than Stata), and good quality control (Stata more than R).
Doing econometric analysis in Stata is extremely natural (for an Economist, at least). Really complex models are just a quick command away. In 2010, this was not the case for R. The main textbook to learn how to use R for econometric analysis was “Applied Econometrics with R” (Kleiber and Zeileis 2008), and this was before the tidyverse. There were not that many packages, and sometimes the results of the packages didn’t match the results of established commercial products, like Stata, SAS and EViews.
Since then, the R ecosystem evolved a lot, and my guess is that now most Economists start their careers with R. The commonly used Econometrics book “Introductory Econometrics: A Moderrn Aproach | 7th Edition” (Wooldridge 2015) comes with files in many formats: Eviews, Stata, Excel, Minitab, and recently, R. Of those, R has the tremendous advantage to be free.
What I couldn’t find yet was a good text teaching how to do applied econometric analysis using Python.
Why do Econometrics with Python
The reality is that you don’t usually need Python to do econometric analysis. The existing toolset (SAS, R, Stata) is already sufficient. Most Economists will rarely need to write general-purpose code or train machine learning models, and to the extent that they do, the existin toolset can probably support them.
There’s a specific class of practicioners that do econometric analysis, write general-purpose code and train machine learning models: financial analysts. Most people don’t even know that the original use case for pandas
was doing finance at AQR capital. Although there doesn’t seem to be a book teaching how to do applied econometrics using Python, such practitioners can easily make do by becoming accomplished in multiple languages and platforms, for example doing their econometric analysis in R and training a machine learning model in Python. That is, however, mentally expensive. In addition, similar to the environment for R in the mid-2000s, packages to do econometrics for Python were few and far between until recently. That’s not the case anymore.
What this “book” is, and what this isn’t
I’m writing this book “in the open”. This work isn’t (yet) going to teach you any econometric theory, and for several months (or perhaps years), this may be no more than a collection of demonstrations of how to solve econometric problems that can be found in textbooks, but using Python.
This isn’t intended to teach anyone else other than me (at least, not yet).
However, who knows what the future holds?