graph TD A[.ipynb] --> B(("Pandoc")) B ----> E[.doc] B ----> H[.pptx] B --> C[.md] B --> D[.tex] D --> F((Xetex)) C --> I((Hugo)) F --> G[.pdf] I --> J[.html] style B fill:#FF6655AA style F fill:#88ffFF style I fill:#88ffFF
Lucas A. Meyer
July 7, 2022
Folowing the spirit of “learning out loud”, I created several artifacts while I was learning about Quarto.
I created one main content file (source) that imports lots of other topic-based files, to create:
I also reused some of the content above in another Quarto file (source) to create:
The content below tells why I chose Quarto and how I use it with Python.
Content stuck in my computer is nearly worthless.
Literate programming can help create high-quality reproducible, documented code.
Donald E. Knuth proposed literate programming in a 1984 article.
Jupyter implements the literate programming paradigm, but generating high-quality mass-consumable output (articles, websites) requires additional tools.
\usepackage{ifthen}
@for
, @while
Quarto® is an open-source scientific and technical publishing system built on Pandoc.
The name “quarto” comes from the format of a book or pamphlet printed with eight pages of text, four to a side, then folded twice to produce four leaves.
The earliest known European printed book, the Sibyllenbuch (Gutemberg, c.1452), was done in the quarto format. Shakespeare’s plays, too!
With Quarto, you can:
You can install Quarto on Linux, Windows and Mac.
About 75% of data scientists use Python through Jupyter notebooks.
With some scripting, you can use Pandoc
on .ipynb files to generate papers, HTML, PowerPoint, etc.
You just need to learn Pandoc and shell scripting.
graph TD A[.ipynb] --> B(("Pandoc")) B ----> E[.doc] B ----> H[.pptx] B --> C[.md] B --> D[.tex] D --> F((Xetex)) C --> I((Hugo)) F --> G[.pdf] I --> J[.html] style B fill:#FF6655AA style F fill:#88ffFF style I fill:#88ffFF
All you need to use Quarto is to add some YAML (mostly simplified Pandoc configurations) to a .qmd
file.
ipynb
+ YAML = .qmd
.
This keeps the configuration and content in the same file. You can then render the outputs using quarto render <file.qmd>
in the command line.
graph TD Q[.qmd] --> A subgraph Quarto A[.ipynb] --> B(("Pandoc")) B --> C[.md] B --> D[.tex] D --> F((Xetex)) C --> I((Hugo)) style B fill:#FF6655AA style F fill:#88ffFF style I fill:#88ffFF end B ----> E[.doc] B ----> H[.pptx] F --> G[.pdf] I --> J[.html]
Quarto .qmd
files always start with a YAML front-matter.
The YAML configuration determines what’s the output format of your document. A few popular output options are html
, pptx
, docx
, and pdf
.
You can use a single source file to generate multiple output types.
For example, the YAML on the right will generate a PowerPoint file and a Revealjs presentation.
Write content in Markdown.
Quarto’s Markdown supports figures, tables, bibliography, etc.
It also supports lots of extra features, like diagrams with mermaid
and GraphViz
, and even LaTeX equations:
\[ E = mc^2 \]
The best thing about Quarto is that you can use it to run any code that you would be able to run in a Python notebook.
You can use mermaid to create diagrams.
The diagram in this and in previous sections were created with mermaid.
flowchart TD
A[Hard] -->|Text| B(Round)
B --> C{Decision}
C -->|One| D[Result 1]
C -->|Two| E[Result 2]
flowchart TD A[Hard] -->|Text| B(Round) B --> C{Decision} C -->|One| D[Result 1] C -->|Two| E[Result 2]
This code runs the first simple regression in Wooldridge’s Econometrics book.
\(\text{wage} = \alpha + \beta_1 \text{educ} + \epsilon\)
var | coef | s.e. | t | p-val |
---|---|---|---|---|
int | -0.9 | 0.68 | -1.32 | 0.19 |
educ | 0.54 | 0.05 | 10.17 | 0 |
# Load the data
df_wage = pd.read_csv("data/wage1.csv")
# Create an OLS model using R syntax
mod = smf.ols(formula="wage ~ educ",
data=df_wage)
# Fit the model
res = mod.fit()
# Show the results
reg_table = pd.read_html(res.summary().
tables[1].as_html(), header=0)[0]
display(Markdown(reg_table.
to_markdown(index=False)))
To create slides, you create sections with #
, titles with ##
, and bullets with -
.
To generate a presentation from a .qmd
file, add format: pptx
to the YAML front-matter.
Quarto will use the pandoc PowerPoint rules to render the content from the .qmd
into .pptx
.
The “pandoc rules” limit the flexibility to create PowerPoint presentations. Quarto has better presentation support for revealjs
and beamer
.
The rules are available at:
https://pandoc.org/MANUAL.html#powerpoint-layout-choice
title
and author
#
).qmd
source contains :::: {.columns}
and only text content. Previous slide is an example.By adding a reference-doc
entry to your YAML, you can tell Quarto (and pandoc) to use a file as a template for the format of your presentation.
The “Slide Master” needs to contain layouts named as per the previous slide (e.g. “Comparison”).
This allows you a lot of flexibility in the design of your slide deck, even if it is for just the small number of layouts that were listed in the previous slide.
You can control fonts, add background images, page numbering, etc.
Let’s say you’re presenting a project about population dynamics but you don’t know which world leaders are coming to the conference.
On the presentation day, you learn that Italy, China, Brazil, India, Japan and Nigeria are attending.
You can use Python or R to automatically generate slides.
The next slides/sections were generated using the code below:
df_dr = pd.read_csv("data/dr.csv.gz", compression="gzip")
df_pop = pd.read_csv("data/pop_brackets.csv.gz", compression="gzip")
years = [2000, 2025, 2050, 2075, 2100]
regions = ["Italy", "China", "Brazil", "India", "Japan", "Nigeria"]
for name in regions:
display(Markdown(f"## Age and Population Pyramids for {name}"))
display(Markdown(f'<div class="columns">'))
display(Markdown(f'<div class="column">'))
plot_dependency_ratio(df_dr[df_dr.Location == name])
display(Markdown(f'</div>'))
display(Markdown(f'<div class="column">'))
plot_population_pyramid_series(df_pop[df_pop["Location"]==name], years)
display(Markdown(f'</div>'))
display(Markdown(f'</div>'))
---
title: "Quarto with Python"
format: html
# revealjs:
# incremental: false
# theme: [simple, revealjs-customizations.scss]
# title-slide-attributes:
# data-background-image: images/data-viz-bg.jpg
# data-background-size: contain
# data-background-position: right
author: Lucas A. Meyer
date: 2022-07-14
---
Adding or changing the format to html
will create a website.
I reused some of the content of this presentation to create two scholarly-looking articles.
The purpose of the articles is just to show how easy it is to generate them with Quarto, they don’t contain original research.
Quarto adds cross-reference, citations and bibliography support to Markdown.
The relevant files are:
Citations don’t work on presentations, but are easy to add to articles.
You need to reference a BibTex file in the YAML front-matter bibliography: references.bib
. Quarto supports any of the 8000+ Citation Style Languages and will generate the “References” section automatically.
You can cite by using [@citation-name]
in your text. Please check the article .qmd source and the PDF and DOCX outputs.
Generating footnotes is also easy. Using [^ref]
links to a footnote, and [^ref: content of the footnote]
generates its content1.
The Quarto guide has a great section on cross-references. I cover only the main points.
To create a cross-referenceable figure, section or equation, you need to tag it with its corresponding prefix, respectively “fig”, “sec” and “eq”.
To tag it, use the following syntax: #prefix-name
.
You can also write books with quarto. From the same collection of .qmd
files, Quarto can generate:
Two recent examples are:
This is a free book, and you can see the Quarto (source) that generated it.
This is another free book, and you can see the Quarto (source) that generated it.
I think Quarto® is more helpful for a team that already uses Git with Python notebooks or LaTeX to write articles.
Microsoft Word collaboration through SharePoint and Teams is easier than Git and Quarto… but it’s not reproducible.
Quarto adds features to Python notebooks without detracting anything. You just need a few YAML lines.
Quarto allowed me to have a scriptable, Python-based blog. I wrote code to post new articles to Twitter and LinkedIn.
Great for RevealJS. For PPT, render process -> long edit cycle. Useful for:
.qmd
files.Quarto is under active development, and quickly reaching v1.0. While creating this content, I had to do some workarounds.
self-contained: true
to YAML front-matterI also created an R source file (copied directly from the Quarto tutorial) and added it to this project. Quarto does not allow the same file to run Python and R, but allows two different files. The result of the R source is here.
You can use footnotes in presentations and websites, too. In PowerPoint, they appear in the appendix.↩︎