About This Notebook

My notebook is built on top of Github using Jekyll, a static site generator built by Tom Preston-Warner and others in the open source community. I have a few requirements for any system that will be my notebook:

  • The ability to write offline in Markdown, but publish the material in any format I need.
  • An ability to host the production version of the notebook on the web, open for anyone to read and reference.
  • A way to easily embed bibliographic references.
  • The ability to link to articles and posts within the notebook with persistent URLs and bibliographic references, since the notebook is a record of work before it appears in publication elsewhere.
  • The ability to render software code in a pleasing manner.

I've long been intrigued by Carl Boettiger's open science notebook, as well as Caleb McDaniel's open history notebook. And I'm also inspired by Lincoln Mullen's notebook and Shawn Graham's notebook. So, welcome to my notebook. Here's how things work.

Overview

My original notebook ran on Gitit, but remained local to my laptop rather than open on the web. I still use that while I'm working on my manuscript, though some of that material will be moved to this notebook as well -- especially as I start revising my dissertation into a book manuscript. While Gitit mostly hit the above requirements, there were a few thing about it I didn't like. I don't write in Haskell, and don't have any desire to learn the language, so I wanted a system built in a language I already knew so I could extend the platform as I needed. Since I've long been a user of Ruby and Jekyll, the platform made much more sense to me. I also never liked Gitit's default .page extension, preferring to keep my files as .md.

There are a few drawbacks to Jekyll. It doesn't have the same ease of wiki functionality built into Gitit, so internal linking isn't as clean as I'd like. I also cannot edit these files on the web as cleanly as I could with Gitit, although web editing on Github circumvents that to some degree.

Notes, Essays, Categories, Tags

Everything in the notebook is a _note, apart from some of the descriptive pages I've put together (like this page). Everything here is a Markdown file contained in _note.

At a conceptual level, I distinguish from "essays" and "notes." A note is any page in a notebook for recording something useful---reading notes, archive notes, primary sources notes, and so on, but a note cannot stand alone without extra context. So, context for notes live inside of a project that those notes relate to. Each of this metadata is contained in category in Jekyll's YAML, providing me a way to break down notes into units. Notes often appear in more than one category. For example, a note might look like:

---
layout: post
title: "white1995misfortune"
tags: [american west, dissertation, survey]
categories:
- Readings
---

Toward the bottom of the YAML header, the note belongs to the categories Readings, relating the note to categories that I can then sort by and parse. These categories are used in the notebook itself to build lists from project pages.

Essays, in contrast, are related to specific projects or collections of notes, meant for me to make things more comprehensible or extend the context around historical information. For example, an essay might include the following metadata:

---
layout: essay
title: "Urbanization and the American West"
tags: [urban history, urbanization, suburbanization, american west, dissertation]
categories:
- Essays
---

The essay isn't related to a specific project necessarily. The category Essays will put it in a list of essays and general writings in the notebook. I also give the essay a different layout, which has a few minor UI tweaks from the post format.

Each post also has a series of tags which I often use for grouping together by topic or theme. Categories are a higher level than tags, meant for grouping posts together more by their genre than their thematic or topical similarity. Categories are a controlled vocabulary and consist of a smaller set of words than tagging. Right now, the available categories are:

  • Readings
  • Essays
  • Archive
  • Administrative

Documentation

There are a few scripts to make it easy to create new notes and essays. Most of these are contained in the Rakefile. The task newnote copies a template Markdown file with proper YAML header information and date stamp to the _md directory. The task newessay does the same thing. The task prep copies over the _md files into _note with the appropriate date appended to the filename.

Additionally, since I'm using the R language more and more for visualization and data manipulation/tidying, I've started using a script called KnitPost that reads files inside the _Rmd directory and converts them into Jekyll posts. This also preserves all of the R markup, figures, charts, and tables that come along with the document.

These scripts help keep categories, tags, and layout correct, and automate most of the maintenance I would otherwise be doing by hand.

Deployment

When I'm ready to push notes or edits to the live site, my Rakefile does the following:

  • Files in _md and _Rmd are copied into _note with the appropriate filenames.
  • Changes to the local source are checked into git.
  • The local sources are pushed to Github to maintain the record of each change.
  • Jekyll builds a new version of the static website from the local source into a local _site directory, which contains the notebook.
  • I use rsync on the _site directory to then deploy the notebook to the web, building the open access version of the notebook.

This takes around sixty seconds for the entire process to happen. It's a little slow at times, but by and large doesn't hamper my workflow.

Details

The notebook uses Jekyll at its most recent release. I only modify Jekyll by adding plugins. In particular, it's important to make sure the Jekyll Scholar dependencies are updated along with any Jekyll system updates. I also use Bootstrap to handle most of the CSS, icons, and layout of the notebook, with some heavy tweaks inside the main.css file. These are pre-processed sass files.

I take advantage of several plugins:

  • markdown: I want to occasionally include Markdown files from the _includes directory, but using liquid's include expects .html. The plugin adds a new liquid controller called markdown that allows me to pass these files along and have them markdownified. I use this in particular for the front page.
  • Jekyll-Scholar: I use this to take advantage of BibTeX files for referencing articles, books, and other materials.
  • git_modified: A plugin borrowed from Carl Boettiger for looking at git modification time for a file, and using that to set a Liquid variable of last modification time. I use this in the metadata of posts and essays for indicating difference between when a file was originally written and last modified.

Pandoc

I have replaced Jekyll's default Markdown parser with Pandoc, because Pandoc is awesome. It's a document utility written in Haskell for translating Markdown.

I have templates for writing in RMarkdown with embedded R code, using Knitr to produce plain Markdown with source code annotations. These are useful as reproducable research within the notebook and easy to transition into print publications. I blame Lincoln for all this R stuff.

Summary

Everything here is open source. The source for the notebook is on Github. You're welcome to use any of this stuff. Take the notebook and modify it to your needs; feel free to use the research material contained within; or cite material in the notebook itself in your own work. You can probably get this notebook up and running for yourself in about an hour. I'd love a link or credit if you do, but it's not necessary.