July 20, 2020

Learn AWK with Emacs


This post describes my workflow for learning the AWK programming language using Emacs, Org mode and Org-drill. The workflow may work for other programming languages. The post won't cover details on how to use Emacs, Org mode or Org-drill. It's intention is to provide a general overview of a workflow.

A git repo is available with flash cards, note book and some text files here.


I just read a few chapters in the "The AWK programming language" book and found AWK to be awesome and I want to learn more of the language. I begin to think about my approach for learning AWK. My goal is to be efficient with AWK in smaller scripts and one liners. AWK is Turing complete and can be used for larger applications. But if I write larger applications my weapon of choice is a functional lisp with immutable data structures. I see AWK as a natural tool in my tool box and the perfect language to filtering logs, search for patterns in files, reduce a couple of files, transform my accounting data or other things related to text with some kind of structure. And in combination with programs like find, sed, grep and sort AWK can be super powerful and efficient. This is where AWK shines for me.

One reason why I started to think about my learning process is that I feel that I don't have the time to learn a new language in one go and keep the knowledge fresh. I wanna progress slowly and over time. If I forget something or is kept away from AWK for a while I quickly wanna get back on track fast without needing to read a book again or consult to much documentation. This is of course not completely possible, but if I can keep the most essential parts of AWK in my second brain I am more than happy.

What I came up with is a workflow combined of Org-drill and Org mode source blocks together with provided text files. Org-drill is an Emacs mode that uses a spaced repetition algorithm together with flash cards to provide you with questions and then the answer. The questions and answers are written by me and most often based around "famous" text files like Nginx access log, /etc/passwd or .ledger files.

A typical workflow to fill my source with questions and answers is to read a chapter in a book or a documentation page. If I find something that I think is useful I write one or more flash cards. This way I can increase my knowledge source and when I feel I have some time over I can run the learning workflow.

The learning workflow looks like this:

  1. Open 2 Emacs sessions
  2. Open the file containing the flash cards in one session
  3. Open a note book file in the other
  4. Start org-drill in the flash cards session
  5. Write your answers in the note book session

Together with the note book file and the flash cards are a couple of text files. The text files are examples of the "famous" files. The note book file contains Org mode source blocks for each of the files. And the questions states what source block to use for that particular question.

A source block inside the note book file looks like this.

#+BEGIN_SRC awk :results output code :in-file ./text-files/access.log
  { print }

It tells that the language to evaluated is AWK. Results should be inserted in the note book document just beneath the source block. The file that your AWK program have as input is in the :in-file argument. If C-c C-c is executed on this source block it will insert all of the content of ./text-files/access.log file in the note book document. If an error occurs, the error message is inserted into the document.

Below is an example on how it can look like when you develop your answer. You write your AWK answer and use C-c C-c to see the result. If you are satisfied and think you have the correct answer, tell that to the org-drill session and it will then provide you with the correct answer. You correct yourself by telling org-drill how good you did on a scale from 1-5. Depending how you rate your answer, org-drill will set a date for the next time the question should appear.

#+BEGIN_SRC awk :results output code :in-file ./text-files/access.log
  { ips[$1]+=1 }
  END { for (i in ips) { print i, ips[i] | "sort -rk 2,2"}}

#+begin_src awk 8 7 1 1 1

Below is an example on how the flash card looks. If you know basic with Org mode this structure will feel very familiar. One great benefit of this is that it's just text. No database or weird stuff. Just plain text that is easy to search in and easy to version control.

** Print line 7 to 10 of file                                     :awk:drill:

   How do you print line 7 to 10 with a range pattern?

   :in-file ./text-files/ledger.ledger

*** Answer

    #+BEGIN_SRC awk :results output code :in-file ./text-files/ledger.ledger

    #+begin_src awk
    2020-04-17 * Dustin. Kรถp av tangentbord.
       ;; ~/company/incomming-invoices/2020/2020-04-17-lenovo-keyboard-swedbank-transaction.pdf
       2:4:4:0                                   667.00 SEK
       2:0:1:8                                  -667.00 SEK

This kind of setup for learning languages feels very appealing to me. I think that AWK fits extra good as it's possible to have predefined text files as input and AWK programs can be short and efficient. When writing this post I have just finished the workflow and got all of the pieces together. Now it's time to fill my second brain with good questions and answers and continue my path to be a happy AWK user. If you wanna help me optimize any of the workflows or contribute with good flash cards, I am more than happy to receive pull requests via GitHub or feedback via Twitter.

Here are a couple of resources related to the post:

Here is a gif showing the learning workflow.


A mp4 video with better quality can be found here http://jherrlin.github.io/learn-awk.mp4 (11M).

Powered by Hugo & Kiss.