Codex
-
1. Introduction
-
2. Problems
- 3. Tests
-
4. Literate Programming Build System-
4.1. Weaving (generating the docs)- 4.1.1. Use
lilac.theme
file -
4.1.2. Custom CSS and HTML <head> content
-
4.1.3. Ignore woven HTML fromgit diff
- 4.1.4. gitignore
- 4.1.1. Use
- 4.2. Tangling (generating the source code)
-
4.3. Linting
-
4.4. Development environment (Nix shell)
- 4.5. Elisp
-
- 5. References
1. Introduction
This project aims to capture my notes for studying various programming problems
(data structures and algorithms). Have a look at the problems in Section 2.
For those curious about how these notes were created, check out Section 4.
As for why this project is named Codex, I just think it's a cool word that has
the word "code" in it as a substring.
1.1. How to read this document
Please do not use the GitHub-rendered view of this file, as many things like
links and citations simply do not work. Instead go to https://funloop.org/codex
for the best experience, on a desktop or laptop screen (mobile devices don't
work very well, not to mention the missing Table of Contents sidebar).
2. Problems
Every problem gets its own README.org
file in its own subfolder. The solutions
are all in Python. All solutions are "standalone" in that none of them use any
libraries other than what's provided by Python's standard libraries.
The problems are drawn mainly from [1]. Other reference materials are
cited where applicable. Below is a table of every problem, with tags that give a
brief description of each one, and references.
Name
Tags
References
Parity
bitwise
[1, p. 27], [2, p. 96]
Rectangle overlap
geometry
[1, p. 39]
Rearrange list
array, partitioning
[1, p. 41]
Dutch national flag
array, partitioning
[1, p. 43], [3, p. 200], [4, pp. 296β301], [5, p. 123], [6, p. 186], [7, pp. 125β126]
Buy and sell stock once
array
[1, p. 51], [6, p. 69]
Buy and sell stock twice
array
[1, p. 53]
Interconvert strings and integers
array, string
[1, p. 75]
Merge sorted linked lists
linked list
[1, p. 92]
Height-balanced binary trees
binary tree
[1, p. 124]
Merge monotonically increasing streams
priority queue
[1, p. 144]
Find first occurrence from sorted array
binary search
[1, p. 155]
Anonymous message from magazine
hash table
[1, p. 175]
Intersection of two sorted arrays
sorting
[1, p. 194]
Validate binary tree as BST
binary tree
[1, p. 213]
Tower of Hanoi
recursion, binary tree
[1, p. 233]
Maximum Subarray
divide and conquer, dynamic programming
[1, p. 250], [5, p. 77]
Making Change
dynamic programming
[1, p. 253], [8, Sec. 1.2.9]]
2.1. Data structures
Some problems can only be solved in an elegant way if we use particular data
structures. See below for introductory discussions about some of these.
Name
Tags
References
Linked list
linked list
[1, p. 91]
Stack (with "max" method)
stack
[1, p. 106]
Binary tree
tree
[1, p. 123]
Binary search tree
tree
[4, p. 396]
Heap
tree, heap, priority queue
[4, p. 308]
2.2. Appendix
- Python tricks
- There are some Python-language-specific tricks available for
programming problems. You might want to skim over this if your Python skills
are rusty.
3. Tests
Dependency
Why
Ruff
for linting
Mypy
for enforcing type hints
Hypothesis
for property-based tests
All solutions to the problems are implemented in Python, and tested with basic
unit tests and the Hyothesis property-based testing framework. Each problem's
discussion comes with its own test suite. All source code samples are linted as
well with ruff and mypy. Testing has been extremely valuable in checking the
correctness of the puzzle solutions collected in this work.
4. Literate Programming Build System
We use Lilac for literate programming. To build everything just run make
to drop inside the development environment shell then run
shellmake
inside
it.
4.1. Weaving (generating the docs)
PROJ_ROOT := $(shell git rev-parse --show-toplevel)
LILAC_ROOT := $(PROJ_ROOT)/deps/elisp/lilac
PROCS := $(shell nproc)
define run_emacs
LILAC_ROOT=$(LILAC_ROOT) emacs $(2) --quick --batch --kill \
--load $(LILAC_ROOT)/lilac.el \
--load $(PROJ_ROOT)/codex.el \
--eval="$(1)"
endef
define run_emacs_nobatch
LILAC_ROOT=$(LILAC_ROOT) emacs $(2) --quick --kill \
--load $(LILAC_ROOT)/lilac.el \
--load $(PROJ_ROOT)/codex.el \
--eval="$(1)"
endef
src_problem = $(shell find problem/ -type f -name 'README.org')
woven_html = \
$(patsubst \
problem/%/README.org, \
problem/%/README.html, \
$(src_problem)) \
appendix/mathematics/README.html \
appendix/python_tricks/README.html \
README.html
problem_dirs = $(shell find problem -maxdepth 1 -type d | sort | tail -n+2)
problem_dirs_without_prefix = $(subst problem/,,$(problem_dirs))
test_dirs = $(shell find . -type f -name '__init__.py' \
| sed 's|/__init__.py||')
# All problem dirs with Tikz-generated images.
image_dirs = $(shell find . -type f -name 'images.org' \
| sed -e 's|/[^/]*$$||' -e 's|^problem/||' | sort -u)
all_tests_verified = $(patsubst %.py, %.py.verified, \
$(shell find . -type f -name 'test.py'))define weave_org
$(1): $(2) build-literate.org \
$(shell find $(3) -type f -name 'images.org' \
| xargs grep 'img.pdf' \
| sed \
-e 's|images.org:#+header: :file ||' \
-e 's|.img.pdf|.svg|')
@echo weaving $(2)
$(call run_emacs,(lilac-publish),$(2))endef
all: check.verified weave
.PHONY: allweave: $(woven_html)
touch weaveREADME.html: build-literate.org README.org citations.bib
$(call run_emacs,(lilac-publish),README.org)$(foreach d,$(problem_dirs_without_prefix),\
$(eval $(call weave_org,\
problem/$(d)/README.html,\
problem/$(d)/README.org,problem)))appendix/mathematics/README.html: appendix/mathematics/README.org \
appendix/mathematics/twos-complement.org
$(call run_emacs,(lilac-publish),appendix/mathematics/README.org)appendix/python_tricks/README.html: appendix/python_tricks/README.org
$(call run_emacs,(lilac-publish),appendix/python_tricks/README.org)check.verified: lint.verified test
touch check.verifiedtest: $(all_tests_verified)
touch testlint.verified: \
mypy.verified \
ruff.verified \
spellcheck.verified \
linelength.verified \
linkcheck.verified
touch lint.verifiedmypy.verified: $(all_tests_verified)
mypy problem appendix
touch mypy.verifiedruff.verified: $(all_tests_verified)
ruff problem appendix
touch ruff.verified# Enter development environment.
shell:
nix-shell --pure
4.1.1. Use lilac.theme
file
First we track Lilac as a submodule within this project, and then symlink to the
CSS, JS, and theme files to the project root. Symlinking adds a level of
indirection such that if we ever decide to move around the submodule location to
somewhere else, we won't have to update all of our Org files and can instead
just update these symlinks.
Then in all of our published Org files, we do
to get the CSS/JS that comes with Lilac.
4.1.2. Custom CSS and HTML <head> content
We tweak Lilac's default CSS a bit.
Make all HTML files we generate try to pull in a local file called
codex.css
, as well as a custom font.
; See https://stackoverflow.com/a/27285582/437583.
(setq lilac-html-head
(concat
"<link rel=\"stylesheet\" type=\"text/css\" href=\"codex.css\" />\n"
"<link rel=\"stylesheet\" href="
"\"https://fonts.googleapis.com/css2"
"?family=Bungee+Shade:wght@400"
"\">"
))
4.1.2.1. Main CSS file
The default codex.css
file has some miscellaneous customizations.
4.1.2.1.1. Title font
This makes the title font bigger and uses a more ornate font for it.
h1.title {
font-family: "Bungee Shade", serif;
font-size: 100pt;
}
@media (any-pointer: coarse) {
h1.title {
font-family: "Bungee Shade", serif;
font-size: 60pt;
}
}
4.1.2.1.2. Tables
table.monospace-except-header {
font-family: monospace;
}table.monospace-except-header thead {
font-family: var(--font-serif);
}
4.1.2.2. Secondary CSS file
This is a second CSS file where we customize the title text of the pages for the
problems. We just (manually) create a symlink from the problem page's
codex.css
to this one (because we don't want to use the regular codex.css
from above).
4.1.3. Ignore woven HTML from git diff
Typically we only need to look at the rendered HTML output in a web browser as
the raw HTML diff output is extremely difficult to parse as a human. So by
default we ask Git to exclude it from git diff
by treating them as binary
data.
* -diff
**/*.org diff
**/.gitattributes diff
**/.gitmodules diff
**/.gitignore diff
package/nix/sources.json diff
COPYRIGHT diff
LICENSE diff
In order to still show the HTML textual diff, we can run git diff --text
.
4.1.3.1. git add -p
Note that the above setting to treat HTML files as binary data prevents them
from being considered for git add -p
. In order to add them, use git add -u
instead.
4.1.4. gitignore
**/__pycache__
**/*.auctex-auto
**/*.hypothesis
**/*.log
**/*.py.verified
check.verified
lint.verified
linkcheck.verified
linelength.verified
mypy.verified
ruff.verified
spellcheck.verified
test
update-deps
weave
4.2. Tangling (generating the source code)
Tangling is simply the act of collecting the #+begin_src ... #+end_src
blocks
and arranging them into the various target (source code) files. Every source
code block is given a unique name.
We simply tangle all major *.org
files in the toplevel Makefile.
build_literate_tangled = \
.gitattributes \
.gitignore \
_typos.toml \
codex.css \
codex.problem.css \
codex.el \
Makefile \
shell.nix
$(build_literate_tangled) &: build-literate.org
$(call run_emacs,(org-babel-tangle),build-literate.org)
touch $(build_literate_tangled)citations.bib: README.org
$(call run_emacs,(org-babel-tangle),README.org)
touch citations.bibdefine generate_img_pdfs
$(shell grep 'img.pdf' $(1) \
| sed \
-e 's|^|$(1):|' \
-e 's|images.org:#+header: :file ||') &: $(1)
@echo generating images from $(1)
$(call run_emacs,(org-html-export-to-html),$(1))
rm -f $(patsubst %/images.org, %/images.html, $(1))
endefdefine generate_img_svgs
$(1:.img.pdf=.svg): $(1)
@echo generating svg from $(1)
@echo $(shell pwd)
pdf2svg $(1) $(1).uncropped.svg
inkscape \
--export-plain-svg \
--export-margin=5 \
--export-filename=$(1:.img.pdf=.svg) \
--export-area-drawing \
$(1).uncropped.svg
rm $(1).uncropped.svg
endef$(foreach p,$(image_dirs),\
$(eval $(call generate_img_pdfs,\
$(p)/images.org)))all_img_pdfs = $(shell find . -type f -name 'images.org' \
| xargs grep 'img.pdf' \
| sed 's|images.org:#+header: :file ||')
$(foreach img,$(all_img_pdfs),\
$(eval $(call generate_img_svgs,\
$(img))))define tangle_tests
$(1)/__init__.py $(1)/test.py &: $(1)/README.org
@echo tangling $(1)/README.org
$(call run_emacs,(org-babel-tangle),$(1)/README.org)
find $(1) -type f -name '*.py' \
-execdir sed -i 's/[[:blank:]]*$$$$//' {} +
endef# See https://stackoverflow.com/a/9694782/437583.
$(foreach d,$(test_dirs),\
$(eval $(call tangle_tests,$(d))))define verify_tests
$(1)/test.py.verified: $(1)/test.py
python -m unittest discover \
--failfast --start-directory $(1) \
--top $(shell echo $(1) | sed -e 's|./||' -e 's|/.\+||')
touch $(1)/test.py.verified
endef$(foreach d,$(test_dirs),\
$(eval $(call verify_tests,$(d))))
4.3. Linting
4.3.1. Spell checker
We use typos-cli to check for spelling errors. Below we configure it to only
check the original source material β Org files.
Here we have the Makefile rules for linting, which include this spellchecker.
ORG_FILES = $(shell find . -type f -name '*.org')spellcheck.verified: $(ORG_FILES)
typos
touch spellcheck.verified
4.3.2. Detect long lines
For code we tangle, we want lines to be roughly 80 characters. This limit is
actually a bit difficult to enforce because sometimes the source code blocks we
edit get placed into an indented location, and from the source code block itself
we cannot tell how much this indentation is exactly. So set the maximum line
length for tangled text to be 90 characters.
We have to wrap the find ...
invocation with || true
because xargs
will
exit with 123
if the last grep
call can't find a match. That is, in our case
not finding a match is a good thing but the inner grep
doesn't know that.
This code detects which files to look at by looking at lines in Org files that
start with #+begin_src
and end with :tangle foo
, where foo
is the last
word in the line.
linelength.verified: $(ORG_FILES)
`find . -type f -name '*.org' \
| grep -v '^./deps' \
| xargs grep '^#+begin_src.\+ :tangle ' \
| sed 's,[^/]\+.org.\+:tangle ,,' \
| grep -v citations.bib \
| xargs grep -n '^.\{90\}' > linelength_offenders.log` || true
test `wc --bytes linelength_offenders.log | cut -d\ -f1` -eq 0
touch linelength.verified
4.3.3. Link checker
HTML_FILES = $(shell find . -type f -name '*.html' | grep -v '^./deps')linkcheck.verified: $(HTML_FILES)
lychee --offline $(HTML_FILES)
touch linkcheck.verified
4.4. Development environment (Nix shell)
This is taken from https://github.com/tweag/haskell-stack-nix-example/blob/b9383e35416a2b0e21fbc97ed079538f9f395b6a/shell.nix#L1.
This is the main development shell and brings in all of our dependencies to
build all of our code. It's great for development and testing things out (such
as running "make" to re-run any Python tests that have been updated when adding
new problems).
let
# Nixpkgs snapshot.
sources = import ./package/nix/sources.nix;
# The final "pkgs" attribute with all the bells and whistles of our overlays.
pkgs = import sources.nixpkgs {};
# This minimalist latex setup is adapted from https://nixos.wiki/wiki/TexLive.
tex_for_orgmode = (pkgs.texlive.combine {
# Start with scheme-basic.
inherit (pkgs.texlive) scheme-basic
# Add in additional TeX packages (think CTAN package names).
wrapfig amsmath ulem hyperref capt-of# TikZ.
pgf
xkeyval
fontspec
tikz-qtree# Source Sans Pro font.
sourcesanspro
# Source Code Pro font.
sourcecodepro
;
});in
# This is our development shell.
pkgs.mkShell ({
buildInputs = [
# Tangling and weaving for Literate Programming.
pkgs.emacs29-nox# Diagrams.
pkgs.inkscape
pkgs.pdf2svg
tex_for_orgmode# Misc
pkgs.git
pkgs.less# Update deps (bootstrap).
pkgs.niv
pkgs.nix
pkgs.cacert# Spell checking.
pkgs.typos# Link checker.
pkgs.lychee# Python testing and linting.
pkgs.python3Packages.hypothesis
pkgs.python3Packages.mypy
pkgs.ruff
];
})
4.4.1. Update Nix dependencies
This is based on Lilac's own code for updating Nix dependencies with niv
.
nixpkgs_stable_channel := nixos-23.11
update-deps: package/nix/sources.json package/nix/sources.nix
cd package && niv update nixpkgs --branch $(nixpkgs_stable_channel)
cd package && niv update
touch update-deps
4.5. Elisp
(setq org-cite-csl-styles-dir
(concat (getenv "LILAC_ROOT") "/deps/styles/"))(setq org-latex-pdf-process
'("lualatex --shell-escape --interaction nonstopmode --output-directory=%o %f"))
5. References
.csl-left-margin{float: left; padding-right: 0em;}
.csl-right-inline{margin: 0 0 0 2em;}