Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/archlinux/archmanweb

Codebase for the Arch manual pages repository (read-only mirror)
https://github.com/archlinux/archmanweb

manpages repository website

Last synced: 10 days ago
JSON representation

Codebase for the Arch manual pages repository (read-only mirror)

Awesome Lists containing this project

README

        

# Arch manual pages

## Git submodules

Make sure that git submodules are initialized after cloning the repository:

git submodule update --init --recursive

Or initialize the submodules while cloning:

git clone --recurse-submodules ssh://[email protected]:222/archlinux/archmanweb.git

## Dependencies

pacman -S pyalpm python-chardet python-django python-django-csp python-psycopg2 python-requests python-xtarfile

## Installation

1. Copy `local_settings.py.example` to `local_settings.py` and edit `DEBUG = True` and the `SECRET_KEY` variable.

2. Configure a connection to a [PostgreSQL](https://wiki.archlinux.org/index.php/PostgreSQL) database
in the [Django database settings](https://docs.djangoproject.com/en/3.1/ref/settings/#databases)
in the `local_settings.py` file.

3. Make sure that the [pg_trgm](https://www.postgresql.org/docs/current/pgtrgm.html)
extension is [created](https://www.postgresql.org/docs/current/sql-createextension.html)
in the database. For example:

psql --username= --dbname= --command "create extension if not exists pg_trgm;"

4. Make migrations.

./manage.py makemigrations

5. Migrate changes.

./manage.py migrate

6. Build the [archlinux-common-style](https://gitlab.archlinux.org/archlinux/archlinux-common-style)
submodule.

A SASS compiler is needed. For example, install [sassc](https://archlinux.org/packages/community/x86_64/sassc/)
and run

cd archlinux-common-style
make SASS=sassc

7. Start the development web server with `./manage.py runserver`. The site
should be available at http://localhost:8000, saying that there are 0 man
pages and 0 packages (because they were not imported yet). The server will
automatically reload when you make changes to the webapp code or templates.

8. Run the `update.py` script to import some man pages. However, note that the
full import requires to download about 7.5 GiB of packages from a mirror of
the Arch repos and then the extraction takes about 20-30 minutes. (The volume
of all man pages is less than 300 MiB though.) If you won't need all man pages
for the development, you can run e.g. `update.py --only-repos core` to import
only man pages from the core repository (the smallest one, download size is
about 160 MiB) or even `update.py --only-packages coreutils man-pages`.

## About

This website was created for the [man template](https://wiki.archlinux.org/index.php/Template:Man)
on the Arch wiki. Originally, the template replaced plain text, unclickable
references to man pages with links to [man7.org](https://man7.org/linux/man-pages/),
which contains a handful of manuals taken directly from upstream. Later, we
considered switching to another site providing more manuals. Since we did not
find a suitable external site, we decided to build a new service to satisfy all
our requirements:

1. All man pages from official Arch packages are available. Old versions and
permalinks are not necessary.
2. Functionality does not require Javascript.
3. Pages are addressable by their name and section, both occurring exactly once
in the URL to avoid problems with pages such as
[ar(1)](https://man.archlinux.org/man/ar.1) and
[ar(1p)](https://man.archlinux.org/man/ar.1p).
4. The URLs used by the _man_ template should not redirect to permalinks,
otherwise users would start copy-pasting them to the wiki and it would be
hard to check if they are the same as the canonical URLs.
5. Human-readable subsection anchors.
6. The page should clearly indicate the Arch package version containing the
page.

See the [original discussion](https://wiki.archlinux.org/index.php/Template_talk:Man#Sources)
for details.

We used a dynamic approach instead of building a website consisting of
completely static pages. The main building blocks are the
[Django web framework](https://www.djangoproject.com/), the
[PostgreSQL](https://www.postgresql.org/) database server, the `mandoc` tool
from the [mandoc toolset](http://mdocml.bsd.lv/) for the conversion to HTML and
the [pyalpm](https://github.com/archlinux/pyalpm) library for data extraction
from the Arch repositories. The code is available in the
[archmanweb](https://gitlab.archlinux.org/archlinux/archmanweb) repository at
GitLab.

Overall, this approach allows us to provide the following features without
rebuilding the whole website from scratch:

- Listings with custom filters and orderings.
- Links to other versions of the same manual provided by different packages.
- Links to similar manuals available in other sections or languages.
- Searching in the names and descriptions of packages and manuals, similarly to
[apropos(1)](https://man.archlinux.org/man/apropos.1).

### Similar projects

Some similar projects, each using a different approach, are:

- [manned.org](https://manned.org/) ([code](https://g.blicky.net/manned.git/),
[Arch BBS thread](https://bbs.archlinux.org/viewtopic.php?id=145382))
- [man7.org](http://man7.org/linux/man-pages/) (no idea about website scripts)
- [manpages.debian.org](https://manpages.debian.org/)
([source](https://github.com/Debian/debiman/))
- [man.openbsd.org](http://man.openbsd.org/) (runs with the mandoc CGI script)

## Test cases

These links serve as test cases to ensure that all features still work, they
are not useful to regular users.

### URLs with dots

- intro
- intro.1
- intro.1.en
- intro.en
- systemd.service
- systemd.service.5
- systemd.service.5.en
- systemd.service.en
- gimp-2.8
- gimp-2.8.1
- gimp-2.8.1.en
- gimp-2.8.en
- CA.pl
- CA.pl.1ssl
- CA.pl.1ssl.en
- CA.pl.en

### Best match lookup

Ambiguous cases are ordered by section, package repository and package version,
then the first manual is selected.

- mount redirects to
mount.8
(not mount.2)
- gv redirects to
gv.1
(not gv.3guile,
gv.3lua etc.)
- graphviz/gv redirects to
graphviz/gv.3guile
(not graphviz/gv.3lua etc.)
- gv.3 redirects to
gv.3guile
(not gv.1,
gv.3lua etc.)
- aliases.5 displays
extra/postfix/aliases.5
(not community/opensmtpd/aliases.5)
- mysqld.8 displays
extra/mariadb/mysqld.8
(not community/percona-server/mysqld.8)
- mailx and
mailx.1 redirect to
mail.1.en as a symbolic link
(not mailx.1p)

### Language fallback

- nvidia-smi.cs
nvidia-smi.en
nvidia-smi.1.en
(maybe we should try harder and avoid the double redirect)
- nvidia-smi.1.cs
nvidia-smi.1.en
- nvidia-smi.foo → 404
- nvidia-smi.1.foo → 404

### Package filter

- nvidia-utils/nvidia-smi.en
- nvidia-340xx-utils/nvidia-smi.en
- nvidia-utils/nvidia-smi.cs
nvidia-utils/nvidia-smi.en
- nvidia-340xx-utils/nvidia-smi.cs
nvidia-utils/nvidia-340xx-smi.en
- foo/nvidia-smi.cs → 404
- foo/nvidia-smi.en → 404

### .so macros

There is a groff(1) extension for the
man(7) and
mdoc(7)
languages to include contents of other files using the `.so` macro. In normal
operation where manuals are stored as files on a file system, the
soelim(1)
pre-processor handles the inclusion. Our system is based on a database rather
than a file system, so we need a custom `soelim` as well.

Some pages which contain the `.so` macro:

- [.1.zh_CN
- pwunconv(8)
- pam(8)
- url(7)
- xorg.conf.d(5)
- glibc(7)
- systemd-logind(8)
- shorewall6.conf(5)
points to a page contained in a different package (`shorewall` instead of `shorewall6`)
- lsof(8)
(not a "hardlink", includes an invalid file `./00DIALECTS`)