Make sure that git submodules are initialized after cloning the repository:
git submodule update --init --recursive
Or initialize the submodules while cloning:
git clone --recurse-submodules ssh://git@gitlab.archlinux.org:222/archlinux/archmanweb.git
pacman -S pyalpm python-chardet python-django python-django-csp python-psycopg2 python-requests python-xtarfile
-
Copy
local_settings.py.example
tolocal_settings.py
and editDEBUG = True
and theSECRET_KEY
variable. -
Configure a connection to a PostgreSQL database in the Django database settings in the
local_settings.py
file. -
Make sure that the pg_trgm extension is created in the database. For example:
psql --username=<username> --dbname=<dbname> --command "create extension if not exists pg_trgm;"
-
Make migrations.
./manage.py makemigrations
-
Migrate changes.
./manage.py migrate
-
Build the archlinux-common-style submodule.
A SASS compiler is needed. For example, install sassc and run
cd archlinux-common-style make SASS=sassc
-
Start the development web server with
./manage.py runserver
. The site should be available at http://localhost:8000, saying that there are 0 man pages and 0 packages (because they were not imported yet). The server will automatically reload when you make changes to the webapp code or templates. -
Run the
update.py
script to import some man pages. However, note that the full import requires to download about 7.5 GiB of packages from a mirror of the Arch repos and then the extraction takes about 20-30 minutes. (The volume of all man pages is less than 300 MiB though.) If you won't need all man pages for the development, you can run e.g.update.py --only-repos core
to import only man pages from the core repository (the smallest one, download size is about 160 MiB) or evenupdate.py --only-packages coreutils man-pages
.
This website was created for the man template on the Arch wiki. Originally, the template replaced plain text, unclickable references to man pages with links to man7.org, which contains a handful of manuals taken directly from upstream. Later, we considered switching to another site providing more manuals. Since we did not find a suitable external site, we decided to build a new service to satisfy all our requirements:
- All man pages from official Arch packages are available. Old versions and permalinks are not necessary.
- Functionality does not require Javascript.
- Pages are addressable by their name and section, both occurring exactly once in the URL to avoid problems with pages such as ar(1) and ar(1p).
- The URLs used by the man template should not redirect to permalinks, otherwise users would start copy-pasting them to the wiki and it would be hard to check if they are the same as the canonical URLs.
- Human-readable subsection anchors.
- The page should clearly indicate the Arch package version containing the page.
See the original discussion for details.
We used a dynamic approach instead of building a website consisting of
completely static pages. The main building blocks are the
Django web framework, the
PostgreSQL database server, the mandoc
tool
from the mandoc toolset for the conversion to HTML and
the pyalpm library for data extraction
from the Arch repositories. The code is available in the
archmanweb repository at
GitLab.
Overall, this approach allows us to provide the following features without rebuilding the whole website from scratch:
- Listings with custom filters and orderings.
- Links to other versions of the same manual provided by different packages.
- Links to similar manuals available in other sections or languages.
- Searching in the names and descriptions of packages and manuals, similarly to apropos(1).
Some similar projects, each using a different approach, are:
- manned.org (code, Arch BBS thread)
- man7.org (no idea about website scripts)
- manpages.debian.org (source)
- man.openbsd.org (runs with the mandoc CGI script)
These links serve as test cases to ensure that all features still work, they are not useful to regular users.
- intro
- intro.1
- intro.1.en
- intro.en
- systemd.service
- systemd.service.5
- systemd.service.5.en
- systemd.service.en
- gimp-2.8
- gimp-2.8.1
- gimp-2.8.1.en
- gimp-2.8.en
- CA.pl
- CA.pl.1ssl
- CA.pl.1ssl.en
- CA.pl.en
Ambiguous cases are ordered by section, package repository and package version, then the first manual is selected.
- mount redirects to mount.8 (not mount.2)
- gv redirects to gv.1 (not gv.3guile, gv.3lua etc.)
- graphviz/gv redirects to graphviz/gv.3guile (not graphviz/gv.3lua etc.)
- gv.3 redirects to gv.3guile (not gv.1, gv.3lua etc.)
- aliases.5 displays extra/postfix/aliases.5 (not community/opensmtpd/aliases.5)
- mysqld.8 displays extra/mariadb/mysqld.8 (not community/percona-server/mysqld.8)
- mailx and mailx.1 redirect to mail.1.en as a symbolic link (not mailx.1p)
- nvidia-smi.cs → nvidia-smi.en → nvidia-smi.1.en (maybe we should try harder and avoid the double redirect)
- nvidia-smi.1.cs → nvidia-smi.1.en
- nvidia-smi.foo → 404
- nvidia-smi.1.foo → 404
- nvidia-utils/nvidia-smi.en
- nvidia-340xx-utils/nvidia-smi.en
- nvidia-utils/nvidia-smi.cs → nvidia-utils/nvidia-smi.en
- nvidia-340xx-utils/nvidia-smi.cs → nvidia-utils/nvidia-340xx-smi.en
- foo/nvidia-smi.cs → 404
- foo/nvidia-smi.en → 404
There is a groff(1) extension for the
man(7) and
mdoc(7)
languages to include contents of other files using the .so
macro. In normal
operation where manuals are stored as files on a file system, the
soelim(1)
pre-processor handles the inclusion. Our system is based on a database rather
than a file system, so we need a custom soelim
as well.
Some pages which contain the .so
macro:
- [.1.zh_CN
- pwunconv(8)
- pam(8)
- url(7)
- xorg.conf.d(5)
- glibc(7)
- systemd-logind(8)
- shorewall6.conf(5)
points to a page contained in a different package (
shorewall
instead ofshorewall6
) - lsof(8)
(not a "hardlink", includes an invalid file
./00DIALECTS
)