Skip to content

2023 version of git course in "Data Toolbox for Reproducible Research in Ecology"

Notifications You must be signed in to change notification settings

rdatatoolbox/course-git-iago

Repository files navigation

Here are the sources to generate the pdf/slideshow support for a git training course.

This particular slideshow was designed with rather high expectations regarding:

  1. Rendering quality (so it's nice, well constructed and portable).
  2. Animation quality (so it's very clear what git is doing during the course).

To address 1., pgf/tikz was chosen, embedded within a vanilla standalone LaTeX document.
LaTeX/beamer alone were not sophisticated/comfortable enough to address 2. As a consequence, python scripts were used instead.

These two languages are not exactly designed to work together, so the project has become unexpectedly big.
Obviously, the first course to be given with this slideshow had a definite date, and I was late to meet that date, so the project has also become unexpectedly messy.
So we are ending up here with a big and messy project whose only purpose is to produce one pdf/slideshow. Hopefully it is also making it possible, in the future, to edit the slideshow and make it evolve in a (rather) flexible way.

Generate the document

With python 3.11 and lualatex, just run:

$ python main.py

Wait for ≈5min for the compilation to happen, then find your result in the newly created ./res.pdf file.

Current slideshow content

  • Introduction to git (from scratch).
  • Git clients.
  • Commit, push commits to a remote.
  • Clone, fetch commits, collaborate.
  • Solve conflicts.
  • Merge or Rebase.

In particular, there is no mention made of:

  • GUI clients interfaces
  • submodules
  • advanced git utilities like cherry-pick, interactive rebase, hooks, etc.

yet.

Current implementation

I am not quite happy with the current implementation and I do have a strong desire to rewrite all of it. Unfortunately (or fortunately maybe?) there is little chance that this'll happen anytime soon.

The major justification why this has become so sophisticated is that animation of every slide can be written in the following form:

  • ./pizzas.py
  • ./staging.py
  • ./remotes.py
  • ./conflicts.py
  • etc.

There is one such file per "slide pattern", and the purpose of all these files is to parse one particular section of the current stub file found at..

  • ./tex/main.tex

.. and to introduce slight modifications in it on every animation step, in a programmable fashion.

On every "step", a new section is generated into the resulting..

  • ./tex/generated_steps.tex

.. which is the one eventually compiled when ./main.py is run. Every step is a new page in the resulting file.

The other files within the ./tex folder, like..

  • ./tex/files.tex
  • ./tex/repo.tex
  • ./tex/diff.tex
  • etc.

.. provide all high-level LaTeX commands used to produce file trees, chains of commits, diffed files, etc.

The other python files like..

  • ./filetree.py
  • ./repo.py
  • ./diffs.py
  • etc.

.. are mirroring them to translate them into python manipulable objects.

The major pattern at play here is the one found in the core file:

  • ./modifiers.py

The abstract class TextModifier is the one parenting most of the useful objects within the python scripts. In a nutshell, a TextModifier value is an object bound to a particular piece of LaTeX code, with special attributes and a .render() method. When .render() is called, the current attributes values are read to produce one adequate version of the code, with e.g. correctly updated style, positionning, textual content etc. TextModifier objects can contain each other as attributes, and they are rendered recursively. The root of the modifiers tree is the Document value whose definition can be found at:

  • ./document.py

TextModifier objects are constructed by another category of objects called *Builders. There are two ways of producing a new TextModifier value:

  • Either by parsing an existing piece of LaTeX code found in the stub ./tex/main.tex file, with a .parse() method.
  • Or from a python-originated set of attributes, with a .new() method.

There is much fragility here, because the parse methods are all very naive, lexical-based with no deep understanding of LaTeX language implemented. As a consequence, they are much fragile to the actual, lexical content of *.tex files, and you can break the whole process by just changing a comment line or even whitespace within the *.tex files. Hopefully either python or the LaTeX compilation process should crash in such a situation, so you can figure that there is something wrong.
If it's python, it's because safety assert guards have been introduced here and there, and you can navigate to the failed assertion to make an attempt to figure which assumption/invariant has not been met.
If it's LaTeX, the message is rarely helpful, but you can open the ./tex/generated_steps.tex file and prune it until you figure out what went wrong during the rendering stage.

There is also fragility in the sense that there is no guarantee whatsoever that the LaTeX code eventually generated by a TextModifier object could even be "parsed back" by the corresponding builder into a similar value. So there is no "rendering-parsing" loop, and "rendering" is not a kind of serialization.

Since compilation is long, you can choose to only generate particular sections of the slideshow by using the corresponding slides names and/or numbers as special arguments to doc.generate_tex() in main.py. For example, doc.generate_tex(5, 9) will only generate pages 5 to 9 (included), doc.generated_steps("Conflicts") will only generate pages for the "Conflicts" slide and doc.generated_steps("Conflicts", 5, 9) will only generated pages from the 5th to the 9th within the "Conflicts" slide.


This is it, happy hacking if you ever even wish to get in there, and I hope that the future is bright with (better) constructive, open-source slideshow animation software..

.. hm. I actually have a few ideas about that.. stay tuned.. maybe? I guess?


This work is licensed under Attribution-NonCommercial-ShareAlike 4.0 International

About

2023 version of git course in "Data Toolbox for Reproducible Research in Ecology"

Resources

Stars

Watchers

Forks

Packages

No packages published