Skip to content

Commit

Permalink
GitHub Pages site (#35)
Browse files Browse the repository at this point in the history
* Make  a mixin and simplify

* Mkdocs documentation

* Automated site deployment

* Publish docs on push to main

* Fix block access
  • Loading branch information
dominictarro authored Jan 31, 2024
1 parent d24c4a7 commit 5ed35bc
Show file tree
Hide file tree
Showing 24 changed files with 2,257 additions and 646 deletions.
47 changes: 47 additions & 0 deletions .github/workflows/publish-docs.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,47 @@
name: Publish docs

on:
workflow_dispatch:
push:
branches:
- main

jobs:
build-and-publish-docs:
name: Build and publish docs
runs-on: ubuntu-latest

steps:
- name: Check out repository code
uses: actions/checkout@v4

- name: Setup Python
id: setup-python
uses: actions/setup-python@v4
with:
python-version: "3.11"
cache: "pipenv"

- name: Install pipenv
run: |
python -m pip install --upgrade pipenv wheel
- name: Install dependencies
if: steps.setup-python.outputs.cache-hit != 'true'
run: |
pipenv sync --dev
pipenv run pip install -e .
- name: Update dataset files
run: |
pipenv run dataset-docs
- name: Build docs
run: |
pipenv run mkdocs build
- name: Publish docs
uses: JamesIves/github-pages-deploy-action@v4.4.2
with:
branch: docs
folder: site
5 changes: 5 additions & 0 deletions Pipfile
Original file line number Diff line number Diff line change
Expand Up @@ -29,9 +29,14 @@ pytest = "*"
click = "*"
prefect-github = ">=0.2.2"
moto = {extras = ["s3"], version = "<5.0,>=3.1.16"}
mkdocs = "*"
mkdocs-material = "*"
tabulate = "*"
neoteroi-mkdocs = "*"

[requires]
python_version = "3.11"

[scripts]
tests = "pytest tests/"
dataset-docs = "python src/scripts/docs.py"
911 changes: 498 additions & 413 deletions Pipfile.lock

Large diffs are not rendered by default.

35 changes: 35 additions & 0 deletions docs/Datasets/Media Inventory.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
# Media Inventory

!!! warning

**This dataset is currently private.**

The Media Inventory is a collection of evidence files that were extracted from the [Oryx](./Oryx.md) dataset.

## Sources

- [Postimages](https://postimg.cc/)

## Schema

<!-- BEGIN SCHEMA SECTION -->

| Name | Type | Description |
|:----------------|:---------|:--------------------------------|
| url_hash | string | A SHA-256 hash of the `url`. |
| url | string | The URL to the evidence. |
| evidence_source | string | The source of the evidence. |
| media_key | string | The S3 Object Key to the media. |
| file_type | string | The file type/extension. |
| media_type | string | The media classification. |
| as_of_date | datetime | The date the row was generated. |

<!-- END SCHEMA SECTION -->

## Examples

Evidence is largely composed of JPEGs and PNGs.

| JPEG | PNG |
| --- | --- |
| ![Example of an image in Oryx media dataset.](../_static/example-oryx-media.jpeg) | ![Example of an image in Oryx media dataset.](../_static/example-oryx-media.png) |
74 changes: 74 additions & 0 deletions docs/Datasets/Oryx.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,74 @@
# Oryx

The Oryx dataset is a complete collection of the equipment losses in the Oryx database. The loss cases have been cleaned and transformed into JSON objects.

## Sources

- [Attack On Europe: Documenting Ukrainian Equipment Losses During The 2022 Russian Invasion Of Ukraine](https://www.oryxspioenkop.com/2022/02/attack-on-europe-documenting-ukrainian.html)
- [Attack On Europe: Documenting Russian Equipment Losses During The 2022 Russian Invasion Of Ukraine](https://www.oryxspioenkop.com/2022/02/attack-on-europe-documenting-equipment.html)
- [List Of Naval Losses During The Russian Invasion Of Ukraine](https://www.oryxspioenkop.com/2022/03/list-of-naval-losses-during-2022.html)
- [List Of Aircraft Losses During The Russian Invasion Of Ukraine](https://www.oryxspioenkop.com/2022/03/list-of-aircraft-losses-during-2022.html)

## Schema

<!-- BEGIN SCHEMA SECTION -->

| Name | Type | Description |
|:-------------------------------|:-------------|:-------------------------------------------------------------------------------------------------------------------------|
| country | string | The country that suffered the equipment loss. |
| category | string | The equipment category. |
| model | string | The equipment model. |
| url_hash | string | A SHA-256 hash of the `evidence_url`. |
| case_id | numeric | A special ID for discriminating equipment losses when their `country`, `category`, `model`, and `url_hash` are the same. |
| status | list(string) | The statuses of the equipment loss. |
| evidence_url | string | The URL to the evidence of the equipment loss. |
| country_of_production | string | The ISO Alpha-3 code of the country that produces the `model`. |
| country_of_production_flag_url | string | The URL to the flag of the country that produces the `model`. |
| evidence_source | string | The source of the evidence. |
| description | string | The Oryx description the equipment loss was extracted from. |
| id_ | numeric | The Oryx ID the equipment loss was labeled with. |
| as_of_date | datetime | The date the row was generated. |

<!-- END SCHEMA SECTION -->

The `url_hash` is used as a file name and identifier for downloaded media from the [Media Inventory](./Media%20Inventory.md) dataset.

## Examples

```json
{
"as_of_date": "2023-05-05T06:27:55.585531+00:00",
"case_id": 1,
"category": "Tanks",
"country": "Russia",
"country_of_production": "SUN",
"country_of_production_flag_url": "https://upload.wikimedia.org/wikipedia/commons/thumb/a/a9/Flag_of_the_Soviet_Union.svg/23px-Flag_of_the_Soviet_Union.svg.png",
"description": "1, captured",
"evidence_source": "postimg",
"evidence_url": "https://i.postimg.cc/yxw0SFD6/1001-T-62-Obr-1967-capt.jpg",
"id_": 1,
"model": "T-62 Obr. 1967",
"status": [
"captured"
],
"url_hash": "e32852f22ee32db27b3733229e1e518a67443adf4c6fc40ce60690f1ac6f3b6a"
}
```

The [Kaggle](https://www.kaggle.com/dominictarro/borderlands) dataset only contains the essential fields.

```json
{
"country": "Russia",
"category": "Aircraft",
"model": "Beriev A-50",
"url_hash": "b77d72f5bef846998d2b2be6226865a16cf3741f7fb5d6992dd77f93d31130bc",
"case_id": 1,
"status": [
"destroyed"
],
"evidence_url": "https://i.postimg.cc/g2Xg4dgw/1032-a50-awacs-destr-14-01-24.jpg",
"country_of_production": "SUN",
"evidence_source": "postimg"
}
```
23 changes: 23 additions & 0 deletions docs/License.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
# License

MIT License

Copyright (c) 2023 dominictarro

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.
40 changes: 0 additions & 40 deletions docs/Media Inventory.md

This file was deleted.

62 changes: 0 additions & 62 deletions docs/Oryx.md

This file was deleted.

Binary file added docs/_static/bracket-curly.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/_static/calendar.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/_static/cloud-computing.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
File renamed without changes
File renamed without changes
Binary file added docs/_static/gallery.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/_static/twitter.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
25 changes: 25 additions & 0 deletions docs/about.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
# About

This dataset was curated to provide quality data to observers of the Russo-Ukrainian War. It is free for the public and voluntarily maintained. Collaborators are welcome to contribute to the project.

## Development Plan

::cards::

- title: Media dates
content: Image dates extracted with OCR, Postimg web pages, and X.
image: _static/calendar.png
- title: Media publication
content: Publish the media dataset to Kaggle.
image: _static/cloud-computing.png
- title: X media source
content: Download media from X.
image: _static/twitter.png

::/cards::

## Funding

The project's expenses are shouldered by the author. If you would like to support Borderlands's maintenance and development, please consider donating through my Patreon.

[Help fund the project :fontawesome-solid-hand-holding-dollar:](https://patreon.com/tarrodot){ .md-button }
Loading

0 comments on commit 5ed35bc

Please sign in to comment.