Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve docu #127

Merged
merged 32 commits into from
Jun 30, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
32 commits
Select commit Hold shift + click to select a range
6d1c4ba
improve nb
pmayd Jun 25, 2024
a892f7b
update table guide
pmayd Jun 25, 2024
64c88b4
Add preliminary "Kreise" shape files
MarcoHuebner Jun 26, 2024
d64cc5f
Add description for last call
MarcoHuebner Jun 26, 2024
7afeaf7
Add descriptions and beautify plots
MarcoHuebner Jun 26, 2024
27f6a1d
Add maps for regional depth
MarcoHuebner Jun 27, 2024
abd93f2
Update notebook with more markdown descriptions, remove intermediate …
MarcoHuebner Jun 27, 2024
ad27f0c
Added minor additional comments to table nb.
Jun 28, 2024
f1f42db
Minor tweaks to README.
Jun 28, 2024
44df05d
Fix caching for jobs (content type json)
MarcoHuebner Jun 29, 2024
82c22f2
Clean up and finalize geo visualization notebook
MarcoHuebner Jun 29, 2024
a67f5e3
Merge branch 'improve-docu' of https://github.com/CorrelAid/pystatis …
MarcoHuebner Jun 29, 2024
de20f8d
Merge branch 'dev' into improve-docu
MarcoHuebner Jun 29, 2024
52c9368
wip
pmayd Jun 27, 2024
fdb4e56
fix error when trying to load job data from cache
pmayd Jun 29, 2024
715b0b4
Add source for geoshapes and fix some typos.
Jun 29, 2024
b0c8cd6
Remove notebooks that are either empty or not required.
Jun 29, 2024
8949062
Merge branch 'dev' into improve-docu
bergnerjonas Jun 29, 2024
a79dfa5
Move profile notebook into setup notebook.
Jun 29, 2024
1143055
Merge branch 'improve-docu' of https://github.com/CorrelAid/pystatis …
Jun 29, 2024
49ac148
Remove python file generated from presentation notebook.
Jun 29, 2024
974d69a
Rename other notebooks for consistency.
Jun 29, 2024
109d850
Remove notebook artifacts from renaming.
Jun 29, 2024
41e5a11
Merge branch 'improve-docu' of https://github.com/CorrelAid/pystatis …
Jun 29, 2024
8639113
Translate Jobs notebook to EN.
Jun 29, 2024
2732662
Update find notebook and fix notebook names.
Jun 29, 2024
230b3f3
Merge branch 'dev' into improve-docu
Jun 30, 2024
62f6d04
finish nb 01
pmayd Jun 30, 2024
6d3b033
update nb02
pmayd Jun 30, 2024
854d091
update nb04
pmayd Jun 30, 2024
15d4c77
update docs
pmayd Jun 30, 2024
cd768db
README
pmayd Jun 30, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
47 changes: 24 additions & 23 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,12 +16,14 @@ The main features are:
- **Simplified access** to all supported API. No more need to write cumbersome API calls or switch between databases.
- **Credential management** removes the need to manually add credentials. We handle all your credentials for you.
- **Database management** handles different databases and lets you switch easily between them.
- **Integrated workflow** enables an end-to-end process from finding the relevant data to download it.
- **Integrated workflow** enables an end-to-end process from finding the relevant data to downloading it.
- **Pandas support** instead of manually parsing results.
- **Caching** to enable productive work despite strict query limits.
- **Starting and handling background jobs** for datasets that are to big to be downloaded directly from GENESIS.
- **Starting and handling background jobs** for datasets that are too big to be downloaded directly from GENESIS.

To learn more about GENESIS refer to the official documentation [here](https://www.destatis.de/EN/Service/OpenData/api-webservice.html).
To learn more about GENESIS, please refer to the official documentation [here](https://www.destatis.de/EN/Service/OpenData/api-webservice.html).

The full documentation of the main and dev branches are hosted via [GitHub Pages (main)](https://correlaid.github.io/pystatis/) and [GitHub Pages (dev)](https://correlaid.github.io/pystatis/dev/).

## Installation

Expand All @@ -41,7 +43,7 @@ print("Version:", pystatis.__version__)

## Getting started

To be able to use the web service/API of either GENESIS-Online, Regionaldatenbank or Zensus, you have to be a registered user. You can create your user [here](https://www-genesis.destatis.de/genesis/online?Menu=Anmeldung), [here](https://www.regionalstatistik.de/genesis/online?Menu=Registrierung#abreadcrumb), or [here](https://ergebnisse2011.zensus2022.de/datenbank/online?Menu=Registrierung#abreadcrumb).
To be able to use the web service/API of either GENESIS-Online, Regionaldatenbank or Zensus, you have to be a registered user of the respective database. You can create your user [here](https://www-genesis.destatis.de/genesis/online?Menu=Anmeldung), [here](https://www.regionalstatistik.de/genesis/online?Menu=Registrierung#abreadcrumb), or [here](https://ergebnisse2011.zensus2022.de/datenbank/online?Menu=Registrierung#abreadcrumb).

Once you have a registered user, you can use your username and password as credentials for authentication against the web service/API.

Expand Down Expand Up @@ -74,21 +76,21 @@ This package currently supports retrieving the following data types:

### Find the right data

`pystatis` offers the `Find` class to search for any piece of information with GENESIS. Behind the scene it's using the `find` endpoint.
`pystatis` offers the `Find` class to search for any piece of information within each database. Behind the scene it's using the `find` endpoint.

Example:

```python
from pystatis import Find

results = Find("Rohöl") # Initiates object that contains all variables, statistics, tables and cubes
results = Find("Rohöl", "genesis") # Initiates object that contains all variables, statistics, tables and cubes
results.run() # Runs the query
results.tables.df # Results for tables
results.tables.get_code([1,2,3]) # Gets the table codes, e.g. for downloading the table
results.tables.get_metadata([1,2]) # Gets the metadata for the table
```

A complete overview of all use cases is provided in the example notebook for [find](https://github.com/CorrelAid/pystatis/blob/main/nb/find.ipynb).
A complete overview of all use cases is provided in the example notebook for [find](https://github.com/CorrelAid/pystatis/blob/main/nb/03_find.ipynb).

### Download data

Expand All @@ -101,10 +103,10 @@ from pystatis import Table

t = Table(name="21311-0001") # data is not yet downloaded
t.get_data() # only now the data is either fetched from GENESIS or loaded from cache. If the data is downloaded from online, it will be also cached, so next time the data is loaded from cache. The default language of the data is German but it can be set to either German (de) or English (en) using the language parameter of get_data().
t.data # prettified data stored as pandas data frame
t.data # prettified data stored as pandas DataFrame
```

For more details, please study the provided sample notebook for [tables](https://github.com/CorrelAid/pystatis/blob/main/nb/table.ipynb).
For more details, please study the provided sample notebook for [tables](https://github.com/CorrelAid/pystatis/blob/main/nb/01_table.ipynb).

### Clear Cache

Expand All @@ -117,18 +119,6 @@ clear_cache("21311-0001") # only deletes the data for the object with the speci
clear_cache() # deletes the complete cache
```

### Full documentation

The full documentation of the main and dev branches are hosted via [GitHub Pages (main)](https://correlaid.github.io/pystatis/) and [GitHub Pages (dev)](https://correlaid.github.io/pystatis/dev/). It can also be built locally by running

```bash
cd docs && make clean && make html
```

from the project root directory. Besides providing parsed docstrings of the individual package modules, the full documentation currently mirrors most of the readme, like installation and usage. The mirroring crucially relies on the names of the section headers in the ReadMe, so change them with care!

More information on how to use sphinx is provided [here](https://docs.readthedocs.io/en/stable/intro/getting-started-with-sphinx.html).

## License

Distributed under the MIT License. See `LICENSE.txt` for more information.
Expand All @@ -139,11 +129,10 @@ A few ideas we should implement in the maybe-near future:

- Mechanism to download data that is newer than the cached version. Right now, once data is cached, it is always retrieved from cache no matter if there is a newer version online. However, this could be quite challenging as the GENESIS API is really bad in providing a good and consistent field for the last update datetime.
- Improve Table metadata so the user can look up the variables contained in the dataset and for each variable the values that this variable can have.
- Understand and support time series.

## How to contribute?

Contributions to this project are highly appreciated! You can either contact the maintainers or directly create a pull request for your proposed changes:
Contributions to this project are highly appreciated! You can either contact the maintainers, create an issue or directly create a pull request for your proposed changes:

1. Fork the Project
2. Create your Feature Branch (`git checkout -b feature/<descriptive-name>`)
Expand Down Expand Up @@ -180,3 +169,15 @@ To contribute to this project, please follow these steps:
11. Create a new PR, always against `dev` as target.

To learn more about `poetry`, see [Dependency Management With Python Poetry](https://realpython.com/dependency-management-python-poetry/#command-reference) by realpython.com.

### Documentation process

Documentation can also be built locally by running

```bash
cd docs && make clean && make html
```

from the project root directory. Besides providing parsed docstrings of the individual package modules, the full documentation currently mirrors most of the readme, like installation and usage. The mirroring crucially relies on the names of the section headers in the ReadMe, so change them with care!

More information on how to use sphinx is provided [here](https://docs.readthedocs.io/en/stable/intro/getting-started-with-sphinx.html).
2 changes: 1 addition & 1 deletion docs/source/pystatis.rst
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@ pystatis.config module
pystatis.custom\_exceptions module
----------------------------------

.. automodule:: pystatis.custom_exceptions
.. automodule:: pystatis.exception
:members:
:undoc-members:
:show-inheritance:
Expand Down
23 changes: 23 additions & 0 deletions nb/00_Setup.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -112,6 +112,29 @@
"with open(Path.home() / \".pystatis\" / \"config.ini\") as f:\n",
" print(f.read())"
]
},
{
"cell_type": "markdown",
"id": "26b7c286",
"metadata": {},
"source": [
"The `profile` module allows you to change your password (`change_password`) and use the (unavailable) `remove_result` functionality, listed in the [documentation under 2.7.2](https://www-genesis.destatis.de/genesis/misc/GENESIS-Webservices_Einfuehrung.pdf)."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "d80712ce",
"metadata": {},
"outputs": [],
"source": [
"# change your password\n",
"pystatis.profile.change_password(db_name=\"genesis\", new_password=\"DoNotUseThisAccidentally\")\n",
"\n",
"# use remove_result functionality\n",
"destatis_name_code = \"42131-0001\"\n",
"pystatis.profile.remove_result(name=destatis_name_code)"
]
}
],
"metadata": {
Expand Down
Loading
Loading