Skip to content

Commit

Permalink
Re-generate CLDF with notes.
Browse files Browse the repository at this point in the history
  • Loading branch information
gederajeg committed Jul 6, 2024
1 parent 1e036fa commit 5b2e2bd
Show file tree
Hide file tree
Showing 11 changed files with 343 additions and 335 deletions.
608 changes: 304 additions & 304 deletions .Rhistory

Large diffs are not rendered by default.

14 changes: 6 additions & 8 deletions .zenodo.json
Original file line number Diff line number Diff line change
Expand Up @@ -7,20 +7,18 @@
],
"creators": [
{
"name": "Carl Benjamin Hermann von Rosenberg"
}
],
"contributors": [
{
"name": "Gede Primahadi W. Rajeg",
"type": "Other"
"name": "Gede Primahadi W. Rajeg"
}
],
"contributors": [],
"communities": [
{
"identifier": "lexibank"
}
],
"upload_type": "dataset",
"description": "<p>Cite the source of the dataset as:</p>\n\n<blockquote>\n<p>Rosenberg, Carl Benjamin Hermann von. 1853. De Mentawei-Eilanden en Hunne Bewoners. Tijdschrift voor Indische Taal-, Land- en Volkenkunde 1. 403\u2013440.</p>\n</blockquote>"
"description": "<p>Cite the source of the dataset as:</p>\n\n<blockquote>\n<p>Rosenberg, Carl Benjamin Hermann von. 1853. De Mentawei-Eilanden en Hunne Bewoners. Tijdschrift voor Indische Taal-, Land- en Volkenkunde 1. 403\u2013440.</p>\n</blockquote>",
"license": {
"id": "CC-BY-NC-SA-4.0"
}
}
3 changes: 1 addition & 2 deletions CONTRIBUTORS.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,5 +2,4 @@

Name | GitHub user | Description | Role
--- | --- | --- | ---
Carl Benjamin Hermann von Rosenberg | | | Author
Gede Primahadi W. Rajeg | @gederajeg | maintainer, CLDF conversion, Concepticon mapping, Orthography profiling | Other
Gede Primahadi W. Rajeg | @gederajeg | digitisation, code, CLDF conversion, Concepticon mapping, Orthography profiling | Maintainer
7 changes: 4 additions & 3 deletions LICENSE
Original file line number Diff line number Diff line change
Expand Up @@ -33,7 +33,7 @@ exhaustive, and do not form part of our licenses.
material not subject to the license. This includes other CC-
licensed material, or material used under an exception or
limitation to copyright. More considerations for licensors:
wiki.creativecommons.org/Considerations_for_licensors
wiki.creativecommons.org/Considerations_for_licensors

Considerations for the public: By using one of our public
licenses, a licensor grants the public permission to use the
Expand All @@ -49,8 +49,8 @@ exhaustive, and do not form part of our licenses.
such as asking that all changes be marked or described.
Although not required by our licenses, you are encouraged to
respect those requests where reasonable. More considerations
for the public:
wiki.creativecommons.org/Considerations_for_licensees
for the public:
wiki.creativecommons.org/Considerations_for_licensees

=======================================================================

Expand Down Expand Up @@ -435,3 +435,4 @@ the avoidance of doubt, this paragraph does not form part of the
public licenses.

Creative Commons may be contacted at creativecommons.org.

16 changes: 13 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,9 @@
# CLDF dataset derived from von Rosenberg's "De Mentawei-Eilanden en Hunne Bewoners" from 1853

<!-- badges: start -->
[![CLDF validation](https://github.com/complexico/mentawai-word-list-1853/workflows/CLDF-validation/badge.svg)](https://github.com/complexico/mentawai-word-list-1853/actions?query=workflow%3ACLDF-validation)
<!-- badges: end -->

## How to cite

If you use these data please cite
Expand All @@ -10,10 +14,17 @@ If you use these data please cite
## Description


This dataset is licensed under a CC-BY-NC-SA 4.0 license
This dataset is licensed under a https://creativecommons.org/licenses/by-nc-sa/4.0/deed.en license

Available online at https://www.digitale-sammlungen.de/en/view/bsb10433845?page=450,451

## Notes

Based on the [Rights Statement](https://www.digitale-sammlungen.de/en/details/bsb10433845) (presented down below in that page), this digitised journal has a [No Copyright-Non-commercial use only](https://rightsstatements.org/page/NoC-NC/1.0/?language=en) condition.

Before the CLDF conversion using Python, the materials in this repository (inside the [data](https://github.com/complexico/mentawai-word-list-1853/tree/main/data) directory) were processed using R as an RStudio project (the R scripts are in the [codes](https://github.com/complexico/mentawai-word-list-1853/tree/main/codes) directory). The English gloss of the Dutch was generated via the DeepL translator using the [`deeplr` R package](https://cran.r-project.org/package=deeplr).


## Statistics


Expand All @@ -37,8 +48,7 @@ Available online at https://www.digitale-sammlungen.de/en/view/bsb10433845?page=

Name | GitHub user | Description | Role
--- | --- | --- | ---
Carl Benjamin Hermann von Rosenberg | | | Author
Gede Primahadi W. Rajeg | @gederajeg | maintainer, CLDF conversion, Concepticon mapping, Orthography profiling | Other
Gede Primahadi W. Rajeg | @gederajeg | digitisation, code, CLDF conversion, Concepticon mapping, Orthography profiling | Maintainer



Expand Down
16 changes: 8 additions & 8 deletions cldf/.transcription-report.json
Original file line number Diff line number Diff line change
Expand Up @@ -2,9 +2,9 @@
"by_language": {
"Mentawai": {
"bipa_errors": [
"t\u0361\u0292",
"<<\u00eb>>",
"<<->>",
"t\u0361\u0292"
"<<->>"
],
"general_errors": 18,
"replacements": {
Expand Down Expand Up @@ -100,9 +100,9 @@
]
},
"sclass_errors": [
"t\u0361\u0292",
"<<\u00eb>>",
"<<->>",
"t\u0361\u0292"
"<<->>"
],
"segments": {
"+": 1,
Expand Down Expand Up @@ -251,9 +251,9 @@
],
"bad_words_count": 15,
"bipa_errors": [
"t\u0361\u0292",
"<<\u00eb>>",
"<<->>",
"t\u0361\u0292"
"<<->>"
],
"general_errors": 18,
"invalid_words": [],
Expand Down Expand Up @@ -352,9 +352,9 @@
]
},
"sclass_errors": [
"t\u0361\u0292",
"<<\u00eb>>",
"<<->>",
"t\u0361\u0292"
"<<->>"
],
"segments": {
"+": 1,
Expand Down
4 changes: 2 additions & 2 deletions cldf/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,9 +11,9 @@ property | value
[dc:bibliographicCitation](http://purl.org/dc/terms/bibliographicCitation) | Rosenberg, Carl Benjamin Hermann von. 1853. De Mentawei-Eilanden en Hunne Bewoners. Tijdschrift voor Indische Taal-, Land- en Volkenkunde 1. 403–440.
[dc:conformsTo](http://purl.org/dc/terms/conformsTo) | [CLDF Wordlist](http://cldf.clld.org/v1.0/terms.rdf#Wordlist)
[dc:identifier](http://purl.org/dc/terms/identifier) | https://www.digitale-sammlungen.de/en/view/bsb10433845?page=450,451
[dc:license](http://purl.org/dc/terms/license) | CC-BY-NC-SA 4.0
[dc:license](http://purl.org/dc/terms/license) | https://creativecommons.org/licenses/by-nc-sa/4.0/
[dcat:accessURL](http://www.w3.org/ns/dcat#accessURL) | git@github.com:complexico/mentawai-word-list-1853
[prov:wasDerivedFrom](http://www.w3.org/ns/prov#wasDerivedFrom) | <ol><li><a href="git@github.com:complexico/mentawai-word-list-1853/tree/774ff3f">git@github.com:complexico/mentawai-word-list-1853 774ff3f</a></li><li><a href="glottolog-glottolog-d9da5e2">Glottolog glottolog-glottolog-d9da5e2</a></li><li><a href="https://github.com/concepticon/concepticon-data/tree/7c0b6ae3">Concepticon v3.1.0-19-g7c0b6ae3</a></li><li><a href="cldf-clts-clts-6dc73af">CLTS cldf-clts-clts-6dc73af</a></li></ol>
[prov:wasDerivedFrom](http://www.w3.org/ns/prov#wasDerivedFrom) | <ol><li><a href="git@github.com:complexico/mentawai-word-list-1853/tree/1e036fa">git@github.com:complexico/mentawai-word-list-1853 1e036fa</a></li><li><a href="glottolog-glottolog-d9da5e2">Glottolog glottolog-glottolog-d9da5e2</a></li><li><a href="https://github.com/concepticon/concepticon-data/tree/7c0b6ae3">Concepticon v3.1.0-19-g7c0b6ae3</a></li><li><a href="cldf-clts-clts-6dc73af">CLTS cldf-clts-clts-6dc73af</a></li></ol>
[prov:wasGeneratedBy](http://www.w3.org/ns/prov#wasGeneratedBy) | <ol><li><strong>lingpy-rcParams</strong>: <a href="./lingpy-rcParams.json">lingpy-rcParams.json</a></li><li><strong>python</strong>: 3.9.6</li><li><strong>python-packages</strong>: <a href="./requirements.txt">requirements.txt</a></li></ol>
[rdf:ID](http://www.w3.org/1999/02/22-rdf-syntax-ns#ID) | barrier-islands-mentawai-wlist1853
[rdf:type](http://www.w3.org/1999/02/22-rdf-syntax-ns#type) | http://www.w3.org/ns/dcat#Distribution
Expand Down
4 changes: 2 additions & 2 deletions cldf/cldf-metadata.json
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@
"dc:conformsTo": "http://cldf.clld.org/v1.0/terms.rdf#Wordlist",
"dc:identifier": "https://www.digitale-sammlungen.de/en/view/bsb10433845?page=450,451",
"dc:isVersionOf": null,
"dc:license": "CC-BY-NC-SA 4.0",
"dc:license": "https://creativecommons.org/licenses/by-nc-sa/4.0/",
"dc:related": null,
"dc:source": "sources.bib",
"dc:title": "CLDF dataset derived from von Rosenberg's \"De Mentawei-Eilanden en Hunne Bewoners\" from 1853",
Expand All @@ -14,7 +14,7 @@
{
"rdf:about": "git@github.com:complexico/mentawai-word-list-1853",
"rdf:type": "prov:Entity",
"dc:created": "774ff3f",
"dc:created": "1e036fa",
"dc:title": "Repository"
},
{
Expand Down
2 changes: 1 addition & 1 deletion cldf/lingpy-rcParams.json
Original file line number Diff line number Diff line change
Expand Up @@ -123,7 +123,7 @@
"scorer": {},
"sonar": true,
"stress": "\u02c8\u02cc'",
"timestamp": "2024-07-06 11:25",
"timestamp": "2024-07-06 14:58",
"tones": "\u00b9\u00b2\u00b3\u2074\u2075\u2076\u2077\u2078\u2079\u2070\u2081\u2082\u2083\u2084\u2085\u2086\u2087\u2088\u2089\u20800123456789\u02e5\u02e6\u02e7\u02e8\u02e9\u02ea\u02eb-\ua708-\ua709-\ua70a-\ua70b-\ua70c-\ua70d-\ua70e-\ua70f-\ua710-\ua711-\ua712-\ua713-\ua714-\ua715-\ua716-\ua717-\ua718-\ua719-\ua71a-\ua700-\ua701-\ua702-\ua703-\ua704-\ua705-\ua706-\ua707",
"tree_calc": "neighbor",
"unique_sequences": true,
Expand Down
2 changes: 1 addition & 1 deletion metadata.json
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
"id": "barrier-islands-mentawai-wlist1853",
"title": "CLDF dataset derived from von Rosenberg's \"De Mentawei-Eilanden en Hunne Bewoners\" from 1853",
"description": null,
"license": "CC-BY-NC-SA 4.0",
"license": "https://creativecommons.org/licenses/by-nc-sa/4.0/deed.en",
"url": "https://www.digitale-sammlungen.de/en/view/bsb10433845?page=450,451",
"citation": "Rosenberg, Carl Benjamin Hermann von. 1853. De Mentawei-Eilanden en Hunne Bewoners. Tijdschrift voor Indische Taal-, Land- en Volkenkunde 1. 403\u2013440."
}
2 changes: 1 addition & 1 deletion tutorial-step-notes-to-create-the-cldf.sh
Original file line number Diff line number Diff line change
Expand Up @@ -59,7 +59,7 @@ tree -v --charset utf-8
cldfbench lexibank.makecldf cldfbench_barrier-islands-mentawai-wlist1853.py --glottolog "/Users/Primahadi/Documents/cldf_project/glottolog-glottolog-d9da5e2" --concepticon "/Users/Primahadi/Documents/cldf_project/concepticon/concepticon-data" --clts "/Users/Primahadi/Documents/cldf_project/cldf-clts-clts-6dc73af"

# to create an orthography profile (with a guess to possible IPA form/phoneme) from the Form col. in cldf/forms.csv using pylexibank (cf. List (2021: section 6)): https://calc.hypotheses.org/2954
cldfbench lexibank.init_profile cldfbench_barrier-islands-mentawai-wlist1853.py --clts "/Users/Primahadi/Documents/cldf_project/cldf-clts-clts-6dc73af"
# cldfbench lexibank.init_profile cldfbench_barrier-islands-mentawai-wlist1853.py --clts "/Users/Primahadi/Documents/cldf_project/cldf-clts-clts-6dc73af"

## note on orthography workflow
- # we could add an orthography profile file (orthography.tsv) in `etc` directory that we previously created using qlcData and manually edited (## ensure we already have the IPA match of the grapheme as well!)
Expand Down

0 comments on commit 5b2e2bd

Please sign in to comment.