CLDF Metadata: StructureDataset-metadata.json
Sources: sources.bib
The Carneiro dataset (6th edition) describes 618 cultural practices for 72 societies that are globally distributed and encompass a wide range of cultural complexity. The data was collected by Robert Carneiro and his team in the 1960s and 1970s for the Scale Analysis project. The original notes are deposited at the American Museum of Natural History.
property | value |
---|---|
dc:conformsTo | CLDF StructureDataset |
dc:license | https://creativecommons.org/licenses/by/4.0/ |
dcat:accessURL | https://github.com/d-place/dplace-dataset-carneiro |
prov:wasDerivedFrom | |
prov:wasGeneratedBy |
|
rdf:ID | dplace-dataset-carneiro |
rdf:type | http://www.w3.org/ns/dcat#Distribution |
Table data.csv
Values are coded datapoints, i.e. measurements of a variable for a society.
Note: Missing data is signaled by an empty Value column.
property | value |
---|---|
dc:conformsTo | CLDF ValueTable |
dc:extent | 44496 |
Name/Property | Datatype | Description |
---|---|---|
ID | string Regex: [a-zA-Z0-9_\-]+ |
Primary key |
Soc_ID | string |
References societies.csv::ID |
Var_ID | string |
References variables.csv::ID |
Value | string |
Values for categorical and ordinal variables reference the corresponding code via the Code_ID column. Values for continuous variables have the measured number in the Value column and an empty Code_ID. |
Code_ID | string |
References codes.csv::ID |
Comment | string |
|
Source | list of string (separated by ; ) |
References sources.bib::BibTeX-key |
sub_case |
string |
More specific description of the population the data refer to in terms of society or area. |
year |
string Regex: -?[0-9]{1,4}(-[0-9]{4})? |
Focal year, i.e. the time period to which the data refer. |
source_coded_data |
string |
The source of the coded data, which was aggregated in this dataset. |
admin_comment |
string |
Table societies.csv
We use the term “society” to refer to cultural groups. In most cases, a society can be understood to represent a group of people at a focal location with a shared language that differs from that of their neighbors. However, in some cases multiple societies share a language.
property | value |
---|---|
dc:conformsTo | CLDF LanguageTable |
dc:extent | 72 |
Name/Property | Datatype | Description |
---|---|---|
ID | string Regex: [a-zA-Z0-9_\-]+ |
Primary key |
Name | string |
|
Latitude | decimal ≥ -90 ≤ 90 |
|
Longitude | decimal ≥ -180 ≤ 180 |
|
Glottocode | string Regex: [a-z0-9]{4}[1-9][0-9]{3} |
|
Name_and_ID_in_source |
string |
Society names identified as pejorative have been replaced with a preferred, English-language ethnonym. The name (and ID) as given in the source dataset is kept in this field. |
xd_id |
string |
“cross-data-set” identifier, used to link societies present in different datasets, if they share a focal location. Note: If this field is empty, other fields such as Name, Glottocode, focal year and location may be used to identify societies across datasets if appropriate. |
alt_names_by_society |
list of string (separated by ; ) |
A list of ‘alternate’ names for the society; includes, where available, one or more autonyms in the society’s own language, as well as other commonly encountered ethnonyms. |
main_focal_year |
integer |
Focal year specifying the time period to which the data refer, given as number of years BCE - if negative - or CE. |
HRAF_name_ID |
string |
Name(s) and ID(s) of the corresponding society in HRAF (the Human Relations Area Files) |
HRAF_ID |
string |
ID of the corresponding society in HRAF |
origLat |
decimal ≥ -90 ≤ 90 |
Uncorrected latitude as given in the source. |
origLong |
decimal ≥ -270 ≤ 180 |
Uncorrected longitude as given in the source. |
comment | string |
|
glottocode_comment |
string |
Comment on the Glottocode assignment. |
region |
string |
World Geographical Scheme for Recording Plant Distributions level2 region |
Table variables.csv
Variables are cultural features or practices, or environmental descriptors.
property | value |
---|---|
dc:conformsTo | CLDF ParameterTable |
dc:extent | 618 |
Name/Property | Datatype | Description |
---|---|---|
ID | string Regex: [A-Za-z.0-9_]+([0-9]+)? |
Primary key |
Name | string |
|
Description | string |
|
ColumnSpec | json |
|
category |
list of string (separated by , ) |
|
type |
string Valid choices: Continuous Categorical Ordinal |
Variables may be categorical (and then must be accompanied by a list of possible ‘codes’, i.e. rows in Codetable. Variables can also be continuous (e.g. Population size) or ordinal. Ordinal variables are accompanied by a list of codes (like categorical variables). The order of codes is encoded as ord column in CodeTable. |
unit |
string |
The unit of measurement |
source_comment |
string |
A note about the source of this variable. |
changes |
string |
Notes about how a variable may have been derived from the source. |
comment | string |
Table codes.csv
property | value |
---|---|
dc:conformsTo | CLDF CodeTable |
dc:extent | 1236 |
Name/Property | Datatype | Description |
---|---|---|
ID | string Regex: [a-zA-Z0-9_\-]+ |
Primary key |
Var_ID | string |
The parameter or variable the code belongs to. References variables.csv::ID |
Name | string |
|
Description | string |
|
ord |
integer |