Skip to content

Latest commit

 

History

History

cldf

StructureDataset D-PLACE dataset derived from Robert L. Carneiro's Dataset (6th edition)

CLDF Metadata: StructureDataset-metadata.json

Sources: sources.bib

The Carneiro dataset (6th edition) describes 618 cultural practices for 72 societies that are globally distributed and encompass a wide range of cultural complexity. The data was collected by Robert Carneiro and his team in the 1960s and 1970s for the Scale Analysis project. The original notes are deposited at the American Museum of Natural History.

property value
dc:conformsTo CLDF StructureDataset
dc:license https://creativecommons.org/licenses/by/4.0/
dcat:accessURL https://github.com/d-place/dplace-dataset-carneiro
prov:wasDerivedFrom
  1. d-place/dplace-dataset-carneiro f8650bc
  2. Glottolog v5.0
prov:wasGeneratedBy
  1. python: 3.10.12
  2. python-packages: requirements.txt
rdf:ID dplace-dataset-carneiro
rdf:type http://www.w3.org/ns/dcat#Distribution

Table data.csv

Values are coded datapoints, i.e. measurements of a variable for a society.

Note: Missing data is signaled by an empty Value column.

property value
dc:conformsTo CLDF ValueTable
dc:extent 44496

Columns

Name/Property Datatype Description
ID string
Regex: [a-zA-Z0-9_\-]+
Primary key
Soc_ID string References societies.csv::ID
Var_ID string References variables.csv::ID
Value string Values for categorical and ordinal variables reference the corresponding code via the Code_ID column. Values for continuous variables have the measured number in the Value column and an empty Code_ID.
Code_ID string References codes.csv::ID
Comment string
Source list of string (separated by ;) References sources.bib::BibTeX-key
sub_case string More specific description of the population the data refer to in terms of society or area.
year string
Regex: -?[0-9]{1,4}(-[0-9]{4})?
Focal year, i.e. the time period to which the data refer.
source_coded_data string The source of the coded data, which was aggregated in this dataset.
admin_comment string

We use the term “society” to refer to cultural groups. In most cases, a society can be understood to represent a group of people at a focal location with a shared language that differs from that of their neighbors. However, in some cases multiple societies share a language.

property value
dc:conformsTo CLDF LanguageTable
dc:extent 72

Columns

Name/Property Datatype Description
ID string
Regex: [a-zA-Z0-9_\-]+
Primary key
Name string
Latitude decimal
≥ -90
≤ 90
Longitude decimal
≥ -180
≤ 180
Glottocode string
Regex: [a-z0-9]{4}[1-9][0-9]{3}
Name_and_ID_in_source string Society names identified as pejorative have been replaced with a preferred, English-language ethnonym. The name (and ID) as given in the source dataset is kept in this field.
xd_id string “cross-data-set” identifier, used to link societies present in different datasets, if they share a focal location. Note: If this field is empty, other fields such as Name, Glottocode, focal year and location may be used to identify societies across datasets if appropriate.
alt_names_by_society list of string (separated by ; ) A list of ‘alternate’ names for the society; includes, where available, one or more autonyms in the society’s own language, as well as other commonly encountered ethnonyms.
main_focal_year integer Focal year specifying the time period to which the data refer, given as number of years BCE - if negative - or CE.
HRAF_name_ID string Name(s) and ID(s) of the corresponding society in HRAF (the Human Relations Area Files)
HRAF_ID string ID of the corresponding society in HRAF
origLat decimal
≥ -90
≤ 90
Uncorrected latitude as given in the source.
origLong decimal
≥ -270
≤ 180
Uncorrected longitude as given in the source.
comment string
glottocode_comment string Comment on the Glottocode assignment.
region string World Geographical Scheme for Recording Plant Distributions level2 region

Variables are cultural features or practices, or environmental descriptors.

property value
dc:conformsTo CLDF ParameterTable
dc:extent 618

Columns

Name/Property Datatype Description
ID string
Regex: [A-Za-z.0-9_]+([0-9]+)?
Primary key
Name string
Description string
ColumnSpec json
category list of string (separated by , )
type string
Valid choices:
Continuous Categorical Ordinal
Variables may be categorical (and then must be accompanied by a list of possible ‘codes’, i.e. rows in Codetable. Variables can also be continuous (e.g. Population size) or ordinal. Ordinal variables are accompanied by a list of codes (like categorical variables). The order of codes is encoded as ord column in CodeTable.
unit string The unit of measurement
source_comment string A note about the source of this variable.
changes string Notes about how a variable may have been derived from the source.
comment string

Table codes.csv

property value
dc:conformsTo CLDF CodeTable
dc:extent 1236

Columns

Name/Property Datatype Description
ID string
Regex: [a-zA-Z0-9_\-]+
Primary key
Var_ID string The parameter or variable the code belongs to.
References variables.csv::ID
Name string
Description string
ord integer