Update README.md and ./data/README.md

TomatenMarc · Jun 12, 2023 · 4862784 · 4862784
1 parent 710bc45
commit 4862784
Show file tree

Hide file tree

Showing 2 changed files with 31 additions and 26 deletions.
diff --git a/README.md b/README.md
@@ -17,17 +17,17 @@ This repository contains the annotation framework, dataset and code used for the
 ## Repository Layout
 
 1. [data](./data)
-   1. [README.md](./data/README.md): A data specific README for TACO.  
-   2. [annotation_framework.pdf](./data/annotation_framework.pdf): The annotation framework for TACO.
-   3. [conversations.csv](./data/conversations.csv): Having stored the structure of conversations.
-   4. [majority_votes.csv](./data/majority_votes.csv): All the majority votes, which serve as the labeled ground truth.
-   5. [worker_decisions.csv](./data/worker_decisions.csv): All individual expert decisions.
+    1. [README.md](./data/README.md): A data-specific README for TACO and its annotation process.
+    2. [annotation_framework.pdf](./data/annotation_framework.pdf): The annotation framework for TACO.
+    3. [conversations.csv](./data/conversations.csv): Having stored the structure of all collected conversations.
+    4. [majority_votes.csv](./data/majority_votes.csv): All the majority votes, which serve as the labeled ground truth.
+    5. [worker_decisions.csv](./data/worker_decisions.csv): All individual expert decisions.
 2. [notebooks](./notebooks)
     1. [dataset_statistics.ipynb](./notebooks/dataset_statistics.ipynb): For comparing the dataset statistics as specified in the sections 2.2 - 2.4
        of the paper.
     2. [classifier_cv.ipynb](./notebooks/classifier_cv.ipynb): For training and evaluating the baseline model as in the section 3 of the paper.
 3. [outputs](./outputs)
-   1. [bertweet_cv_predictions.csv](./outputs/bertweet_cv_predictions.csv): The ground truth and cross-validation results of the baseline model.
+    1. [bertweet_cv_predictions.csv](./outputs/bertweet_cv_predictions.csv): The ground truth and cross-validation results of the baseline model.
 
 ## Findings
 

diff --git a/data/README.md b/data/README.md
@@ -1,12 +1,14 @@
 # :taco: TACO -- Twitter Arguments from COnversations
 
-In this folder, you can find the annotation framework and information about the data used in the resource paper: "TACO - Twitter Arguments from 
+In this folder, you can find the annotation framework and information about the data used in the resource paper: "TACO - Twitter Arguments from
 Conversations".
 
 ## Sensitive Data
 
-The contents of this folder comprise all data that can be shared with the public. This includes reduced versions of tweets that only contain their
-tweet_id. Additionally, we offer the [dataset_statistics.ipynb](../notebooks/dataset_statistics.ipynb) file, which we utilized to generate our ground
+The contents of this folder comprise all data that can be shared with the public according
+to [Twitter's developer policy](https://developer.twitter.com/en/developer-terms/policy).
+This includes reduced versions of tweets that only contain their tweet_id. Additionally, we offer
+the [dataset_statistics.ipynb](../notebooks/dataset_statistics.ipynb) file, which we utilized to generate our ground
 truth data and gain preliminary insights. Since we cannot release all data, such as the text of tweets,
 the [dataset_statistics.ipynb](../notebooks/dataset_statistics.ipynb) file is only provided for comparison purposes. Accessing it would
 necessitate the use of the following files:
@@ -41,41 +43,44 @@ in [majority_votes.csv](./majority_votes.csv). The [worker_decisions.csv](./work
 1. **tweet_id**: The unique identifier of a tweet in Twitter's database.
 2. **information**: A binary value indicating the presence (1) or absence (0) of information in the tweet.
 3. **inference**: A binary value indicating the presence (1) or absence (0) of inference in the tweet.
-4. **confidence**: A value indicating the annotator's task confidence, ranging from easy (1) to hard (3).
+4. **confidence**: A value indicating the annotator's task confidence, ranging from easy (1) to hard (3), not used in the paper.
 5. **worker**: The identifier of the annotator (A and E both belong to the author Marc Feger)).
 6. **topic**: The conversation's topic that was assigned for sampling purposes.
 7. **phase**: The phase in which the tweet was annotated.
 
 ### Annotation Phases
 
-Our six experts provided individual decisions from different annotation phases. During the initial annotation stage, experts A, B, C, and D annotated
-600 conversation-starting tweets. This first annotation step comprised two phases:
+Our six experts provided individual decisions from two annotation phases. During the initial annotation stage, experts A, B, C, and D
+annotated 600 conversation-starting tweets (300 random selected for each #Abortion and #Brexit) to evaluate and refine the framework. This first
+annotation step comprised two phases:
 
-1. **training_1 - 2**: Two successive training phases, each involving 100 tweets, were conducted for the annotators. These were followed by a
-   debriefing session.
-2. **extension_1 - 4**: The first through fourth extensions, each comprising 100 tweets, were conducted after the annotators had completed their
-   training.
+1. **training_1 - 2**: Two successive training phases, each involving 100 tweets for either #Abortion and #Brexit, were conducted for the annotators.
+   These were followed by a debriefing session.
+2. **extension_1 - 4**: The four extensions steps, each comprising 100 tweets, were conducted after the annotators had completed their deliberation.
 
-In the second annotation step, three additional annotators, namely A (E), F, and G, annotated the tweets in 200 conversations. To this end, 100
-conversation-starting tweets from the first step were selected for training the new annotators on the tweets and their subsequent conversations. This
-was followed by another 100 conversations. In total, the second annotation step comprised the following phases:
+In the second annotation step, three additional annotators, namely A (E), F, and G, annotated the tweets of 200 conversations. To this end, 100
+conversation-starting tweets from the first step (with perfect agreement among A-D) were randomly selected for training the new annotators on the
+tweets and their subsequent conversations. This was followed by another 100 conversations (started by 25 randomly selected conversation-starting
+tweet for either #GOT, #SquidGame, #TwitterTake and #LOTRROP). In total, the second annotation step comprised the following phases:
 
 1. **training_3 - 4**: Two training phases involving 100 conversation-starting tweets from the first annotation step (conducted with 100%
-   agreement among annotators A, B, C, and D) along with their entire conversations.
-2. **extension_5 - 8**: The subsequent four annotation steps covered 25 new conversations each.
+   agreement among annotators A, B, C, and D) along with their entire conversations for #Abortion and #Brexit.
+2. **extension_5 - 8**: The following four annotation steps each included 25 new conversations for either #GOT, #SquidGame, #TwitterTakeover, or
+   #LOTRROP.
 
-The individual annotation phases are detailed in [dataset_statistics.ipynb](../notebooks/dataset_statistics.ipynb).
+The individual annotation phases (including the inter-annotator-agreement) are detailed
+in [dataset_statistics.ipynb](../notebooks/dataset_statistics.ipynb).
 
 ### Majority Votes
 
-Once the annotation phases were complete, the ground truth labels were assigned using a hard majority vote. The resulting ground truth data
-is saved in [majority_votes.csv](./majority_votes.csv), which contains the following columns:
+Once the annotation phases were complete, the ground truth labels were assigned using a hard majority vote (more than 50% of all experts had to
+agree on one class). The resulting ground truth data is saved in [majority_votes.csv](./majority_votes.csv), which contains the following columns:
 
 1. **tweet_id**: A unique identifier for each tweet in Twitter's database.
 2. **topic**: The topic of the conversation that was assigned for sampling purposes.
-3. **category**: The category assigned to each tweet based on the majority vote of the annotators, as specified
+3. **class**: The class assigned to each tweet based on the majority vote of the annotators, as specified
    in [annotation_framework.pdf](./annotation_framework.pdf).
-4. **confidence**: The proportion of annotators who voted for the final category. For example, if A, B, and C all voted for the same category, the
+4. **confidence**: The proportion of annotators who voted for the final class. For example, if A, B, and C all voted for the same class, the
    confidence value would be 3/4.
 
 ## Contact