CLEF - EXIST2024 - MEDUSA

RoBEXedda: Enhancing Sexism Detection in Tweets for the EXIST 2024 Challenge

Group Members

Project description

Sexism remains a significant barrier to women's advancement, particularly evident in the realm of online interactions where women frequently encounter abuse and threats. This work addresses the "EXIST 2024" challenge, which aims to detect and categorize sexist content on social media. Specifically, the task focuses on identifying and classifying sexist tweets into predefined categories. Using a dataset of over 10,000 tweets in both English and Spanish, the study trained neural networks employing "Binary Relevance" and "Classifier Chain" architectures. The top-performing model from this study, designated "RoBEXedda," will represent the team in the challenge.

Scripts and dataset files

This is the list of Python scripts:

algorithms.py: auxiliary functions for the processing and evaluation
data_processing.ipynb: performs the preprocessing operations on the datasets
data_understanding.ipynb: analyses of the statistics of the dataset
model_final_retraining.py: training models on the entire dataset (no evaluation)
model_output.py: generates the output on the blind test set formatted accoring to the challenge requirements
model_test.py: trains a model and evaluate its performance on the internal test set
model_training.py: trains a model and evaluate its performance on a validation set during model selection
models.py: contains the definition of the architectures considered during model selection

Dataset files:

merged_dataset.csv: train + validation + test
merged_dataset_proc.csv: train + validation + test after the processing
real_test.csv: blind test set
real_test_proc: blind test set after the processing
test_split_proc: internal test set after the processing
training_split: train + validation
training_split_proc: train + validation after the processing

Packages requirements

Contained in requirements.txt

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
data		data
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CLEF - EXIST2024 - MEDUSA

RoBEXedda: Enhancing Sexism Detection in Tweets for the EXIST 2024 Challenge

Group Members

Project description

Scripts and dataset files

Packages requirements

About

Releases

Packages

Contributors 2

Languages

License

JacopoRaffi/EXIST2024_Medusa

Folders and files

Latest commit

History

Repository files navigation

CLEF - EXIST2024 - MEDUSA

RoBEXedda: Enhancing Sexism Detection in Tweets for the EXIST 2024 Challenge

Group Members

Project description

Scripts and dataset files

Packages requirements

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages