📖 Read the manuscript.
🔗 Gather more information on zuco-benchmark.com
💻 Look at our code for creating the baseline results
🏆 Create models or custom feature sets and participate in our challenge at EvalAI
The Zurich Cognitive Language Processing Corpus (ZuCo 2.0) is a dataset combining EEG and eye-tracking recordings from subjects reading natural sentences as a resource for the investigation of the human reading process in adult English native speakers.
The benchmark is a cross-subject classification to distinguish between normal reading and task-specific information searching.
This repository is supposed to give you a starting point to participate in our challenge.
To run the code, follow the steps:
Version: Python 3.7.16
Using newer versions may lead to conflicts with h5py.
Should you encounter any installation difficulties, please don't hesitate to open an issue.
- Install pip
- Create a virtual environment and activate it
- Run
pip install -r requirements.txt
You can also download individual files from the OSF
To download the whole dataset, execute
bash get_data.sh
If you do not want to download the whole dataset, you can download the extracted features for each subject and feature set.
To do so, download features.zip, place the file under zuco-benchmark/src/
and unzip it.
cd src
Run the code to produce baseline predictions with the SVM.
python benchmark_baseline.py
If you downloaded the whole dataset, the script will first extract the selected feature sets from the .mat
files, which will take a while.
You can use the code in benchmark_baseline.py
as a starting point and:
- Try different models.
- Use other feature sets or combinations. See config.py for available feature sets
- Create your own feature combinations. To do that, take a look at the feature-extraction and add your own feature combination there.
To experiment with different models or feature combinations, you can use the validation.py, which tests your configuration using leave-one-out cross-validation on the training data.
If you have create_submission
enabled in the config, benchmark_baseline.py
will automatically create a submission file in the correct format.
For the submission format, check out the example files.
Head to EvalAI, fill in the required information and upload your submission.json.