Name		Name	Last commit message	Last commit date
parent directory ..
README.md		README.md
processed_twitter_dataset.pickle		processed_twitter_dataset.pickle

README.md

Twitter Vaccines Dataset

This data set has been obtained by processing the data gathered for

"Falling into the Echo Chamber: the Italian Vaccination Debate on Twitter" by Alessandro Cossard, Gianmarco De Francisci Morales, Yelena Mejova, Daniela Paolotti

From their data (7152 users and their tweets), we build a processed data file by (i) selecting a subset of hashtags and (ii) removing users and tweets so that every Twitter user in the data set has at least 100 retweets and every retweet has been shared by at least 10 users. This processed file is saved in processed_twitter_dataset.pickle. This file is a pickled tuple with

(selected_hashtags, tw2user, gamma)

Where selected_hashtags is the list of selected hashtags, tw2user is a map from tweet to users, each one with an anonymized 0-to-N index, and gamma is a numpy matrix that in row i contains the hashtag-based topics of tweet i (each column correspond to a hashtag in selected_hashtags).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

twitter-vaccines

twitter-vaccines

README.md

Twitter Vaccines Dataset

Files

twitter-vaccines

Directory actions

More options

Directory actions

More options

Latest commit

History

twitter-vaccines

Folders and files

parent directory

README.md

Twitter Vaccines Dataset