Skip to content

[NLP] Unsupervised User Stance Detection on Twitter.

License

Notifications You must be signed in to change notification settings

elaaf/stance-detect

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

14 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Unsupervised User Stance Detection on Twitter.

A python implementation of the paper "Unsupervised User Stance Detection on Twitter" by Darwish et al. arxiv. This unofficial repo simply consolidates the code used in the paper for detecting the stance of prolific Twitter users with respect to controversial topics.

Approach

Given a Twitter dataset containing Tweets regarding a divisive/controversial topic

  • Construct Feature Vectors for each user. (Hashtags, Retweeted Accounts, Unique Tweets)

  • Apply Dimensionality Reduction. (t-SNE, UMAP)

  • Cluster low-dim data (Mean-Shift, DBSCAN)

The low dimensional clusters can be visualized to see nicely separated user clusters, which then can be assigned "Stance" labels based on their orignal descriptors/features.

Requirements

# Create a Python 3.6+ virtual environment and run
pip install -r requirements.txt

How To Run This Code

Clone this repo.

git clone https://github.com/elaaf/stance-detect.git

Place your Twitter Dataset CSV in ./datasets/ folder. Set Data Pipeline Parameters in main.py For Standard Twitter API Dataset CSV, simply run.

python3 stance_detect/main.py

API USAGE

Data Loading

from data_loading.load_data import load_dataset

load_dataset(dataset_path="./datasets/twitter_dataset.csv",
             features=["user_id", "username", "tweet", "mentions", "hashtags"], 
             num_top_users=1000,
             min_tweets=0,
             random_sample_size=0, 
             rows_to_read=None, 
             user_col="user_id", 
             str2list_cols=["mentions", "hashtags"])

Feature Extraction

from feature_extraction.feat_extract import FeatureExtraction

FEATURES_TO_USE = ["T","R","H"]

ft_extract = FeatureExtraction()
user_feature_dict = ft_extract.get_user_feature_vectors(
                                FEATURES_TO_USE,
                                users_list,
                                tweets_list, 
                                mentions_list, 
                                hashtags_list,
                                feature_size=None,
                                relative_freq=True)

Dimensionality Reduction

from dimensionality_reduction.umap import get_umap_embedding

low_dim_user_feature_dict = get_umap_embedding(
                                user_feature_dict,
                                n_neighbors=20,
                                n_components=3,
                                min_distance=0.1,
                                distance_metric="correlation")

Clustering

from clustering.mean_shift import mean_shift_clustering

user_feature_label_dict = mean_shift_clustering( low_dim_user_feature_dict )

Get User Labels for Interactive Plot (Optional)

user_info_label_dict = ft_extract.get_user_info_labels(
                            users_list,
                            user_info_list = hashtags_list,
                            top_n = 5)

user_hover_labels = list( user_info_label_dict.values() )

Interactive Scatter Plot

from graph_plots.plot_3d import scatter_plot_3d

scatter_plot_3d(user_feature_label_dict, 
                title="Twitter Users Scatter Plot",
                hover_info=user_hover_labels,
                plot_save_path="./stance_detect/results/3d_scatter_plot.html")

Output 3D Scatter Plot for Twitter Users

Each datapoint in the scatter plot represents a Twitter User, with their top 5 most used hashtags displayed as hover labels.

Click to open interactive view !

3D Scatter Plot

References

Darwish, K., Stefanov, P., Aupetit, M., & Nakov, P. (2020). Unsupervised User Stance Detection on Twitter. Proceedings of the International AAAI Conference on Web and Social Media, 14(1), 141-152. Retrieved from https://www.aaai.org/ojs/index.php/ICWSM/article/view/7286

UNDER PROGRESS !