Skip to content

Pet-project: MEME Stocks SentimentDinamics on the basis of Reddit posts

Notifications You must be signed in to change notification settings

koba4444/Pet_RedditWSB_SentimentDinamics

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

44 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

SUBJECT.
January 2021 was somewhat splendid on Wall Street because of events connected with so called MEME stocks (GME, AMC, EXPR, MULN etc.).
Retail traders cooperated via social networks (subreddit r/whallstreetbets) have managed to oppose to hedge funds in their short sell play.
As a result several stocks briefly soared 10X - 40X.
January 2021 showed that sentiment analysis of retail investor groups will be of rising interest.
This tiny pet-project is for demonstrating sentiment dinamics in r/whallstreetbets almost real time.

DATA.
Data for the project are taken from Reddit.com. It consists of post texts along with timestamps.


MODEL.
For sentiment determining catboost model trained on datasets (https://www.kaggle.com/datasets/datatattle/covid-19-nlp-text-classification) is useds.
It was showed in https://towardsdatascience.com/unconventional-sentiment-analysis-bert-vs-catboost-90645f2437a9 that catboost model works as well as BERT but is much lighter. 


ETL
Controllers are in docker containers. Data is received on hourly basis, preprocessed. Result is pushed to github repository connected with Streamlit service. 


1. AIRFLOW

Сборка образа (из дирректории /airflow-docker)
docker build -t airflow_docker:xxxxx .
Сохранение образа


Структура каталога. 
* помечены директории, проброшенные в докер.
Образ airflow_docker:xxxxxxx вероятно лежит в директории docker.

├── airflow-docker
│   └── dags*
│       └── __pycache__
├── config*
├── data*
├── docker
├── log*
├── mlflow-docker
├── notebooks
├── src*
│   ├── data
│   ├── evaluate
│   ├── features
│   ├── pipelines
│   ├── report
│   └── train
└── temp
    └── mlflow-docker-master
        ├── mlflow
        └── quickstart


docker run -p 8080:8080 --mount type=bind,source="$(pwd)"/airflow-docker/dags,target=/root/airflow/dags --mount type=bind,source="$(pwd)"/data,target=/data --mount type=bind,source="$(pwd)"/config,target=/config --mount type=bind,source="$(pwd)"/log,target=/log --mount type=bind,source="$(pwd)"/streamlit,target=/streamlit --mount type=bind,source="$(pwd)"/src,target=/src --mount type=bind,source="$(pwd)"/../.ssh,target=/root/.ssh airflow_docker_12092022:latest (включает GIT)


2. MLFLOW
Описание в temp/mlflow-docker-master/Readme.md

3. В директории config/ находятся конфигурационные файлы params_all.yaml (несекретный) и params_secret.yaml (содержащий чувствительную информацию; вносить в .gitignore). 

About

Pet-project: MEME Stocks SentimentDinamics on the basis of Reddit posts

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published