-
Notifications
You must be signed in to change notification settings - Fork 0
koba4444/Pet_RedditWSB_SentimentDinamics
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
SUBJECT. January 2021 was somewhat splendid on Wall Street because of events connected with so called MEME stocks (GME, AMC, EXPR, MULN etc.). Retail traders cooperated via social networks (subreddit r/whallstreetbets) have managed to oppose to hedge funds in their short sell play. As a result several stocks briefly soared 10X - 40X. January 2021 showed that sentiment analysis of retail investor groups will be of rising interest. This tiny pet-project is for demonstrating sentiment dinamics in r/whallstreetbets almost real time. DATA. Data for the project are taken from Reddit.com. It consists of post texts along with timestamps. MODEL. For sentiment determining catboost model trained on datasets (https://www.kaggle.com/datasets/datatattle/covid-19-nlp-text-classification) is useds. It was showed in https://towardsdatascience.com/unconventional-sentiment-analysis-bert-vs-catboost-90645f2437a9 that catboost model works as well as BERT but is much lighter. ETL Controllers are in docker containers. Data is received on hourly basis, preprocessed. Result is pushed to github repository connected with Streamlit service. 1. AIRFLOW Сборка образа (из дирректории /airflow-docker) docker build -t airflow_docker:xxxxx . Сохранение образа Структура каталога. * помечены директории, проброшенные в докер. Образ airflow_docker:xxxxxxx вероятно лежит в директории docker. ├── airflow-docker │ └── dags* │ └── __pycache__ ├── config* ├── data* ├── docker ├── log* ├── mlflow-docker ├── notebooks ├── src* │ ├── data │ ├── evaluate │ ├── features │ ├── pipelines │ ├── report │ └── train └── temp └── mlflow-docker-master ├── mlflow └── quickstart docker run -p 8080:8080 --mount type=bind,source="$(pwd)"/airflow-docker/dags,target=/root/airflow/dags --mount type=bind,source="$(pwd)"/data,target=/data --mount type=bind,source="$(pwd)"/config,target=/config --mount type=bind,source="$(pwd)"/log,target=/log --mount type=bind,source="$(pwd)"/streamlit,target=/streamlit --mount type=bind,source="$(pwd)"/src,target=/src --mount type=bind,source="$(pwd)"/../.ssh,target=/root/.ssh airflow_docker_12092022:latest (включает GIT) 2. MLFLOW Описание в temp/mlflow-docker-master/Readme.md 3. В директории config/ находятся конфигурационные файлы params_all.yaml (несекретный) и params_secret.yaml (содержащий чувствительную информацию; вносить в .gitignore).
About
Pet-project: MEME Stocks SentimentDinamics on the basis of Reddit posts
Topics
Resources
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published