GitHub - stepchoi/DL-Price-Mom

Dependencies

Python 3.6+, Postgres, Unix (tested on Ubuntu)
pip install -r requirements.txt
Install hyperopt from source

Setup

createdb dl_price_mom
cp config.example.json config.json then modify config.json
Sign into Kaggle, download this dataset

Unzip

unzip price-volume-data-for-all-us-stocks-etfs.zip -d tmp
pushd tmp
unzip Data.zip
popd

The important additions to your dir structure should be:

- config.json (modified)
- tmp
  - Stocks

Import

python import.py

This will populate your database from the Kaggle dataset, it will take a few hours.

Note re: dataset

We have arrange the code to work with the Kaggle dataset but there are few important provisions:

Kaggle data was NOT used for our analysis. Our research is based on proprietary licensed data from FTSE Russell which we cannot provide by the terms of our agreement.
Kaggle does not specify the investment universe, which was sourced from FTSE Rusell database for the Russell 1000-based investment universe.

Run

You'll train three separate components to completion (they depend on each other sequentially):

Autoencoder
Embedded Clustering
Recurrent Neural Network (GRU/FFN)

Each component can take 6h or more to run; EmbedClust, in particular, takes multiple days. Run each step in a tmux session and check back in 24h. Between each step (after completion), you'll choose optimal autoencoder configurations or embedded clusterings.

python ae.py - runs hyperopt for autoencoding of the data
python select_winners.py - selects optimal hyperparameter configurations for autoencoder.
python embed_clust.py - runs embedded clustering on the data for origins and k clusters based on the selected autoencoder.
Manually select embed_clust clustering outputs for each origin (set use to TRUE), based on normalized X_B or S_Dbw scores.
python rnn.py - runs GRU/FFN based on selected optimal embedded clusterings.

Credit

Deep Clustering with Convolutional Autoencoders (DCEC)
- Code: XifengGuo/DCEC
- Paper: Unsupervised Deep Embedding for Clustering Analysis
S_Dbw
- Code: iphysresearch/S_Dbw_validity_index
- Paper: Clustering Validity Assessment: Finding the optimal partitioning of a data set
Xie-Beni index (XB)
- Paper: A validity measure for fuzzy clustering

Name		Name	Last commit message	Last commit date
Latest commit History 45 Commits
.gitignore		.gitignore
README.md		README.md
S_Dbw.py		S_Dbw.py
ae.py		ae.py
config.example.json		config.example.json
data.py		data.py
embed_clust.py		embed_clust.py
import.py		import.py
requirements.txt		requirements.txt
rnn.py		rnn.py
select_winners.py		select_winners.py
split_gpu.py		split_gpu.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Dependencies

Setup

Import

Note re: dataset

Run

Credit

About

Releases

Packages

Languages

stepchoi/DL-Price-Mom

Folders and files

Latest commit

History

Repository files navigation

Dependencies

Setup

Import

Note re: dataset

Run

Credit

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages