-
Notifications
You must be signed in to change notification settings - Fork 1
/
global_config.yaml
49 lines (37 loc) · 2.21 KB
/
global_config.yaml
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
# config containing global variables
# CHUNCK_SIZE - specifies the number of records to be processed during one iteration
# for computing similarity scores.
#
# This is done to avoid excessive memory consumption due to cloud compute limitations.
# According to local tests, the value of 1000 has practically no impact on calculation time,
# but 16.7 times less memory is consumed. The results of chunked calculations are
# also almost the same (corr coef > 0.999) as for non-chunked calculations.
# The actual difference is the result of precision changes (float64 to float16
# to reduce RAM consumption) and common 'float values inaccuracy' while the order
# of operations (dot product, vector sum and precision change) is not the same.
# Moreover, it was found that the result difference leads to swapping of individual pairs of
# titles in some rare cases, which slightly affects the final result. However, this behaviour
# considered as acceptable approximation.
chunk_size: 100
# DATA_PATH - specifies the path to the data file
data_path: data/anilist.pickle
# METADATA_PATH - specifies the path to the metadata file
metadata_path: data/anilist_meta.pickle
# INFO_PATH - specifies the path to the info file
info_path: data/anilist_info.pickle
# DATA_STAGED_PATH - specifies the path to the staged data file
data_staged_path: data/anilist_staged.pickle
# METADATA_STAGED_PATH - specifies the path to the metadata file for staged data
metadata_staged_path: data/anilist_meta_staged.pickle
# DATA_PROCESSED_PATH - specifies the path to the processed data file
data_processed_path: data/anilist_processed.pickle
# MAX_PER_PAGE - specifies the maximum number of records to be returned per page
pagination_size: 50
# API_URL - specifies the url of the API
api_url: https://graphql.anilist.co
# API_PERIOD - specifies the period of time in seconds for which the `API_LIMIT` specified
api_period: 60
# API_LIMIT - specifies the maximum number of requests per period
api_limit: 90
# DROPDOWN_ITEM_COUNT - specifies the number of results in items dropdown search to be returned
dropdown_item_count: 8