You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Thank you for making your repo available, it is extremely helpful.
Would you have any insights on the following problem I encountered:
I have a base dataframe with 50,000 rows and 8 columns, which I use to create roughly 1,500 features per multipliable feature in my feature family. The multipliers are nested such that there are 4 categorical multipliers and one time multiplier. While running the code with one multipliable feature alone works and takes about 2min, adding more multipliable features in the same feature family increases the run time exponentially:
1 feature - 2min
3 features - 17min
4 features - 2h20min
Is this an expected behaviour for this size of data? And if not, any suggestions on how to improve it?
I also noted that the function "append_features" is always being executed with a single executor in my cluster (while dynamic allocation being true).
Thank you,
Epson
The text was updated successfully, but these errors were encountered:
Hi,
Thank you for making your repo available, it is extremely helpful.
Would you have any insights on the following problem I encountered:
I have a base dataframe with 50,000 rows and 8 columns, which I use to create roughly 1,500 features per multipliable feature in my feature family. The multipliers are nested such that there are 4 categorical multipliers and one time multiplier. While running the code with one multipliable feature alone works and takes about 2min, adding more multipliable features in the same feature family increases the run time exponentially:
1 feature - 2min
3 features - 17min
4 features - 2h20min
Is this an expected behaviour for this size of data? And if not, any suggestions on how to improve it?
I also noted that the function "append_features" is always being executed with a single executor in my cluster (while dynamic allocation being true).
Thank you,
Epson
The text was updated successfully, but these errors were encountered: