-
Notifications
You must be signed in to change notification settings - Fork 144
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Assign more cpu to single task to speed it up for local executor? #214
Comments
This should be the most optimized setup yes. You can optionally increase |
Thank you!!!!! |
Also, how to set the parameter tasks? For instance, if I have 10000 files, should I set tasks = 10000? |
You can yes. If tasks > nb of files, than the excess tasks will not perform any work as we do not currently split files |
How about the case where tasks < nb of files? Will all the files be processed? Will the execution speed be faster? |
I am using the local executor. My machine has 48 Cpus with 348 Ram. Any idea how to speed this up? Currently one single task (task=1, running for 1 warc.gz file, with size ~1g) takes half an hour. This is my executor code, borrowed from the fineweb example. Also, I have 200 warc.gz files to process. Is setting tasks = 200 the correct way?
The text was updated successfully, but these errors were encountered: