Skip to content

Commit

Permalink
Production release (#109)
Browse files Browse the repository at this point in the history
* using ThreadPoolExecutor instead of ThreadPool to avoid semlock

* first attempt using multiprocessing

* added comments to clarify purpose

---------

Co-authored-by: ilkin <ilkin@nygen.io>
  • Loading branch information
parashardhapola and hi-ilkin authored Oct 27, 2023
1 parent 28cc978 commit 5021948
Showing 1 changed file with 12 additions and 2 deletions.
14 changes: 12 additions & 2 deletions scarf/utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -173,10 +173,20 @@ def controlled_compute(arr, nthreads):
Returns:
Result of computation.
"""
from multiprocessing.pool import ThreadPool
import dask

with dask.config.set(schedular="threads", pool=ThreadPool(nthreads)): # type: ignore
try:
# Multiprocessing may be faster, but it throws exception if SemLock is not implemented.
# For example, multiprocessing won't work on AWS Lambda, in those scenarios we switch ThreadPoolExecutor
from multiprocessing.pool import ThreadPool

pool = ThreadPool(nthreads)
except Exception:
from concurrent.futures import ThreadPoolExecutor

pool = ThreadPoolExecutor(nthreads)

with dask.config.set(schedular="threads", pool=pool): # type: ignore
res = arr.compute()
return res

Expand Down

0 comments on commit 5021948

Please sign in to comment.