Replies: 3 comments 4 replies
-
Hi @chmaz 👋, Looks like a problem with doctr/references/detection/train_pytorch.py Line 108 in 9045dcf Best reagrds, |
Beta Was this translation helpful? Give feedback.
-
@felixT2K |
Beta Was this translation helpful? Give feedback.
-
@felixdittrich92 Unfortunately, I tried but get the error even with the switch --workers 1 :( on windows . As it works fine on Linux, I will switch to this environment for my experiments with text detection training and hope inference will work smoothy on windows (or a fix has been found in-between). I have two follow up questions: Really thanks for the great feedback, Best, Chris |
Beta Was this translation helpful? Give feedback.
-
Hello,
During my first try with a very small training dataset on Windows 11 to see if I got the training format right, I get the following error stack trace. I tried to find if a workaround had been provided in the repo or internet but did not find anything.
Thanks in advance for any insight you may have,
Best,
Chris
Trace:
(base) C:\deepLearning\doctr\doctr-main>python references/detection/train_pytorch.py E:/training E:/validation fast_base --name fast_base1 --device 0 --epochs 20 --batch_size 8 --lr 0.001 --amp --early-stop --early-stop-epochs 5 --early-stop-delta 0.01
Namespace(train_path='E:/training', val_path='E:/validation', arch='fast_base', name='fast_base1', epochs=20, batch_size=8, device=0, save_interval_epoch=False, input_size=1024, lr=0.001, weight_decay=0, workers=None, resume=None, test_only=False, freeze_backbone=False, show_samples=False, wb=False, push_to_hub=False, pretrained=False, rotation=False, eval_straight=False, sched='poly', amp=True, find_lr=False, early_stop=True, early_stop_epochs=5, early_stop_delta=0.01)
Validation set loaded in 0.001002s (1 samples in 1 batches)
Train set loaded in 0.008007s (15 samples in 1 batches)
Traceback (most recent call last): | 0/1 [00:00<?, ?it/s]
File "C:\deepLearning\doctr\doctr-main\references\detection\train_pytorch.py", line 481, in
main(args)
File "C:\deepLearning\doctr\doctr-main\references\detection\train_pytorch.py", line 388, in main
fit_one_epoch(model, train_loader, batch_transforms, optimizer, scheduler, amp=args.amp)
File "C:\deepLearning\doctr\doctr-main\references\detection\train_pytorch.py", line 108, in fit_one_epoch
pbar = tqdm(train_loader, position=1)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\chris\anaconda3\Lib\site-packages\tqdm\asyncio.py", line 33, in init
self.iterable_iterator = iter(iterable)
^^^^^^^^^^^^^^
File "C:\Users\chris\anaconda3\Lib\site-packages\torch\utils\data\dataloader.py", line 439, in iter
return self._get_iterator()
^^^^^^^^^^^^^^^^^^^^
File "C:\Users\chris\anaconda3\Lib\site-packages\torch\utils\data\dataloader.py", line 387, in _get_iterator
return _MultiProcessingDataLoaderIter(self)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\chris\anaconda3\Lib\site-packages\torch\utils\data\dataloader.py", line 1040, in init
w.start()
File "C:\Users\chris\anaconda3\Lib\multiprocessing\process.py", line 121, in start
self._popen = self._Popen(self)
^^^^^^^^^^^^^^^^^
File "C:\Users\chris\anaconda3\Lib\multiprocessing\context.py", line 224, in _Popen
return _default_context.get_context().Process._Popen(process_obj)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\chris\anaconda3\Lib\multiprocessing\context.py", line 336, in _Popen
return Popen(process_obj)
^^^^^^^^^^^^^^^^^^
File "C:\Users\chris\anaconda3\Lib\multiprocessing\popen_spawn_win32.py", line 95, in init
reduction.dump(process_obj, to_child)
File "C:\Users\chris\anaconda3\Lib\multiprocessing\reduction.py", line 60, in dump
ForkingPickler(file, protocol).dump(obj)
AttributeError: Can't pickle local object 'main..'
0%| | 0/1 [00:00<?, ?it/s]
(base) C:\deepLearning\doctr\doctr-main>Traceback (most recent call last):
File "", line 1, in
File "C:\Users\chris\anaconda3\Lib\multiprocessing\spawn.py", line 122, in spawn_main
exitcode = _main(fd, parent_sentinel)
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\chris\anaconda3\Lib\multiprocessing\spawn.py", line 132, in _main
self = reduction.pickle.load(from_parent)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
EOFError: Ran out of input
Beta Was this translation helpful? Give feedback.
All reactions