You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I was trying to finetune on a raw text file. It has a few empty lines too. I'm getting this error.
When I looked into the Datasets class, I didn't find from_list function. There were others like from_dict and from_text ( reads from file). I wanted to know if this line of code needs to be changed.
PS: I tried replacing that line with data = datasets.Dataset.from_text(<file path>) and the training seems to be working fine. But I'm not sure how newline and multiple new line characters effect the training performance. Would appreciate some light shed on that.
To create a public link, set `share=True` in `launch()`.
Loading base model...
Number of samples: 28
Traceback (most recent call last):
File "/home/datta0/.pyenv/versions/3.8.10/lib/python3.8/site-packages/gradio/routes.py", line 393, in run_predict
output = await app.get_blocks().process_api(
File "/home/datta0/.pyenv/versions/3.8.10/lib/python3.8/site-packages/gradio/blocks.py", line 1108, in process_api
result = await self.call_function(
File "/home/datta0/.pyenv/versions/3.8.10/lib/python3.8/site-packages/gradio/blocks.py", line 915, in call_function
prediction = await anyio.to_thread.run_sync(
File "/home/datta0/.pyenv/versions/3.8.10/lib/python3.8/site-packages/anyio/to_thread.py", line 31, in run_sync
return await get_asynclib().run_sync_in_worker_thread(
File "/home/datta0/.pyenv/versions/3.8.10/lib/python3.8/site-packages/anyio/_backends/_asyncio.py", line 937, in run_sync_in_worker_thread
return await future
File "/home/datta0/.pyenv/versions/3.8.10/lib/python3.8/site-packages/anyio/_backends/_asyncio.py", line 867, in run
result = context.run(func, *args)
File "/home/datta0/.pyenv/versions/3.8.10/lib/python3.8/site-packages/gradio/helpers.py", line 588, in tracked_fn
response = fn(*args)
File "main.py", line 161, in tokenize_and_train
data = datasets.Dataset.from_list(paragraphs)
AttributeError: type object 'Dataset' has no attribute 'from_list'
Number of samples: 11
Traceback (most recent call last):
File "/home/datta0/.pyenv/versions/3.8.10/lib/python3.8/site-packages/gradio/routes.py", line 393, in run_predict
output = await app.get_blocks().process_api(
File "/home/datta0/.pyenv/versions/3.8.10/lib/python3.8/site-packages/gradio/blocks.py", line 1108, in process_api
result = await self.call_function(
File "/home/datta0/.pyenv/versions/3.8.10/lib/python3.8/site-packages/gradio/blocks.py", line 915, in call_function
prediction = await anyio.to_thread.run_sync(
File "/home/datta0/.pyenv/versions/3.8.10/lib/python3.8/site-packages/anyio/to_thread.py", line 31, in run_sync
return await get_asynclib().run_sync_in_worker_thread(
File "/home/datta0/.pyenv/versions/3.8.10/lib/python3.8/site-packages/anyio/_backends/_asyncio.py", line 937, in run_sync_in_worker_thread
return await future
File "/home/datta0/.pyenv/versions/3.8.10/lib/python3.8/site-packages/anyio/_backends/_asyncio.py", line 867, in run
result = context.run(func, *args)
File "/home/datta0/.pyenv/versions/3.8.10/lib/python3.8/site-packages/gradio/helpers.py", line 588, in tracked_fn
response = fn(*args)
File "main.py", line 161, in tokenize_and_train
data = datasets.Dataset.from_list(paragraphs)
AttributeError: type object 'Dataset' has no attribute 'from_list'
The text was updated successfully, but these errors were encountered:
I was trying to finetune on a raw text file. It has a few empty lines too. I'm getting this error.
When I looked into the Datasets class, I didn't find from_list function. There were others like from_dict and from_text ( reads from file). I wanted to know if this line of code needs to be changed.
PS: I tried replacing that line with
data = datasets.Dataset.from_text(<file path>)
and the training seems to be working fine. But I'm not sure how newline and multiple new line characters effect the training performance. Would appreciate some light shed on that.The text was updated successfully, but these errors were encountered: