You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Suffix array initialized with length: 3055148
Calling libsais_int with parameters:
buffer.as_ptr(): 0x75fee1dff010
suffix_array.as_mut_ptr(): 0x6967ee0
buffer.len() as i32: 3055148
vocab_size: 100000
symbol_frequency_table: 0
Segmentation fault (core dumped)
Justing the datastore creation scripts and they seem to crash on the finalize step of lib.rs, this is on ubuntu 22 with py 3.9.. Same thing on Windows.
I am having trouble with the data reader/search part too..
let end_of_indices = end_of_indices.unwrap();
is not caught is end_of_indices is None
Edit again: increasing vocabulary size above the tokenizer vocab size seems to solve the segmentation error, seems dependent on the datastore data if it throws or not.
if end_of_indices.is_none() {
return
}
couldn't figure out why end_of_indices was null sometimes so just returned if so
The text was updated successfully, but these errors were encountered:
Justing the datastore creation scripts and they seem to crash on the finalize step of lib.rs, this is on ubuntu 22 with py 3.9.. Same thing on Windows.
Opened the git issue prematurely, I fixed this by adding +1 to vocabulary size in lib.rs (https://discourse.julialang.org/t/segfault-calling-c-function-any-advice/94730/8) and rebuilding the wheel, maybe it is model dependent issue not sure, something to track down and handle for future releases maybe?
I am having trouble with the data reader/search part too..
let end_of_indices = end_of_indices.unwrap();
is not caught is end_of_indices is None
Edit again: increasing vocabulary size above the tokenizer vocab size seems to solve the segmentation error, seems dependent on the datastore data if it throws or not.
couldn't figure out why end_of_indices was null sometimes so just returned if so
The text was updated successfully, but these errors were encountered: