We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
楼主你好!我在处理中文预训练数据时,发现原有的mask是'',对于fnlp/bart-base-chinese和fnlp/bart-large-chinese来讲,这个mask是不是应该为'[MASK]'?
The text was updated successfully, but these errors were encountered:
你好,我是用的mask token就是[MASK]。请问是哪里的代码造成了误解吗?
Sorry, something went wrong.
在词组抽取的类里、jieba的默认是,这个我改了,问题不大。 对了,你训练这个base中文模型是什么配置,耗时多久?我现在跑9千万数据,8*v100 32g,速度非常慢,当然文本长度这里,我source的最大长度改为60,target长度最大410。
我训练1000万的,在8*A100 80g上,耗时大概3天
No branches or pull requests
楼主你好!我在处理中文预训练数据时,发现原有的mask是'',对于fnlp/bart-base-chinese和fnlp/bart-large-chinese来讲,这个mask是不是应该为'[MASK]'?
The text was updated successfully, but these errors were encountered: