You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Is your feature request related to a problem? Please describe.
I need aggregated results so I can analyze helpfulness on Notes level. I think the only way so far is to run the algorithm from scratch so I’m reproducing results using the downloaded data (notes, ratings, notes history, and user enrollment) on a 64-core Intel(R) Xeon(R) Gold 6448H CPU with 500GB memory (correct me if I am wrong). However, after 20 hours, the pre-scoring phase still hasn’t completed. It looks like it won't finish within one day which stops me from working on further analysis.
would it be possible to share the output files (scored_notes.tsv, helpfulness_scores.tsv, note_status_history.tsv, and aux_note_info.tsv)?
and the hardware requirement and expected running time if I want to generate aggregated scores for notes myself?
This would greatly help for research analysis, as running the algorithm locally to aggregate helpfulness scores has been quite challenging.
Describe the solution you'd like
Would it be possible to share the output files (scored_notes.tsv, helpfulness_scores.tsv, note_status_history.tsv, and aux_note_info.tsv)? They don’t need to be the latest versions—files aligned with the current download page would be fine.
Describe alternatives you've considered
It would be nice to share the hardware requirement and expected running time if I want to generate aggregated scores for notes from scratch, or any intermediate process.
Additional context
Thank you so much for your contribution on this amazing project! I am a PhD student working on fact-checking in Natural Language Processing and I am very happy to explore and contribute more. I am actively working on this and any help in above questions would be much appreciated!
The text was updated successfully, but these errors were encountered:
hi - it's not surprising that the job might take that long when run sequentially. since you seem not to be resource-bound, you could try running with the parallel flag set to True. Let us know if that helps!
hi - it's not surprising that the job might take that long when run sequentially. since you seem not to be resource-bound, you could try running with the parallel flag set to True. Let us know if that helps!
Hi @ashilgard, many thanks for your reply, I will try that! Actually I do have resource limitations and normally we don't have that much CPU and memory (64G at most), I queued for a very long time to run the algorithm It would be great if it's possible to share results and descriptions of the output formats.
Is your feature request related to a problem? Please describe.
I need aggregated results so I can analyze helpfulness on Notes level. I think the only way so far is to run the algorithm from scratch so I’m reproducing results using the downloaded data (notes, ratings, notes history, and user enrollment) on a 64-core Intel(R) Xeon(R) Gold 6448H CPU with 500GB memory (correct me if I am wrong). However, after 20 hours, the pre-scoring phase still hasn’t completed. It looks like it won't finish within one day which stops me from working on further analysis.
Since the algorithm runs every hour or so on the server, may I know:
This would greatly help for research analysis, as running the algorithm locally to aggregate helpfulness scores has been quite challenging.
Describe the solution you'd like
Would it be possible to share the output files (scored_notes.tsv, helpfulness_scores.tsv, note_status_history.tsv, and aux_note_info.tsv)? They don’t need to be the latest versions—files aligned with the current download page would be fine.
Describe alternatives you've considered
It would be nice to share the hardware requirement and expected running time if I want to generate aggregated scores for notes from scratch, or any intermediate process.
Additional context
Thank you so much for your contribution on this amazing project! I am a PhD student working on fact-checking in Natural Language Processing and I am very happy to explore and contribute more. I am actively working on this and any help in above questions would be much appreciated!
The text was updated successfully, but these errors were encountered: