Skip to content

Latest commit

 

History

History
28 lines (15 loc) · 7.31 KB

README.md

File metadata and controls

28 lines (15 loc) · 7.31 KB

Evidence Based FactChecking in Healthcare

Introduction

Fact-checking in healthcare is a critical and complex field that focuses on validating medical claims and information using rigorous scientific evidence. With the prevalence of misinformation, ensuring the accuracy and reliability of healthcare information is of utmost importance. This study aims to enhance fact-checking in the healthcare domain through a comprehensive investigation by evaluating different datasets, machine learning models, and techniques. Firstly, the most effective proposed fact-checking techniques for healthcare information are assessed. Comparative analysis of two prominent models, BlueBERT and SciBERT, on the HealthVER dataset is conducted to understand their performances in healthcare fact-checking. Additionally, data augmentation techniques are investigated, and focal loss is incorporated to address and tackle the dataset's imbalanced nature. Ultimately, a classification system is developed to evaluate the support or refutation of claims in the COVID-19 dataset based on claim, evidence, and topic ID features. The findings from this study will aid in choosing the most suitable techniques for evidence-based fact checking in healthcare.

Dataset (github.com/sarrouti/HealthVer/blob/master/data/healthver_dev.csv)

  1. HealthVER is an innovative dataset that has been developed specifically for evidence-based fact-checking of health-related COVID-19 claims using scientific articles. The verification system employed in HealthVER predicts the relationship between a claim and the evidence extracted from a scientific article. It consists of 14,330 manually annotated evidence-claim pairs, each labelled with a veracity label indicating whether the evidence supports, refutes, or remains neutral towards the claim. This dataset includes evidence extracted from scientific articles and has been manually annotated with three types of relationships: SUPPORTS, REFUTES, and NEUTRAL.

Proposed Methodology

• Data Cleaning and Exploration: We start by cleaning and organizing the dataset. This involves removing any irrelevant or duplicate entries, handling missing values, and ensuring the data is in a format suitable for analysis. We then examine each set, that are, training, validation, and testing sets carefully. We look at the distribution, word clouds, sentence lengths and identify patterns and themes in the text.

• Data Preparation: We then encode the pre-processed claims and evidence using respective tokenizers. Convert them into tokenized sequences and generate attention masks, which indicate which tokens are important for the model to pay attention to during processing. We leverage the PyTorch framework to handle the dataset preparation, loading, and perform classification tasks. To ensure the model efficiently processes the data, it requires the input to be organized in a structured and batched format. This step involves carefully arranging the training, testing, and validation sets in a way that allows the model to effectively learn and make accurate predictions. By following this approach, we enhance the model's training process with improved efficiency and reduced overhead.

• Model Preparation: Our focus is on two different approaches, where we are using two variations of BERT called BlueBERT and SciBERT. BERT is a powerful transformer-based model that has shown impressive results in various natural language processing tasks, including fact-checking. It learns contextualized word representations by training on a large amount of text data. BlueBERT is a version of BERT that is specifically trained on abstracts from PubMed text, striking a balance between medical and non-medical language. This choice is particularly important because the HealthVER dataset mainly consists of abstracts rather than full articles. It's worth noting that Mourad Sarrouti and colleagues have already utilized SciBERT with the HealthVER dataset. Our objective is to compare the performance of BlueBERT and SciBERT on the HealthVER dataset. By conducting this comparison, we aim to gain insights into how these two BERT variations perform on this specific task.

• Fine-Tuning: The models are fine-tuned on the dataset. This step involves updating the model's parameters using the provided labelled data. The objective is to minimize a loss function, such as cross-entropy loss, which measures the difference between predicted and actual labels. Here, backpropagation is used to calculate the gradients of the loss with respect to the model's parameters. These gradients are then used to update the parameters using an optimization algorithm.

• Prediction and Veracity Assessment: Further, we use the fine-tuned base BERT model and BlueBERT model to predict the veracity of claims based on the encoded evidence. Pass the tokenized claims and evidence through the model and obtain predicted labels for each claim. These predictions indicate the models’ assessment of the claim's veracity based on the provided evidence.

• Result Analysis and Reporting: In this step, we analyse the predictions made by the two models. Evaluate the model's performance by comparing its predictions with the ground truth labels from the dataset. We calculate metrics such as accuracy, precision, recall, and F1-score to assess the models’ effectiveness. Also, we report the results and insights gained from the fact-checking process.

• Iterative Improvement: Based on the analysis and evaluation of the models, we then identify areas for improvement and iterate on the methodology. This may include adjusting hyperparameters, refining the pre-processing steps, or incorporating additional data to enhance the model's performance.

Conclusion

Fact-checking within the healthcare domain represents a vital and intricate field that revolves around validating medical assertions and data using rigorous scientific proof. Given the prevalence of incorrect information, ensuring the precision and trustworthiness of healthcare data remains exceedingly significant. This research delves into a variety of strategies and methods aimed at enhancing fact-checking within healthcare. This involves the evaluation of diverse datasets, machine learning models, and techniques. Our investigation adopts a multifaceted approach to enhance fact-checking practices within the healthcare sector. We've examined effective strategies for verifying healthcare data and have rigorously evaluated classification systems to identify the most effective methodologies. Through a comparison of the performance of two notable models, BlueBERT and SciBERT, using the HealthVER dataset, we've gained insights into their respective strengths and weaknesses in the context of healthcare fact-checking. Our exploration of the impact of focal loss and topic classification has shed light on its importance in bolstering the accuracy of fact-checking procedures. Central to our analysis has been the differentiation between the performances of SciBERT and BlueBERT, particularly concerning precision, recall, F1-score, and accuracy metrics. SciBERT's consistent superiority across diverse categories highlights its efficacy and resilience in various scenarios. In the field of predicting contradicting and supporting pairs, SciBERT's proficiency in identifying straightforward conflicts has stood out, though it has encountered difficulties in discerning subtle and conditional relationships.