XDetox

This repository contains the official implementation of XDetox: Text Detoxification with Token-Level Toxicity Explanations, accepted to EMNLP 2024.

Abstract

Methods for mitigating toxic content through masking and infilling often overlook the decision-making process, leading to either insufficient or excessive modifications of toxic tokens. To address this challenge, we propose XDetox, a novel method that integrates tokenlevel toxicity explanations with the masking and infilling detoxification process. We utilized this approach with two strategies to enhance the performance of detoxification. First, identifying toxic tokens to improve the quality of masking. Second, selecting the regenerated sentence by re-ranking the least toxic sentence among candidates. Our experimental results show state-of-the-art performance across four datasets compared to existing detoxification methods. Furthermore, human evaluations indicate that our method outperforms baselines in both fluency and toxicity reduction. These results demonstrate the effectiveness of our method in text detoxification.

Run Code

Recommended Hardware

We conducted our experiments using an NVIDIA A100 GPU with 40GB of VRAM. For systems with lower VRAM, the method can still be run; however, you may need to reduce the batch size to accommodate the available memory.

Installation

To clone this repository along with its submodules, use the following command:

git clone --recurse-submodules https://github.com/LeeBumSeok/XDetox.git

Requirements

Ensure you have Python 3.8+ installed along with the required dependencies. Install the necessary libraries using:

pip install -r requirements.txt

Quick Start

After cloning the repository, you can easily run the XDetox method with the provided script. Use the following command to run:

python lab.py --all --output_folder single --evaluate --ranking

Related Work

This research is inspired by ideas from previous work on text detoxification and explainability, particularly MARCO (Hallinan et al., 2023) and DecompX (Modarressi et al., 2023).

Citation

TODO

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
DecompX @ 59c290f		DecompX @ 59c290f
datasets		datasets
evaluation		evaluation
rewrite		rewrite
.gitignore		.gitignore
.gitmodules		.gitmodules
README.md		README.md
lab.py		lab.py
requirements.txt		requirements.txt
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

XDetox

This repository contains the official implementation of XDetox: Text Detoxification with Token-Level Toxicity Explanations, accepted to EMNLP 2024.

Abstract

Run Code

Recommended Hardware

Installation

Requirements

Quick Start

Related Work

Citation

About

Releases

Packages

Languages

LeeBumSeok/XDetox

Folders and files

Latest commit

History

Repository files navigation

XDetox

This repository contains the official implementation of XDetox: Text Detoxification with Token-Level Toxicity Explanations, accepted to EMNLP 2024.

Abstract

Run Code

Recommended Hardware

Installation

Requirements

Quick Start

Related Work

Citation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages