Connecting Illegal Vendors on Darknet Markets: Responsible Authorship Attribution to Link and Connect Online Cybercrimes
In this workshop, we address the significant challenge law enforcement agencies face in uncovering illegal markets and their connections due to the anonymity on these platforms. Perpetrators often remain undetected by adopting multiple aliases or frequently shifting between markets. To tackle this issue, we demonstrate the responsible application of the Authorship Attribution (AA) approaches ([1,2]) within Natural Language Processing (NLP).
AA approaches in NLP involve determining a text's likely author or source based on linguistic features and patterns. These approaches are particularly useful in identifying and linking written content to specific individuals, which can be valuable in various applications, including forensic analysis, plagiarism detection, and tracking activities on online platforms. This workshop aims to develop methods that enable law enforcement to connect the dots and establish relationships between illegal Darknet markets and their vendors. We employ advanced NLP techniques to unveil the distinctive linguistic fingerprints that individual authors leave in their writings. These fingerprints may include syntactic structures, vocabulary choices, writing style, and other linguistic nuances that can be extracted and analyzed. Through a responsible application of AA approaches, investigators can trace and link these linguistic patterns to specific vendors, thus uncovering their identities and affiliations within the illegal markets.
The workshop will explore the difficulties of implementing these AA techniques, providing practical insights and methodologies for effectively identifying and connecting unique vendor accounts across various text advertisements of the Agora Darknet Market, and elaborate on responsible guidelines for the field. By understanding and leveraging the distinctive writing patterns of individuals, law enforcement agencies can enhance their capabilities in tracking and combating illicit activities on the Darknet.
-
Introduction:
- Who are we, and what do we do?
- Current Trends in NLP; the rise of Transformer models
- Darknet Markets
- Authorship Attribution Approaches in NLP
- Authorship Identification
- Authorship Verification
-
Responsible Authorship Attribution:
- Privacy & Data Protection
- Discrimination & Unintended Biases
- Transparency & Fairness
- Societal Impact
-
Hands-on session: Getting started
- Setting up the Google Colab notebook
- Creating the environment and Installing the dependencies
- Data Analysis of Agora Marketplace
- Responsible Data-preprocessing
- Sanity Check: Performing stylometric analysis
-
Hands-on session: Authorship Identification; A closed-set multi-class Classification Task
- Statistical Models
- Traditional NNs
- Transformers-based models
- Explainability frameworks
-
Hands-on session: Authorship Verification; An open-set Retrieval Task
- Extracting text representations
- FAISS
-
Conclusion
- Where do we stand?
- Limitations
- Future Work
(*) Disclaimer: Due to time constraints, our exploration of the Python code will be limited, with most functionalities encapsulated within functions and classes. The primary goal of this workshop is not to delve into the intricacies of text classification and retrieval in NLP. Instead, our focus is on demonstrating how AA approaches can be applied responsibly to identify and connect criminal entities within illegal online markets. We aim to showcase the application of these techniques rather than providing an in-depth tutorial. Any questions are welcome during the break time or after the workshop.
Agora Darknet Market: The Agora Darknet market, operational on the Tor network, was launched in 2013 and gained prominence as a leading platform after the closure of Evolution in March 2015. Operating until August 2015, Agora distinguished itself by surviving Operation Onymous, which targeted various darknet websites in November 2014. Administrators announced the market's closure in response to potential threats and suspicions of server de-anonymization. An art exhibition in Switzerland, titled "The Darknet: From Memes to Onionland," featured an exploration of darknet culture, including purchases made by the automated shopping bot, Random Darknet Shopper, on Agora. Despite its resilience, vulnerabilities in the Tor Hidden Services protocol were revealed, prompting Agora's administrators to suspend operations temporarily. The closure led to a shift in activity to AlphaBay until its shutdown by law enforcement in July 2017. To read more about the Agora darknet market, please take a look at Wikipidia.
- Complete Google Colab Tutorial
- Scikit-Learn Python Tutorial | Machine Learning with Scikit-learn
- PyTorch Lightning Tutorials