Skip to content

Latest commit

 

History

History
38 lines (22 loc) · 2.22 KB

README.md

File metadata and controls

38 lines (22 loc) · 2.22 KB

ToS;DR-ML

Deep Learning for Terms of Service

Do you really read the terms of service presented by various services on the internet? Or do you just scroll to the bottom and click "I Agree"?

What if machine learning could help humans understand these lengthy legal documents we click to agree?

Here is an attempt to utilize the power of machine learning to extract information in multiple dimensions, like - data locality, privacy, ownership etc., lying under various Terms of Service agreement we all agree to.

Motivation

We all use a plethora of apps and services, each of them store a variety of data generated by us. Do they belong to us? How are these services / apps using this data? Do we really care about our data & privacy? Can AI help demystify long dumps of text for us?

I have always scrolled to the bottom and clicked "I Agree"!

As responsible engineers how can we change this? How about using AI for this?

A quick study revealed there exists this crowd sourced project - ToS;DR - https://tosdr.org

Given that ToS;DR is a crowdsourced effort, it is hard to expect the team to review all the Terms & Service documents / Privacy policy published by every other service on the internet.

The only way to solve this is to automate this process. Natural language processing techniques and AI models have evolved to a great extent and used in various legal domains. Moreover, with the advent of BERTology (BERT & BERT like models), solving a lot of NLP problems are eased to a great extent.

In this talk we will be talking about the following:

  1. Why should we care about Terms of Services?
  2. Work done by ToS;DR
  3. How can we use ML models to automate efforts similar to ToS;DR?
  4. Is NLP evolved to handle new domains?
  5. Explainability of these models

This project is part of a larger effort (Deep Learning for Humans), with the goals of making deep learning easy to understand and explainable to any layman.

The final goal of this project would be to have a web app which would do the following:

  1. given a bunch of terms of service text, predict the class of goodness in terms of privacy
  2. given the entire terms and service text dump, summarize it along multiple dimensions like data locality, data privacy, rights to name a few.