Veritas: A lightweight Generative AI detection service

Collaborators: Nick Ching (nc2935), Naren Loganathan (nl2878), Suwei Ma (sm5011), Avery Fan (mf3332)

Contributions + Tasks: Trello Board

About The Project

Veritas is a lightweight and flexible service that abstracts the task of detecting AI-generated text in various contexts. Utilizing the cutting edge paper: Raidar: geneRative AI Detection viA Rewriting published by Mao, Vondrick, Wang, and Yang.

Current Progress

18 Oct 2024: Completed development of Veritas service (including Java API controller, DB handlers and Python Microservice)
28 Nov 2024: Furthered development of Veritas service with an authentication mechanism, developed BotBuster client which employs our service, finalized endpoints

BotBuster Linked Here: https://github.com/nick-ching23/BotBuster/tree/main

Vertias Architecture

Veritas's Service utilizes a modular microservice design:

Java Springboot API Handler: controls all business service logic, handling interactions with the ML model and our persistent storage
Python ML microservice: this microservice is solely responsible for detecting AI-generated text. It is deployed on a separate GCP VM so and interacted with through HTTP
Cloud SQL on GCP: The final component of our service is persistent storage, hosted on GCP.

Project Requirements

Java Development Kit (JDK): Version 17 or later
- Ensure that javac and java are installed and properly configured in your system's PATH.
- you can verify this by running javac --version and java-version
Maven: Used for building the project
- Maven should be installed and accessible via the CLI
- you can verify this by running mvn --version
Python 3.11 or later: For deploying the Veritas microservice:
- You can check your version using python3 --version
Intellij: Note that this project was built using Intellij IDEA, but it should work with any Java-compatible IDE.

How to build our project

Clone the GitHub repo, open the veritas directory in IntelliJ, run the default generated run configuration for VeritasApplication from the IDE.
Tests can be run (mvn test) without the extra DB/flask server setup (since we mock). Check application.yaml for the additional env vars required otherwise.
Set up a MySQL database (we use Cloud SQL), set the DB_URL, DB_USERNAME and DB_PASSWORD environment variables accordingly.
ML Microservice: Create a python3 venv (we did this on a separate GCP VM) python3 -m venv <env-name>, run pip install -r requirements.txt from the check-gpt directory and start up the flask server by running python3 app.py.
Get the URL for the server and set the MODEL_MICROSERVICE_URL env var accordingly.

Interacting with our service

Endpoint information is described in the next section. We use Postman to craft our requests and point it at the URL we obtain when we start up Veritas.
At the moment, we have our service running on a GCP VM. Paste http://34.70.245.192:8080 in your browser to see 'Welcome to Veritas!'. You can use the endpoint descriptions to make other types of requests.

Service Endpoint Descriptions

GET: /

Purpose: Debugging function to ensure our API is connected.

Expected Parameters: N/A

Expected Output: "Welcome to Veritas!" string

POST: /checkText

Purpose: Simply determine if an independent piece of text was potentially generated by AI

Expected Parameters: String text -- must be provided in a JSON format.

Expected Output: HTTP OK Status with JSON containing a boolean true or false value

Upon Failure: HTTP Bad Request or Internal Server Error

POST: /checkTextUser

Purpose: Determine if a piece of text attributed to a user in an organization was potentially generated by AI, updating the corresponding flag count in the database if so

Expected Parameters: String text, String userId, String orgId -- must be provided in a JSON format.

Expected Output: HTTP OK, indicating that the text was analyzed (and that the database was successfully updated if necessary)

Upon Failure: HTTP Bad Request or Internal Server Error (if supplied parameters are null or the underlying checkText call errs out)

GET: /numFlags

Purpose: Checks the number of times a particular user has been flagged for AI-generated text (from DB)

Expected Parameters: String userId, String orgId

Expected Output: HTTP OK, with the number of flags (int)

POST: /register

Purpose: Register a new organization

Expected Parameters: LoginRequest loginRequest -- an object containing an orgId and password.

Expected Output: HTTP OK Status with a boolean showing if the organization has been successfully registered

Upon Failure: HTTP Bad Request or Internal Server Error

POST: /login

Purpose: Log a current organiztion in

Expected Parameters: LoginRequest loginRequest -- an object containing an orgId and password.

Expected Output: HTTP OK Status with a boolean showing if the organization has been successfully logged in or not

Upon Failure: HTTP Bad Request or Internal Server Error showing the user has failed to enter the correct orgId or password

Note: we have incorporated login functionality. Each valid user and organization will need to re-enter their credentials every 24 hours. This check has been integrated into each of our endpoints.

Running Tests

Run mvn test on the command line from the veritas directory. As of now, this runs 33 tests with high branch coverage. We have 3 suites of unit tests for the VeritasController, VeritasRecord and VeritasService classes.

Static Code Analysis

We use pmd as our static bug finder. Run mvn pmd::check from the veritas directory. Here is the report as of the day of 11/28/2024 (These can be found in the reports folder):

Style Check Report

We used the tool "checkstyle" to check the style of our code and generate style checking reports. Run mvn checkstyle:check from the veritas directory. Here is the report as of the day of 11/28/2024 (These can be found in the reports folder):

Branch Coverage Reporting

We used JaCoCo to perform branch analysis in order to see the branch coverage of the relevant code within the code base. Run mvn jacoco:report from the veritas directory and open (in browser) the index.html that gets generated at target/site/jacoco. See below for a screenshot of the output - indicating high branch coverage.

Tools used

Maven
GitHub Actions CI: Current workflow runs a Maven build + tests to make sure pushed code doesn't break
Checkstyle: Style checking
JUnit: Unit testing
PMD: Static Code Analysis
Mybatis Mapper: Allows us to perform SQL queries by specifying a Java map.
JaCoCo: Code coverage report generation
Postman: API testing

Name		Name	Last commit message	Last commit date
Latest commit History 102 Commits
.github/workflows		.github/workflows
check_gpt		check_gpt
reports		reports
veritas		veritas
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Veritas: A lightweight Generative AI detection service

About The Project

Current Progress

Vertias Architecture

Project Requirements

How to build our project

Interacting with our service

Service Endpoint Descriptions

Running Tests

Static Code Analysis

Style Check Report

Branch Coverage Reporting

Tools used

Third Party Services

About

Releases

Packages

Contributors 5

Languages

nick-ching23/pineapple

Folders and files

Latest commit

History

Repository files navigation

Veritas: A lightweight Generative AI detection service

About The Project

Current Progress

Vertias Architecture

Project Requirements

How to build our project

Interacting with our service

Service Endpoint Descriptions

Running Tests

Static Code Analysis

Style Check Report

Branch Coverage Reporting

Tools used

Third Party Services

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 5

Languages

Packages