Skip to content

Riofuad/Docubertify

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

14 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Docubertify

This repository contains the source code from my college thesis entitled "Design and Development of an Android-Based Document Classification and Digitization Application Using ML Kit SDKs and BERT Algorithm".

Table of Contents

Project Overview

docubertify_logo

This project focuses on automating document classification (KTP and SIM) and digitizing the content using machine learning. It uses the BERT model for text classification and Google's ML Kit SDKs for text recognition. The application is built for Android devices, and it includes a backend server to handle API requests for document classification and a pre-trained BERT model.

Folder Structure

The repository is organized into the following main directories:

.
├── Backend Server API/   # Contains FastAPI code and API endpoints
├── Frontend Android/     # Android app code (Kotlin) with ML Kit integration
└── ML BERT Model/        # Pre-trained BERT model, training scripts, and model fine-tuning

Backend Server

This folder contains the source code for the FastAPI server. The API handles requests from the Android app, processes document text, and performs document classification using the BERT model.

Frontend Android

The Android application, built with Kotlin, allows users to capture or select documents (KTP/SIM), perform text recognition using ML Kit SDKs, and send extracted text to the backend API for classification.

ML BERT Model

This folder includes:

  • Datasets: CSV files containing text data for document classification (e.g., KTP and SIM).
  • JSON Dict: A dictionary for filtering and fixing typos in text extracted from the documents.
  • Jupyter Notebook (.ipynb): The notebook used for training and fine-tuning the BERT model.

Requirements

Datasets for Fine-Tuned BERT Model

ktp_dummy_image sim_dummy_image

Because the dataset used to build the fine-tuned BERT model is sensitive, which is the image dataset of KTP (ID Card) and SIM (Driving License) documents of the Republic of Indonesia, the data used is in the form of dummy document images for KTP and SIM made in Figma. The dataset amounts to 1000 dummy images with 500 images each for each type of document. The dataset used can be seen here.

⚠️ Disclaimer ⚠️

The dataset is only used for research needs and although the document images are dummy (not based on data from real people, only random generation) still use the KTP and SIM image dataset wisely. Violations that occur due to misuse of KTP and SIM document images are beyond the responsibility of the author.

How To Run Backend Server API

Before using the app, the API server must be running by following these steps:

  1. Run the API locally using Uvicorn
    Start the FastAPI server on your local machine:
    uvicorn app:app --host 0.0.0.0 --port 8000
  2. Run Ngrok to expose the server using a static domain
    You need to expose your local server to the internet using Ngrok with a static domain. Generate a static domain from the Ngrok dashboard, then run the following command:
    ngrok http --domain=[static-domain] 8000
    For more information on how to generate static domains for Ngrok, you can visit this Ngrok blog post.

Android App Preview

MainActivity Select Image
(Camera Option)
Select Image
(Gallery Option)
Crop Image
MainActivity Select Image (Camera Option) Select Image (Gallery Option) CropImage
ResultActivity ResultActivity
(Loading State)
ResultActivity
(Bottom Sheet Fragment Show Half)
ResultActivity ResultActivity (Loading) ResultActivity (Bottom Sheet Fragment Showing)
KtpBottomSheetFragment SimBottomSheetFragment
Bottom Sheet Fragment KTP Bottom Sheet Fragment SIM