Be My Voice (BMV)

Cutting-Edge Video ArSL Recognition

Overview 📹

This repository contains state-of-the-art models and techniques for Arabic Sign Language (ArSL) recognition from video inputs. Our goal is to leverage deep learning and computer vision technologies to build accurate and efficient models capable of interpreting ArSL in real-time.

Features ✨

End-to-End ArSL Recognition: From raw video inputs to ArSL word predictions.
Real-Time Inference: Optimized models for low latency, suitable for real-world applications.
Custom Deep Learning Models: Novel convolutional and transformer-based architectures specifically designed for ArSL recognition.
Data Augmentation: Extensive data augmentation techniques to handle variations in lighting, background, and signer differences.
Custom Datasets: Incorporates multiple ArSL datasets, including custom-built datasets for specialized gestures.
Visualization Tools: Tools for visualizing model predictions, keypoints, and attention maps.

Installation 💻

To get started, clone the repository and install the required dependencies:

git clone https://github.com/yourusername/BMV-AI.git
cd ArSL-Recognition
pip install -r requirements.txt

Usage 🚀

Training and Visualization

To train a model from scratch, execute:

For Training and Visualization Model You can Ask Me

Datasets 📦

This project builds upon our collected ArSL dataset:

Custom Dataset: We created and use our custom datasets with the aid of the Faculty of Disabilities and Rehabilitation Sciences.

Models 🧠

The repository includes a custom model architecture optimized for ArSL recognition:

Custom Convolutional and Transformer-based Model

Our model leverages the strengths of both Convolutional Neural Networks (CNNs) and Transformer-based architectures to efficiently recognize ArSL from video sequences.

Efficient Channel Attention (ECA): Enhances feature representations by recalibrating the importance of different channels in the input tensor, improving the model's ability to focus on significant features.
Causal Dilated Depthwise Convolution (CausalDWConv1D): Captures temporal dependencies in sequential data, ensuring that future information is not used in the prediction of the current step.
Conv1D Blocks: Efficiently process temporal sequences by combining depthwise convolutions with attention mechanisms, expanding and contracting the feature space.
Multi-Head Self-Attention: Implements self-attention to dynamically weigh the importance of different parts of the input sequence, capturing long-range dependencies.
Transformer Blocks: Further refine the sequential data by stacking multi-head self-attention layers and fully connected layers, with residual connections and layer normalization for stability.

Each component is designed and fine-tuned specifically for ArSL data to maximize accuracy and efficiency in real-time applications.

Training 🎓

The training process is highly configurable:

Config Files: Use YAML configuration files for flexible experiment setups.
Hyperparameter Tuning: Easily adjust learning rates, batch sizes, and other hyperparameters.
Custom Architectures: Train our state-of-the-art custom models specifically designed for ArSL.
Multi-GPU Support: Train models using multiple GPUs for faster results.

Evaluation 📊

Evaluate the performance of the trained models:

python evaluate.py --dataset_path path/to/dataset --model_path path/to/model.pth

Metrics: Accuracy, precision, recall, F1-score, and more.
Confusion Matrix: Generate a confusion matrix to analyze model performance.

Issues 🤝

Stay connected: Feel free to open an issue if you encounter any problems or have questions.

License 📄

This project is licensed under the MIT License. See the LICENSE file for details.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
image		image
models		models
src		src
Pipfile		Pipfile
Pipfile.lock		Pipfile.lock
README.md		README.md
augmentation.py		augmentation.py
eda.ipynb		eda.ipynb
main.py		main.py
requirements.txt		requirements.txt
save_tflite.py		save_tflite.py
server.py		server.py
video2npy.py		video2npy.py
video_eda_results.csv		video_eda_results.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Be My Voice (BMV)

Cutting-Edge Video ArSL Recognition

Overview 📹

Features ✨

Table of Contents 📚

Installation 💻

Usage 🚀

Training and Visualization

Datasets 📦

Models 🧠

Custom Convolutional and Transformer-based Model

Training 🎓

Evaluation 📊

Issues 🤝

License 📄

Contact 📱

About

Releases

Packages

Languages

OsamaM0/BMV-AI

Folders and files

Latest commit

History

Repository files navigation

Be My Voice (BMV)

Cutting-Edge Video ArSL Recognition

Overview 📹

Features ✨

Table of Contents 📚

Installation 💻

Usage 🚀

Training and Visualization

Datasets 📦

Models 🧠

Custom Convolutional and Transformer-based Model

Training 🎓

Evaluation 📊

Issues 🤝

License 📄

Contact 📱

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages