AndroidMalwareSleuth is a user-friendly interface for analyzing Android applications for potential malware. This UI uses TensorFlow's Random forest classifier and analyses API calls, Intents and Permissions from Android apps and provids a detailed report on their safety.
- Clone the repository:
git clone https://github.com/rakshith111/AndroidMalwareSleuth
- Install the required dependencies:
pip install -r requirements.txt
OR Create a virtual environment and install the dependenciesconda env create -f ml.yml
- Build Headers for the dataset Chimeradataset
⚠️ IMPORTANT⚠️ : Follow steps here to build the headers for the dataset - Make sure you have Docker installed on your machine. Java JDK 8 is suggest for JADx to work properly
- Run the following command to get started :
python src\web_launcher.py
- Open the UI in your browser if it doesnt automatically open :
http://localhost:7555/
Refer to Screenshots For a preview of the WebUI made with streamlit.
- In-depth Analysis: Utilizes machine learning techniques to analyze Android applications for potential threats.
- User-friendly Interface: Simple and intuitive design for easy navigation and usage.
- Detailed Reports: Provides comprehensive reports about the safety of the analyzed Android
| Dataset | Confidence Score | Confidence Percentage |
| --------------------------------------| -----------------| ----------------------|
| mobfs | 0.79 | Malware Confidence @ 79.33% |
| chimera_intents | 0.03 | Benign Confidence @ 97.00% |
| chimera_permissions | 0.09 | Benign Confidence @ 91.33% |
| chimera_merged | 0.06 | Benign Confidence @ 94.33% |
Final Class: Benign Confidence @ 75.83% (Score: 0.24)
- Test 1: Train model on CICMalDroid dataset @ here
- Test 2: Train model on Chimaera dataset for Permission @ here
- Test 3: Train model on Chimaera dataset for Intent here
- Test 4: Train model on Chimaera dataset for Intent+Permission here
This was done to check if there is any improvement in the accuracy of the model when the dataset is seperated or not
- MobFS For API Rules
- libsast For Feature Extraction
- JADX For Deobfuscation and Decompilation
- Chimaera
- CICMalDroid
- Drebin Modified Drebin dataset for suitable training
- Tensorflow For Random Forest Classifier
Samaneh Mahdavifar, Andi Fitriah Abdul Kadir, Rasool Fatemi, Dima Alhadidi, Ali A. Ghorbani; Dynamic Android Malware Category Classification using Semi-Supervised Deep Learning, The 18th IEEE International Conference on Dependable, Autonomic, and Secure Computing (DASC), Aug. 17-24, 2020.
Samaneh Mahdavifar, Dima Alhadidi, and Ali A. Ghorbani (2022). Effective and Efficient Hybrid Android Malware Classification Using Pseudo-Label Stacked Auto-Encoder, Journal of Network and Systems Management 30 (1), 1-34
Build your own model and test it
- To build your own model and test it
- Choose a model from the [models](Models) folder
- modify the dockerfile
- Create the image by running the dockerfile
docker build -t your_image_name .
- Run the docker image
docker run -p 8500:8500 -p 8501:8501 -t your_image_name
Build your own model and test it
- To build your own model and test it
- Choose a model from the [models](Models) folder
- modify the dockerfile
- Create the image by running the dockerfile
- Run the docker image
docker build -t your_image_name .
docker run -p 8500:8500 -p 8501:8501 -t your_image_name
docker build -t <name>:<TAG>.
docker tag <name>:<TAG> <UserID>/<name>:<TAG>
docker push <UserID>/<name>:<TAG>
Contributions are welcome! If you find any bugs or have suggestions for new features, please open an issue or submit a pull request.
Although this is under GPL-3.0 license, we kindly request you to refrain from using this code for any commercial purposes. This is a research project and we would like to keep it that way. Thank you for understanding.