GitHub - saral7293/Malicious-Vs-Benign-Pdf-Classification: This repo is for finding whether a PDF file in json format contains virus or not by classifying it into Malicious or Benign using Classification algorithms.

About

This project was part of my Cybersecurity coursework that I did in my college in which I had to build Machine Learning algorithms to classify whether a PDF file in JSON format is Malicious or Benign i.e whether it contains virus or not.

Data

Data was given to me and it is in JSON format and is populated in the features folder in 2 categories- Malicious & Benign. It contains in total 10000 files equally divided for both categories.

Report

A report on how the data was parsed, challenges faces, EDA insights and different ML models built with their comparison with performance metrics for each.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
Cybersecurity project		Cybersecurity project
.gitattributes		.gitattributes
PDF Classification ML models.ipynb		PDF Classification ML models.ipynb
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

About

Data

Report

About

Releases

Packages

Languages

saral7293/Malicious-Vs-Benign-Pdf-Classification

Folders and files

Latest commit

History

Repository files navigation

About

Data

Report

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages