Skip to content

Simple Classification Problem-labelling emails as spam or non spam based on content. dataset from UCI machine learning repository.

Notifications You must be signed in to change notification settings

akhapwoyaco/data-science-ml-projects

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 

Repository files navigation

1 Introduction

Computer systems and networks are often under attack from intruders, of whom masquerade under different duress to avoid detection. Their modes of operation vary over time, with advancement in technology within general and individual systems and networks, making it more harder for early detection of their intrusion. There may not be complete assurances that vulnerabilities within systems can be detected early or fixed up on time, given that detection and risk assessment is a continuous process. Moreover, patching up vulnerabilities within computer systems and networks does not imply that all is fixed as modes of operation change based on a number of factors.

According to Lipson (2002, p. 7), communication across the computer systems that are acting as hosts, connected via wired or wireless links, is through the standardized regulations and rules known as Transmission Control Protocol/Internet Protocol, TCP/IP, with IP packet contents being data.

1.1 Context and Motivation

Attacks and intrusion often times may lead to unauthorized access, that may extend unnoticed over time, infection, modification and even the shut-down of the systems and networks. Changes in the systems often times go un-noticed until complaints are raised up on the efficiency of the systems and networks as argued by Lai & Hsia (2007). Successful transmission of the IP packets to the destination and without any interception, within a specified time is highly desired. However, as more systems join up to use a network, the complete security of systems becomes complex, with risk evaluation on the systems encompassing, but not limited to how to certify intrusions’ risky levels, the vulnerability and type of vulnerability, the host systems facilitating the intrusion and its state, the level of security being offered to users of the systems and the level of patching of vulnerabilities and their effectiveness.

1.2 Problem Statement

In a realistic setting, not all solutions to intrusions and vulnerabilities are easy to detect or solve and on time. The flow of information packets on the systems can provide information on intrusion, given that certain events may be deviating from the overall norm. Guiding our analysis is the Disc Consulting (DCE) data on IP Packet event records, that has identified malicious and non-malicious event record within their networks and computer systems, for which the attacks are more sophisticated than previously experienced attacks, with no clue on the methods used for intrusion.

A suggested approach is incorporate a real-time threat detection system based on machine learning classification models to categorize an record event as either malicious or not based on IP packets attributes. Performance measures and model selection, will be assessed inline with the scope of the DCE objectives.

About

Simple Classification Problem-labelling emails as spam or non spam based on content. dataset from UCI machine learning repository.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published