- 🏢 This work is part of the MOSAICO PROJECT
- 🇫🇷 Copyright (c) 2022 Centre National de la Recherche Scientifique. All Rights Reserved.
- ✒️ Author: GRAFF Philippe
- 🔗 The DT classifier has been trained thanks to
pcap
traces available on this website link. - 📂 The
Utils
folder is used by our program- 👉
Approx
: division approximation - 👉
LoadRulesDT
: export trained DT in CSV - 👉
Visualize
: visualize features & classification results
- 👉
⚠️ Needed libraries:- You need Docker to make it work
- Python libraries: scikit-learn - graphviz - joblib
- 📂 The
includes
folder contains third party Software- Some come from the P4 repository
tcpreplay-x.y.z
comes from (tcpreplay 🔗)
- P4 program recognizing CG traffic
- 33ms windows
- Features computation when window is over
- Features stored inside the packet closing the window (metadata)
- If predicts CG 👉 bit CG to
1
- In
visualize
, we read the P4 logfile- Displays the features (12) on graphs
- Prompts the CG percentage (ratio Reports CG/Reports)
- Approximates a division
- With 3 reverse powers of 2
- 1/Y ~ 1/2a + 1/2b + 1/2c
- X/Y ~ X>>a + X>>b + X>>c
- For Y in [1;4095] (
2**12-1
) - Will be used for mean and std calculation
‼️ We always have 3 reverse power of 2 (👉 +1/2255 if not necessary)‼️ The same table is present twice (impossible to look twice)
- Loads CSV file
FTR_Ruls.csv
- In LoadRulesDT
- More details: 👉 in this folder
- Represents a feature with an Index
- Function of the thresholds of that feature in the DT
- | Feati | 0 | 37 | 42 | 58 | 63 | |-|-|-|-|-|-|
- If Feati = 41
- 37<41<42
- Then index = 2
- Loads CSV file
ACT_Ruls.csv
- In LoadRulesDT
- More details: 👉 in this folder
- One rule per leaf
- Given the feature's indexes (Cf II.c)), find the adequate leaf
- Give the label associated with that leaf
- Limitation to UDP traffic (
hdr.udp.isValid
: ¬UDP => ¬CG ) - Get to know the traffic direction (function of the IPv4 prefix)
- Compute the Hash: an index ind of the conversation in the registers
- If end of window or collision:
- Retrieve Data UP & Down
- Do the divisions
- Classify
- Initialize
- If window is not over:
- Update values
Associate a conversation (IP1 <-> IP2) to its data.
- 16 registers
- 2 global registers
- 7 registers downlink
- 7 registers uplink
- Global Registers (2):
- Links ind to the conversation key (IP1+IP2)
- Start of current temporal window
- Directional registers (7x2):
- Number of elements (in current window)
- Sum of packet sizes
- Last mean of packet sizes
- Sum of squared deviance (size-LastMean)2
- Last timestamp (:point_right: compute IAT)
- Last mean of IATs
- Sum of squared deviance (iat-LastMean)2
- 3 values linked to global registers
- Index ind
- Collision (2 conversations have same ind)
- End of window
- 7 values read in the directional registers (UP or DOWN)
- 12 features computed (end of the window)
- 12 indexes of the features (end of the window)
- Is CG (end of the window)
- Go to the folder
/Utils/Approx
- Call
make
- Execute
a.out
and type 4095 - It will generate
out.csv
. out.csv
is used by the main program.- 👍 There is a Readme if necessary
- Go to the folder
/Utils/LoadRulesDT
- Type
python3 dum2cond.py
- It will generate
ACT_Ruls.csv
andFTR_Ruls.csv
ACT_Ruls.csv
andFTR_Ruls.csv
are used by the main program.- 👍 There is a Readme if necessary
- Go to the folder
P4-Classifier/
- Call
sudo make
⚠️ You need root privileges- It compiles everything
- Instantiate a host
- Open a new terminal in
P4-classifier/
- Call
sudo make h1
- Type
./Host.sh INTERF PATH_TO_PCAP
from a host - 💡 need to store a pcap file in
Data/
- Type
x.y.z
when asked tcpreplay version (Software Used / Installation) - It will generate trafic
- Go to the folder
Utils/Visualize
- Execute
FtsFromLog.sh
- You will be asked a conversation number
- Look at
out.csv
👉 last column = identifier - Type the most frequent identifier
- You will be prompted % of CG recognition
- There will be some graph in
Utils/Visualize/Out/
Made with ❤️ by GRAFF Philippe