The premise of this project is to parse Common Log Format logs into a readable format which will then output the desired fields by the user. The initial regular expression implentation seperates the information into 8 different categories/fields, IP_of_requesting_host, Remote_user, Timestamp, Request_from_client, HTTP_response_code, Size_of_bytes_returned, Http_referer, and Http_user_agent. To further output the desired data, another script was ran which will take the .json file that was created in the previous step and output the information as per the users choice. The final report will be linked in the References section.
The tools used here are the following:
- Used log2json to parse the initial Common Log Format.
- Used Python to create another script that will take the .json file that was created in the previous step and output the information as per the users choice (8 different catagories/fields, IP_of_requesting_host, Remote_user, Timestamp, Request_from_client, HTTP_response_code, Size_of_bytes_returned, Http_referer, and Http_user_agent).
- Learned how to pull Github repositories.
- Understood the concept of Common Log Format and how it relates to cybersecurity.
- Learned how to read Common Log Format files.