BigSnarf blog
Infosec FTW
Monthly Archives: May 2012
iPython Notebook pandas data analysis of web logs and auth logs
Posted by on May 28, 2012
Get code here:
https://github.com/dgleebits/PythonSystemAdminTools/blob/master/pandasAuthLogAnalysis.ipynb
Get sample attack data set here:
http://honeynet.org/files/sanitized_log.zip
Thanks to Vincent for testing the code and helping out with the screenshots.
Influences
Using pandas to report on apache web logs
Posted by on May 28, 2012
So I got this new book:
Step 1 – Start with this Forensic Challenge dataset:
http://honeynet.org/files/sanitized_log.zip
Step 2 – Build program without pandas:
#! /usr/bin/python”’This program takes in a apache www-media.log and provides basic report”’for collections import CountersipAddressList = []methodList = []requestedList = []referalList = []mylist = []data = open(‘www-media.log’).readlines()for line in data:ipAddressList.append(line.split()[0])requestedList.append(line.split()[6])methodList.append(line.split()[5])referalList.append(line.split()[10])count_ip = Counter(ipAddressList)count_requested = Counter(requestedList)count_method = Counter(methodList)count_referal = Counter(referalList)count_ip.most_common()count_requested.most_common()count_method.most_common()count_referal.most_common()
Step 3 – Build program with pandas … code is very simple and easy once you figure out how the DataFrame works
import pandasdata = open(‘www-media.log’).readlines()frame = pandas.DataFrame([x.split() for x in data])countIP = frame[0].value_counts()countRequested = frame[6].value_counts()countReferal = frame[10].value_counts()print countIPprint countRequestedprint countReferal
Step 4 – Enjoy Responsibly
Step 5 – Get code here
https://github.com/dgleebits/PythonSystemAdminTools/blob/master/weblogAnalysis.py





