BigSnarf blog

Infosec FTW

Monthly Archives: May 2012

iPython Notebook pandas data analysis of web logs and auth logs

Get code here:

https://github.com/dgleebits/PythonSystemAdminTools/blob/master/pandasAuthLogAnalysis.ipynb

Get sample attack data set here:

http://honeynet.org/files/sanitized_log.zip

Thanks to Vincent for testing the code and helping out with the screenshots.

Influences

http://pixlcloud.com/

Using pandas to report on apache web logs

So I got this new book:

Step 1 – Start with this Forensic Challenge dataset:

http://honeynet.org/files/sanitized_log.zip

Step 2 – Build program without pandas:

#! /usr/bin/python
”’
This program takes in a apache www-media.log and provides basic report
”’
for collections import Counters
ipAddressList = []
methodList = []
requestedList = []
referalList = []
mylist = []
data = open(‘www-media.log’).readlines()
for line in data:
     ipAddressList.append(line.split()[0])
     requestedList.append(line.split()[6])
    methodList.append(line.split()[5])
    referalList.append(line.split()[10])
count_ip = Counter(ipAddressList)
count_requested = Counter(requestedList)
count_method = Counter(methodList)
count_referal = Counter(referalList)
count_ip.most_common()
count_requested.most_common()
count_method.most_common()
count_referal.most_common()

Step 3 – Build program with pandas … code is very simple and easy once you figure out how the DataFrame works

import pandas
data = open(‘www-media.log’).readlines()
frame = pandas.DataFrame([x.split() for x in data])
countIP = frame[0].value_counts()
countRequested = frame[6].value_counts()
countReferal = frame[10].value_counts()
print countIP
print countRequested
print countReferal

Step 4 – Enjoy Responsibly

Step 5 – Get code here

 https://github.com/dgleebits/PythonSystemAdminTools/blob/master/weblogAnalysis.py

Follow

Get every new post delivered to your Inbox.

Join 40 other followers