BigSnarf blog

Infosec FTW

iPython processing Apache logs with generators and visualizing with matplotlib

Screen Shot 2013-01-20 at 11.26.02 AM

  • Grab logs from multiple data-centers
  • Split out anonymized and non-anonymized data into two separate files
  • Store both sets of files in HDFS – Hadoop FTW (experiments done)
  • Create HIVE queries (experiments done)
  • Query the data
  • Yum! Stats!

https://github.com/bigsnarfdude/pythonNetworkProgrammingN00B/blob/master/logProcessing.ipynb

Refactored for easier queries

Screen Shot 2013-01-20 at 2.28.34 PM

https://github.com/bigsnarfdude/pythonNetworkProgrammingN00B/blob/master/log_analytics.ipynb

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: