BigSnarf blog

Infosec FTW

Category Archives: Framework

Vincent Vega d3.js in python charts are super simple for pandas dataframes

Graphing different website user experiences

graph5 graph4 graph3




User experience (UX) involves a person’s emotions about using a particular productsystem or service. User experience highlights the experiential, affective, meaningful and valuable aspects of human-computer interaction and product ownership. Additionally, it includes a person’s perceptions of the practical aspects such as utility, ease of use and efficiency of the system. User experience is subjective in nature because it is about individual perception and thought with respect to the system. User experience is dynamic as it is constantly modified over time due to changing circumstances and new innovations.


Metrics platitudes or just the Fogg behaviour grid applied to startups

d3.js mixedtape tutorials – creators gotta create

Bulk processing memory, network traces and HDD using fuzzy hashing and sdhash

Cloudera Impala for Real Time Queries in Hadoop

Machine Learning – LinkedIn profile matcher based on Skills tags

Screen Shot 2013-01-03 at 10.45.58 AM

Linkedin Profiles 4,2, and 1 matched to ‘jQuery’ etc. tags.

Linkedin Profiles 5 and 4 matched to ‘Data Analysis’ etc. tags

Here is definitely something that will be part of the bigsnarf technology stack


iPython Notebook pandas data analysis of web logs and auth logs

Get code here:

Get sample attack data set here:

Thanks to Vincent for testing the code and helping out with the screenshots.


Using pandas to report on apache web logs

So I got this new book:

Step 1 – Start with this Forensic Challenge dataset:

Step 2 – Build program without pandas:

#! /usr/bin/python
This program takes in a apache www-media.log and provides basic report
for collections import Counters
ipAddressList = []
methodList = []
requestedList = []
referalList = []
mylist = []
data = open(‘www-media.log’).readlines()
for line in data:
count_ip = Counter(ipAddressList)
count_requested = Counter(requestedList)
count_method = Counter(methodList)
count_referal = Counter(referalList)

Step 3 – Build program with pandas … code is very simple and easy once you figure out how the DataFrame works

import pandas
data = open(‘www-media.log’).readlines()
frame = pandas.DataFrame([x.split() for x in data])
countIP = frame[0].value_counts()
countRequested = frame[6].value_counts()
countReferal = frame[10].value_counts()
print countIP
print countRequested
print countReferal

Step 4 – Enjoy Responsibly

Step 5 – Get code here


Get every new post delivered to your Inbox.

Join 43 other followers