BigSnarf blog

Infosec FTW

Monthly Archives: May 2013

Dirk Loss Pandas PCAP IPython Notebook

Visualize data to spot the errors




In the first chart, I plotted data on an aggregation report I built. Because of the visualization, I discovered gaps in the reporting data.

In the second chart, I changed the resolution for hourly, to every three hours and the gap was still there.

The third plot is each data source plotted separately, notice no gaps it the original source data.

What I discovered was a built in flaw in the library I was using to aggregate data or my poorly implemented method. Albeit, I re-implemented a custom aggregator to fix the problem.

I found a related post

Force-Directed Parallel Coordinates

Screen Shot 2013-05-18 at 8.45.56 PM

Mahout Parallel Frequent Pattern Mining


Screen Shot 2013-05-13 at 10.39.15 PM

Hadoop MapReduce Redis Cluster

Data mining AOL web search queries

Screen Shot 2013-04-25 at 10.43.47 PM

Screen Shot 2013-04-25 at 10.52.25 PM

Screen Shot 2013-04-25 at 10.46.17 PM

AOL Web Search Data 2006