# BigSnarf blog

Infosec FTW

## What side you on? Blue Team or Red Team? OSS Security Distros REMnux < SIFT Kit < Security Onion < IPCop > Samurai WTF > BackTrack > Kali

## Dude where’s my naive bayes? naive Bayes classifier is a simple probabilistic classifier based on applying Bayes’ theorem with strong (naive) independence assumptions. A more descriptive term for the underlying probability model would be “independent feature model”.  An overview of statistical classifiers is given in the article on Pattern recognition.

http://en.wikipedia.org/wiki/Naive_Bayes_classifier

https://github.com/bigsnarfdude/machineLearning/blob/master/mason_vs_sklearn_naive_bayes.py

## python hyperloglog and webscale counters

```from mmhash import mmhash
from math import log
from zlib import compress
from base64 import b64encode
class HyperLogLog:
def __init__(self, log2m):
self.log2m = log2m
self.m = 1 << log2m
self.data = *self.m
self.alphaMM = (0.7213 / (1 + 1.079 / self.m)) * self.m * self.def offer(self, o):
x = mmhash(str(o), 0)
a, b = 32-self.log2m, self.log2m
i = x >> a
v = self._bitscan(x << b, a)
self.data[i] = max(self.data[i], v)
def count(self):
estimate = self.alphaMM / sum([2**-v for v in self.data])
if estimate <= 2.5 * self.m:
zeros = float(self.data.count(0))
return round(-self.m * log(zeros / self.m))
else:
return round(estimate)

def _bitscan(self, x, m):
v = 1
while v<=m and not x&0x80000000:
v+=1
x<<=1
return v

def datastr(self):
return b64encode(compress(str.join('', map(chr, self.data)), 9))```

## Stacked bar charts work better to tell the story – barh not enough by itself

### Bar chart reporting presents data but it doesn’t provide context alone ### Transition to side by side bar chart can help display counts http://nbviewer.ipython.org/urls/raw.github.com/bigsnarfdude/bsides_vancouver_2013/master/05-TimeSeriesReview.ipynb