Category Archives: Tools
December 5, 2013Posted by on
def take(n, iterable): "Return first n items of the iterable as a list" return list(islice(iterable, n)) def enumerate(iterable, start=0): return izip(count(start), iterable) def tabulate(function, start=0): "Return function(0), function(1), ..." return imap(function, count(start)) def consume(iterator, n): "Advance the iterator n-steps ahead. If n is none, consume entirely." # The technique uses objects that consume iterators at C speed. if n is None: # feed the entire iterator into a zero-length deque collections.deque(iterator, maxlen=0) else: # advance to the emtpy slice starting at position n next(islice(iterator, n, n), None) def nth(iterable, n, default=None): "Returns the nth item or a default value" return next(islice(iterable, n, None), default) def quantify(iterable, pred=bool): "Count how many times the predicate is true" return sum(imap(pred, iterable)) def padnone(iterable): """Returns the sequence elements and then returns None indefinitely. Useful for emulating the behavior of the built-in map() function. """ return chain(iterable, repeat(None)) def ncycles(iterable, n): "Returns the sequence elements n times" return chain.from_iterable(repeat(iterable, n)) def dotproduct(vec1, vec2): return sum(imap(operator.mul, vec1, vec2)) def flatten(listOfLists): return list(chain.from_iterable(listOfLists)) def repeatfunc(func, times=None, *args): """Repeat calls to func with specified arguments. Example: repeatfunc(random.random) """ if times is None: return starmap(func, repeat(args)) return starmap(func, repeat(args, times)) def pairwise(iterable): "s -> (s0,s1), (s1,s2), (s2, s3), ..." a, b = tee(iterable) next(b, None) return izip(a, b) def grouper(n, iterable, fillvalue=None): "grouper(3, 'ABCDEFG', 'x') --> ABC DEF Gxx" args = [iter(iterable)] * n return izip_longest(fillvalue=fillvalue, *args) def roundrobin(*iterables): "roundrobin('ABC', 'D', 'EF') --> A D E B F C" # Recipe credited to George Sakkis pending = len(iterables) nexts = cycle(iter(it).next for it in iterables) while pending: try: for next in nexts: yield next() except StopIteration: pending -= 1 nexts = cycle(islice(nexts, pending)) def compress(data, selectors): "compress('ABCDEF', [1,0,1,0,1,1]) --> A C E F" return (d for d, s in izip(data, selectors) if s) def combinations_with_replacement(iterable, r): "combinations_with_replacement('ABC', 2) --> AA AB AC BB BC CC" # number items returned: (n+r-1)! / r! / (n-1)! pool = tuple(iterable) n = len(pool) if not n and r: return indices =  * r yield tuple(pool[i] for i in indices) while True: for i in reversed(range(r)): if indices[i] != n - 1: break else: return indices[i:] = [indices[i] + 1] * (r - i) yield tuple(pool[i] for i in indices) def powerset(iterable): "powerset([1,2,3]) --> () (1,) (2,) (3,) (1,2) (1,3) (2,3) (1,2,3)" s = list(iterable) return chain.from_iterable(combinations(s, r) for r in range(len(s)+1)) def unique_everseen(iterable, key=None): "List unique elements, preserving order. Remember all elements ever seen." # unique_everseen('AAAABBBCCDAABBB') --> A B C D # unique_everseen('ABBCcAD', str.lower) --> A B C D seen = set() seen_add = seen.add if key is None: for element in iterable: if element not in seen: seen_add(element) yield element else: for element in iterable: k = key(element) if k not in seen: seen_add(k) yield element def unique_justseen(iterable, key=None): "List unique elements, preserving order. Remember only the element just seen." # unique_justseen('AAAABBBCCDAABBB') --> A B C D A B # unique_justseen('ABBCcAD', str.lower) --> A B C A D return imap(next, imap(itemgetter(1), groupby(iterable, key)))
October 29, 2013Posted by on
wget https://bitbucket.org/pypa/setuptools/raw/bootstrap/ez_setup.py -O – | sudo python
sudo python ez_setup.py
sudo easy_install pip
sudo pip install virtualenv
sudo pip install virtualenvwrapper
curl -O http://python-distribute.org/distribute_setup.py
sudo python distribute_setup.py
sudo pip install scipy –upgrade
sudo pip install numpy –upgrade
sudo pip install matplotlib –upgrade
sudo pip install pyzmq –upgrade
sudo pip install tornado –upgrade
sudo pip install pygments –upgrade
sudo pip install pandas –upgrade
sudo pip install jinja2 –upgrade
October 20, 2013Posted by on
Tutorial 10: How to Visualize Website Clickstream Data
August 18, 2013Posted by on
May 17, 2013Posted by on
May 14, 2013Posted by on
April 26, 2013Posted by on
Moloch is an open source, large scale IPv4 packet capturing (PCAP), indexing and database system. A simple web interface is provided for PCAP browsing, searching, and exporting. APIs are exposed that allow PCAP data and JSON-formatted session data to be downloaded directly. Simple security is implemented by using HTTPS and HTTP digest password support or by using apache in front. Moloch is not meant to replace IDS engines but instead work along side them to store and index all the network traffic in standard PCAP format, providing fast access. Moloch is built to be deployed across many systems and can scale to handle multiple gigabits/sec of traffic.
April 17, 2013Posted by on
PyOpenCL lets you access the OpenCL parallel computation API from Python. Here’s what sets PyOpenCL apart:
- Object cleanup tied to lifetime of objects. This idiom, often called RAII in C++, makes it much easier to write correct, leak- and crash-free code.
- Completeness. PyOpenCL puts the full power of OpenCL’s API at your disposal, if you wish.
- Convenience. While PyOpenCL’s primary focus is to make all of OpenCL accessible, it tries hard to make your life less complicated as it does so–without taking any shortcuts.
- Automatic Error Checking. All OpenCL errors are automatically translated into Python exceptions.
- Speed. PyOpenCL’s base layer is written in C++, so all the niceties above are virtually free.
- Helpful, complete documentation and a wiki.
- Liberal licensing (MIT).
See the PyOpenCL Documentation.
Or get it directly from my source code repository by typing
git clone http://git.tiker.net/trees/pyopencl.git
You may also browse the source.
Prerequisites: All you need is an OpenCL implementation. And Python obviously.
April 9, 2013Posted by on
Classifying traffic intensity and temporary differences in access
- Total pages request per IP address
- Percentage of images requested
- Percentage of binaries requested like pdf
- Total request for robots.txt
- Percentage of HTML pages requested
- Percentage of text files requested
- Percentage of zip files requested
- Percentage of video files requested
- Bounce rate
- Session time
- Standard deviation between clicks
- Percentage of night time requests
- Percentage of errors
- Percentage of garbage requests
- Percentage of GETS
- Percentage of POSTS
- Percentage of HEAD
- URL traversal
- Depth of URL traversal
- User Agents
- IP Address location
- Known crawler IP addresses
- Repeated requests
- Average time between clicks
- OS badges
- ARIN registration
- ASN analysis