Category Archives: Thoughts
March 14, 2014Posted by on
Building a system that can do full context PCAP for a single machine is trivial, IMHO compared to creating predictive algorithms for analyzing PCAP traffic. There are log data search solutions like Elasticsearch, GreyLog2, ELSA, Splunk and Logstash that can help you archive and dig through the data.
My favorite network traffic big data solution (2012) is PacketPig. In 2014 I noticed another player named packetsled. I found this nice setup by Alienvault. Security Onion, BRO IDS is a great network security IDS etc distro. I have seen one called xtractr, MR for forensics. Several solutions exist and PCAP files can be fed to the engines for analysis. I think ARGUS and Moloch (PCAP Elasticsearch) have a place here too, but I haven’t tackled it yet. There’s a DNS Hadoop presentation from Endgame clairvoyant-squirrel. There’s also openfpc and streamDB. There are some DNS tools like passivedns. ELSA is another tool.
I started using PCAP to CSV conversion perl program, and written my own sniffer to csv in scapy. Super Timelines are being done in python too. Once I get a PCAP file converted to csv, I load it up to HDFS via HUE. I also found this PCAP visualization blog entry by Raffael Marty.
I’ve stored a bunch of csv network traces and did analysis using HIVE and PIG queries. It was very simple. Name the columns and query each column looking for specific entries. Very labour intensive. Binary analysis on Hadoop.
I’m working on a MapReduce library that uses machine learning to classify attackers and their network patterns. As of 2013, there are a few commercial venders like IBM and RSA which have added Hadoop capability to their SIEM product lines. Here is Twitters logging setup. In 2014 I loaded all the csv attack data into CDH4 cluster with Impala query engine. I’m also looking at writing pandas dataframes to Googles Big Query. As of 2014 there are solutions on hadoop for malware analysis , forensics , DNS data mining.
The biggest advantage with all these systems will be DATA ENRICHMENT. Feeding and combining data to turn a weak signal into actionable insights.
There are a few examples of PCAP ingestion with open source tools like Hadoop:
The second presentation I found was Wayne Wheelers – SherpaSurfing and https://github.com/sherpasurfing/SHERPASURFING:
The third I found was https://github.com/RIPE-NCC/hadoop-pcap:
March 7, 2014Posted by on
tshark -i en1 -nn -e http://dns.qry.name -E separator=”;” -T fields port 53
tshark -i en1 -R “dns” -T pdml | tee dns_log.xml
March 6, 2014Posted by on
February 27, 2014Posted by on
- Download Kibana git clone https://github.com/elasticsearch/kibana.git
- Download ElasticSearch wget https://download.elasticsearch.org/elasticsearch/elasticsearch/elasticsearch-1.0.1.tar.gz
- python -m SimpleHTTPServer 8000
- Load Apache log data using pyelasticsearch and IPython
- Query logs
February 18, 2014Posted by on
We set up a series of authorizations to put people on systems to access data and hopefully, have a series of authorizations and systems in place to remove the person. There are few systems in place to quickly remove people from systems and maybe we audit the systems quarterly by a third party. We choose RBAC systems, encrypt passwords, enforce complicated passwords and expire passwords, all in an attempt to control access to data assets.
Verification a process control to monitor access control
3 types of manual verification can be done.
- Ask the system custodian to verify access
- Ask the user to verify access
- Ask the data custodian to verify access
Monitoring Access Control and data mining
Monitoring access to data assets remains a difficult task. You can monitor transactions, monitor a person’s access, look at where they came from etc. Its almost like a feature set for data mining. You can look a volumes, types of transactions, time of day, and access patterns. You can look at granting patterns, removal patterns and group membership patterns. and again you can look at the volumes, types of transactions, time of day and access patterns. You can also look for skyline patterns and changes in the rolling weekly and 30 day statistics. You can even monitor the patterns to the data accessed and again you can look at the volumes, types of transactions, time of day and access patterns. These might be great candidates for graph databases. These are detective controls.
For example, finding fraud with credit cards we use phone number, email address and an IP address find:
1. How many unique phone numbers, emails and IP addresses are tied to the given credit card.
2. How many unique credit cards, emails, and IP addresses are tied to the given phone number.
3. How many unique credit cards, phone numbers and IP addresses are tied to the given email.
4. How many unique credit cards, phone numbers and emails are tied to the given IP address.
Monitoring Access Control and Predictive models
I would argue this is the first step to predictive controls. Highlighting patterns of abuse and fraud, by building predictive models for your access controls. Tightening your access controls at this level is sophisticated and there isn’t any commercial tools that I know of that are this sophisticated at predicting volumes, types of transactions, time of day, access patterns, abuse patterns, impersonating patterns and fraud patterns in access control.
This all leads to having machines help us to monitor access controls, by building systems to help us direct our efforts to breach investigations and access control violations.
January 28, 2014Posted by on
January 26, 2014Posted by on
December 25, 2013Posted by on
from twython import TwythonStreamer class MyStreamer(TwythonStreamer): def on_success(self, data): if 'text' in data: print data['text'].encode('utf-8') def on_error(self, status_code, data): print status_code, data stream = MyStreamer(APP_KEY, APP_SECRET, OAUTH_TOKEN, OAUTH_TOKEN_SECRET) # Tracking Twitter search term stream.statuses.filter(track='iphone')
November 3, 2013Posted by on
- Linear Algebra and Its Applications by Gilbert Strang (Cengage Learning)
- Convex Optimization by Stephen Boyd and Lieven Venden‐berghe (Cambridge University Press)
- A First Course in Probability (Pearson) and Introduction to Probability Models (Academic Press) by Sheldon Ross
- R in a Nutshell by Joseph Adler (O’Reilly)
- Learning Python by Mark Lutz and David Ascher (O’Reilly)
- R for Everyone: Advanced Analytics and Graphics by Jared Lander (Addison-Wesley)
- The Art of R Programming: A Tour of Statistical Software Design by Norman Matloff (No Starch Press)
- Python for Data Analysis by Wes McKinney (O’Reilly) Data Analysis and Statistical Inference
- Statistical Inference by George Casella and Roger L. Berger (Cengage Learning)
- Bayesian Data Analysis by Andrew Gelman, et al. (Chapman & Hall)
- Data Analysis Using Regression and Multilevel/Hierarchical Models by Andrew Gelman and Jennifer Hill (Cambridge University Press)
- Advanced Data Analysis from an Elementary Point of View by Cosma Shalizi (under contract with Cambridge University Press)
- The Elements of Statistical Learning: Data Mining, Inference and Prediction by Trevor Hastie, Robert Tibshirani, and Jerome Friedman (Springer)
Artificial Intelligence and Machine Learning
- Pattern Recognition and Machine Learning by Christopher Bishop (Springer)
- Bayesian Reasoning and Machine Learning by David Barber (Cambridge University Press)
- Programming Collective Intelligence by Toby Segaran (O’Reilly)
- Artificial Intelligence: A Modern Approach by Stuart Russell and Peter Norvig (Prentice Hall)
- Foundations of Machine Learning by Mehryar Mohri, Afshin Rostamizadeh, and Ameet Talwalkar (MIT Press)
- Introduction to Machine Learning (Adaptive Computation and Machine Learning) by Ethem Alpaydim (MIT Press)
- Field Experiments by Alan S. Gerber and Donald P. Green (Norton)
- Statistics for Experimenters: Design, Innovation, and Discovery by George E. P. Box, et al. (Wiley-Interscience)
- The Elements of Graphing Data by William Cleveland (Hobart Press)
- Visualize This: The FlowingData Guide to Design, Visualization, and Statistics by Nathan Yau (Wiley)
Pinterst Screenshot http://www.pinterest.com/dangleebits/books/
November 1, 2013Posted by on