BigSnarf blog

Infosec FTW

Category Archives: Tools

Tools to assess hacked machines

JSON ETL to Parquet using Apache Spark

Process logs with Kinesis, S3, Apache Spark on EMR, Amazon RDS

Apache Spark Streaming and AWS Kinesis integration in version 1.1.0

OpenSOC Machine Learning

Screen Shot 2014-09-26 at 3.26.58 PM Screen Shot 2014-09-26 at 3.23.34 PM Screen Shot 2014-09-26 at 3.22.05 PM Screen Shot 2014-09-26 at 3.20.43 PM

Self Hosted Maven repo on S3

s3cmd mb s3://www.example.mavenrepo
s3cmd ws-create s3://www.example.mavenrepo
mkdir com
cd com
mkdir amazonaws
cd amazonaws/
mkdir amazon-kinesis-connector
cd amazon-kinesis-connector
mkdir 1.0.0
cd com/amazonaws/amazon-kinesis-connector/1.0.0/
s3cmd -P sync /home/ubuntu/com/amazonaws/amazon-kinesis-connector/1.0.0 s3://www.example.mavenrepo/snapshots/com/amazonaws/amazon-kinesis-connector/1.0.0/

"AWS Snapshots" at ""


Monitoring JVM


Scala REPL in Notebook

Screen Shot 2014-08-26 at 10.29.31 PM

Simple Apache Auth Log Processing with Spark job

Screen Shot 2014-08-03 at 10.07.22 PM
Screen Shot 2014-08-03 at 10.13.14 PM



Simple Spark Job for processing Apache auth.log for Invalid user login attempts and Failed password counts
./bin/spark-submit --class "SimpleApp" --master local[4] target/scala-2.10/simple-project_2.10-1.0.jar

import org.apache.spark.SparkContext
import org.apache.spark.SparkContext._
import org.apache.spark.SparkConf

object SimpleApp {
 def main(args: Array[String]) {
 val logFile = "/Users/antigen/Downloads/sanitized_log/auth.log" 
 val conf = new SparkConf().setAppName("SimpleApacheLogProcessing Application")
 val sc = new SparkContext(conf)
 val logData = sc.textFile(logFile, 2).cache()
 val numAs = logData.filter(line => line.contains("Invalid user")).count()
 val numBs = logData.filter(line => line.contains("Failed password")).count()
 println("Lines with INVALID USER: %s, Lines with FAILED PASSWORD: %s".format(numAs, numBs))

Code, Folder Structure, simple.sbt, and packaged jar files here:

Data Science Stack

Screen Shot 2014-08-02 at 10.47.09 PM


Get every new post delivered to your Inbox.

Join 46 other followers