BigSnarf blog

Infosec FTW

PCAP – Logs – Kafka -Kinesis – Compute – Storage

When my Scala code compiles and I’m not quite sure the reason why

fatStacks

AWS Lambda in Scala

Screen Shot 2015-07-30 at 4.25.32 PM

Compile Apache Spark with Kinesis Support

Creating Kinesis Stream in pictures

 

signInAWSconsole

 

 

logInScreenKinesis

awsLandingPage

createStreamButton

createKinesisEventStream

waitingCreatingStreem

streamCreatedFeedback

 

How I learned to program

Sorry honey, not tonight I’ve gotta get this Apache Spark Fat jar compiled and shipped

Streaming Prototype

Apache Spark Use Casesstreaming-arch

Our specific use case

Screen Shot 2015-05-23 at 11.18.22 AM

Kinesis gets raw logs

Screen Shot 2015-05-21 at 5.14.41 PM

Spark Streaming does the counting

Screen Shot 2015-05-21 at 5.14.04 PM

Two Tables Created, One for Kinesis Log Position and the Second for Aggregates

Screen Shot 2015-05-23 at 10.40.15 AM

DynamoDB stores the aggregations

Screen Shot 2015-05-21 at 5.12.11 PM

 

mySparkStreaming

 

https://github.com/snowplow/spark-streaming-example-project

Building Custom Queries, Grouping, Aggregators and Filters for Apache Spark

Query Metrics

Returns a list of metric values based on a set of criteria. Also returns a set of all tag names and values that are found across the data points.

The time range can be specified with absolute or relative time values. Absolute time values are in milliseconds. Relative time values are specified as an integer duration and a unit. Possible unit values are “milliseconds”, “seconds”, “minutes”, “hours”, “days”, “weeks”, “months”, and “years”. For example, “5 hours” means that metric values submitted 5 hours ago will be returned. The end time is optional. If no end time is specified, the end time is assumed to be now (the current date and time).

Grouping

The results of the query can be grouped together.There are three ways to group the data; by tags, by a time range, and by value. Grouping is done with the groupBy or groupByKey which is an array of one or more groupers.

Aggregators

Aggregators perform an operation on data points and down samples. For example, you could sum all data points that exist in 5 minute periods.

Aggregators can be combined together. For example, you could sum all data points in 5 minute periods then average them for a week period.

Filtering

It is possible to filter the data returned by specifying a tag. The data returned will only contain data points associated with the specified tag. Filtering is done using the “tags” property.

Links

Netflix Security tool – FIDO

fido-scoring

FIDO is an orchestration layer that automates the incident response process by evaluating, assessing and responding to malware and other detected threats.

http://techblog.netflix.com/2015/05/introducing-fido-automated-security.html

Follow

Get every new post delivered to your Inbox.

Join 50 other followers