Software Developer and Data Engineer
Setup Apache Spark SQL loading from S3, CSV, JSON. Setup IPython Notebook server. Setup ElasticSearch server to quickly explore social network analysis profile data and social share data to identify and report user behaviors.
Implemented and shipped streaming data pipeline from social firehoses and built AWS data pipeline consisting of AWS Kinesis, Amazon S3 and EC-2 based Apache Spark cluster.
Implemented ETL processes for Neo4j graph database, searchable in Elasticsearch and demonstrated to end-users in a custom web-app.
Performed ad hoc data exploration and statistical analyses using IPython Notebooks and Apache SparkSQL.
All software and algorithms were developed in Python, Java and Scala. Servers deployed using Ansible and code was automatically deployed via CodeShip to AWS.
I build a variety of data products that use various models and machine learning in support of the group’s mission. My focus is on modelling customer behaviour using data-driven approaches.
Building Real-time Analytics
What I doing right now
I’m building production data pipelines to power customer facing analytics dashboards. I built a high-performance, scalable analytics infrastructure using Amazon Kinesis, Akka Scala, Apache Spark SQL, DStreams, ElasticSearch, and Amazon RDS/DynamoDB that can process data results in real time.
Other places where I’ve been
I’m a gun slinger turned code slinger. Writing stuff that humans & computers can read.
Worked at RCMP, CIBC, and Deloitte. @TechStars Chicago 2013. Proud @HackerSchool alumnus Winter 2012.
Things that keep me up at night
I’m interested in identifying threats to security with data science. I been using Spark Dataframes with my IPython Notebooks to interactively explore the datasets. I’m using connected graphs to help identify bad actors.
With a background that includes fraud detection, data loss prevention, network security and computer forensics, I employ data mining and analytics to enterprise IT operations and security.
I help process noisy signals and combine them with rich data sources … feeding and turning weak attack signals into actionable security insights using Spark.
Things that I’m passionate about
* Reading academic papers
* Reading the citations
* Building prototypes
* Testing with random data
* Verifying with production data
* Rebuilding it for production
* Tweeting about it
* Blogging about it
* Sharing code on my github or bitbucket
* Contributing to open source projects
* Speaking about it