# BigSnarf blog

Infosec FTW

## Building your first neural network self driving car in Python

1. Get RC Car

2. Learn to drive it

3. Take apart car to see controllers and wireless controller

4. Soldering Iron and Multimeter to determine positive and negative and circuits firing

## Testing – Link Mac to Arduino to Wireless Controller

5. Need Arduino board and cable

6. Install software and load Arduino program onto board

7. Install pygame and serial

This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.

 // Car Control v. 0.2 // http://thelivingpearl.com/2013/01/04/drive-a-lamborghini-with-your-keyboard/ int reversePin = 9; int forwardPin = 8; int leftPin = 7; int rightPin = 6; int order = 55; int time = 75; //control flag int flag = 0; void forward(int time){ Serial.println("This is forward…"); digitalWrite(forwardPin, LOW); delay(time); digitalWrite(forwardPin,HIGH); } void reverse(int time){ Serial.println("This is reverse…"); digitalWrite(reversePin, LOW); delay(time); digitalWrite(reversePin,HIGH); } void left(int time){ Serial.println("This is left…"); digitalWrite(leftPin, LOW); delay(time); digitalWrite(leftPin,HIGH); } void right(int time){ Serial.println("This is right…"); digitalWrite(rightPin, LOW); delay(time); digitalWrite(rightPin,HIGH); } void leftTurnForward(int time){ digitalWrite(forwardPin, LOW); digitalWrite(leftPin, LOW); delay(time); off(); } void rightTurnForward(int time){ digitalWrite(forwardPin, LOW); digitalWrite(rightPin, LOW); delay(time); off(); } void leftTurnReverse(int time){ digitalWrite(reversePin, LOW); digitalWrite(leftPin, LOW); delay(time); off(); } void rightTurnReverse(int time){ digitalWrite(reversePin, LOW); digitalWrite(rightPin, LOW); delay(time); off(); } void demoOne(){ int demoTime = 500; forward(demoTime); reverse(demoTime); left(demoTime); right(demoTime); } void demoTwo(){ int demoTime = 500; rightTurnForward(demoTime); leftTurnForward(demoTime); rightTurnReverse(demoTime); leftTurnReverse(demoTime); } void off(){ digitalWrite(9, HIGH); digitalWrite(8, HIGH); digitalWrite(7, HIGH); digitalWrite(6, HIGH); } void orderControl(int order, int time){ switch (order){ //off order case 0: off(); break; //demo modes case 1: demoOne(); order=0; break; case 2: demoTwo(); order=0; break; case 3: demoOne(); demoTwo(); order=0; break; //movment options case 11: forward(time); order=0; break; case 12: reverse(time); order=0; break; case 13: right(time); order=0; break; case 14: left(time); order=0; break; //complex movment case 21: rightTurnForward(time); order=0; break; case 22: leftTurnForward(time); order=0; break; case 23: rightTurnReverse(time); order=0; break; case 24: leftTurnReverse(time); order=0; break; //no match… default: Serial.print("\nINVALID ORDER!… TURNING OFF!\n"); } } void setup() { // initialize the digital pins as an output. pinMode(rightPin, OUTPUT); pinMode(leftPin, OUTPUT); pinMode(forwardPin, OUTPUT); pinMode(reversePin, OUTPUT); Serial.begin(115200); Serial.print("\n\nStart…\n"); } void loop() { //Turn everything off… off(); //get input if (Serial.available() > 0){ order = Serial.read() – 65; Serial.print("I received: "); Serial.println(order); flag = 1; } if(flag){ //complete orders orderControl(order,time); } }

view raw

gistfile1.txt

hosted with ❤ by GitHub

8. python carDriving.py to test soldering and driving by keyboard

This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.

 # http://thelivingpearl.com/2013/01/04/drive-a-lamborghini-with-your-keyboard/ import serial import pygame import os from pygame.locals import * def clear_screen(): os.system('clear') def getOrder(run): for event in pygame.event.get(): if (event.type == KEYDOWN): keyinput = pygame.key.get_pressed(); #complex orders if keyinput[pygame.K_UP] and keyinput[pygame.K_RIGHT]: run[1] = 21; elif keyinput[pygame.K_UP] and keyinput[pygame.K_LEFT]: run[1] = 22; elif keyinput[pygame.K_DOWN] and keyinput[pygame.K_RIGHT]: run[1] = 23; elif keyinput[pygame.K_DOWN] and keyinput[pygame.K_LEFT]: run[1] = 24; #simple orders elif keyinput[pygame.K_UP]: run[1] = 11; elif keyinput[pygame.K_DOWN]: run[1] = 12; elif keyinput[pygame.K_RIGHT]: run[1] = 13; elif keyinput[pygame.K_LEFT]: run[1] = 14; elif keyinput[pygame.K_1]: run[1] = 1; elif keyinput[pygame.K_2]: run[1] = 2; elif keyinput[pygame.K_3]: run[1] = 3; #exit elif keyinput[pygame.K_x] or keyinput[pygame.K_q]: print 'exit'; run[0] = False; run[1] = 0; elif event.type == pygame.KEYUP: #single key if (run[1] < 20): run[1] = 0; #up-right elif (run[1] == 21): if event.key == pygame.K_RIGHT: run[1] = 11; elif event.key == pygame.K_UP: run[1] = 13; #up-left elif (run[1] == 22): if event.key == pygame.K_LEFT: run[1] = 11; elif event.key == pygame.K_UP: run[1] = 14; #back-right elif (run[1] == 23): if event.key == pygame.K_RIGHT: run[1] = 12; elif event.key == pygame.K_DOWN: run[1] = 13; #back-left elif (run[1] == 24): if event.key == pygame.K_LEFT: run[1] = 12; elif event.key == pygame.K_DOWN: run[1] = 14; return run; def main(): clear_screen() print '\nStarting CarControl v.0.3\n'; ser = serial.Serial('/dev/tty.usbmodem411', 115200, timeout=1); pygame.init(); run = [True,0]; previous = -1 while run[0]: run = getOrder(run); #debug #print 'current orders: ' + str(run[1]); if (run[1] != previous): previous = run[1]; ser.write(chr(run[1] + 65)); print run[1]; ser.close(); exit('\nGoodbye!\n') if __name__ == "__main__": main()

view raw

gistfile1.txt

hosted with ❤ by GitHub

## Testing – Capturing image data for training dataset

On the first iteration of the physical devices, I mounted the disassembled Logitech C270/Raspberry Pi on the car with a coat hanger that I chopped up and modified to hold the camera. I pointed it down so it could see the hood and some of the “road”. The webcam  captures video frames of the road ahead  at ~24 fps.

I send the captured stream across the wifi network back to my MacBookPro using python server implementation using basic sockets.

On my MacBookPro laptop computer, I run another client python program to connect to Raspberry Pi using basic sockets. I take the stream color stream 320×240 and down sample and grayscale video frames for preprocessing into a numpy matrix.

Wirelessly stream video and capture using opencv2 and slice into jpeg, preprocess and reshape numpy and feed array into with  key press data as label.

## Testing – Convert 240×240 into greyscale

This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.

 # label pixel0 pixel1 pixel2 pixel3 pixel4 pixel5 pixel6 pixel7 pixel8 # 1 0 0 0 0 0 0 0 0 0 # 0 0 0 0 0 0 0 0 0 0 from skimage import color from skimage import io img = color.rgb2gray(io.imread('001.jpg')) img x1 = 0 x2 = 240 y1 = 0 y2 = 240 cropped = img[x1:x2,y1:y2] io.imsave("/Users/benchemployee/Desktop/kaggle/selfDriving/cropped.jpg", cropped)

view raw

gistfile1.txt

hosted with ❤ by GitHub

57600 input neurons

## Take 2 : Using PiCamera and stream images to Laptop

This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.

 __author__ = 'zhengwang' import numpy as np import cv2 import serial import pygame from pygame.locals import * import socket class CollectTrainingData(object): def __init__(self): self.server_socket = socket.socket() self.server_socket.bind(('192.168.1.72', 8000)) self.server_socket.listen(0) # accept a single connection self.connection = self.server_socket.accept()[0].makefile('rb') # connect to a seral port self.ser = serial.Serial('/dev/cu.usbserial-AM01VDMD', 115200, timeout=1) self.send_inst = True # create labels self.k = np.zeros((4, 4), 'float') for i in range(4): self.k[i, i] = 1 self.temp_label = np.zeros((1, 4), 'float') def collect_image(self): saved_frame = 0 total_frame = 0 # collect images for training print('Start collecting images…') e1 = cv2.getTickCount() image_array = np.zeros((1, 38400)) label_array = np.zeros((1, 4), 'float') # stream video frames one by one try: stream_bytes = ' ' frame = 1 while self.send_inst: stream_bytes += self.connection.read(1024) first = stream_bytes.find('\xff\xd8') last = stream_bytes.find('\xff\xd9') if first != –1 and last != –1: jpg = stream_bytes[first:last + 2] stream_bytes = stream_bytes[last + 2:] image = cv2.imdecode(np.fromstring(jpg, dtype=np.uint8), cv2.CV_LOAD_IMAGE_GRAYSCALE) # select lower half of the image roi = image[120:240, :] # save streamed images cv2.imwrite('training_images/frame{:>05}.jpg'.format(frame), image) #cv2.imshow('roi_image', roi) cv2.imshow('image', image) # reshape the roi image into one row array temp_array = roi.reshape(1, 38400).astype(np.float32) frame += 1 total_frame += 1 # get input from human driver for event in pygame.event.get(): if event.type == KEYDOWN: key_input = pygame.key.get_pressed() # complex orders if key_input[pygame.K_UP] and key_input[pygame.K_RIGHT]: print("Forward Right") image_array = np.vstack((image_array, temp_array)) label_array = np.vstack((label_array, self.k[1])) saved_frame += 1 self.ser.write(chr(6)) elif key_input[pygame.K_UP] and key_input[pygame.K_LEFT]: print("Forward Left") image_array = np.vstack((image_array, temp_array)) label_array = np.vstack((label_array, self.k[0])) saved_frame += 1 self.ser.write(chr(7)) elif key_input[pygame.K_DOWN] and key_input[pygame.K_RIGHT]: print("Reverse Right") self.ser.write(chr(8)) elif key_input[pygame.K_DOWN] and key_input[pygame.K_LEFT]: print("Reverse Left") self.ser.write(chr(9)) # simple orders elif key_input[pygame.K_UP]: print("Forward") saved_frame += 1 image_array = np.vstack((image_array, temp_array)) label_array = np.vstack((label_array, self.k[2])) self.ser.write(chr(1)) elif key_input[pygame.K_DOWN]: print("Reverse") saved_frame += 1 image_array = np.vstack((image_array, temp_array)) label_array = np.vstack((label_array, self.k[3])) self.ser.write(chr(2)) elif key_input[pygame.K_RIGHT]: print("Right") image_array = np.vstack((image_array, temp_array)) label_array = np.vstack((label_array, self.k[1])) saved_frame += 1 self.ser.write(chr(3)) elif key_input[pygame.K_LEFT]: print("Left") image_array = np.vstack((image_array, temp_array)) label_array = np.vstack((label_array, self.k[0])) saved_frame += 1 self.ser.write(chr(4)) elif key_input[pygame.K_x] or key_input[pygame.K_q]: print 'exit' self.send_inst = False self.ser.write(chr(0)) break elif event.type == pygame.KEYUP: self.ser.write(chr(0)) # save training images and labels print("train") train = image_array[1:, :] print("label") train_labels = label_array[1:, :] # save training data as a numpy file print("np") np.savez('training_data_temp/test08.npz', train=train, train_labels=train_labels) e2 = cv2.getTickCount() # calculate streaming duration time0 = (e2 – e1) / cv2.getTickFrequency() print('Streaming duration:', time0) print(train.shape) print(train_labels.shape) print('Total frame:', total_frame) print('Saved frame:', saved_frame) print('Dropped frame', total_frame – saved_frame) finally: self.connection.close() self.server_socket.close() if __name__ == '__main__': print("Entering main function") print("Press q to quit on the video capture pygame area") print("Initializing pygame") pygame_init = pygame.init() print("Initializing Collection of Training Data Object") ctd = CollectTrainingData() print("Server Sockets setup") ctd.collect_image()

## Take 2 -Load new Arduino Sketch and change PINS

This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.

 // assign pin num int right_pin = 6; int left_pin = 7; int forward_pin = 10; int reverse_pin = 9; // duration for output int time = 50; // initial command int command = 0; void setup() { pinMode(right_pin, OUTPUT); pinMode(left_pin, OUTPUT); pinMode(forward_pin, OUTPUT); pinMode(reverse_pin, OUTPUT); Serial.begin(115200); } void loop() { //receive command if (Serial.available() > 0){ command = Serial.read(); } else{ reset(); } send_command(command,time); } void right(int time){ digitalWrite(right_pin, LOW); delay(time); } void left(int time){ digitalWrite(left_pin, LOW); delay(time); } void forward(int time){ digitalWrite(forward_pin, LOW); delay(time); } void reverse(int time){ digitalWrite(reverse_pin, LOW); delay(time); } void forward_right(int time){ digitalWrite(forward_pin, LOW); digitalWrite(right_pin, LOW); delay(time); } void reverse_right(int time){ digitalWrite(reverse_pin, LOW); digitalWrite(right_pin, LOW); delay(time); } void forward_left(int time){ digitalWrite(forward_pin, LOW); digitalWrite(left_pin, LOW); delay(time); } void reverse_left(int time){ digitalWrite(reverse_pin, LOW); digitalWrite(left_pin, LOW); delay(time); } void reset(){ digitalWrite(right_pin, HIGH); digitalWrite(left_pin, HIGH); digitalWrite(forward_pin, HIGH); digitalWrite(reverse_pin, HIGH); } void send_command(int command, int time){ switch (command){ //reset command case 0: reset(); break; // single command case 1: forward(time); break; case 2: reverse(time); break; case 3: right(time); break; case 4: left(time); break; //combination command case 6: forward_right(time); break; case 7: forward_left(time); break; case 8: reverse_right(time); break; case 9: reverse_left(time); break; default: Serial.print("Inalid Command\n"); } }

view raw

arduino

hosted with ❤ by GitHub

## Take 2 – Stream Data from Pi to Laptop

This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.

 """ Reference: PiCamera documentation https://picamera.readthedocs.org/en/release-1.10/recipes2.html https://github.com/hamuchiwa/AutoRCCar """ import io import socket import struct import time import picamera # create socket and bind host client_socket = socket.socket(socket.AF_INET, socket.SOCK_STREAM) client_socket.connect(('192.168.1.72', 8000)) connection = client_socket.makefile('wb') try: with picamera.PiCamera() as camera: camera.resolution = (320, 240) # pi camera resolution camera.framerate = 10 # 10 frames/sec time.sleep(2) # give 2 secs for camera to initilize start = time.time() stream = io.BytesIO() # send jpeg format video stream for foo in camera.capture_continuous(stream, 'jpeg', use_video_port = True): connection.write(struct.pack(' 600: break stream.seek(0) stream.truncate() connection.write(struct.pack('

## Train Neural Network with train.pkl

Converted numpy data to pickle and then use it for training python simple 3 layer neural network. 65536 neurons for input layer,  1000 neurons for hidden layer and 4 output neurons.  Forward, None, Left, and Right.

## Test trained Neural Network with live camera data…enjoy!

Next Steps

• Deep Learning
• Computer Vision
• Vehicle Dynamics
• Controllers
• Localization,
• Mapping (SLAM)
• Sensors & Fusion
• Safety Systems and Ethics

ReportStyleDocumentaton build RC custom

## LIDAR and Deep Learning

LiDAR sensors and software for real-time capture and processing of 3D mapping data and object detection, tracking, and classification. Can be used in self driving cars, security perimeter systems, interior security systems.

https://github.com/dps/nnrccar

https://gopigo.firebaseapp.com/

http://www.danielgm.net/cc/

https://github.com/bigsnarfdude/loam_velodyne

http://www.phoenix-aerial.com/information/lidar-comparison/

http://www.gim-international.com/content/news/9-revolutionary-lidar-survey-projects

http://velodynelidar.com/vlp-16-lite.html

https://www.idaholidar.org/free-lidar-tools/

http://www.technavio.com/blog/top-companies-global-automotive-lidar-sensors-market

https://zhengludwig.wordpress.com/projects/self-driving-rc-car/

## Neural Network Driving in GTAV

http://deepdrive.io/

Register

https://github.com/samjabrahams/tensorflow-on-raspberry-pi

Drive a Lamborghini With Your Keyboard

http://www.acmesystems.it/timelaps_video

## 2015 BlackHat there was some talks on Deep Learning

In this presentation they are doing static analysis on code and transcode it to images and use CNN for predictions. Trained on 1.6 million examples.

In this presentation model tries to do protocol classification. Unsure on the model. Maybe an RNN?

Reviews

## Using Inception v3 Tensorflow for MNIST

Modern object recognition models have millions of parameters and can take weeks to fully train. Transfer learning is a technique that shortcuts a lot of this work by taking a fully-trained model for a set of categories like ImageNet, and retrains from the existing weights for new classes. In this example we’ll be retraining the final layer from scratch, while leaving all the others untouched. For more information on the approach you can see this paper on Decaf.

Though it’s not as good as a full training run, this is surprisingly effective for many applications, and can be run in as little as 75 minutes on a laptop, without requiring a GPU. The data I used is from Kaggle MNIST dataset.

#### Script to convert train.csv to images in python

This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.

 import csv import numpy import cv2 import matplotlib.pyplot as plt reader = csv.reader(open('train.csv','rb')) header = reader.next() read_data = [] for row in reader: read_data.append(row) print("Working on:") data = numpy.array(read_data) ro = numpy.zeros((28,28,3)) print(data.shape) for image_number in range(data.shape[0]): label = read_data[image_number][0] for i in range(0,27): for j in range(0,27): ro[i,j,0] = data[image_number,j+i*28] ro[i,j,1] = data[image_number,j+i*28] ro[i,j,2] = data[image_number,j+i*28] cv2.imwrite('image_' + label + '_' + str(image_number) + '_' + '.jpg', ro)

view raw

gistfile1.txt

hosted with ❤ by GitHub

#### Let’s move the data to the proper folders

This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.

 mkdir hydrated_images cd hydrated_images mkdir zero mkdir nine mkdir eight mkdir seven mkdir six mkdir five mkdir four mkdir three mkdir two mkdir one mv ../image_4_*.jpg ./four mv ../image_3_*.jpg ./three mv ../image_2_*.jpg ./two mv ../image_1_*.jpg ./one mv ../image_0_*.jpg ./zero mv ../image_9_*.jpg ./nine mv ../image_8_*.jpg ./eight mv ../image_7_*.jpg ./seven mv ../image_6_*.jpg ./six mv ../image_5_*.jpg ./five

view raw

gistfile1.txt

hosted with ❤ by GitHub

#### Using the re-trained model to do MNIST prediction

This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.

 ⋊> ~/tensorflow on master ◦ bazel build tensorflow/examples/label_image:label_image & \ 22:12:44 bazel-bin/tensorflow/examples/label_image/label_image \ –graph=/tmp/output_graph.pb –labels=/tmp/output_labels.txt \ –output_layer=final_result \ –image=/Users/benchemployee/flower_photos/zero/image_0_4_.jpg WARNING: /private/var/tmp/_bazel_vincento/7a4c233cb97ceb0b839d7ecf123daf7f/external/protobuf/WORKSPACE:1: Workspace name in /private/var/tmp/_bazel_vincento/7a4c233cb97ceb0b839d7ecf123daf7f/external/protobuf/WORKSPACE (@__main__) does not match the name given in the repository's definition (@protobuf); this will cause a build error in future versions. WARNING: /private/var/tmp/_bazel_vincento/7a4c233cb97ceb0b839d7ecf123daf7f/external/re2/WORKSPACE:1: Workspace name in /private/var/tmp/_bazel_vincento/7a4c233cb97ceb0b839d7ecf123daf7f/external/re2/WORKSPACE (@__main__) does not match the name given in the repository's definition (@re2); this will cause a build error in future versions. WARNING: /private/var/tmp/_bazel_vincento/7a4c233cb97ceb0b839d7ecf123daf7f/external/highwayhash/WORKSPACE:1: Workspace name in /private/var/tmp/_bazel_vincento/7a4c233cb97ceb0b839d7ecf123daf7f/external/highwayhash/WORKSPACE (@__main__) does not match the name given in the repository's definition (@highwayhash); this will cause a build error in future versions. INFO: Found 1 target… Target //tensorflow/examples/label_image:label_image up-to-date: bazel-bin/tensorflow/examples/label_image/label_image INFO: Elapsed time: 0.114s, Critical Path: 0.00s W tensorflow/core/framework/op_def_util.cc:332] Op BatchNormWithGlobalNormalization is deprecated. It will cease to work in GraphDef version 9. Use tf.nn.batch_normalization(). I tensorflow/examples/label_image/main.cc:204] zero (5): 0.999788 I tensorflow/examples/label_image/main.cc:204] six (1): 9.04188e-05 I tensorflow/examples/label_image/main.cc:204] eight (9): 6.66903e-05 I tensorflow/examples/label_image/main.cc:204] five (6): 2.26056e-05 I tensorflow/examples/label_image/main.cc:204] nine (8): 2.20534e-05 'bazel build tensorflow/examples…' has ended

view raw

gistfile1.txt

hosted with ❤ by GitHub

## Neural Network from scratch in Python

So you want to teach a computer to recognize handwritten digits? You want to code this out in Python? You understand a little about Machine Learning? You wanna build a neural network?

Let’s try and implement a simple 3-layer neural network (NN) from scratch. I won’t get into the math because I suck at math, let alone trying to teach it.  I can also point to moar math resources if you read up on the details.

I assume you’re familiar with basic Machine Learning concepts like classification and regularization. Oh, and how optimization techniques like gradient descent work.

So, why not teach you Tensorflow or some other deep learning framework? I found that I learn best when I see the code, and learn the basics of the implementation. I find it helps me with intuition in choosing each part of the model. Of course, there are some AutoML solutions that could get me quicker ways to a baseline, but I still wouldn’t know anything. I’m trying to get out of just running the code like a script kiddie.

## So let’s get started!

For the past few months (thanks Arvin),  I have learned to appreciate both Classic Machine Learning (prior 2012) and Deep Learning techniques to model Kaggle competition data.

The handwritten digits competition was my first attempt at deep learning. So, I think it’s appropriate that it’s your first example to do deep learning. I remember this important gotcha moment. It was seeing the relationships between the data and pictures. It helped me to imagine the deep learning concepts visually.

## What does the data look like?

We’re going to use the classic visual recognition challenge data set, called the MNIST data set. Kaggle competitions are awesome because you can self score your solutions and they provide data in simple clean CSV files.  If successful, we should have a deep learning solution that should be the able to classify 25,000 images with a correct label. Let’s look at the CSV data.

Using a Jupyter notebook, let’s dump the data into a numpy matrix, and reshape it back into a picture. Each digit has been normalized to a 28 by 28 matrix.

The goal is to take the training data as an input (handwritten digit), pump it through the deep learning model, and predict if the data is a 0, 1, 2, 3, 4, 5, 6, 7, 8, or 9.

## Architecture of a Simple Neural Network

1. Picking the shape of the neural network. I’m gonna choose a simple NN consisting of three layers:

• First Layer: Input layer (784 neurons)
• Second Layer: Hidden layer (n = 15 neurons)
• Third Layer: Output layer

Here’s a look of the 3 layer network proposed above:

## Basic Structure of the code

This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.

 // **************** basic structure ********************** import random import numpy as np def sigmoid_function def sigmoid_prime class Network(object): num_layers sizes biases weigths def feedforward def SGD def update_mini_batch def backprop def evaluate def cost_derivative

view raw

gistfile1.txt

hosted with ❤ by GitHub

## Data structure to hold our data

2.  Picking the right matrix data structure. Nested python lists? CudaMAT? Python Dict? I’m choosing numpy because we’ll heavily use np.dot, np.reshape, np.random, np.zeros, np.argmax, and np.exp functions that I’m not really interested in implementing from scratch.

## Simulating perceptrons using an Activation Function

3.  Picking the activation function for our hidden layer. The activation function transforms the inputs of the hidden layer into its outputs. Common choices for activation functions are tanh, the sigmoid function, or ReLUs. We’ll use the sigmoid function.

This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.

 def sigmoid(z): """The sigmoid function.""" return 1.0/(1.0+np.exp(-z)) def sigmoid_prime(z): """Derivative of the sigmoid function.""" return sigmoid(z)*(1-sigmoid(z))

view raw

gistfile1.txt

hosted with ❤ by GitHub

## Python Neural Network Object

This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.

 In [1]: import numpy as np In [2]: class Network(object): …: def __init__(self, sizes): …: self.num_layers = len(sizes) …: self.sizes = sizes …: self.biases = [np.random.randn(y, 1) for y in sizes[1:]] …: self.weights = [np.random.randn(y, x) for x, y in zip(sizes[:-1], sizes[1:])] …: In [3]: neural_network = Network([1,20,10]) In [4]: neural_network.biases Out[4]: [array([[ 0.28482536], [-0.89049548], [ 0.20617518], [ 0.4158359 ], [-0.93796761], [-1.49837658], [-0.71753994], [ 0.1593402 ], [ 0.50075027], [-0.91795465], [ 1.16622691], [ 0.07412679], [ 0.95915247], [ 1.27357302], [-0.18081714], [-0.68107571], [ 0.25295953], [ 0.04032309], [-0.71716707], [-0.46420026]]), array([[ 1.03793023], [ 1.17470112], [ 0.20570392], [-2.21440845], [ 0.58324405], [ 0.4505373 ], [ 0.58999162], [-1.20247126], [-0.79782343], [ 1.04171305]])] In [5]: neural_network.weights Out[5]: [array([[-1.25500579], [-0.11754026], [-0.52644551], [ 0.70982054], [-0.19753958], [-0.30560159], [-0.64869807], [-0.0959351 ], [ 0.00607763], [-1.2932363 ], [ 1.2917588 ], [-0.9246978 ], [-0.66108135], [-1.74086206], [-1.77074381], [-0.82392728], [ 0.48077431], [ 0.69002335], [-1.47798152], [ 0.34058097]]), array([[ -4.66379471e-01, 9.34679152e-01, 1.67660702e-01, -1.12799513e+00, 3.81581590e-01, 1.47740106e+00, 1.44007712e+00, 3.16368089e-01, 4.77270298e-01, -1.17552649e+00, 1.05535962e+00, 4.68589315e-01, 1.76161441e+00, 1.50704930e+00, 4.24731475e-01, -2.00378066e-01, -9.89383268e-01, -1.61688372e+00, 1.54292169e+00, -7.94524978e-01], [ -1.36887372e+00, -4.07464906e-01, -8.18249890e-02, 3.81270961e-01, -2.50761956e-01, 1.97130152e+00, 3.72098140e-01, 7.84729226e-01, -6.42591846e-01, 6.43485388e-01, -1.38265055e+00, 2.43183695e-01, 6.55665636e-01, 5.51403453e-01, 8.12486615e-01, -6.83346668e-01, -8.08290747e-01, -8.68447206e-01, -7.26645512e-01, 2.56793945e+00], [ -2.04202733e+00, -1.05115965e+00, 9.16176896e-01, 5.98457440e-01, 5.23700857e-02, 1.37716760e+00, -2.00414176e+00, -1.01167674e+00, -3.27884747e+00, 4.67357213e-01, -1.04381626e+00, 9.51630607e-02, -8.21025097e-01, 1.34168886e+00, -4.69250266e-01, -2.13058599e+00, -7.59169043e-01, 4.41636737e-01, 9.70997171e-01, 3.65910642e-01], [ 2.96224393e-01, -8.51502886e-01, -3.63581410e-01, 1.40439562e+00, -2.14400305e-01, 4.79910147e-01, -2.81209770e-01, 4.35715562e-01, 1.33283734e-01, 2.14274492e+00, -4.07300222e-01, 1.50805305e-01, 1.13887757e+00, -7.75897808e-02, -8.99781261e-01, 4.53669508e-01, -2.23501031e+00, 5.67555554e-01, -6.21195162e-01, 2.32840516e-01], [ 7.69594784e-01, 6.67373451e-01, -5.19098158e-01, -9.90597228e-01, -9.02242307e-01, 1.60027781e+00, -1.54443880e+00, 2.12182678e+00, -1.68228949e-01, 7.03017020e-01, -1.79920919e-01, 1.45734404e+00, 5.28024487e-01, -8.36041803e-01, 8.38291619e-01, -3.02458855e-01, -9.72754311e-01, -2.23379185e-02, 6.29819167e-01, 4.71624228e-01], [ -2.73827049e+00, -1.23137601e+00, -5.13667202e-01, 1.20490181e+00, 2.41071636e-01, -1.02243174e+00, 1.24807613e-01, -1.38978132e+00, -9.83408159e-01, -7.95698065e-01, 1.37490138e+00, -1.61008374e+00, 2.50347152e+00, -3.12140404e-01, 8.86036952e-01, 1.26754490e+00, 8.30173086e-01, -5.42252782e-01, 1.60154518e+00, -3.01922053e-01], [ 2.47038125e-01, -5.77586943e-01, -3.53221925e-04, 8.26414095e-01, -4.54193733e-01, 5.83379656e-01, 7.09285965e-01, 1.71015373e+00, -5.30156231e-01, 4.41905982e-01, 3.65858862e-01, 6.28863024e-01, -4.51148215e-01, -4.96840730e-01, -8.88860502e-01, -8.54983751e-01, -5.39272835e-01, 1.80314187e-01, -1.63315792e+00, -1.18592337e-01], [ -1.45691956e+00, 5.78914213e-01, 1.42205135e-01, -3.99161926e-01, 1.99981936e-01, -1.99242627e-02, -6.80270027e-01, 1.14376227e+00, 1.13379158e+00, -1.62605085e+00, 3.02440462e-02, 8.12623415e-01, -6.35188704e-01, 2.80390889e-01, -3.25566694e-01, 3.78058061e-01, -5.91146697e-01, -6.16020138e-01, 8.59975010e-01, 1.25717657e+00], [ -6.17783563e-01, -1.11271300e-01, -6.95523468e-01, 1.07761134e+00, 3.06866575e-01, -1.00667752e+00, -6.99460531e-02, 8.96042222e-01, -6.70529698e-01, 4.67376227e-01, 9.81872751e-01, 1.33395172e+00, -5.89545746e-01, -1.27067551e+00, 1.17194488e+00, -5.30943382e-01, 9.75559426e-01, 4.69853098e-01, -7.55707525e-01, 2.27781822e-01], [ 1.13163724e+00, 1.78187712e+00, -6.62599227e-01, -2.68871029e-02, -5.87937695e-02, 1.44750129e-01, -5.39669257e-01, 1.85152838e+00, 1.54515545e+00, -5.96931160e-01, 1.20361780e-01, 6.12899657e-01, 5.40445280e-01, 5.77774584e-01, -1.83077535e-01, -2.52571377e-01, -1.13763002e-01, -1.36721348e+00, 6.12752593e-01, -1.82517001e+00]])]

view raw

gistfile1.txt

hosted with ❤ by GitHub

## Feed Forward Function

a.k.a The Forward Pass

The purpose of the feed forward function is to pass the input into the NN matrix and return the new activations.

This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.

 def feedforward(self, a): """Return the output of the network if “a“ is input.""" for b, w in zip(self.biases, self.weights): a = sigmoid(np.dot(w, a)+b) return a

view raw

gistfile1.txt

hosted with ❤ by GitHub

## Stochastic Gradient Descent function (SGD)

This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.

 def SGD(self, training_data, epochs, mini_batch_size, eta, test_data=None): """Train the neural network using mini-batch stochastic gradient descent. The “training_data“ is a list of tuples “(x, y)“ representing the training inputs and the desired outputs. The other non-optional parameters are self-explanatory. If “test_data“ is provided then the network will be evaluated against the test data after each epoch, and partial progress printed out. This is useful for tracking progress, but slows things down substantially.""" if test_data: n_test = len(test_data) n = len(training_data) for j in xrange(epochs): random.shuffle(training_data) mini_batches = [ training_data[k:k+mini_batch_size] for k in xrange(0, n, mini_batch_size)] for mini_batch in mini_batches: self.update_mini_batch(mini_batch, eta) if test_data: print "Epoch {0}: {1} / {2}".format( j, self.evaluate(test_data), n_test) else: print "Epoch {0} complete".format(j)

view raw

gistfile1.txt

hosted with ❤ by GitHub

## Update Mini Batch Function

Mini-batch gradient descent can work a bit faster than stochastic gradient descent. In Batch gradient descent we will use all m examples in each generation. Whereas in Stochastic gradient descent we will use a single example in each generation. What Mini-batch gradient descent does is somewhere in between. Specifically, with this algorithm we’re going to use b examples in each iteration where b is a parameter called the “mini batch size” so the idea is that this is somewhat in-between Batch gradient descent and Stochastic gradient descent.

This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.

 def update_mini_batch(self, mini_batch, eta): """Update the network's weights and biases by applying gradient descent using backpropagation to a single mini batch. The “mini_batch“ is a list of tuples “(x, y)“, and “eta“ is the learning rate.""" nabla_b = [np.zeros(b.shape) for b in self.biases] nabla_w = [np.zeros(w.shape) for w in self.weights] for x, y in mini_batch: delta_nabla_b, delta_nabla_w = self.backprop(x, y) nabla_b = [nb+dnb for nb, dnb in zip(nabla_b, delta_nabla_b)] nabla_w = [nw+dnw for nw, dnw in zip(nabla_w, delta_nabla_w)] self.weights = [w-(eta/len(mini_batch))*nw for w, nw in zip(self.weights, nabla_w)] self.biases = [b-(eta/len(mini_batch))*nb for b, nb in zip(self.biases, nabla_b)]

view raw

gistfile1.txt

hosted with ❤ by GitHub

## Back Prop Function

a.k.a The Backwards Pass

Our goal with back propagation is to update each of the weights in the network so that they cause the actual output to be closer the target output, thereby minimizing the error for each output neuron and the network as a whole.  Back prop is a method to stop us from overfitting our model, so the model is more generalized.

This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.

 def backprop(self, x, y): """Return a tuple “(nabla_b, nabla_w)“ representing the gradient for the cost function C_x. “nabla_b“ and “nabla_w“ are layer-by-layer lists of numpy arrays, similar to “self.biases“ and “self.weights“.""" nabla_b = [np.zeros(b.shape) for b in self.biases] nabla_w = [np.zeros(w.shape) for w in self.weights] # feedforward activation = x activations = [x] # list to store all the activations, layer by layer zs = [] # list to store all the z vectors, layer by layer for b, w in zip(self.biases, self.weights): z = np.dot(w, activation)+b zs.append(z) activation = sigmoid(z) activations.append(activation) # backward pass delta = self.cost_derivative(activations[-1], y) * \ sigmoid_prime(zs[-1]) nabla_b[-1] = delta nabla_w[-1] = np.dot(delta, activations[-2].transpose()) # Note that the variable l in the loop below is used a little # differently to the notation in Chapter 2 of the book. Here, # l = 1 means the last layer of neurons, l = 2 is the # second-last layer, and so on. It's a renumbering of the # scheme in the book, used here to take advantage of the fact # that Python can use negative indices in lists. for l in xrange(2, self.num_layers): z = zs[-l] sp = sigmoid_prime(z) delta = np.dot(self.weights[-l+1].transpose(), delta) * sp nabla_b[-l] = delta nabla_w[-l] = np.dot(delta, activations[-l-1].transpose()) return (nabla_b, nabla_w)

view raw

gistfile1.txt

hosted with ❤ by GitHub

## Cost Derivative Function

So in gradient descent, you follow the negative of the gradient to the point where the cost is a minimum. If someone is talking about gradient descent in a machine learning context, the cost function is probably implied (it is the function to which you are applying the gradient descent algorithm).

This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.

 def cost_derivative(self, output_activations, y): """Return the vector of partial derivatives \partial C_x / \partial a for the output activations.""" return (output_activations-y)

view raw

gistfile1.txt

hosted with ❤ by GitHub

## Putting it all together – Network.py

This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.

 """ network.py ~~~~~~~~~~ A module to implement the stochastic gradient descent learning algorithm for a feedforward neural network. Gradients are calculated using backpropagation. Note that I have focused on making the code simple, easily readable, and easily modifiable. It is not optimized, and omits many desirable features. """ #### Libraries # Standard library import random # Third-party libraries import numpy as np class Network(object): def __init__(self, sizes): """The list “sizes“ contains the number of neurons in the respective layers of the network. For example, if the list was [2, 3, 1] then it would be a three-layer network, with the first layer containing 2 neurons, the second layer 3 neurons, and the third layer 1 neuron. The biases and weights for the network are initialized randomly, using a Gaussian distribution with mean 0, and variance 1. Note that the first layer is assumed to be an input layer, and by convention we won't set any biases for those neurons, since biases are only ever used in computing the outputs from later layers.""" self.num_layers = len(sizes) self.sizes = sizes self.biases = [np.random.randn(y, 1) for y in sizes[1:]] self.weights = [np.random.randn(y, x) for x, y in zip(sizes[:–1], sizes[1:])] def feedforward(self, a): """Return the output of the network if “a“ is input.""" for b, w in zip(self.biases, self.weights): a = sigmoid(np.dot(w, a)+b) return a def SGD(self, training_data, epochs, mini_batch_size, eta, test_data=None): """Train the neural network using mini-batch stochastic gradient descent. The “training_data“ is a list of tuples “(x, y)“ representing the training inputs and the desired outputs. The other non-optional parameters are self-explanatory. If “test_data“ is provided then the network will be evaluated against the test data after each epoch, and partial progress printed out. This is useful for tracking progress, but slows things down substantially.""" if test_data: n_test = len(test_data) n = len(training_data) for j in xrange(epochs): random.shuffle(training_data) mini_batches = [ training_data[k:k+mini_batch_size] for k in xrange(0, n, mini_batch_size)] for mini_batch in mini_batches: self.update_mini_batch(mini_batch, eta) if test_data: print "Epoch {0}: {1} / {2}".format( j, self.evaluate(test_data), n_test) else: print "Epoch {0} complete".format(j) def update_mini_batch(self, mini_batch, eta): """Update the network's weights and biases by applying gradient descent using backpropagation to a single mini batch. The “mini_batch“ is a list of tuples “(x, y)“, and “eta“ is the learning rate.""" nabla_b = [np.zeros(b.shape) for b in self.biases] nabla_w = [np.zeros(w.shape) for w in self.weights] for x, y in mini_batch: delta_nabla_b, delta_nabla_w = self.backprop(x, y) nabla_b = [nb+dnb for nb, dnb in zip(nabla_b, delta_nabla_b)] nabla_w = [nw+dnw for nw, dnw in zip(nabla_w, delta_nabla_w)] self.weights = [w–(eta/len(mini_batch))*nw for w, nw in zip(self.weights, nabla_w)] self.biases = [b–(eta/len(mini_batch))*nb for b, nb in zip(self.biases, nabla_b)] def backprop(self, x, y): """Return a tuple “(nabla_b, nabla_w)“ representing the gradient for the cost function C_x. “nabla_b“ and “nabla_w“ are layer-by-layer lists of numpy arrays, similar to “self.biases“ and “self.weights“.""" nabla_b = [np.zeros(b.shape) for b in self.biases] nabla_w = [np.zeros(w.shape) for w in self.weights] # feedforward activation = x activations = [x] # list to store all the activations, layer by layer zs = [] # list to store all the z vectors, layer by layer for b, w in zip(self.biases, self.weights): z = np.dot(w, activation)+b zs.append(z) activation = sigmoid(z) activations.append(activation) # backward pass delta = self.cost_derivative(activations[–1], y) * \ sigmoid_prime(zs[–1]) nabla_b[–1] = delta nabla_w[–1] = np.dot(delta, activations[–2].transpose()) # Note that the variable l in the loop below is used a little # differently to the notation in Chapter 2 of the book. Here, # l = 1 means the last layer of neurons, l = 2 is the # second-last layer, and so on. It's a renumbering of the # scheme in the book, used here to take advantage of the fact # that Python can use negative indices in lists. for l in xrange(2, self.num_layers): z = zs[–l] sp = sigmoid_prime(z) delta = np.dot(self.weights[–l+1].transpose(), delta) * sp nabla_b[–l] = delta nabla_w[–l] = np.dot(delta, activations[–l–1].transpose()) return (nabla_b, nabla_w) def evaluate(self, test_data): """Return the number of test inputs for which the neural network outputs the correct result. Note that the neural network's output is assumed to be the index of whichever neuron in the final layer has the highest activation.""" test_results = [(np.argmax(self.feedforward(x)), y) for (x, y) in test_data] return sum(int(x == y) for (x, y) in test_results) def cost_derivative(self, output_activations, y): """Return the vector of partial derivatives \partial C_x / \partial a for the output activations.""" return (output_activations–y) #### Miscellaneous functions def sigmoid(z): """The sigmoid function.""" return 1.0/(1.0+np.exp(–z)) def sigmoid_prime(z): """Derivative of the sigmoid function.""" return sigmoid(z)*(1–sigmoid(z))

view raw

Network.py

hosted with ❤ by GitHub

This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.

 """ sudo pip3 install –upgrade https://storage.googleapis.com/tensorflow/linux/cpu/tensorflow-0.8.0rc0-cp34-cp34m-linux_x86_64.whl """ from __future__ import absolute_import from __future__ import division from __future__ import print_function import shutil from sklearn import datasets, metrics, cross_validation from tensorflow.contrib import skflow import numpy as np from flask import Flask, abort, jsonify, request #import cPickle as pickle # basic toy nn pickled loader #pkl_file = open('batch_model_2016_04_23.pkl', 'rb') #latest_neural_network = pickle.load(pkl_file) #pkl_file.close() from sklearn import datasets, metrics, cross_validation from tensorflow.contrib import skflow iris = datasets.load_iris() X_train, X_test, y_train, y_test = cross_validation.train_test_split(iris.data, iris.target, test_size=0.2, random_state=42) # trained batch offline saved skflow model (latest parameters and learned variables) #classifier.save('/home/ubuntu/scratch/skflow_batch/batch_model_2016_04_23') # restore skflow model from batch run 2016-04-23 new_classifier = skflow.TensorFlowEstimator.restore('/home/ubuntu/scratch/skflow_batch/batch_model_2016_04_23') # check model load with test data score = metrics.accuracy_score(y_test, new_classifier.predict(X_test)) print('Accuracy: {0:f}'.format(score)) app = Flask(__name__) @app.route('/api', methods=['POST']) def make_predict(): # incoming data converted from json data = request.get_json(force=True) # shove into array predict_request = [data['sl'],data['sw'],data['pl'], data['pw']] predict_request = np.array([predict_request]) # np array passed to toy neural network # https://gist.github.com/bigsnarfdude/57ff7d6095f7ee83d4195d1fed26388b y_hat = new_classifier.predict(predict_request) output = str(y_hat[0]) # convert output to json return jsonify(results=output) if __name__ == '__main__': app.run(port = 11111, debug = True)

view raw

gistfile1.txt

hosted with ❤ by GitHub

## When Should You Perform a Security Audit?

You should audit your security configuration in the following situations:

• On a periodic basis. You should perform the steps described in this document at regular intervals as a best practice for security.
• If there are changes in your organization, such as people leaving.
• If you have stopped using one or more individual AWS services. This is important for removing permissions that users in your account no longer need.
• If you’ve added or removed software in your accounts, such as applications on Amazon EC2 instances, AWS OpsWorks stacks, AWS CloudFormation templates, etc.
• If you ever suspect that an unauthorized person might have accessed your account.

## General Guidelines for Auditing

• Be thorough. Look at all aspects of your security configuration, including those you might not use regularly.
• Don’t assume. If you are unfamiliar with some aspect of your security configuration (for example, the reasoning behind a particular policy or the existence of a role), investigate the business need until you are satisfied.
• Keep things simple. To make auditing (and management) easier, use IAM groups, consistent naming schemes, and straightforward policies.

## Review Your AWS Account Credentials

Take these steps when you audit your AWS account credentials:

1. If you’re not using the root access keys for your account, remove them. We strongly recommend that you do not use root access keys for everyday work with AWS, and that instead you create IAM users.
2. If you do need to keep the access keys for your account, rotate them regularly.

Take these steps when you audit your existing IAM users:

1. Delete users that are not active.
2. Remove users from groups that they don’t need to be a part of.
3. Review the policies attached to the groups the user is in. See Tips for Reviewing IAM Policies.
4. Delete security credentials that the user doesn’t need or that might have been exposed. For example, an IAM user that is used for an application does not need a password (which is necessary only to sign in to AWS websites). Similarly, if a user does not use access keys, there’s no reason for the user to have one. For more information, see Managing Passwords for IAM Users and Managing Access Keys for IAM Users in the IAM User Guide guide.

You can generate and download a credential report that lists all IAM users in your account and the status of their various credentials, including passwords, access keys, and MFA devices. For passwords and access keys, the credential report shows how recently the password or access key has been used. Credentials that have not been used recently might be good candidates for removal. For more information, see Getting Credential Reports for your AWS Account in the IAM User Guide guide.

5. Rotate (change) user security credentials periodically, or immediately if you ever share them with an unauthorized person. For more information, see Managing Passwords for IAM Users and Managing Access Keys for IAM Users in the IAM User Guide guide.

Take these steps when you audit your IAM groups:

1. Delete unused groups.
2. Review users in each group and remove users who don’t belong. See Review Your IAM Users earlier.
3. Review the policies attached to the group. See Tips for Reviewing IAM Policies.

Take these steps when you audit your IAM roles:

1. Delete roles that are not in use.
2. Review the role’s trust policy. Make sure that you know who the principal is and that you understand why that account or user needs to be able to assume the role.
3. Review the access policy for the role to be sure that it grants suitable permissions to whoever assumes the role—see Tips for Reviewing IAM Policies.

## Review Your IAM Providers for SAML and OpenID Connect (OIDC)

If you have created an IAM entity for establishing trust with a SAML or OIDC identity provider, take these steps:

1. Delete unused providers.
2. Download and review the AWS metadata documents for each SAML provider and make sure the documents reflect your current business needs. Alternatively, get the latest metadata documents from the SAML IdPs that you want to establish trust with and update the provider in IAM.

If you have created a mobile app that makes requests to AWS, take these steps:

1. Make sure that the mobile app does not contain embedded access keys, even if they are in encrypted storage.
2. Get temporary credentials for the app by using APIs that are designed for that purpose. We recommend that you use Amazon Cognito to manage user identity in your app. This service lets you authenticate users using Login with Amazon, Facebook, Google, or any OpenID Connect (OIDC)–compatible identity provider. You can then use the Amazon Cognito credentials provider to manage credentials that your app uses to make requests to AWS.

If your mobile app doesn’t support authentication using Login with Amazon, Facebook, Google, or any other OIDC-compatible identity provider, you can create a proxy server that can dispense temporary credentials to your app.

## Review Your Amazon EC2 Security Configuration

Take the following steps for each AWS region:

1. Delete Amazon EC2 key pairs that are unused or that might be known to people outside your organization.
2. Review your Amazon EC2 security groups:
• Remove security groups that no longer meet your needs.
• Remove rules from security groups that no longer meet your needs. Make sure you know why the ports, protocols, and IP address ranges they permit have been allowed.
3. Terminate instances that aren’t serving a business need or that might have been started by someone outside your organization for unapproved purposes. Remember that if an instance is started with a role, applications that run on that instance can access AWS resources using the permissions that are granted by that role.
4. Cancel spot instance requests that aren’t serving a business need or that might have been made by someone outside your organization.
5. Review your Auto Scaling groups and configurations. Shut down any that no longer meet your needs or that might have been configured by someone outside your organization.

## Review AWS Policies in Other Services

Review the permissions for services that use resource-based policies or that support other security mechanisms. In each case, make sure that only users and roles with a current business need have access to the service’s resources, and that the permissions granted on the resources are the fewest necessary to meet your business needs.

## Monitor Activity in Your AWS Account

Follow these guidelines for monitoring AWS activity:

• Turn on AWS CloudTrail in each account and use it in each supported region.
• Periodically examine CloudTrail log files. (CloudTrail has a number of partners who provide tools for reading and analyzing log files.)
• Enable Amazon S3 bucket logging to monitor requests made to each bucket.
• If you believe there has been unauthorized use of your account, pay particular attention to temporary credentials that have been issued. If temporary credentials have been issued that you don’t recognize, disabletheir permissions.
• Enable billing alerts in each account and set a cost threshold that lets you know if your charges exceed your normal usage.

## Tips for Reviewing IAM Policies

Policies are powerful and subtle, so it’s important to study and understand the permissions that are granted by each policy. Use the following guidelines when reviewing policies:

• As a best practice, attach policies to groups instead of to individual users. If an individual user has a policy, make sure you understand why that user needs the policy.
• Make sure that IAM users, groups, and roles have only the permissions that they need.
• Use the IAM Policy Simulator to test policies that are attached to users or groups.
• Remember that a user’s permissions are the result of all applicable policies—user policies, group policies, and resource-based policies (on Amazon S3 buckets, Amazon SQS queues, Amazon SNS topics, and AWS KMS keys). It’s important to examine all the policies that apply to a user and to understand the complete set of permissions granted to an individual user.
• Be aware that allowing a user to create an IAM user, group, role, or policy and attach a policy to the principal entity is effectively granting that user all permissions to all resources in your account. That is, users who are allowed to create policies and attach them to a user, group, or role can grant themselves any permissions. In general, do not grant IAM permissions to users or roles whom you do not trust with full access to the resources in your account. The following list contains IAM permissions that you should review closely:
• iam:PutGroupPolicy
• iam:PutRolePolicy
• iam:PutUserPolicy
• iam:CreatePolicy
• iam:CreatePolicyVersion
• iam:AttachGroupPolicy
• iam:AttachRolePolicy
• iam:AttachUserPolicy
• Make sure policies don’t grant permissions for services that you don’t use. For example, if you use AWS managed policies, make sure the AWS managed policies that are in use in your account are for services that you actually use. To find out which AWS managed policies are in use in your account, use the IAMGetAccountAuthorizationDetails API (AWS CLI command: aws iam get-account-authorization-details).
• If the policy grants a user permission to launch an Amazon EC2 instance, it might also allow the iam:PassRoleaction, but if so it should explicitly list the roles that the user is allowed to pass to the Amazon EC2 instance.
• Closely examine any values for the Action or Resource element that include *. It’s a best practice to grantAllow access to only the individual actions and resources that users need. However, the following are reasons that it might be suitable to use * in a policy:
• The policy is designed to grant administrative-level privileges.
• The wildcard character is used for a set of similar actions (for example, Describe*) as a convenience, and you are comfortable with the complete list of actions that are referenced in this way.
• The wildcard character is used to indicate a class of resources or a resource path (e.g.,arn:aws:iam::account-id:users/division_abc/*), and you are comfortable granting access to all of the resources in that class or path.
• A service action does not support resource-level permissions, and the only choice for a resource is *.
• Examine policy names to make sure they reflect the policy’s function. For example, although a policy might have a name that includes “read only,” the policy might actually grant write or change permissions.