BigSnarf blog

Infosec FTW

Kaggle Vanity

Leave a comment Posted by Security Dude on January 24, 2018

Screen Shot 2018-01-24 at 4.24.35 PM

https://www.kaggle.com/vincento/competitions

Thoughts

Blade Runner Principle

Leave a comment Posted by Security Dude on December 24, 2017

giphy

Malware < ————————————————————————— > Detector

Generator < ————————————————————————- > Discriminator

One network generates candidates and the other evaluates them. Typically, the generative network learns to map from a latent space to a particular data distribution of interest (benignware), while the discriminative network discriminates between instances from the true data distribution and candidates produced by the generator. The generative network’s training objective is to increase the error rate of the discriminative network (i.e., “fool” the discriminator network by producing novel synthesized instances that appear to have come from the true data distribution).

Thoughts

Adversarial stuff

Leave a comment Posted by Security Dude on December 23, 2017

Screen Shot 2018-01-03 at 9.45.52 PM

https://arxiv.org/abs/1712.09665

Click to access 1511.07528.pdf

Click to access 1605.07277.pdf

https://www.usenix.org/conference/usenixsecurity16/technical-sessions/presentation/tramer

Click to access 1708.06733.pdf

Click to access privacy-igert.pdf

Click to access af4067d0b51794c3ae91fccfd1909d784a5b.pdf

Click to access 1709.00440.pdf

Click to access 1610.06918v1.pdf

https://www.darpa.mil/program/cyber-grand-challenge

https://machine-learning-and-security.github.io/

http://www.ai-sec.net/AISec2017/index.html

https://nips.cc/Conferences/2017/Videos

http://www.secrepo.com/

Thoughts

malware2vec experiments query and answer

Leave a comment Posted by Security Dude on December 22, 2017

Screen Shot 2017-12-22 at 11.57.42 PM

13388_2011_1_MOESM3_ESM

linear-relationships

Screen Shot 2017-12-25 at 11.31.36 PM

Screen Shot 2017-12-27 at 9.52.46 PM

Screen Shot 2018-01-10 at 1.54.27 PM

The proposed method converts the strings, and opcode sequences extracted from the malware into vectors and calculates the similarities between vectors. In addition, we apply the proposed method to the execution traces extracted through dynamic analysis, so that malware employing detection avoidance techniques such as obfuscation and packing can be analyzed. Instructions and instructions frequencies can be modeled into vectors. Call sequences can be modeled. PE sections, DLLs, opcode stats as BOW can be modeled into vectors. Name of files, system calls, API can be vectorized.

Motivation: https://code.google.com/archive/p/word2vec/

Click to access 1801.02950.pdf

http://pytorch.org/tutorials/beginner/nlp/word_embeddings_tutorial.html

Click to access notes1.pdf

Click to access 1709.07470.pdf

entropy based analysis and testing malware

hmm based analysis and testing for malware detection

Click to access IJISA-V8-N4-2.pdf

http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.645.9508&rep=rep1&type=pdf

http://ieeexplore.ieee.org/document/7275913/

AI approach to malware similarity analysis: Maping the malware genome with a deep neural network from Priyanka Aash

Static Malware Analysis
Dynamic Malware Analysis

https://dl.acm.org/citation.cfm?id=1007518

http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.103.29&rep=rep1&type=pdf

https://jon.thackray.org/biochem/dna.html

Click to access 1104.3229.pdf

We present a novel system for automatically discovering and interactively visualizing shared system call sequence relationships within large malware datasets. Our system’s pipeline begins with the application of a novel heuristic algorithm for extracting variable length, semantically meaningful system call sequences from malware system call behavior logs. Then, based on the occurrence of these semantic sequences, we construct a Boolean vector representation of the malware sample corpus. Finally we compute Jaccard indices pairwise over sample vectors to obtain a sample similarity matrix.

Stacked DAE for malware https://arxiv.org/abs/1711.08336

https://github.com/jivoi/awesome-ml-for-cybersecurity

Automatic malware signature generation and classification. The method uses a deep stack of denoising autoencoders, generating an invariant compact representation of the malware behavior. While conventional signature and token based methods for malware detection do not detect a majority of new variants for existing malware, the results presented in this paper show that signatures generated by the DBN allow for an accurate classification of new malware variants.

https://github.com/yuvalapidot/DeepSign—Deep-Learning-algorithm/tree/master/dl

Dataset

virtualized dynamic analysis to yield program run-time traces of both benign and malicious files.

Screen Shot 2017-12-22 at 12.19.00 AM

https://github.com/wapiflapi/veles

http://ieeexplore.ieee.org/document/8027024/

class imbalance

Click to access raff_shwel.pdf

GAN idea – Generative adversarial network opcode

Generative adversarial network for opcode – altering the malware code to resemble benignware by injection subroutines from normal files to cause a rise in misdetection

Kaggle Malware Classification Challenge 2015

https://www.kaggle.com/c/malware-classification/

simhash http://www.wwwconference.org/www2007/papers/paper215.pdf

Machine learning is a popular approach to signatureless malware detection because it can generalize to never-beforeseen malware families and polymorphic strains. This has resulted in its practical use for either primary detection engines or supplementary heuristic detections by anti-malware vendors. Recent work in adversarial machine learning has shown that models are susceptible to gradient-based and other attacks. In this whitepaper, we summarize the various attacks that have been proposed for machine learning models in information security, each which require the adversary to have some degree of knowledge about the model under attack. Importantly, even when applied to attacking machine learning malware classifier based on static features for Windows portable executable (PE) files, these attacks, previous attack methodologies may break the format or functionality of the malware. We investigate a more general framework for attacking static PE anti-malware engines based on reinforcement learning, which models more realistic attacker conditions, and subsequently has provides much more modest evasion rates. A reinforcement learning (RL) agent is equipped with a set of functionality-preserving operations that it may perform on the PE file. It learns through a series of games played against the anti-malware engine which sequence of operations is most likely to result in evasion for a given malware sample. Given the general framework, it is not surprising that the evasion rates are modest. However, the resulting RL agent can succinctly summarize blind spots of the anti-malware model. Additionally, evasive variants generated by the agent may be used to harden machine learning anti-malware engine via adversarial training

https://arxiv.org/abs/1702.05983

https://github.com/wapiflapi/veles

https://github.com/wapiflapi/binglide

http://www.capstone-engine.org/

https://github.com/radare/radare2

https://github.com/vivisect/vivisect

https://cuckoosandbox.org/

https://github.com/programa-stic/barf-project

Click to access 1801.02950.pdf

Sorry, something went wrong. Reload?

Sorry, we cannot display this file.

Sorry, this file is invalid so it cannot be displayed.

<br /> Viewer requires iframe.<br />

view raw

WMD_tutorial.ipynb

hosted with ❤ by GitHub

Screen Shot 2017-12-23 at 12.43.46 AM

Thoughts

SICK LiDAR for xmas fun

Leave a comment Posted by Security Dude on December 14, 2017

Every XMAS I always do some electronics project for fun. This year LIDAR.

This stack provides a ROS driver for the SICK LD-MRS series of laser scanners. The SICK LD-MRS is a multi-layer, multi-echo 3D laser scanner that is geared towards rough outdoor environments and also provides object tracking. The driver also works for the identical devices from IBEO.

Read moar:

https://www.sick.com/ca/en/detection-and-ranging-solutions/3d-lidar-sensors/mrs1000/c/g387152

https://github.com/nhatao/mo_tracker/wiki

Tools

Denoising AutoEncoder

Leave a comment Posted by Security Dude on December 9, 2017

Screen Shot 2017-12-13 at 11.51.27 PM Screen Shot 2017-12-11 at 3.57.33 PM

Screen Shot 2017-12-08 at 3.49.33 PM.png

latent-space

DEMO: http://vecg.cs.ucl.ac.uk/Projects/projects_fonts/projects_fonts.html#interactive_demo

https://github.com/ramarlina/DenoisingAutoEncoder

https://github.com/Mctigger/KagglePlanetPytorch

https://github.com/fducau/AAE_pytorch

https://blog.paperspace.com/adversarial-autoencoders-with-pytorch/

http://pytorch.org/docs/master/torchvision/transforms.html

	import os

	import torch
	from torch import nn
	from torch.autograd import Variable
	from torch.utils.data import DataLoader
	from torchvision import transforms
	from torchvision.datasets import MNIST
	from torchvision.utils import save_image

	if not os.path.exists('./mlp_img'):
	os.mkdir('./mlp_img')


	def to_img(x):
	x = x.view(x.size(0), 1, 28, 28)
	return x

	num_epochs = 20
	batch_size = 128
	learning_rate = 1e-3


	def plot_sample_img(img, name):
	img = img.view(1, 28, 28)
	save_image(img, './sample_{}.png'.format(name))


	def min_max_normalization(tensor, min_value, max_value):
	min_tensor = tensor.min()
	tensor = (tensor – min_tensor)
	max_tensor = tensor.max()
	tensor = tensor / max_tensor
	tensor = tensor * (max_value – min_value) + min_value
	return tensor


	def tensor_round(tensor):
	return torch.round(tensor)

	img_transform = transforms.Compose([
	transforms.ToTensor(),
	transforms.Lambda(lambda tensor:min_max_normalization(tensor, 0, 1)),
	transforms.Lambda(lambda tensor:tensor_round(tensor))
	])

	dataset = MNIST('./data', transform=img_transform, download=True)
	dataloader = DataLoader(dataset, batch_size=batch_size, shuffle=True)


	class autoencoder(nn.Module):
	def __init__(self):
	super(autoencoder, self).__init__()
	self.encoder = nn.Sequential(
	nn.Linear(28 * 28, 256),
	nn.ReLU(True),
	nn.Linear(256, 64),
	nn.ReLU(True))
	self.decoder = nn.Sequential(
	nn.Linear(64, 256),
	nn.ReLU(True),
	nn.Linear(256, 28 * 28),
	nn.Sigmoid())

	def forward(self, x):
	x = self.encoder(x)
	x = self.decoder(x)
	return x


	model = autoencoder().cuda()
	criterion = nn.BCELoss()
	optimizer = torch.optim.Adam(
	model.parameters(), lr=learning_rate, weight_decay=1e-5)

	for epoch in range(num_epochs):
	for data in dataloader:
	img, _ = data
	img = img.view(img.size(0), -1)
	img = Variable(img).cuda()
	# ===================forward=====================
	output = model(img)
	loss = criterion(output, img)
	MSE_loss = nn.MSELoss()(output, img)
	# ===================backward====================
	optimizer.zero_grad()
	loss.backward()
	optimizer.step()
	# ===================log========================
	print('epoch [{}/{}], loss:{:.4f}, MSE_loss:{:.4f}'
	.format(epoch + 1, num_epochs, loss.data[0], MSE_loss.data[0]))
	if epoch % 10 == 0:
	x = to_img(img.cpu().data)
	x_hat = to_img(output.cpu().data)
	save_image(x, './mlp_img/x_{}.png'.format(epoch))
	save_image(x_hat, './mlp_img/x_hat_{}.png'.format(epoch))

	torch.save(model.state_dict(), './sim_autoencoder.pth')

view raw

autoencoder_pytorch.py

hosted with ❤ by GitHub

https://arxiv.org/abs/1612.04642

model that predicts – “autoencoder” as a feature generator
model that predicts – “incidence angle” as a feature generator

Screen Shot 2017-12-09 at 1.16.48 PM

Tools

List and Dicts to Pandas DF

Leave a comment Posted by Security Dude on December 6, 2017

pandas-dataframe-shadow

http://pbpython.com/pandas-list-dict.html

Tools

3 Pillars of Autonomous Driving

Leave a comment Posted by Security Dude on December 1, 2017

Thoughts

Pytorch DCGAN MNIST

Leave a comment Posted by Security Dude on November 28, 2017

MNIST dataset: http://yann.lecun.com/exdb/mnist/

https://github.com/lanpa/tensorboard-pytorch

gist.github.com

	class AlexNet(nn.Module):

	def __init__(self, num_classes=1000):
	super(AlexNet, self).__init__()
	self.features = nn.Sequential(
	nn.Conv2d(3, 64, kernel_size=11, stride=4, padding=2),
	nn.ReLU(inplace=True),
	nn.MaxPool2d(kernel_size=3, stride=2),
	nn.Conv2d(64, 192, kernel_size=5, padding=2),
	nn.ReLU(inplace=True),
	nn.MaxPool2d(kernel_size=3, stride=2),
	nn.Conv2d(192, 384, kernel_size=3, padding=1),
	nn.ReLU(inplace=True),
	nn.Conv2d(384, 256, kernel_size=3, padding=1),
	nn.ReLU(inplace=True),
	nn.Conv2d(256, 256, kernel_size=3, padding=1),
	nn.ReLU(inplace=True),
	nn.MaxPool2d(kernel_size=3, stride=2),
	)
	self.classifier = nn.Sequential(
	nn.Dropout(),
	nn.Linear(256 * 6 * 6, 4096),
	nn.ReLU(inplace=True),
	nn.Dropout(),
	nn.Linear(4096, 4096),
	nn.ReLU(inplace=True),
	nn.Linear(4096, num_classes),
	)

	def forward(self, x):
	x = self.features(x)
	x = x.view(x.size(0), 256 * 6 * 6)
	x = self.classifier(x)
	return x

view raw

gistfile1.txt

hosted with ❤ by GitHub

https://github.com/pytorch/examples/blob/42e5b996718797e45c46a25c55b031e6768f8440/imagenet/main.py#L89-L101

https://github.com/yunjey/pytorch-tutorial/tree/master/tutorials/03-advanced/deep_convolutional_gan

https://github.com/MorvanZhou/PyTorch-Tutorial

https://github.com/soumith/ganhacks

https://github.com/znxlwm/pytorch-MNIST-CelebA-GAN-DCGAN/blob/master/pytorch_MNIST_GAN.py

Click to access 1710.07035.pdf

https://github.com/MorvanZhou/PyTorch-Tutorial/blob/master/tutorial-contents/401_CNN.py

http://pytorch.org/docs/master/torchvision/models.html

https://github.com/pytorch/examples/blob/master/mnist/main.py

.com/@devnag/generative-adversarial-networks-gans-in-50-lines-of-code-pytorch-e81b79659e3f

Holder for future CapsNet work

Thoughts

Semantic Segmentation using Adversarial Networks

Leave a comment Posted by Security Dude on November 22, 2017

Click to access 1611.08408.pdf

References

[1] A. Arnab, S. Jayasumana, S. Zheng, and P. Torr. Higher order conditional random fields in deep neural networks. In ECCV, 2016.

[2] L.-C. Chen, G. Papandreou, I. Kokkinos, K. Murphy, and A. Yuille. Semantic image segmentation with deep convolutional nets and fully connected CRFs. In ICLR, 2015.

[3] G. Csurka, D. Larlus, and F. Perronnin. What is a good evaluation measure for semantic segmentation? In BMVC, 2013.

[4] E. Denton, S. Chintala, A. Szlam, and R. Fergus. Deep generative image models using a Laplacian pyramid of adversarial networks. In NIPS, 2015.

[5] A. Dosovitskiy, J. Springenberg, and T. Brox. Learning to generate chairs with convolutional neural networks. In CVPR, 2015.

[6] M. Everingham, S. Ali Eslami, L. van Gool, C. Williams, J. Winn, and A. Zisserman. The PASCALvisual object classes challenge: A retrospective. IJCV, 111(1):98–136, 2015.

[7] C. Farabet, C. Couprie, L. Najman, and Y. LeCun. Learning hierarchical features for scene labeling. PAMI, 35(8):1915–1929, 2013.

[8] J. Gauthier. Conditional generative adversarial nets for convolutional face generation. Unpublished, .

[9] I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, and Y. Bengio. Generative adversarial nets. In NIPS, 2014.

[10] S. Gould, R. Fulton, and D. Koller. Decomposing a scene into geometric and semantically consistent regions. In ICCV, 2009.

[11] D. Grangier, L. Bottou, and R. Collobert. Deep convolutional networks for scene parsing. In ICML Deep Learning Workshop, 2009.

[12] B. Hariharan, P. Arbelaez, L. Bourdev, S. Maji, and J. Malik. Semantic contours from inverse detectors. In ICCV, 2011.

[13] P. Kohli, L. Ladický, and P. Torr. Robust higher order potentials for enforcing label consistency. IJCV, 82(3):302–324, 2009.

[14] P. Krähenbühl and V. Koltun. Parameter learning and convergent inference for dense random fields. In ICML, 2013.

[15] G. Lin, C. Shen, A. van den Hengel, and I. Reid. Efficient piecewise training of deep structured models for semantic segmentation. In CVPR, 2016.

[16] J. Long, E. Shelhamer, and T. Darrell. Fully convolutional networks for semantic segmentation. In CVPR, 2015.

[17] D. Martin, C. Fowlkes, and J. Malik. Learning to detect natural image boundaries using local brightness, color, and texture cues. PAMI, 26(5):530–549, 2004.

[18] M. Mathieu, C. Couprie, and Y. LeCun. Deep multi-scale video prediction beyond mean square error. In ICLR, 2016.

[19] M. Mirza and S. Osindero. Conditional generative adversarial nets. In NIPS deep learning workshop, 2014.

[20] A. Nguyen, J. Yosinski, and J. Clune. Deep neural networks are easily fooled: High confidence predictions for unrecognizable images. In CVPR, 2015.

[21] H. Noh, S. Hong, and B. Han. Learning deconvolution network for semantic segmentation. In ICCV, 2015.

[22] D. Pathak, P. Krähenbühl, J. Donahue, T. Darrell, and A. Efros. Context encoders: Feature learning by inpainting. In CVPR, 2016.

[23] P. Pinheiro and R. Collobert. Recurrent convolutional neural networks for scene labeling. In ICML, 2014.

[24] P. Pinheiro, T.-Y. Lin, R. Collobert, and P. Dollár. Learning to refine object segments. In ECCV, 2016.

[25] A. Radford, L. Metz, and S. Chintala. Unsupervised representation learning with deep convolutional generative adversarial networks. In ICLR, 2016.

[26] S. Reed, Z. Akata, X. Yan, L. Logeswaran, B. Schiele, and H. Lee. Generative adversarial text to image synthesis. In ICML, 2016.

[27] O. Ronneberger, P. Fischer, and T. Brox. U-net: Convolutional networks for biomedical image segmentation. In Medical Image Computing and Computer-Assisted Intervention, 2015.

[28] S. Roweis, L. Saul, and G. Hinton. Global coordination of local linear models. In NIPS, 2002.

[29] S. Saxena and J. Verbeek. Convolutional neural fabrics. In NIPS, 2016.

[30] A. Schwing and R. Urtasun. Fully connected deep structured networks. Arxiv preprint, 2015.

[31] K. Simonyan and A. Zisserman. Very deep convolutional networks for large-scale image recognition. In ICLR, 2015.

[32] C. Szegedy, W. Zaremba, I. Sutskever, J. Bruna, D. Erhan, I. Goodfellow, and R. Fergus. Intriguing properties of neural networks. In ICLR, 2014.

[33] D. Tarlow and R. Zemel. Structured output learning with high order loss functions. In AISTATS, 2012.

[34] F. Yu and V. Koltun. Multi-scale context aggregation by dilated convolutions. In ICLR, 2016.

[35] S. Zheng, S. Jayasumana, B. Romera-Paredes, V. Vineet, Z. Su, D. Du, C. Huang, and P. Torr. Conditional random fields as recurrent neural networks. In ICCV, 2015.

[36] Y. Zhou, X. Hu, and B. Zhang. Interlinked convolutional neural networks for face parsing. In International Symposium on Neural Networks, 2015.

Thoughts

← Older posts

Newer posts →

BigSnarf blog

Kaggle Vanity

Blade Runner Principle

Adversarial stuff

malware2vec experiments query and answer

GAN idea – Generative adversarial network opcode

SICK LiDAR for xmas fun

Every XMAS I always do some electronics project for fun. This year LIDAR.

Denoising AutoEncoder

List and Dicts to Pandas DF

3 Pillars of Autonomous Driving

Pytorch DCGAN MNIST

Semantic Segmentation using Adversarial Networks

Recent Posts

Archives

Categories

Meta