Reading this blog and about the tech and it looks like my experiments with Bloom Filters, murmurhash3, iPython Notebook and Redis will come together nicely.
Wikipedia says that in computer science, streaming algorithms are algorithms for processing data streams in which the input is presented as a sequence of items and can be examined in only a few passes (typically just one). These algorithms have limited memory available to them (much less than the input size) and also limited processing time per item.
These constraints may mean that an algorithm produces an approximate answer based on a summary or “sketch” of the data stream in memory.
Use Cases for monitoring counts on anything and for network monitoring
Network Login counts
Failed attempts per user
Failed attempts per groups
Failed attempts per role
Success counts for above
Passwords reset volumes per day, month, year
Counts for credentials per person
Password age
Password change day counts
Password lengths
User accounts counts for overall issued
Time elapsed for provision
Time elapsed for decommission
Time elapsed for authorization for changes
Number of privilege accounts per person
Infection counts per user
Infection counts per machine
Infection counts per IP
New account provisioning counts per hour, day, week, month, year
Success and failed for each IP per user counts
Counts of logins devices
Counts of login unique destinations
Packet Counts
Port Counts
DNS request counts per host
DNS over all
DNS request to internal devices
DNS request for each device
Per device aggregation of all types of traffic
Comparing the increase of the number of DNS requests per second with respect to the average number of DNS requests per second
Counts of uses of any word or hashtag from specific locations
Device counts
Software counts
Application patch level counts
Active user counts
Inactive user counts
Remote login per country counts
Remote login per IP address counts
Website visit counts per user
Email counts
Email attachment counts
SPAM counts
Statistics for developer
Stats on access per application, IP address, service, user
Proposed Implementation
Proposed architecture of an example real time processing and monitoring solution would consist of two modules: the on
line streaming module and the statistical estimation module. The online streaming module is updated upon each packet arrival. Real time tracking of summary information in network traffic is crucial for many network functions such as network monitoring and traffic engineering.
Threshold Analysis
This proposed system will concentrate on two types of threshold analysis:
Pingback: Use cases for probabilistic data structures in Infosec metrics | BigSnarf blog