D1.1 Analysis of Current Practices - VIS
Transcription
D1.1 Analysis of Current Practices - VIS
SEVENTH FRAMEWORK PROGRAMME Area ICT-2009.1.4 (Trustworthy ICT) Visual Analytic Representation of Large Datasets for Enhancing Network Security D1.1 Analysis of Current Practices Contract No. FP7-ICT-257495-VIS-SENSE Workpackage Author Version Date of delivery Actual Date of Delivery Dissemination level Responsible Data included from WP 1 – Requirements-Specifications-Architecture UKON, IGD, SYM, CERTH 1 M6 M6 Public UKON IGD, SYM, CERTH The research leading to these results has received funding from the European Community’s Seventh Framework Programme (FP7/2007-2013) under grant agreement n°257495. SEVENTH FRAMEWORK PROGRAMME Area ICT-2009.1.4 (Trustworthy ICT) The VIS-SENSE Consortium consists of: Fraunhofer IGD Institut Eurecom Institut Telecom Centre for Research and Technology Hellas Symantec Ltd. Universität Konstanz Project coordinator Contact information: Dr Jörn Kohlhammer Fraunhofer IGD Fraunhoferstraße 5 64283 Darmstadt Germany e-mail: joern.kohlhammer@igd.fraunhofer.de Phone: +49 6151 155 646 Germany France France Greece Ireland Germany Contents 1 Introduction 2 Network Analytics for Security 2.1 Abnormal Network Traffic and Event Detection . . . . . . . . . . . . 2.1.1 Detecting network anomalies . . . . . . . . . . . . . . . . . . 2.1.2 Behavior-based Network Intrusion Detection . . . . . . . . . 2.1.3 Knowledge-based Network Intrusion Detection . . . . . . . . 2.1.4 Composite Detection . . . . . . . . . . . . . . . . . . . . . . . 2.2 Correlation Analysis and Alert Correlation . . . . . . . . . . . . . . 2.2.1 Alert Correlation . . . . . . . . . . . . . . . . . . . . . . . . . 2.2.2 Monitoring from several vantage points . . . . . . . . . . . . 2.3 BGP State-of-the-art . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.3.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.3.2 Prefix Hijacking . . . . . . . . . . . . . . . . . . . . . . . . . 2.3.3 Securing BGP . . . . . . . . . . . . . . . . . . . . . . . . . . 2.3.4 BGP monitoring . . . . . . . . . . . . . . . . . . . . . . . . . 2.3.5 Methods for detecting prefix hijacking . . . . . . . . . . . . . 2.4 Analysis of Spam Campaigns . . . . . . . . . . . . . . . . . . . . . . 2.4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.4.2 IP reputation analysis . . . . . . . . . . . . . . . . . . . . . . 2.4.3 Message content analysis . . . . . . . . . . . . . . . . . . . . 2.4.4 Network-level spam detection . . . . . . . . . . . . . . . . . . 2.4.5 Analysis of scam infrastructure . . . . . . . . . . . . . . . . . 2.4.6 Analysis of higher-level behaviour of spammers . . . . . . . . 2.5 Root Cause Analysis and Attack Attribution . . . . . . . . . . . . . 2.5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.5.2 Investigative and Security Data Mining . . . . . . . . . . . . 2.5.3 Attack Attribution based on Multi-criteria Decision Analysis 2.5.4 Malicious Traffic Analysis and Cyber-SA . . . . . . . . . . . . 3 Visual Analysis for Network Security 6 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 9 9 10 19 21 22 23 27 31 31 33 37 41 45 52 52 54 55 56 57 59 60 60 62 64 65 67 3 3.1 3.2 3.3 3.4 3.5 Introduction . . . . . . . . . . . . . . . . . . . 3.1.1 Visualization Techniques . . . . . . . . 3.1.2 Basic Interaction Techniques . . . . . 3.1.3 Advanced Interaction Techniques . . . 3.1.4 The Results of an Analysis . . . . . . Tools for Generic Data Visualizations . . . . Tools and Methods for BGP Data . . . . . . Tools and Methods for Network Traffic Data Tools and Methods for IDS Logs . . . . . . . 4 Conclusions and Future Work 4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67 67 68 69 70 70 74 77 84 88 Abstract The VIS-SENSE project aims to use a novel combination of visual analytics approaches and network security analytics to enhance network security. To support the work and collaboration in those two research domains and their communities, this document provides an overview of both fields detailing the state-of-the-art techniques, algorithms and tools, which were developed in the past and are in use. As a result of this survey it is possible to identify open questions and gaps, which lead to novel ways to solve and achieve the overall goal. We also show that both research domains are still relatively separated and lack tight integration. In general, most currently available visual analysis tools focus on very specialized tasks and problems, do not integrate the most advanced network security approaches and focus on specific data rather than providing means to correlate different data sources. Therefore, it can be stated that the VIS-SENSE project is very relevant and a highly promising direction to solve open questions in network security. 1 Introduction Today there are hundreds of thousands of different viruses, worms or other malware spreading through the Internet and infecting unprotected computer all over the world. Everyday the amount of malicious traffic increases which makes it difficult to keep a network safe. In most cases the computer user doesn’t even know that his machine is infected. Sometimes this can lead to unnoticed data theft or the possibility of the criminal to use the hijacked computer to spread spam and unsolicited bulk e-mail, host phishing web sites and so on. This is the reason why there is a great need to deal with this massive amount of malicious traffic circulating through the Internet. The VISSENSE project is going to use visual analytics to create a scalable framework for network security. The idea is to detect and predict very complex patterns of abnormal traffic to prevent computer networks from being hacked or infected. The ultimate goal of the project is the improvement of the international network security and the resolution of cyber crime. In order to achieve these high aims it is absolutely necessary to combine novel algorithms from the field of network analytics and novel visual analysis techniques and methods. These methods make it possible to deal with the massive amount of network data and the very special tasks of network administrators. The primary goal of visual analytics is to turn the information overload into an opportunity [281]. With visual analytics it is possible to affiliate the information visualization area with automatic data mining methods to generate a highly interactive software and to couple human and machine analysis. The idea is to combine the strengths of human visual perception and electronic data processing to exploit their respective advantages to achieve the most effective results. In visual analytics the user is completely integrated in the overall analysis process to give him the chance to better understand automatic algorithms and their results and to control the process in the most promising direction, thus matching the need of the analysts for data exploration. In order to meet these requirements, visual analytics draws tools from both the information visualization and the data mining communities. To achieve the overall goal to eventually enhance network security with visual analytics within the VIS-SENSE project, it is required for all partners to be up to date on the most recent developments, state-of-the-art techniques and common practices in the fields of network analytics and visual analysis with the focus on security applications. It is also necessary to have a baseline when it comes to the final evaluation of the whole 6 project and to see the advantages of the prospective developed framework. Therefore, this deliverable summarizes the latest developments in network analytics and visual analytics with respect to the scope of the project in a survey. The rest of the document is structured as follows. Chapter 2 covers the topics of anomaly detection, correlation and attack attribution. Additionally this chapter describes security issues concerning the border gateway protocol (BGP). In particular, the existing practices on detecting abnormal network traffic and malicious events are investigated. The survey covers Intrusion Detection Systems following well-known paradigms and techniques such as expert systems and other data mining approaches including different clustering and classification algorithms. In addition, techniques that compose more than one of such methods are discussed. Moreover, a coarse-grain classification of the systems is provided together with a short summary for each of the techniques on the advantages and disadvantages. However, since the optimal operation of the discussed intrusion detection systems is a challenging research issue on correctly detecting abnormalities, correlation techniques have been employed to minimize the probability of false positive alarms and undetected events. Four basic categories of correlation methods have been identified: based on similarity between alert attributes, based on predefined attack scenarios, based on attack preconditions and prerequisites as well as post-conditions and consequences and finally strategies that utilize several heterogeneous information sources. A very interesting approach that requires particular attention is that based on honeypots. A special area for deeper investigation has been the security mechanisms for BGP traffic. Major secure BGP flavors are also discussed that focus on the integrity of the exchanged BGP messages. Section 2.4 then discusses the state-of-the-art for the analysis of spam campaigns as a concrete application area of the VIS-SENSE project. In particular, methods for analysis of IP reputation, message content, network-level spam detection, and the hosting infrastructure of scam websites, that advertise their products through spams, are discussed. Finally, this section outlines how these particularities are then used for abstracting higher-level behaviour of spammers. For obtaining a more complete overview of the related work relevant to the project, Section 2.5 details recent publications about root cause analysis and attack attribution. The ultimate goal of work in this field is to understand the modus operandi of spammers and attackers in order to develop better security mechanisms. Since current attack phenomena are largely distributed in the Internet and their lifetime vary from a few days to several months, it is a difficult task to attribute different multi-featured attacks to the same root source. The review of preliminary research in this subfield gives us a FP7-ICT-257495-VIS-SENSE 7 1 Introduction good starting point for future research and development in the scope of the VIS-SENSE project. Chapter 3 then covers the state-of-the-art in the field of visual analytics for network security. First, an introduction to the general field of visual analytics is provided by explaining basic concepts for visualizing data and describing the most commonly used visualization techniques in the field of network security. Furthermore, different interaction techniques are described, which turn visual analytics applications into interactive data exploration tools. While there are many tools that are custom-build for a particular data set or analysis task, we first focus on generic data visualization tools in Section 3.2 that can be quickly used for different kinds of data since those tools are more likely to be used in practice by network security analysts and researchers. The state-of-the-art of current research projects, which enhance network security tasks with visualization techniques, is then discussed in sections 3.3, 3.4 and 3.5 focussing on visual analysis tools and methods for BGP data, network traffic data and IDS logs. Thereby, an overview and classification of the most relevant publications in this field is given. The last chapter summarizes the findings of this survey and briefly outlines the implications of this survey and future developments on both the project and the network security field in general. 8 SEVENTH FRAMEWORK PROGRAMME 2 Network Analytics for Security 2.1 Abnormal Network Traffic and Event Detection 2.1.1 Detecting network anomalies There are two general complementary approaches for detecting network anomalies [130]. The first one is be defining what normal network operation is and developing techniques to identify deviations from normal cases, the so-called behavior-based techniques. The second family of approaches shares the philosophy of defining directly the attacks and aiming to identify them in the observed data, the so-called knowledge-based techniques. IDSs may be categorized according to different properties and features such as the detection principle, the behavior on detection, the source location, etc. [75], [20]. A further classification of the two complementary approaches based on the fundamental principles of the analysis and investigation reveal five basic categories: 1. Statistical-based approaches generate a profile about the stochastic behavior of the network traffic. 2. Expert systems that are trained with rules to produce a certain out about the network state given a particular input. 3. Machine learning approaches that initially require training with input data to identify the normal network operation building internal structures to represent such information. Afterwards, they identify intruders by evaluating their actions and raising alarms when they differ from the range of values they have been trained. 4. Pattern matching approaches that employ pattern matching (typically string matching) techniques to identify abnormalities. 5. State-based transition approaches where finite state machines are utilized to represent the potential states of the network and to identify the set of requirements to transit in an abnormal state. Table 1 classifies several state-of-the-art intrusion detection platforms that contain behavior-based or knowledge-based detection modules. The majority of the investigated event detection methods fit into one of these categories, however, a number of 9 2 Network Analytics for Security them combine techniques from several categories. In order to cover the latter methods, the composite detection class is proposed. Observing Table 1 it is clear that the surveyed intrusion detection platforms are subdivided into two categories: self-learning and learning-based. Self-learning refers to systems that learn by example what represents normal network operations. The learning-based techniques require to be taught how to identify particular abnormal patterns of network traffic. 2.1.2 Behavior-based Network Intrusion Detection While expert systems have been the initial way of developing behavior-based network intrusion detection systems, their enormous requirements led to the investigation of methods based on neural networks, wavelets, Markovian models, Bayesian networks, genetic algorithms, etc. Expert Systems Expert systems have been the first approaches dealing with anomaly detection, where normal operation is described in terms of rules that are stored in DBs. The monitored traffic is processed by the rules-based system and alerts are raised when low-weighted matches are detected. The three major steps of rule-based traffic classification are identification of attributes and classes, deduction of classification rules and parameters and audit data classification. LERAD [199] is a rule based algorithm for finding rare events in time-series data with long range dependencies, which has been used for detecting anomalies in network traffic. LERAD uses association mining to find out syntactic associations between attributes. More particularly, the anomaly score for each record is estimated based on the unsatisfied rules and the time since each particular rule has been violated. However, LERAD lacks a mechanism for distinguishing between correct and false alarms and it lacks the ability of detecting untrained anomalies. Another expert system that employs fuzzy cognitive maps (FCM) for network anomalies detection is described in [217]. FCM is a visual model for encoding and processing unclear causal reasoning that via its dynamic properties is able to represent time-varying characteristics of network anomalies. A matrix is used for detecting propagation of events over managed network components providing a causal inference representation mechanism. While expert systems are solutions that provide robustness, flexibility and high-quality knowledge, they typically achieve them through time-consuming and demanding processes. Nevertheless, anomaly detection systems in general have challenging requirements to defining normality, which requires exhaustive training with relevant data. 10 SEVENTH FRAMEWORK PROGRAMME 2.1 Abnormal Network Traffic and Event Detection Table 2.1: Behavior and knowledge based system classification for abnormal event detection. Behavior-based Learning based Expert systems Supervised machine learning approaches Self-learning Statistical-based approaches Unsupervised machine learning approaches Knowledge-based Composite techniques Learning based Self-learning FP7-ICT-257495-VIS-SENSE Pattern matching approaches Expert Systems State transition and Petri-net modeling Hybrid approaches [199], [217] [122], [37], [73], [47], [249], [110], [213], [159], [214], [150], [80], [95], [196], [79], [58], [191], [22], [113], [329], [120], [72], [89], [115], [102], [85], [49], [202], [256], [189], [83], [114], [23], [41], [42] [77], [78], [44], [215], [197], [99], [127], [129], [324], [288], [300] [179], [128], [245], [86], [123], [135], [162], [169], [325], [319], [280], [88], [172] [134], [94], [52], [170], [231] [18], [101], [190] [299], [136], [240], [173], [172] [133], [233], [193] 11 2 Network Analytics for Security Machine Learning Approaches Machine learning approaches for network anomaly detection include methods based on neural networks, support vector machines, fuzzy systems, genetic algorithms, etc. The common goal of these techniques is establishing an explicit or implicit model enabling pattern categorization. Neural Networks The most challenging issue in a Neural Network (NN) is to train it properly and set the coefficients to their optimal values for the specified input and output. The general approach for IDS systems is to initially train the NN system with normal data as well as attack patterns. An early attempt to utilize NNs for IDS development is described in [122], where a hierarchical approach combines NNs with hidden Markov models (HMMs). A NNbased IDS based on well-known intrusion profiles is provided in [37]. A NN based on a statistical model focusing on the architecture design of an expert system is given in [73]. A NN base on Self-Organized Maps (SOMs) is described in [47] combined with a multi-level perception mechanism to detect attack patterns. Another NN-based system using SOMs is called Integrated Network-Based Ohio University Network Detective Service (INBOUNDS) [249]. Six relevant parameters assist in characterizing network connections. The SOM utilizes structures with two-dimensional lattices of neurons. A predefined threshold is used to characterize activities as attacks or not. However, it has limitations in identifying well-covered attacks and corner-case behavior can give false positives. According to [110], training NNs with random data gives the best possible results in detecting unknown attacks. Utilizing a recurrent NN it achieves generalization of the results from particular users to categories. Nevertheless, NNs do not provide a descriptive model clarifying why a certain detection decision has been taken. Finally, a comparison between NNs and support vector machines (SVMs) is provided in [213], where it is concluded that SVMs are superior IDS development since they are more effective in training and operation, they have better accuracy and scalability. Support Vector Machines As it is stated above Support Vector Machines (SVMs) are promising learning techniques used for categorization and regression of network anomaly detection. SVNs are a supervised learning method trained with normal data and afterwards used for detecting anomalies [159]. In [214], the superiority of SVNs over NNs is investigated as it has been aforementioned. In addition, that work deals with the feature selection issue for SVNs demonstrating that SVNs can achieve the same performance as NNs using a smaller number of features. The investigation has been based on the Knowledge Discovery in Data Competitions (KDD Cup) dataset [150] using in 12 SEVENTH FRAMEWORK PROGRAMME 2.1 Abnormal Network Traffic and Event Detection total five SVMs. While the SVM based method achieved 99% accuracy, the NN based achieved only 87% requiring more time. However, if the training data contain traces from intrusions, they will not be able to detect them in future attempts since they are considered as normal cases. Fuzzy Logic Techniques In order to overcome the limitations of deterministic reasoning, fuzzy systems promote a probabilistic framework where reasoning is approximate and not precise. Such reasoning approaches fit nicely with the fuzzy variables related to network anomaly detection where normal and abnormal events are distinguished by the values of the variables (lying in given intervals). The Fuzzy Intrusion Recognition Engine (FIRE) [80] is a fuzzy logic based IDS using data mining for classification of the information and representing the discovered metrics as fuzzy sets. Well-known scenarios are expressed as fuzzy rules that are applied to the collected data in order to provide the output reasoning and classify them as normal or malicious. An algorithm for calculating fuzzy relationship rules based on Borgelt’s prefix trees is described in [95] involving feature selection, genetic algorithms based optimization and improved confidence fuzzy rules. Abnormal events are generated when the similarity between the trained fuzzy sets and the ones under evaluation is beyond a threshold. Such a system achieves an appealing accuracy that is achieved by modified data mining algorithms. The Intelligent Intrusion Detection Model (IIDM) [196] is another IDS based on fuzzy logic. The system involves a normalization step for balanced mining fuzzy association and afterwards it is applied to learn fuzzy frequency episodes where the selected similarity function is continuous and monotonic. Nevertheless, a critical shortage of fuzzy systems is the requirement for a costly offline analysis processing. A system that deals with this limitation is presented in [79] integrating fuzzy logic and genetic algorithms to select the best possible fuzzy rules. Genetic Algorithms and Immunological techniques Genetic algorithms provide another possibility for developing IDSs. Such evolutionary algorithms are considered global search heuristics able to detect novel network attacks, capitalizing on features such as inheritance, mutation, selection and recombination. An approach combining genetic algorithms with a decision tree is provided in [58] achieving high detection rate and low false positives over unknown gathered data capitalizing on the fact that malicious events are inherently different from normal ones. However, the developed system has scalability limitations. Another approach inspired by genetic algorithms is described in [191] where both temporal and spatial information is considered for building the fuzzy rules. The evaluation function gives high weight to source and destination IP addresses, FP7-ICT-257495-VIS-SENSE 13 2 Network Analytics for Security as well as the duration and less to the utilized communication protocol or source port number. Crossover and mutation techniques are employed for the natural reproduction and mutation of the species where the fittest chromosomes are selected. A similar technique is given in [22] where genetic algorithms are applied on TCP sessions where packets of the same session are considered are sequences. The ROCK algorithm [113] uses dynamic programming and it is employed to cluster the sequence features of the TCP sessions. The compact clustered information provides a knowledge space where it is more effective identifying normal and abnormal scenarios. Genetic algorithms are interesting approaches to detect network anomalies since they are able to identify unknown attack patterns, however, they are considerably resource demanding. A system aiming to partly address this issue is described in [329] that combined clustering techniques with the genetic algorithms resulting in high detection rate, reduced false positives in a resource-effective way. Nevertheless, particular attention has been given to immunological techniques. Such an IDS system is presented in [120] where a set of immunological techniques are employed. More specifically, it includes permutation masks to amplify the detection of false negatives, activation thresholds to aggregate activity over time, and adaptive thresholds to integrate patterns from several points. R-contiguous bits match rule is utilized to compare incoming connections with classified ones and matched connections are considered as anomalous. A negative selection mechanism for distinguishing foreign patterns in the complement space is presented in [72] where a set of fuzzy rules is generated for differentiating abnormalities in network traffic using genetic search algorithms. A comparison between Positive Comparison (PC) and Negative Comparison (NC) approaches showed that the latter while less accurate are more effective in term of resource requirements. Another immunological-inspired approach is presented in [89] where is has been observed that introducing synthetic abnormalities into the original data considerably improved discovery of malicious anomalies, even unknown ones. However, immunological techniques may miss evident attacks and occasionally produce false positive alerts, as discussed in the evaluation of Lightweight Intrusion detection System (LISYS), developed over the Artificial Immune System (ARTIS) [115]. Genetic algorithm based systems may face scalability limitations, however, they are able to identify effectively previously unobserved attacks. Clustering and Outlier Detection Unsupervised techniques such as clustering and outlier detection identify abnormal data considering their deviation from the normal ones. Such an alert clustering mechanism is presented in [102], which operate in real-time considering that different network sensors may produce different reports and similar reports may be generated by a particular sensor for different network events. A geometric 14 SEVENTH FRAMEWORK PROGRAMME 2.1 Abnormal Network Traffic and Event Detection framework for unsupervised anomaly detection is presented in [85] where events are represented by d-dimensional vectors. Anomalies are identified as points lying in non-dense areas of that d-dimensional space. A number of mapping methods are evaluated within that work, one data-dependent for the network connections and one based on a spectrum kernel. The advantage of this technique is that it operates effectively on unlabeled data. Constant width and k-nearest neighbor clustering algorithms are employed in [49] investigating connection logs for anomalies in the network traffic. A hybrid approach is described in [202], where examining user profiles, expert rules are applied to decrease data dimensionality and afterwards, a clustering mechanism based on Learning Vector Quantization (LVQ) provides a categorization of the data. Taking advantage of the fact that LVQ is a nearest neighbor method, abnormal events can be easily identified without the requirement to train the system a priori with network anomalies. A successful classification of about 80% is possible with this system. ADMIT [256] is utilizing semi-incremental methods to detect non-legitimate users of computer terminals. They introduce the concept of dynamic training and dynamic clustering to deal adaptively with unobserved classes as new data is captured by creating new clusters. fpMAFIA [189] is a density and grid-based high dimensional clustering method for large amounts of data. fpMAFIA can generate clusters of arbitrary shapes and fpMAFIA attain a high detection rate, however, it experiences high false positive rate. In addition, that work compares fixed width clustering algorithms and density based ones. The density-based clustering algorithm has the in-built limitation categorizing effectively points that lay in sparse areas. In general, clustering algorithms may require long convergence time to a stable categorization. Moreover, statistical dependencies among raw data are not effectively represented using clustering methods, thus, such correlations may not reveal. To achieve local convergence effectively, the SFK-means approach combines fuzzy logic and swarm intelligence algorithms [83]. The training phase produces improved classification on each repetition while Euclidean distances are employed for the anomalies detection phase. In addition, mixture models are alternative clustering approaches focusing on modeling aspects. A finite Gaussian mixture model [114] is employed for approximating stochastically the maximum likelihood using the Expectation-Maximization (EM) method. Anomalous events are identified based on the fact that they demonstrate rapid mean value changes, while the baseline random variable is stationary having zero mean. ADAM (Audit Data Analysis and Mining) [23] is a testbed to research which data mining techniques are appropriate for IDSs. A graph-theoretic approach for detecting abnormal network traffic is presented in [41], where networks are represented as graphs with relevant properties at nodes sampled at regular time intervals. The states of graph snapshots construct a space where differences demonstrate the network changes as events occur. If the calculated distance between FP7-ICT-257495-VIS-SENSE 15 2 Network Analytics for Security two subsequent states is larger than a threshold, an abnormal event alert is generated. More particularly, graphs with unique node labels are utilized for lower computational complexity on graph operations. A concept called median graph is used to measure the similarity of graphs [42]. The median of a set of graphs is a graph that minimizes the average edit distance to all graphs in the set. The complete set of graph distances is applied on the graphs using a multidimensional scaling (MDS) method to associate events on the network, thus providing a scatterplot-based visualization method to present anomalies. Statistical-based Approaches The aforementioned approaches are heavily dependent on the state of the network where they are trained or configured. Significant changes on the network state require retraining of the system to operate effectively. In contrast, involving online learning and statistical techniques allows constant monitoring of the network state. An important discriminative feature between statistical anomaly detection and machine learning techniques is that statistical methods mainly focus on the statistical investigation of the gathered data, whereas machine learning methods focuses on the learning procedure. The general process of statistical methods for network anomaly detection are first to preprocess and filter the raw data, afterwards to perform the statistical analysis and the transformation of the data, and finally, to check whether conditions and thresholds are met to raise an anomaly alert. A significant amount of research focuses mainly on the second step to distinguish normal operation from anomalous behaviors and noise. Some early statistical approaches for network anomalies detection employed univariate models with Gaussian random variables [77], [78]. An auto-regressive process based technique is presented in [44] where applying Statistical Tests for Causality on the data from a Management Information Base (MIB) they derive information about the attacks. A similar IDS that uses Adaptive Regression Splines is described in [215]. More particularly, Multivariate Adaptive Regression Splines (MARS) are compared with SVMs and NNs. It is reported that MARS is superior to SVMs for classifying significant attack classes and that SVMs are superior to NNs regarding scalability, accuracy as well as training and execution time. Wavelet approaches Wavelet analysis has been employed to model non-stationary data series taking advantage of the time and scale-localization abilities to identify abnormal events in traffic traces. For example, wavelet-based analysis techniques are employed in [197] and are applied on network packets in a MIB to generate time series of traffic statistics. The system mostly aims in identifying correlations between mis-configured traffic 16 SEVENTH FRAMEWORK PROGRAMME 2.1 Abnormal Network Traffic and Event Detection and Retransmission Time-Out (RTO) events (they consist up to 33% of network disoperation) rather than attacks and generate related signatures. Another wavelet-based decomposition method is presented in [99], aiming at rapid network recovery. Providing scalability and adaptability, the method transforms the problem to a frequency domain where mainly using medium and high frequencies detect the anomalies. A fast wavelet algorithm allows its application in real-time traffic. Moreover, in order to achieve a better performance than pattern matching methods, Waveman [127] is a wavelet-based framework capitalizing on percentage deviation and entropy to calculate the performance of various wavelet algorithms. It is concluded that Coiflet and Paul wavelets based on a five-minute, sixty-sample window are among the best for detecting network anomalies. Moreover, WIND [129] is a prototype tool for Wavelet-based INference for Detecting network performance problems. WIND is merely based on passive packet properties coming from a single observation point where time, scale and destination-based inter-relations among packets are detected and structured using wavelet algorithms. A covariance-matrix modeling and detection method is described in [324], where second order features are employed. In this approach, statistical covariance matrices are used to represent normal network traffic conditions and by using a threshold matrix generated by Chebyshev inequality theory, classification takes place. Attacks are detected by estimating the difference to the categorized data. Such a method does not pose any assumptions on the distribution of data. Wavelet approaches provide interesting scalability features, however, they require complex mathematical models. PCA methods An unsupervised statistical-based method that employs Principal Component Analysis (PCA) over global traffic matrix statistics is presented in [179], which utilizes entropy as a metric to explore feature distributions and their structure. It is observed that such a method achieves an effective classification scheme in an unsupervised manner enabling the discovery of known and unknown anomalies. However, it is assumed that data processing took place in an offline manner raising scalability issues for scenarios that pose real-time requirements. Another PCA-based approach is discussed in [128], suggesting a scheme relaxes the need to centralize the available data using filtering methods in order to achieve scalability. However, a stochastic matrix perturbation technique is employed to reduce the possibility of false alerts, providing the means to trade-off between accuracy and communication effort. Independent Component Analysis (ICA) is a similar approach presented in [245] that is able to split traffic into normal and abnormal components based on blind source separation. A scale-space filter is utilized to reduce the noise and a zero-crossing technique is employed to mine the stochastic behavior pulse widths in order to select the largest as indicators for the behavior. These FP7-ICT-257495-VIS-SENSE 17 2 Network Analytics for Security indicators assist in detecting abnormal events. ICA does not require supervised learning. PCA analysis methods are able to extract interesting features from the monitored data, however, they have some limitations on their scalability. BN methods In addition, methods that employ Bayesian Networks (BNs) and Hidden Markov Models (HMM) are interesting approaches that have been investigated [86]. In particular, BNs are models able to capture the statistical dependence or causal-relations between variables and abnormalities. The application of BNs to MIBs is presented in [123] where the normal operation is captured in the structure of a BN in order to detect the unknown anomalies when deviations occur. A Web-based automated network anomaly detection approach is described in [135] aiming to address issues in multitier systems by utilizing a BN solution. More particularly, sequences of graph models represent the offered Web services and their dependencies as they vary over time. A feature vector is extracted from the adjacency matrix and the principal eigenvector of the graph eigenclusters is calculated. Anomalies are detected by observing the irregular changes in the graph sequences. S3 [162] is a BN based algorithm able to detect network anomalies. S3 targets to address short-lived anomalies by employing fine-grained timestamps on inputs such as traffic volumes, correlated packets and session bit rates. Combined together, these signals provide sufficient information to detect anomalies with higher accuracy compared to time series-based and wavelet-based methods. Furthermore, the suitability of BNs to reduce false alerts is discussed in [169]. BNs provide an advanced mechanism compared to unsophisticated aggregation methods that employ a single threshold to make decisions. Moreover, BNs allow natural combination of information originating on different sources. However, the utilized model has strong assumptions about the behavior of the target system. HMM methods An approach based on multivariate models that consider the correlations between two or more metrics is discussed in [325]. It capitalizes on HMMs and the maximum likelihood principle to deal with dynamic features, while for the static ones it utilizes frequency distributions and minimum cross entropy. The multivariate models are suitable for experimentally collected data coming from multiple sources since they demonstrate enhanced discrimination features on them. In addition, using subtractive clustering and HMMs, normal-anomaly patterns are produced for network traffic [319] that assist in correlating the observation sequences and network state transitions, thus detecting intrusion activities. A combination of HMM algorithms with models of user behaviour is presented in [280]. A relevant piece of work investigating HMMs is presented in [88] focusing on TCP session sequences. It quantizes and models TCP headers 18 SEVENTH FRAMEWORK PROGRAMME 2.1 Abnormal Network Traffic and Event Detection as Markov chains to represent the dynamics of the protocol; then it distinguishes normal from abnormal behavior. HMMs improve the network detection accuracy by reducing false alerts. Nevertheless, HMM-based approaches are complex procedures requiring long processing time making them appropriate only for offline processing [172]. Statistical sequential change-point detection methods A different method of designing IDSs is based on statistical sequential change-point detection that estimate the deviation of a measured sample from the normal behavior using distance metrics based on L-norm, Hamming or Manhattan distance, etc. More particularly, in [288] a statistical signal processing method is presented where it is assumed that traffic variables are quasistationary. In this work it has been observed that many false alarms are due to burst in traffic. In the utilized MIBs there are abrupt changes in a correlated manner. Fine granularity sampling provides useful results by the means of a ”network health” function that indicates anomalies in the network. A related technique is presented in [300] for identifying SYN flooding attacks at edge routers. Applying sequential change point detection on the differences between TCP SYN and FIN pairs modeled as a stationary ergodic random process and a non-parametric cumulative sum method it is possible to detect irregular behavior. Nevertheless, the aforementioned approaches assume a quasistationary or stationary process to model the network dynamics that does not always hold. Concluding, statistical-based approaches have several advantages for developing IDSs such as the relaxation for required prior knowledge. However, such approaches may be trained attackers to consider attacks as normal. Moreover, their fine tuning is a complex procedure to minimize false positive or negative alerts. Finally, in several cases it is not possible to model variables and system behaviors with stochastic means. 2.1.3 Knowledge-based Network Intrusion Detection Pattern Matching Approaches Pattern matching techniques have been early approaches for knowledge-based network intrusion detection systems. However, such approaches introduced a number of issues such as low processing capacity, high rates of false alerts, inability to identify unknown misuses and requirement for explicit signatures for each attack. Commercial products such as ISS [134] match network data to predefined sets of patterns. An IDS that investigates string matching algorithms to detect security breaches is presented in [94]. The proposed algorithm is compared with other known string matching algorithms such as Aho-Corasick and Boyer-Moore using the Snort platform [244]. Another pattern based IDS is described in [52] paying particular attention to reducing the processing FP7-ICT-257495-VIS-SENSE 19 2 Network Analytics for Security requirements by disabling the pattern matching mechanism in periods where no traffic changes are noticed. The latter changes are identified by employing a time series analysis. However, this method does not work on links with high amounts of traffic. A different approach aiming to reduce the matching processing requirements is described in [170] where an ID3 clustering method is utilized for that purpose. The generated decision trees are employed to optimize the rules-to-input comparison avoiding redundant operations to detect malicious behaviors. Comparing that system with the Snort open source platform [244], improved results have been reported. Bro [231] is a real-time IDS that employs an event engine for grouping traffic to high level events and a policy script interpreter to define security policies. Attacks are detected using string comparison operations. In general, while pattern matching approaches are simple, they face difficulties dealing with evolving networks and traffic conditions in a scalable manner. Expert Systems Based on rules that describe abnormal behavior, expert systems are able to generate alerts for network security breaches when fed by transformed audit events. Nextgeneration intrusion-detection expert system (NIDES) [18] has been developed to detect malicious activities on networks. NIDES has been designed based on a foundation for anomaly detection as well as signature-based components. The system performs statistical and rule-based analysis of the audit data providing graphically the results to users. State Transition Analysis Technique (STAT) tool [101] models attacks as sequences of state changes that move the network from a secure state to a compromised one. CRITTER is a case-based reasoning (CBR) algorithm that is presented in [190]. CRITTER combines rules and conditions that lead to abnormalities. CBRs are alternatives to rulebased reasoning (RBR) techniques that inherit fewer constraints compared to the latter ones. RBRs can be easily configured to define attacks. CBRs are more effective than RBRs for scenarios where the system must capitalize on and learn from past experiences and be able to cope with novel conditions that occur on Internet. Expert systems based on the inductive approach produce if-then rules from provided data representing normal and abnormal cases being able to support mechanisms such as unification at a higher processing operation cost. Adaptive techniques are required to make CBRs function on changing environments. The amount of functions that is necessary to address abnormalities scales linearly with the number of faults. 20 SEVENTH FRAMEWORK PROGRAMME 2.1 Abnormal Network Traffic and Event Detection State Transition and Petri Net Modeling State-transition and Petri-nets are modeling techniques able to represent the different states of a network and identify intrusions. Generally speaking, state-transition diagrams are graphs where nodes are the states and links represent the potential transitions; when applied to network intrusion detection, nodes represent states of the network and links are the required activities to move from one state to another. NetSTAT [299] and its predecessors USTAT [136], [240] are prototype systems providing an environment to model networks based on state-transition diagrams. Employed hypergraphs provide vital information about the events to be monitored, the appropriate location assisting network administrators on their work. This approach provides a robust solution against unknown vulnerabilities. However, it lacks adaptation to state sequence changes where big effort is required to be comprehensively configured. IDIOT [173], [172] is a coloured Petrinet (CPNS) based approach aiming to detect components of partially ordered attack sequences. There are concerns about the scalability of this approach as the number of states increases and its ability to operate in real-time. Nevertheless, the fact that IDIOT operates on abstractions of the raw data gives it a performance boost. Moreover, it is able to detect unknown attacks, exploit temporal relations, reuse modeled concepts and achieve a reduced false alert ratio. Summarizing, a shortcoming of using finite state machine (FSM) methods is the fact that some attacks require a very large number of states to be comprehensively modeled. Therefore, the amount of total states and related parameters grows up enormously that can only be handled as an offline process. They also lack adaptability characteristics as network evolves. 2.1.4 Composite Detection Some IDSs combine both knowledge based methods and anomalous behavior detection ones. Such approaches are designed to take advantages of both worlds. They have the ability to both identify the patterns of intrusive behavior and to associate them to the normal behavior of the network. Hybrid intrusion detection system (HIDS) [133] is combining the advantaged of an IDS and an anomaly detection system (ADS) to identify unknown attack scenarios. The ADS is developed out of the mined anomalous network traffic episodes. By utilizing a weighted signature generation scheme the integration of the two approached is achieved. Another hybrid approach is presented in [233] able to detect and visualize network intrusions. Agents are employed to perform against intruders for protecting the network resources. The Production-Based Expert System Toolset (P-BEST) [193] is a system employed for developing a modern generic signature-analysis engine for network misuse detection such as SYN flooding and buffer overruns. Well de- FP7-ICT-257495-VIS-SENSE 21 2 Network Analytics for Security signed composite techniques are able to provide a rich set of functionality, however, such an advantages comes with a cost of high complexity and in some cases with unnecessary redundancy. 2.2 Correlation Analysis and Alert Correlation Intrusion Detection Systems (IDSs) operate in a supplementary manner to other more traditional security methods, such as network firewalls or certificates and cryptography. Depending on the configuration and the ability of the deployed IDSs to detect correctly the intruders and their actions, a very large number of alerts is generated continuously every day, however, having a large number of false positive incidents. Tuning and properly configuring IDSs stamp out a large number of trivial false alerts, however, there is still a significant portion of spurious notifications. Moreover, most IDSs have limited observation abilities in terms of network space as well as the kind of attacks they can deal with. Attack evidences against network resources can be scattered over several hosts. It is a challenging issue designing an IDS with properly deployed sensors and analysis capabilities able to detect the attacker traces at different spots in the network during an intrusion attempt and being able to find dependencies among them. Therefore, achieving collaboration in the result analysis correlation and the relevant triggered alerts between different IDSs leads to improved results with better description of the attacks and provides a stronger confidence on the raised security issues. Alert correlation techniques, which gather and identify relationships on alerts from different sources aiming to spot attack scenarios, are typical tasks of Security Information and Event Management (SIEM) systems. Such techniques combine alerts with high probability of sharing the same root cause, reduce the probability of false positive alerts and they provide rankings of the alerts based on their importance. Using an appropriate visualization method to show such information can be of great assistance for the security analysis employees providing them with a decision support system. Alert correlation is an important multistep process of IDSs that combines information from heterogeneous network sensors, improves the ability of identifying attack, enhances the meaning and the semantics of the attacks with more details and reduces the false positives scenarios. The general procedure includes dealing with alerts at multiple granularities, exploiting potential spatiotemporal relationships (e.g. origin, target, etc.), data fusion and structure identification for detecting complex intrusion scenarios [296]. The Intrusion Detection Message Exchange Format (IDMEF) [74] by IEFT has been proposed for standardizing the format of the raw input alerts as well as to define the alert exchange protocol. Time and synchronization are critical aspects in the alert correlation 22 SEVENTH FRAMEWORK PROGRAMME 2.2 Correlation Analysis and Alert Correlation process in order to capture accurately the arrival order and the relevant timestamps [152]. A significant amount of work has been done related to of physical and logical clocks and timestamps in distributed systems. 2.2.1 Alert Correlation Alert correlation focuses on discovering various relationships between individual alerts. The existing alert correlation techniques can be roughly divided into four categories [253], [315]: 1. methods based on similarity between alert attributes (such as start-time, end-time, source, and target of the attack), which cluster alerts through computing attribute similarity values; 2. approaches based on predefined attack scenarios, which build such scenarios through matching alerts to predefined templates; 3. techniques based on attack pre-conditions and prerequisites as well as post-conditions and consequences, which develop attack scenarios as chains in time, through matching the post-conditions of earlier attacks with the pre-conditions of later attacks; 4. strategies that utilize several heterogeneous information sources integrating different information types and carrying out reasoning based on triggered alerts and other collected information. Data clustering techniques form the basic methodology of the approaches based on similarity between alert attributes, where the definition of appropriate similarity measures is the most critical issue [186]. Typical records about potential suspicious events include information such as source and destination IP addresses and ports as well as timestamps, which form the attribute values. Then, the similarity measurement algorithms compute the distance between such events and cluster them accordingly and alert correlation techniques are triggered. This methodology inherits lower number of alerts since similar alerts are clustered together and assigned to the same attack. An alert clustering approach to perform root-cause analysis is described in [145], where the root-causes are the reasons of the triggered alerts. In such a system, groups of alerts are identified so that the grouped alerts correspond to the same root-cause. In order to provide meaningful clustering methods, hierarchical generalizations are proposed for constructing high level concepts of the alert attributes, e.g. a network address is a generalization of an IP address. Then, a series of dissimilarity measures is defined that are relevant to the produced generalizations. Measurements using this technique provided FP7-ICT-257495-VIS-SENSE 23 2 Network Analytics for Security that the top 13 alert clusters account for 95% of all alerts. A probabilistic framework to perform alert correlation is described in [295], where the similarity among alerts generated by different IDSs is calculated. This approach focuses on dealing with IDSs with heterogeneous alert attributes (e.g. IP addresses, ports, timestamps), where initially the common features are identified and it estimates both the minimum similarity as well as the expectation of the similarity. The overall similarity is weighted by the expectation of similarity having as terms the similarity of the common features. An approach that performs series and statistical analysis for detecting attacks is proposed for conducting statistical causality analysis [246]. Moreover, an aggregation technique is defined for grouping lower level alerts to a conceptual higher level alert called hyper alert, therefore, leading to a smaller alert number and providing the means for alert ranking. Finally, after completing the aggregation, clustering and prioritization steps, the proposed system perform a statistical Granger Causality Test to detect the attack scenarios. The Mirador project [67] developed an alert correlation system based on multiple IDSs, conducted in three steps. The first step is alert management where tuples are generated for each alert and stored in an RDBMS. Tuples are created by transforming IDMEF alert messages into the specified DB schema. The second step is about alert clustering where alerts belonging to the same attack are grouped together, where the successful result depends on the correct evaluation of the similarity between alerts. Finally, the third step is about alert merging in each cluster where a global alert represents the whole group of the alerts. The approaches based on predefined attack scenarios require a series of attack steps correlated aggregated to demonstrate the big picture of potential attacks. A typical approach is to defining some required attack scenario templates. An example of such a sequence of scenarios is to define a template for IP scan attack, then a TCP port scan attack followed by an application buffer overflow attack. When an attack is identified, it is matched with the predefined templates as parts of an attack scenario. Such an approach is quite beneficial to detect already known attack scenarios; however it is not possible to identify unknown attacks. Moreover, for certain cases it is not easy to exhaustively list all attack sequence templates. Such an approach is described in [76], where an architecture called ACC is proposed aiming to cluster alerts based on predefined relationships between them. Both aggregation and correlation relationships are identified, where the former ones aggregate alerts based on the predefined criteria, while the latter ones discover the commonalities between attacks by identifying duplicates and consequences. Nevertheless, this method produces a large number of false positives. Another approach utilizes alert correlations based on the chronicles formalism [212], where chronicles is a model for temporal event patterns used to monitor security events and to perform alert correlation. Thus, chronicles is a concept aiming to reduce the raised alerts and more importantly their false rate. Each chronicle includes information timestamps, 24 SEVENTH FRAMEWORK PROGRAMME 2.2 Correlation Analysis and Alert Correlation event patterns, time constraints and other related information. If the relevant chronicle conditions are fulfilled, then it is considered as valid and an alert is produced. The approaches based on prerequisites and consequences of attacks capitalize on the fact that intruders usually perform attacks in steps where earlier steps perform tasks to set the conditions for subsequent ones. Examples of such sequential steps are to initialize an IP sweep to find live hosts in a network. Afterwards, attackers may scan for open ports on discovered live hosts to find vulnerable services, and finally start a buffer overflow attack on the specific hosts. Among such a sequence of attacks, causal relationships can be identified that can form attack scenarios providing a more comprehensive view about security threats. The prerequisites are mandatory conditions for follow-up attack steps to take place, while the consequences are possible attack results. Attack modeling languages such as LAMBDA [69], CAML [57] or first order logic methods can be used for modeling the prerequisites and consequences. More particularly, an approach applying abductive correlation utilizing pre and post-conditions is described in [68], where initially alert clustering is taking place and then a merging process using appropriate similarity functions. The LAMBDA attack specification is employed to automatically generate correlation rules both in a directed and undirected manner. Using the produced rules alert information such as types, attribute values and timestamps is extracted and justified against the rules. The identified series of correlated alerts produce a complete attack scenario. On the other hand, first order logic is employed on different approaches [218], [219] to describe pre and post conditions as well as causal relationships among alerts. Both pre and post conditions are defined for each generated alert by extracting the relevant alert attribute values, which are processed afterwards for finding their correlations via possible partial matches. The correlated alerts are grouped for conducting potential attack scenarios, forming prepare-for relation models. These relations are used for constructing correlation directed acyclic graphs, where the nodes correspond to alerts and links indicate ”prepare-for” relations. An extension of this technique [221] employs hypothesis and reasoning methods to further detect unidentified attacks, capitalizing on the observation that missed intermediate attacks by IDSs may have possibly produced multiple attack scenarios. The identification of the relevant constraints regarding the possible multiple attacks can be used in the hypothesis process aiming to discover the relevant attribute values. The hypothesized attacks are validated using the original data from the sensors and failed attacks are removed. Using the validated attacks and the existing alerts, concise attack scenarios are constructed. On a different approach, JIGSAW [273] describes attack conditions employing capabilities and concepts. Capabilities specify the information that intruders require knowing to perform particular attacks such as user names and passwords as well as the required conditions that clarify the context of the attack. On the other hand, concepts model fragments of complex attacks utilizing FP7-ICT-257495-VIS-SENSE 25 2 Network Analytics for Security capabilities to specify the pre and post conditions. Complex attack scenarios are then detected by correlating capabilities included in a particular concept with capabilities of other concepts and therefore discover for instance that a remote shell connection spoofing that relies on a denial-of-service attack. Nevertheless, there are several disadvantages related with the pre- and post-condition based approaches. It is a fact that there are strong assumptions that only well defined alerts exist and attacks trigger multiple alerts, therefore, they ignore unrelated and uncorrelated events [220]. However, observations of collected data demonstrate that such assumptions do not necessarily hold. Moreover, there is a need for manual specification of the conditions for each alert and no automatic correlation operations are involved. Finally, when the modeling phase involves only dependencies between alerts, it is challenging to monitor the evolution of an attack scenario in real time, thus making them hard to be used in demanding use cases. Developing methods that are based on multiple information sources is a promising approach to provide complementary security assurance to networks. However, the scale of produced alerts is increasing heavily and it can become a challenging issue for such IDS systems design as their users are getting overwhelmed with huge amount of alerts, making it hard to detect the critical ones and prioritize them. Moreover, lack of cooperation and coordination among the considered sources of information hinders the investigation process. A mission-impact-based technique is proposed in [241], aiming to correlate alerts coming form several heterogeneous and spatially distributed information sources such as network firewalls and deployed IDSs in an automatic way. The host configurations are considered when alerts are inspected for system vulnerability. This approach is heavily based on maintaining two DBs, one for incident handling fact base and one for the topology map of the protected network and hosts. Moreover, a series of processing steps is specified in order to perform filtering, topology inspection, priority calculation, event ranking and alert grouping. The DBs are critical sources of information in order to perform the topology inspection where a relevance score is calculated per raised alert. This score defines the dependency among the incidents and the associated component configurations. At last, the level that an event influences the valid operation of a network is demonstrated via the generated priorities. M2D2 model [215] aims to removing several false positive alerts. The model relies on defining the sensor capabilities in a formal manner and considering their scope and position in the network to decide whether an alert is a false positive. M2D2 verifies whether all the pertinent sensors able to detect an attack confirmed it during the detection phase, having the assumption that inconsistent reports denote false alarms. However, this method can be compromised by the attackers who can participate in the voting process, which by itself is a challenging concern to avoid. An extension of the M2D2 model, called M4D4 data model [212], [31] has been designed seeking to provide reasoning about the security alerts as well as the relevant 26 SEVENTH FRAMEWORK PROGRAMME 2.2 Correlation Analysis and Alert Correlation context in a cooperative manner. The extended model is a reliable and formal foundation for reasoning about complementary evidences providing the means to validate reported alerts by IDSs. A joint approach to carry out alert correlations from multiple IDSs is described in [314], where a fundamental concept is the assumption that correlation is based on triggering events. Thus, clustering incidents that are coming from similar triggering events allows their partitioning into discriminated groups that may be related to the same attack attempt. In addition, the consistency between alerts of the same cluster and the related configuration descriptions can provide further assurance about the accuracy of the results and rate the severity of alerts and clusters. Furthermore, a second fundamental concept is the importance of input and output resources in the derived correlation level, considering that input resources are mandatory resources for an accomplished attack while output resources are the supplied ones at successful scenarios. The discovery of common resources between input for one attack and output for another allows the recognition of causal relationships among grouped alerts for developing attack scenarios. A decentralized IDS system is described in [171] seeking to both correlate the gathered events and fuse the relevant data observed among the multiple sensors. The deployed monitoring points collaborate using the peer-to-peer paradigm and exchanging events relevant to complex distributed attack scenarios. Afterwards, a distributed misuse detection algorithm is employed to perform event correlation. An inherit challenge to this approach the requirement for correct temporal order of the events that may be hard to achieve. 2.2.2 Monitoring from several vantage points Maintaining the routing tables of a huge and heterogeneous network such as the Internet is challenging issue considering the large number of prefixes, ASes and BGP updates. Therefore, detecting abnormal events updates requires advanced data mining methods to investigate the roots of the problem [260]. Pin-pointing the exact cause behind observed network routing issues remains a complex problem. A formal framework to represent and study MOAS events and relevant network management activities is described in [211]. A learning approach over raw BGP data is taking place that evaluates and ranks the possible relevant actions. It has been discovered that although multiple ASs perform promptly reactive actions before correcting the false BGP updates, more than 90% of affected prefixes were routed back to their correct routing path. Another distributed measurement framework for pin-pointing routing changes is described in [272]. In this work, each AS maintains an accurate view of occurred routing changes. Then, for each route modification the involved measurement servers are queried following the path from source to destination aiming to detect the exact location and the reason for the modifi- FP7-ICT-257495-VIS-SENSE 27 2 Network Analytics for Security cation. A large study of real network control traffic over the Sprint and AT&T backbone networks has been performed in [271], where the impact of BGP routing modifications on network traffic is investigated. It has been observed that a small number of routing modification have significant impact on data traffic while the majority of them have little influence. A formal model about the dynamicity of BGP is given in [112] highlighting the differences among multiple network monitor observations and focusing on route flapping. Root-cause analysis is a typical technique for detecting the reason and the location of BGP route modifications. An investigation about detecting the responsible AS for a routing change is provided in [91]. Correlating BGP update messages for prefixes collected at several observation locations forms the basis of the method. In particular, successfully pinpointing the origin of an AS number is conducted in two steps. The initial one involves simulations on snapshots of the AS topology as it can be developed out of the BGP updates having properly behaving routers. Then, a number of heuristic algorithms are suggested to deal with the restrictions of the actual update procedure. The differences between the simulation and real-world observations give insights about the deployed observation points. A VA-based approach combining both computer and human intelligence via properly selected visualization techniques for BGP anomaly event analysis and correlation is given in [275]. The work provides interactive means for presenting BGP OASC (Origin AS Change) events demonstrating the superiority of VA techniques. A distributed system for real-time IP prefix hijack detection is provided in [332] capitalizing mostly on data plane observations. Two key assumptions of this work are the facts that the path hop count from a source to a legitimate prefix is generally stable and second, the path from a source to a legitimate prefix is nearly at all times a superpath of the path from the same source to a reference point along the previous path for points topologically close to that prefix. The appropriate choice of vantage points for monitoring modifications that do not meet the aforementioned assumptions is critical for raising valid alerts. A Principal Components Analysis (PCA) based method for root cause analysis of BGP updates is provided in [316] aiming to develop a set of groups of prefixes or AS numbers that are affected by the same BGP update message. The method uses BGP update data from multiple border routers inside the AS to detect BGP routing changes. However, this method has limitations when two distinct events affect the same prefix or AS during the same observation time. An online tool able to generate a relatively small number of alerts out of millions of BGP messages is described in [311], where r-vector data is proposed to detect and capture BGP modifications. Correlating time and prefix modifications is a critical step to identify unstable routes. The tool has been used on a Tier-1 ISP backbone with hundreds of border routers with very inter- 28 SEVENTH FRAMEWORK PROGRAMME 2.2 Correlation Analysis and Alert Correlation esting results. Another root-cause analysis method is described in [45], which aims to detect the cause and the origin of a routing change. Correlating routing updates across different vantage points reduces false or redundant events. This method performs well on the analysis of events affecting relatively stable prefixes, applied on some use cases such as multiple BGP session resets during Internet worm attacks [301] and analyzing the updates generated by BGP Beacons [201] to pinpoint the update sources. A spatiotemporal clustering method that utilizes path vector information for assigning several related messages to the relevant events is described in [51]. The approach classifies the effect of routing events and estimates the distances to the originating AS, observing that more than 45% path changes are caused by events on transit peerings and that several path changes are transient indicating short-term path modifications. A pin-pointing algorithm for the origin of routing changes called MVSChange [178] proposes a simplified BGP model called Simple Path Vector Protocol (SPVP) combined with a graph model of the Internet to locate the origins of updates using multiple vantage points. The mechanics of router reactions when there are large routing tables are examined in [50]. There are routers demonstrating table-size fluctuations that are possible to cause cascading failures. Moreover, it has been found that in some cases an administrator is necessary to recover from failures and in some others BGP mechanisms such as prefix limits and route flap damping only partially handle the overhead of large routing tables. The aforementioned methods study streams of BGP messages from multiple observation sensors aiming to infer the cause and the origin of an unstable route. The root-cause analysis fits nicely with attack attribution. The aforementioned approaches capitalizing on distributing monitoring methods are heavily depended on the selection of the position of the sensors. Thonnard et al. [284] employed multi-dimensional data mining techniques for detecting actionable knowledge about network security issues aiming to build global indicators about existing malicious activities and investigate the modus operandi of rising security threats, considering specialists supervision. For this, a graph-based KDD approach is applied to evaluate real data from attack traces. More particularly, a clique-based clustering technique is used to extract the critical knowledge and afterwards combined with a multidimensional synthesis process, a concept lattice is created to describe the observed phenomena. Moreover, a customizable analysis framework for deeper investigation of raw honeynet data has been also developed [283] providing the means to discover traces in the network with common activity patterns. A clustering mechanism applicable on several feature vectors has been designed focusing particularly on the time series of the attacks. In particular, clique-based analysis can assist honeypot forensics by stressing correlation patterns, even when they are referring to completely independent attacks. A result of this work demonstrated that appropriate similarity measures assist FP7-ICT-257495-VIS-SENSE 29 2 Network Analytics for Security greatly on clustering attack patterns for detecting the probable root causes of attacks. In addition, a fuzzy inference system [285] has been employed in a knowledge discovery technique aiming to reproduce as close as possible experts reasoning for attack attribution. A multi-criteria decision-making process takes as input the extracted knowledge from large-scale attacks to attribute them and discover them. This method is particularly useful against distributed zombie-armies attacks. The aforementioned pieces of work for attack attribution automatically group together events that are likely to be due to the same underlying root cause. These techniques offer an automated means to apply a multi-criteria decision process to cluster groups together. Applications of the method have shown its usefulness but also, its limits when it comes to explain why events have been grouped together. Finally, systems such as honeynets [242] may also be set up specifically to support network attack detection and attribution. Another interesting approach is that of forward-deployed IDSs. The philosophy of forward-deployed IDSs differs dramatically from typical IDSs since they are systems deployed as close as possible to the attackers in order to maximize attribution information [307]. The great benefit of these systems is that they are able to supply faster more accurate information about the location of the attackers with reduced cost finding the correlations in the gathered information of the local log files [255]. However, false positives and negatives are possible requiring continuous observation and moreover, forward-deployed IDSs require some information to be well deployed and it may not be possible to be placed close enough to the attacker location. Moreover, forward-deployed IDSs need stronger protection since they are more vulnerable to attacks to avoid being disabled, controlled by attackers or revealing the internal detection policies. Nevertheless, assuming that such policies can be updated fast enough, the forward-deployed IDS can become an input debugging tool [254] where upstream routers are supplied with a policy/pattern of the target requested to generate an alert when the pattern is validated in future attempts. Forward-deployed IDSs are able to identify the initial attack event without requiring several messages to begin attribution. A number of techniques have been proposed for Level 3 attribution based on Bayesian networks [123] that are able to handle incomplete data scenarios, Hidden Markov Models (HMMs) that can unify analysis steps from different perspectives, Self-Organizing Maps (SOMs) and game-theoretic models [54]. For example, the latter method allows trackers to detect the methods utilized by intruders by comparing the actual evidence with the attack trajectory predicted by the models. Spoofed message discovery has a significant importance for attribution because it allows the interaction with protocols attempting to uncover attackers [274]. A relevant commercial tool called eTrust Network Forensics is a widely accepted environment to carry out multiple kinds of automated analysis for attack attribution and trace-back procedures. 30 SEVENTH FRAMEWORK PROGRAMME 2.3 BGP State-of-the-art 2.3 BGP State-of-the-art 2.3.1 Background The Internet is partitioned into tens of thousands of independently administered routing domains called Autonomous Systems (ASes), where an AS corresponds to an ISP, a company, a government body, an academic institution, etc. The Border Gateway Protocol (BGP) is the de facto inter-domain routing protocol that maintains and exchanges routing information between ASes. Since January 2006, BGP version 4 is codified in RFC 4271 [8]. When interconnecting, two ASes must be able to exchange network reachability information. Unlike intra-domain routing protocols that route packets through the shortest possible network path, BGP lets each AS define its routing policy, which is then enforced on each BGP-speaking router by filtering on incoming and outgoing update messages [297, 198]. A BGP update message is exchanged between two BGP-enabled routers to announce or withdraw network addresses reachable through them. Such a message mainly contains the destination network address, the AS path to the destination, and preference indicators1 . The AS path is built sequentially: when a router exports a route to a neighbour, it prepends its unique AS number to the path it has received2 . The first AS exporting a route to a given network is called the originating AS : the update message contains then a single AS number in the AS path field. The AS path is primarily used to avoid routing loops between ASes. Indeed, a router receiving an update containing its AS number in the AS path field will not consider the route as it already is in the path to destination. As a result of the existence of the routing policy, unlike intra-domain routing protocols that use the shortest possible path to the destination address, BGP uses the following mechanism to select the preferred route. First, when multiple network addresses overlap, BGP uses the longest prefix match rule. Then, for identical network addresses, BGP selects the route with: 1. the highest local preference, 2. the shortest AS path, 3. the lowest Multi-Exit Discriminator (MED). 1 2 Note that in case of a withdrawal, an update message only contains the network address. Public AS numbers are uniquely assigned by Regional Internet Registries (RIRs). Their values range from 1 to 64511. Private AS numbers (from 64512 to 65535) can be used locally for a connection between a network and its provider [297, 8]. FP7-ICT-257495-VIS-SENSE 31 2 Network Analytics for Security If multiple routes are still possible, tie-breaking rules are applied [297]. The local preference is a value assigned to a route as part of the AS policy. It is only relevant within an AS and is not communicated to external networks. The MED value is used to balance traffic between multiple possible links between two ASes and is only shared between them [297, 43]. Once the process has successfully selected a route to a prefix, the route is added to the forwarding table. BGP security issues BGP was designed based on the implicit trust between all participants, and no measure exists in the protocol itself to authenticate the routes injected into or propagated through the system. Therefore, virtually any AS can announce any route into the routing system and sometimes, bogus routes can trigger large-scale anomalies in the Internet. This intrinsic weakness of the BGP protocol can lead to so-called prefix hijacking attacks, be it intentional or not (i.e., due to router misconfiguration or because of a real attack). Prefix hijacking basically consists in redirecting Internet traffic by tampering with the control plane itself (i.e., the BGP protocol). The problem of prefix hijacking is considered as a crucial one and has recently received much attention. There are indeed some claims that the core infrastructure of the Internet may be misused by attackers in one or another way to surreptitiously perform malicious activities. For example, in [248] the authors have shown evidence that, in a few limited cases, it is quite likely that attackers were misusing the BGP routing protocol to hijack blocks of IP addresses during limited amounts of time, so as to launch spam campaigns from legitimate-looking blocks of IP addresses. If successful, such techniques would clearly defeat the spam blacklists that anti-spam tools use as a first layer of defence against spammers. Since one of the main objectives of VIS-SENSE is to correlate security events with possible traces of attacks targeting the core of the Internet, we perform an extensive study of BGP prefix hijacking and its related concepts in Section 2.3.2. We will then briefly describe a few techniques developed to securing BGP in Section 2.3.3. In Section 2.3.4, we review some popular tools for the observation of the BGP routing process. Finally, we finish this state-of-the-art on BGP by reviewing available methods and services for detecting BGP hijacking attacks in Section 2.3.5 respectively. It is worth noting that, as BGP runs over TCP/IP (BGP listens on TCP port 179), the protocol is also subject to the same attacks than any other protocol relying on TCP, e.g.: Denial of Service (DoS) attacks (e.g. SYN flooding or RST spoofing), eavesdropping, attacks against packet integrity, replay attacks, etc. [43]. However, as interesting as these attacks may be, these are out of the scope of this document. 32 SEVENTH FRAMEWORK PROGRAMME 2.3 BGP State-of-the-art 2.3.2 Prefix Hijacking Prefix hijacking (also known as BGP hijacking or IP hijacking) is the act of absorbing (a part of) the traffic destined to another AS through the propagation of erroneous BGP routes. It can be the result of router misconfigurations [198] or of malicious intent [21, 43, 124, 247, 268]. Regardless of the intentions of the issuer of the incorrect routes, we will refer to him as the hijacking AS. In the same fashion, the route propagated by the hijacking AS is the hijacked route. The network whose route has been hijacked will be referred to as the victim AS. The correct route to the victim AS is referred to as the legitimate route (or the original route). Finally, any occurrence of prefix hijacking will be considered as an attack. Objectives By hijacking the traffic of another AS, an attacker may [331, 222]: (i) create a black hole, i.e., perform a complete Denial of Service of an AS/prefix; (ii) impersonate the victim by stealing its AS’s identity and imitating certain services (e.g., duplicate a website); (iii) intercept the traffic to eavesdrop (or record) the exchanged data, and then forward the data back to the victim AS (i.e., a case of subversion). (iv) create a network instability by triggering connectivity outages [261]. To achieve these objectives, different types of attack can be employed. These are explained here below. For illustrative purposes, we then describe a few public incidents of BGP hijacking that appeared in the headlines in the recent years. Types of Hijacking IP prefix hijacking can be performed in several ways. Hu et al. present a taxonomy of hijacking attacks in [124]. A similar work was done by Lad et al. in [176] and Katz-Basset et al. in [149]. The attacks are usually based on the following key elements: • AS ownership: the hijacking AS claims to be the origin AS of the prefix. Since the hijacker is advertising itself as the origin AS, the AS path is much shorter than the one of a legitimate route. The hijacked route is then selected – if only by peers of the hijacking network – to route to the victim network. FP7-ICT-257495-VIS-SENSE 33 2 Network Analytics for Security This kind of attack can usually be easily detected because it creates a so-called Multiple Origin AS (MOAS): a single prefix is originated from multiple ASes. Note, however, that there may be valid reasons for a network to be a MOAS (e.g., multihomed stub networks) [330], so it is not always trivial for an external observer to differentiate between a legitimate MOAS route and a prefix hijacking attempt. • Intermediate hop: the hijacking AS claims to be closer to the origin AS than it really is. The announced AS path is longer than for an ownership attack, but it is also harder to detect. Usually, the attacking AS will claim to be second hop, since being any further down in the path would significantly decrease the amount of hijacked traffic [21]. Because the victim’s AS number is still the originating AS, it does not create a MOAS route. Another approach to this type of attack is described in [203] where they study the amount of traffic that can be stolen with an intermediate hop attack, depending on how far in the AS path the hijacking AS is. The idea behind it is not to hijack the whole traffic of the victim, but to only suck in a little percentage of packets destined to them, enabling the attacker to go undetected for a longer time (i.e., stealthier attack). • Subprefix hijacking: the hijacking AS propagates update messages containing a route to a more specific network address than the original announcement. Because of the longest prefix match rule, this is a very effective attack: any router that receives (and accepts) the incoming route will automatically forward any traffic destined to the victim to the hijacking AS. The victim has only two ways of dealing with this attack: 1. inform the NOC of the hijacking AS that they are misbehaving. Since it is unlikely they will cooperate if the attack is not the result of a misconfiguration, the victim will have to get the cooperation of an upstream provider of the hijacking AS, which can be quite complicated. 2. announce an even more specific prefix for the network. However, this countermeasure may also fail in some cases, since most ASes tend to reject too specific incoming routes in order to keep the size of the routing table as low as possible (usually, anything more specific than a /24 is dropped) [297, 124, 43]. 34 SEVENTH FRAMEWORK PROGRAMME 2.3 BGP State-of-the-art • Supernet: the hijacking AS propagates update messages containing a route to a less specific network address than the original announcement, hoping to receive the traffic whenever the legitimate AS is unavailable, or to use a range of addresses not covered in the original announcement. • Invalid or unassigned prefix: the hijacking AS announces a network prefix that is not assigned to any entity (e.g., a bogon). In this case, there is no victim AS, but malicious activities can be easily carried out by using these addresses (e.g., spam campaigns). Finally, it is interesting to note that, depending on the position in the Internet hierarchy of the hijacking AS [100], the probability of a successful hijacking may vary quite substantially (between 38 and 63% according to [21]). Some public incidents In the recent years, several cases of BGP hijacking have made the headlines. We briefly describe some of them to illustrate the concepts explained here above. The AS7007 incident The first BGP-related incident on the Internet dates back to April 25, 1997 when AS7007 – assigned to MAI Network Services (MAI), a regional ISP in Virginia, USA – started, as the result of a misconfiguration, announcing highly specific routes to one of its providers: Sprint (a large backbone network). Sprint didn’t filter out those announcements and started propagating them. Because of their network size, the erroneous routes completely contaminated the Internet, resulting in a large-scale prefix attack coupled with an ownership attack. When MAI noticed what was happening (within 15 minutes), they disconnected themselves off the Internet. However, the highly specific routes still existed for a while, resulting in a massive blackhole of the global network that lasted a bit less than 6 hours [38, 60]. Christmas Eve leak On December 24, 2004, TTNet (the largest ISP in Turkey) started announcing over 106,000 prefixes to Telecom Italia who did not set a maximum prefix count on the incoming routes from TTNet, so they accepted the routes and started propagating them upwards. Fortunately, these peers had an upper limit on the number of accepted incoming routes and it was rapidly reached. Unfortunately, more specific routes were still FP7-ICT-257495-VIS-SENSE 35 2 Network Analytics for Security propagated, albeit in a small number, which resulted in a virus-like propagation of erroneous routes (everybody got a little bit infected). The event lasted a little under 12 hours [239]. The YouTube attack On February 24, 2008, the Pakistani government decided to forbid access to YouTube [251]. YouTube is announced with an aggregated /22 prefix. Pakistan Telecom decided to enforce the interdiction by BGP means and announced the /24 prefix of YouTube that contains YouTube’s DNS and web servers. Somehow, Pakistan Telecom announced that route outside of their networks, including to their provider, PCCW Global, that did not filter them and propagated the more specific /24 route to the rest of the world. For approximatively 80 minutes, the whole traffic of YouTube was blackholed in Pakistan. YouTube then reacted by announcing even more specific /25 subnets, which resulted in getting the traffic redirected to them. Roughly 2 hours after the start of the attack, PCCW Global withdrawed the routes originated by Pakistan Telecom, and YouTube reaggregated its announcement to the original /22 prefix. China Telecom On April 8, 2010, China Telecom released 37,000 prefixes instead of the normal amount of 40, affecting networks owned by CNN, Dell, Apple, US DoD, France Telecom, Amazon Deutschland, and others, for approximatively 20 minutes [289, 175, 174, 206, 309]. About 15% of the global routing table was apparently affected [175, 174]. The impact in North-America and Europe was minimal [175, 160], although the impact in Asia was certainly not negligible. The incident raised awareness about the fragile security of Internet routing in the media that started drawing conclusions about “Cyber-War”. However, there is a consensus among experts that the incident was most likely due to a misconfiguration. DEFCON Man-In-The-Middle While not an incident in its own right, the Man-In-The-Middle BGP attack presented in [238] is very instructive and probably one of the most dangerous types of hijacking. Unlike precedent incidents, the goal here is to silently redirect Internet traffic through another AS, and then forward it back to its final destination. Wile diverting the traffic to another network can be actually simple, the trickiest part is to be able to forward it afterwards to the legitimate owner. Pilosov and Kapela have demonstrated how to do this during DEFCON (16) in 2008 [238]. First, they identify a possible path from the attacking AS to the destination. This legitimate route will 36 SEVENTH FRAMEWORK PROGRAMME 2.3 BGP State-of-the-art not be modified nor hijacked, as it will be used as return path to forward the traffic back to its destination. Secondly, to attract the traffic, the hijacking AS will perform a regular subprefix attack, but it will prepend the return path to the destination AS in the announced AS path. As a result, many routers will receive and accept the more specific routes, except the routers being on the forwarding route that will discard them because they are already in the AS path. As such, this can be seen as a combination of a subprefix attack (i.e., regular blackholing) with an intermediate hop attack. The good news, however, is that BGP man-in-the middle has not been observed yet in the wild [118]. 2.3.3 Securing BGP Like many other protocols in the Internet, BGP was originally designed on the premise of mutually trusting and well-behaving entities, and thus no measure has been included in the protocol itself to authenticate the routes propagated by BGP routers. As a result, there have been many propositions for securing BGP and inter-domain routing. Current research efforts to securing BGP attempt to secure the confidentiality, integrity and availability of the BGP data. Most of the techniques proposed until now to secure the protocol are based on cryptographic extensions of the protocol. Confidentiality in BGP sessions Regarding confidentiality aspects, a possible method for mitigating attacks on BGP sessions is to protect the TCP connections. The TCP protection mechanisms include the generalized TTL security mechanism limiting the effective radius of potential attack on BGP sessions, and providing in parallel host-level defences against TCP SYN attacks [81]. Another category of TCP protection mechanisms, namely the IPsec at the IP level, and the TCP MD5 signature option at the TCP session level [117], protect the BGP TCP session from external disruption using cryptographic protection techniques for the underlying TCP connections. On this matter, the MD5 signature option is a frequently suggested method because it provides a relatively sufficient level of protection combined with simplicity; however it has some potential weaknesses when compared with IPsec [27]. IPsec (RFC 4301 [156]) has been suggested as a secure underlying message delivery protocol that aims at providing security over plain IP. BGP operates on top of IPsec by utilizing the authentication capability, in particular the Authentication Header (AH) option that can be used at the IP layer implementing packet level security with differing guarantees [153]. Additionally, using the Encapsulating Security Payload (ESP) option, FP7-ICT-257495-VIS-SENSE 37 2 Network Analytics for Security BGP capitalizes on an added layer of protection to encrypt BGP update messages [154]. Nevertheless, despite the higher levels of assurance provided by IPsec and the dynamic approach of secret sharing, there are several disadvantages when employing IPsec for BGP communication. The strong encryption algorithm generates high packet processing workload to routers that can cause increased packet queues and become a DoS attack target [59]. Moreover, a mechanism for key coordination is necessary. An approach that exploits the Time-to-Live (TTL) has been devised by IETF and is called the BGP TTL Security Hack (BTSH) [103], which is also known as the Generalized TTL-based Security Mechanism (GTSM) (RFC 5082) [104]. This TTL-based security protection leverages the TTL value of IP packets to ensure that the received BGP packets are from a directly connected peer. However, such an approach requires cooperation and mutual acceptance among BGP routers, therefore, it cannot be easily deployed. Each router receiving BGP packets has to check the TTL value, which must be greater than or equal to 255 minus the hop-count specified, otherwise it shall be considered invalid and should be discarded. The big advantage of this approach is the lightweight processing requirements as compared to crypto-based approaches. However, it protects only against intrusion of external packets into an existing session, assuming that spoofing of the TTL field in an IP header is a challenging task for remote attackers. Integrity of BGP messages A number of studies have focused on securing BGP messages themselves and validating the integrity of a message as it is accumulated along crossed routers (e.g., the IP prefixes in the AS PATH messages) [132]. Two important candidate solutions in this area are S-BGP [155], and soBGP [308], which address both the integrity and authenticity of the BGP data. However, it should be noted that both soBGP and S-BGP, although developed during 2000 − 2003, have not been widely deployed yet [155], mainly because they require substantial modifications to the BGP protocol and its core operation. S-BGP [155] is a solid piece of work for securing the exchange of BGP messages. SBGP is enforcing both integrity and authentication by employing digital signatures for both the addresses and the AS Path information. For the validation of these signatures, S-BGP requires a Public Key Infrastructure (PKI). In addition, S-BGP proposes the use of IPsec to secure the inter-router communication paths. S-BGP defines the correct operation of a BGP speaker in terms of a set of constraints placed on individual protocol messages, guaranteeing that • no protocol update messages have been modified between the BGP routers, • the updates were sent by the indicated BGP node, 38 SEVENTH FRAMEWORK PROGRAMME 2.3 BGP State-of-the-art • the update are destined to this particular BGP node, • the BGP node is authorized to advertise routing information on behalf of the AS it represents. Moreover, there are a number of conditions that should hold: every pair of originating AS and a related prefix must be valid pairs, the originating AS must be authorized to advertise the particular prefix and finally, every subsequent advertisement must be authorized by the AS holder of the prefix. The security features of S-BGP are based on digital signatures for verifying BGP peer identities, IP prefix owners and their administrators. Hence, PKI is a key element in this process, where PKI-signed certificates are used to verify each address assignment and allocation. Nevertheless, the operation of S-BGP is significantly more costly in terms of processing workload, required memory and utilized bandwidth as compared to plain BGP, mainly caused by the attestations and the certificates for signature generation and validation [320]. Moreover, challenging issues are the increased load during session restarting, the completeness of route attestations, and the requirement that the BGP UPDATE message has to traverse the same AS sequence as that contained in the UPDATE message [158]. In order to deal with the challenges introduced by S-BGP, another approach have been proposed, called Secure Origin BGP (soBGP) [308] that mainly aims to provide a solution with reduced processing load during attestations validation as well as reduced signing overhead, mainly by using locally generated RAs [157]. The concept of EntityCert is introduced for binding an AS to a public key. Instead of capitalizing on hierarchical infrastructures such as PKI, soBGP involves a reputation mechanism (i.e., web of trust) for certificate validation. Moreover, a second concept is introduced, called AuthCert for correlating address prefixes and originating ASes. In order to sign AuthCerts, a private key bound to an AS is used. A third concept introduced by soBGP is that of ASPolicyCert that includes a signed list of neighbor ASes that have to appear mutually in the lists of the two neighbors to be valid. Such an approach avoids on purpose strong dependency on the ASes or the address distribution mechanism, however, it brings an open issue on how to validate and extract trustful relationships between the introduced objects, which are considerable shortages on the design of soBGP. Aiming to overcome the aforementioned issues on S-BGP and soBGP, Pretty Secure BGP (pS-BGP) [227] proposed a combination of a centralized trust model for AS number authentication as well as a decentralized one for IP prefixes verification. In particular, ASes are equipped with a trusted certificate binding their number to the public key. Moreover, a lightweight rating mechanism is used for verifying the advertised prefixes and the relevant AS PATHs. The introduced decentralized model then is used to verify FP7-ICT-257495-VIS-SENSE 39 2 Network Analytics for Security the constructed AS prefix graph. Therefore, a configurable solution is provided for each AS that can consider the rating values to give weights to AS PATHs and take local decisions on whether to accept advertisements. Such a method is preferred over globally determined ones for countering the wide spread of security threats. However, the increased design complexity that involves two trust models is a shortage of this approach and it has not been widely accepted and deployed. Interdomain Route Validation (IRV) [108] is a proposed query-response protocol operating in parallel with BGP that allows BGP listeners to query the originating ASes about the validity and authenticity of the received UPDATE messages and the advertised prefixes. However, such an approach introduces new challenges such as how to validate IRV messages, authenticate and correlate routers, collaboration issues, etc. while additional workload is introduced. The performance advantage of symmetric versus asymmetric cryptographic functions has triggered the interest for deeper investigation [125]. In particular, a tree-based hash function for authentication has been used to encode sequences of data, thus, fulfilling the requirement for an ordered relationship among the data that is mandatory for the application of symmetric functions. Such an approach is particularly useful for preventing malicious manipulation of the ASes, as they are members of a list included in BGP route UPDATE messages. Another application of symmetric cryptographic functions is the origin authentication [16]. In this investigation, taking advantages of properties such as the density and the static nature of the address delegation structure and analyzing their semantics, it has been observed that the delegations were very stable over time, and therefore, using mechanisms such as Merkle hash trees [209], ownership validation can be effectively implemented. Secure Path Vector (SPV) [126] is another BGP proposal for increased security capitalizing on the symmetric hash functions. Although it achieves improved performance in terms of processing workload, it requires more storage, higher synchronization and information update times. Moreover, it is based on a complex key distribution mechanism. Finally, even though ISPs are aware of the weaknesses of BGP, and despite all the protection mechanisms that have been proposed, there have been no important changes so far. A common practice today is to rely on ingress-filtering techniques at AS level, manually implemented in an ad-hoc way, along with some simple transport-level techniques to ensure that BGP speakers talk only to their direct neighbours (e.g., with TTL-based protection techniques, as explained here above). 40 SEVENTH FRAMEWORK PROGRAMME 2.3 BGP State-of-the-art 2.3.4 BGP monitoring Over the years, a variety of tools allowing the observation of BGP routing tables were developed. This section will briefly cover the most popular ones. Note, however, that a more comprehensive survey of available information sources (both BGP and attackrelated) will be provided in the VIS-SENSE deliverable D2.1. Looking glasses A Looking glass is a network, somewhere on the Internet, that is “kind enough to show you their BGP routing table” [297]. For example, Packet Clearing House (PCH) offers a looking glass web application at [228]. PCH offers archived BGP update messages from over 30 routers over the world. An archive contains 5 minutes of exchanged update messages for a single router. RouteViews The RouteViews project [293], run by the University of Oregon, is a network of routers of AS6447 placed at several locations and peering with different backbones. The idea is to obtain near real-time informations about BGP routing to understand better the relationship between an AS and the rest of the network. Most of the routers are available directly via Telnet, so that information can be viewed in real-time. Moreover, every two hours, the data of the BGP table is dumped into a file and made available from RouteViews website. Also, every 15 minutes, BGP update messages received is saved in an archive file that is also made available at [293]. A lot of tools have been developed based on RouteViews data, most notably Cyclops (see below). Many analytical works, such as studies of the global dynamics of BGP routing tables [132, 204], have been based on the very same data. BGPlay BGPlay [63] is an application that graphically displays AS-relationships based on RouteViews data. It was developed by the Computer Networks Research Group of Roma Tre University. RIPE RIS RIPE NCC (Réseaux IP Européens - Network Coordination Centre) offers several tools as part of its Routing Information Service (RIS) project [216]. FP7-ICT-257495-VIS-SENSE 41 2 Network Analytics for Security Visualize Visualize is a Flash application that graphically displays topology changes, based on updates and withdrawals seen by RIS, towards a given prefix in a given time frame. Search RIS The Search RIS module enables a search, in the last three months, of announcements and withdrawals for a given prefix (with the option to search for less and/or more specific prefixes) in a given location in a given time frame. ASInUse ASInUse determines the last time an AS appeared in a routing table (in the last three months), and displays its known peering ASes. Filtering on a specific location is also possible. PrefixInUse Similar to ASInUse, PrefixInUse determines the last time a given prefix appeared in a routing table (in the last three months). The result also displays the originating AS(es) for the prefix, or related ones. Looking glass RIPE provides a webpage that enables the execution of a command on one of their routers. The available commands permit querying the BGP and routing tables of the router, the execution of a traceroute, sending a PING message, etc. RISwhois The RISwhois tool returns the matching prefix/origin AS pair for a given host address. Raw Data RIPE dumps and archive data of their collector routers, which is publicly available as raw data. The entire BGP routing tables are dumped every 8 hours, while the updates messages are saved every 5 minutes. For example, data collected by the router located in Amsterdam dates back as far as September 1999. BGPmon: BGP Monitoring System BGPmon [62] aims at giving real-time access to BGP data, avoiding update lags inherent to collectors-based systems (e.g., RouteViews, RIPE RIS). Unlike collectors, BGPmon 42 SEVENTH FRAMEWORK PROGRAMME 2.3 BGP State-of-the-art does not implement a full-blown BGP client, but only the requested functions: receive and log routes. As a result, BGPmon is lightweight enough to peer with more neighbours [318]. The architecture used by BGPmon is the publish/subscribe one. A set of brokers form an overlay network that peer with neighbour BGP routers and exchange information among themselves. They manage the final stream and compute the best route from the publisher to the subscriber. Subscribers (applications) can personalise the informations they want to receive in their stream (including open, close, update, notification BGP messages, state changes in BGP, break up and tear down of peering connections, etc). Currently, the BGPmon application incorporates the three facets of the system: broker, publish, and subscribe. It is divided in three levels. The first one peers with a BGP-enabled device and places BGP messages in a queue, creating a stream of events. The second one labels events from that queue that identify announcements, withdrawals, updates, duplicates, etc. As this second stage can be quite costly in terms of memory, it can be disabled. Disabling it, however, results in losing the ability to simulate a route refresh without the help of remote sources. The final stage adds status informations and injects route table snapshots in the stream. Cyclops Cyclops, the AS-level connectivity observatory, is a monitoring tool developed by the University of California, Los Angeles. In a nutshell, the goals of this project are i) to detect anomalies in BGP data (such as misconfigurations and route leakages), ii) to provide a connectivity map of inter-connected networks, to detect suspicious peerings, and iii) to correlate these events together. As of now, Cyclops fetches data from RouteViews devices, RIPE-RIS, Albeine, Packet Clearing House and BGPmon. The data is then preprocessed by extracting AS links from the AS-paths, and adding timestamps. Also, a weight is associated to a link, which represents the number of routes that make use of it. Finally, a relationship inference is performed and the ASes are classified (i.e.., stub AS, transit AS, tier-1, etc). After preprocessing, the data is entered in the Cyclops database [225]. The Cyclops database can then be browsed through the Cyclops website [291]. The web interface can display AS connectivity (which ASes are peering with a given AS), prefix origins (which AS announces a given prefix), transient prefix origin (which AS has announced a given prefix for less than 5 days), anomalous peerings (when does AS link disappear for more than 24 hours), and even more. Data can be filtered by date, by activity, etc. The raw data of the database is also available at [195] for people who want to build their own tools based on it. FP7-ICT-257495-VIS-SENSE 43 2 Network Analytics for Security Finally, Cyclops offers the possibility to register and allows the user to define a set of ASes they are interested in. Cyclops will then show by default information regarding these networks (neighbours, alerts, etc). Netviews NetViews [292] is an effort between University of Oregon, Colorado State University, University of Memphis, and University of California Los Angeles aiming at building the next-generation RouteViews. The system relies mostly on BGPmon, and provides therefore real-time information. A central server, called data broker is connected to BGPmon and forwards the data to its clients. The NetViews client is a Java application that displays BGP data in real time in multiple forms. The default view is quite standard: it shows plots of BGP activity, including the state of the routing table, incoming announcements and withdrawals. Another view is the visualizer, which displays geographically positioned ASes. ASes and links are drawn differently depending on selectable factors such as the number of originating ASes, number of peers, link degree, etc. The map is interactive and dynamic: it updates in real-time. The live mode can be stopped to observe an event in more details. In this case, update messages are queued for further processing. It is therefore always possible to go back and forth in time to view messages. A complete BGP table can be downloaded from a source as a base so that the map is populated with correct entries at startup. Information about ASes (such as WHOIS) are also integrated. Finally, the user can filter on a given prefix (or AS), or display the routes towards a given location. The NetViews client is still in beta development, and is not currently available to the public [292]. Robtex and BGP Toolkit Because the information about ASes are spread in different databases, some projects have been focusing on gathering these data in a centralized view. Robtex AS Analysis [252] gathers data from WHOIS and routing registries, and infers peering relations from BGP tables based on data from RouteViews and RIPE RIS. Similarly, Hurricane Electrics BGP Toolkit [131] gathers the same kind of data, and performs some statistical analysis on it. 44 SEVENTH FRAMEWORK PROGRAMME 2.3 BGP State-of-the-art 2.3.5 Methods for detecting prefix hijacking This Section focuses on existing methods and algorithms used to detect prefix hijacking attacks and briefly describes some tools, services or implementations for detecting prefix hijacking. The Next-Hop anomaly This method is presented in [21] was designed under the assumption of an ownership or an intermediate hop attack where the hijacker is the first hop after the legitimate AS3 . It uses information from both the control and data plane. Detection method Let p be a prefix originated by AS O. A router belonging to AS S receives in its update an AS path field containing N1 , . . . , Nj , O. Based on this AS path, any packet to p should be directly forwarded to O once it reaches Nj . The authors define a next-hop anomaly as a data-plane trace where AS Nj forwards packets for p to some AS I (I 6= Nj ). It suggests that Nj and O are not interconnected. The next-hop anomaly is used as signature for detection. Limitations As such, the signature generates a lot of false positives that the authors attribute to errors in IP-to-AS mappings, including IXPs routers not included in the AS path, sibling ASes that share address space and have routing agreements, and provider address spaces in which customers use a small part of ISP’s space as their own. Having removed events attributed to the causes here above, the authors are still unable to decide whether the remaining cases are the result of prefix hijacking or traffic engineering agreements. Basically, “there is no way to verify the data-plane adjacency of two ASes as claimed by the corresponding control-plane advertisements”. PHAS: a Prefix Hijack Alert System The idea behind PHAS [176] is to provide a prefix hijack alert service. Based on the premise that the prefix owner is the only one that can unambiguously distinguish a legitimate route change and a hijacking attack, the authors offer the possibility for network administrators to subscribe to monitoring services for a given prefix p, and to be notified 3 The method would work for any intermediate path level attack, but was limited to this case to reduce the problem to a manageable size. FP7-ICT-257495-VIS-SENSE 45 2 Network Analytics for Security of an origin AS change somewhere on the Internet, in near real-time. Detection method The system builds, over time, a set Op (t) containing the different origins ASes for prefix p seen at time t on every router where PHAS is deployed4 . PHAS alerts the users whenever Op (t) 6= Op (t − 1). Obviously, by simply doing this, the system will notify a user each time there is a change in the set. To avoid notifying users of repeated origin changes, the authors introduce a time window. The origin set is extended to Op (t − k, t) that contains every origin AS seen for prefix p during the time [t − k, t], on every PHAS-enabled router. The system then generates an alert when Op (t − k − 1, t − 1) 6= Op (t − k, t). This trick avoids repeated origin events, but will still generate an alert whenever a new origin AS appear, or whenever a known origin AS disappears, notifying users only on potentially wrong origin ASes. Such a detection scheme works relatively well, but users should not have their notification delayed when Op changes if their network behaves well. To avoid this, the authors introduce an adaptive window size. On top of a windowed origin set, each prefix is assigned a penalty Sp . When an update message is received for prefix p, Sp is increased by 1/2. The size of the window for p is then 2bSp c . Sp decays exponentially, determined by a time value. Finally, users have also the possibility to add filters before alerts are being sent to the user. Extensions The authors also provide possible extensions to PHAS to deal with other types of attacks from the origin attack. For subnet attacks, a mechanism based on watching modifications made to the set SP p that contains the advertised subprefixes of p is proposed. If no subprefixes are advertised, SP p = { }. For last hop attacks, the suggested method is to watch the set LA containing the last hops witnessed for prefixes with A as the origin AS. Using these two additional sets in PHAS helps to further identify hijacking attempts. However, the subprefix set (resp. the last hops set) is potentially huge for a network such as 12.0.0.0/8 (resp. for a tier-1 ISP). Accuracy PHAS has successfully detected every known ownership attack. It cannot, however, de4 The authors decided to use data from RouteViews routers [293]. 46 SEVENTH FRAMEWORK PROGRAMME 2.3 BGP State-of-the-art tect a stealthy IP hijack, like the one presented in [203]. As a result, PHAS is unlikely to detect a man-in-the-middle attack such as the DEFCON one [238]. Directed AS topology The idea behind this method is to build a directed graph of the network topology, and to use it to verify the AS paths in update messages. It is presented in [247] and is heavily dependant on a previous work of the same authors [100]. Detection method The authors first observe that the majority of BGP routes are stable and legitimate. Thus, these routes can be learnt over time. Let’s consider a prefix p. An observer receives a legitimate update message for p, containing the AS path ak , . . . , a0 . In other words, ASes ai and ai−1 are neighbours. A directed AS link is a link ai → ai−1 (i = 1, . . . , k). Moreover, ai (resp. ai−1 ) is upstream (resp. downstream) of ai−1 (resp ai ). The directed links also indicate the import/export policies of the involved ASes. A downstream (resp. upstream) AS allows route to be exported (resp. imported) to an upstream (resp. downstream) AS. Let’s consider, at time t, the sets A(t − k, t) and L(t − k, t) containing the associations between a prefix and an origin AS number and the directed AS links, respectively, seen between time t − k and t (i.e., in a time window of size k). Whenever an update message reaches the observer, the system verifies that the AS links given in the AS path of the message are valid (i.e., are part of set L). If the links are ok, the system verifies the association between the prefix and the origin AS (i.e., it is part of set A). If an extracted ai → aj association from the AS path does not belong to L but aj → ai does, there is a policy violation, and the link is a redistribution link. An example of such a behaviour is when a customer having two different ISPs forwards traffic between the two providers. If aj → ai 6∈ L, the path is a fake link : the announced neighbouring ASes are not really neighbours. It is highly likely that someone tampered with the AS path. Also, if (p, a0) 6∈ A, there is prefix hijacking. Furthermore, if (p, x) ∈ A for x 6= a0 , it is an ownership attack. If (q, x) ∈ A with q ⊂ p (i.e. q is more specific than p), it is a prefix attack. Finally, if (q, x) ∈ A with q ⊃ p (i.e. q is less specific than p), it is a supernet attack. Of course the scheme will only work if the model of the network (i.e., the sets A and L) are close enough to reality. Therefore, the initialisation phase is very important. The authors propose heuristics to remove alerts generated by transient routes, path extensions (which are the result of address suballocation), usual BGP misconfig- FP7-ICT-257495-VIS-SENSE 47 2 Network Analytics for Security urations, (de)aggregations, sibling ASes links, address-sharing peers, and backbone links. Accuracy The authors announce a false positive rate as low as 0, 02% and an average of 20 alerts raised per day. They have nearly 100% accurate detection on documented public incidents. However, the required quality of the calibration data can be a strong limitation. Moreover, the AS relationships on which this method is based is a model of a perfect Internet, and thus not entirely accurate. Finally, the authors do not provide any detail on how to set the threshold values used for the different heuristics. Hop count to a reference point The technique presented in [331] only relies on the data plane to detect possible hijacking events. Namely, it uses the distance (expressed in hop count) between a set of N wellplaced monitors M and the watched network, based on the assumption that distance measurements to a destination network is relatively stable over time (which seems to be confirmed by [330]). In addition to the N monitors, one (or more) reference points per monitor are needed. A reference point is a router topologically close to the network under surveillance, but outside of it. Detection method First, periodically, a monitor measures its distance from the network dt (at time t). It keeps in memory a moving average window of size k that contains, at time t the average value of the distances between t − k and t, called At . Because a prefix hijack is likely to have serious consequences the topological location of the victim network, whenever an attack occurs, dt will significantly differ from At , thus raising a red flag5 . This step is known as the network location monitoring. Secondly, when a red flag is raised, the path disagreement detection is called. Its goal is to compute the path similarity between the (supposedly affected) AS path to the network and the (normally unaffected) AS path to a reference point. Because the authors rely only on the data plane, they chose to use iPlane [294] to map the hop IP addresses to their (supposedly correct) AS number. Once the similarity st between these paths has been computed, its value is compared with sh , the similarity path value that 5 To be complete, the authors use another window to smooth the instantaneous measurement dt as transient problems leads to noise. 48 SEVENTH FRAMEWORK PROGRAMME 2.3 BGP State-of-the-art had been computed prior to the hijacking alert. If st /sh > T for a threshold T (i.e., the similarity has decreased dangerously), an alert is raised by the monitor. Obviously, if multiple monitors raise an alert, the probability of being under attack increases. Limitations The detection accuracy highly depends on the choice of the monitors. To be effective, monitors have to be largely distributed and use different routes to the network. It may not be easy to locate such positions. Also, the method relies solely on the data plane. An attacker using a tool like Fakeroute [203] will make the detection system blind. Moreover, a MITM-attack such as [238] also makes the scheme useless. The path disagreement detection might not be accurate because of the policy of one AS along the way between the monitor and the network/reference point. An AS radically changing its policy could even trigger an alarm. Fingerprinting the network The fingerprinting technique [124] is based on the hypothesis that the hijacking network is different from the legitimate one. Consequently, it is possible to compare the fingerprint properties of the hosts on these networks to infer if they are identical or not. Multiple fingerprinting techniques are used, both network based and end-host based. Network based fingerprints include firewall policies, bandwidth information, characteristics of routers, etc. End-host fingerprints include OS, IP identifier probing, TCP/ICMP timestamp probing, uptime, etc. It is essential to select multiple discriminative properties to ensure that the hijacker cannot fake the responses. Detection method To detect ownership attacks, the system looks for MOAS. For each prefix in a MOAS conflict, the method then builds an AS path tree rooted at the prefix. Then, it tries to find a live host to use as probing target. Multiple probing locations are selected such that packets traverse every possible AS to the destination, and fingerprints are then acquired. Finally, the results are analysed and compared. To detect intermediate hop attacks, the authors use an AS-level traceroute to detect fake edges in the path. They limit the amount of false positive with a couple of heuristics: popularity constraints (i.e., if an edge of the network is only used by a few prefixes, it is more suspicious than a route used by a lot of prefixes), geographic constraints (i.e., a network edge corresponding to two geographically distant points is suspicious), and FP7-ICT-257495-VIS-SENSE 49 2 Network Analytics for Security relationships constraints (partially based on AS relationships [100]). When a potential subnet attack is detected, the method first removes all networks with a provider-customer relationship. This is based on the assumption that a provider has no reason to hijack the traffic of one of its customers, and that a customer cannot steal the traffic of its provider. For the remaining routes, a reflect-scan is used for fingerprinting. The reflect-scan is similar to the TCP idle-scan technique. An additional step for the reflect scan is to find a live host that is not inside the attacked subnet to perform the test. Limitations The result of fingerprints are highly dependent on the OS installed on the machine. Also, it is not always possible to find a live host to perform those tests (or even two hosts in the case of reflect-scans). Moreover, devices on the path (e.g. firewalls) can hinder the quest for probe-able hosts. Using idle scan Detecting BGP hijacking attacks through idle scan is presented in [121]. However, this technique relies on a single vantage point to probe networks, which makes it even more complex to detect an attack. Detection method The system watches BGP update messages and, whenever it detects a MOAS conflict, it starts an idle scan to find out whether the MOAS is legitimate or the result of an attack. The probing technique is similar to fingerprinting’s reflect-scan; however, instead of using a machine outside of the hijacked subnet (but still inside the original network) to perform the test, it makes use of a host part of the legitimate last-hop network. Limitations The limitations are the same as the ones presented before for the fingerprinting technique. Using PING tests This method [268] focuses on the sole use of PING tests to differentiate legitimate MOAS and an ownership attack. Detection method 50 SEVENTH FRAMEWORK PROGRAMME 2.3 BGP State-of-the-art When a monitor receives a suspicious update containing a prefix to be observed, that monitor executes ping tests for every host address of that prefix. At the same time, it notifies another monitor that did not receive the update yet to perform the same test on the original route. The two ping results are compared. If the results are “similar enough”, the system concludes that there is no hijacking. Limitations A preliminary experiment showed good results, but a large-scale test remains to be done. However, pinging a whole network range may result in substantial network load, although the authors suggest that for larger networks, only a set of distributed subnets need to be checked. iSPY In [328], the authors present a method for detecting prefix hijacking without relying on external infrastructure (vantage points, monitors, etc). The method relies on the ability of the victim AS to find its vPath. The vPath is the set of AS-level forward paths from the network to the others ASes on the Internet6 . It can easily be obtained from tools such as traceroute. Detection method Considering two forward paths P and P 0 to destination d, obtained at time t and t0 (t < t0 ), if P = P 0 , then everything is fine. If P 0 6= P but P 0 is complete (i.e. traceroute receives every response to destination), the route change was legitimate. If P 0 is incomplete and P 0 ⊂ P (i.e. every AS number in P 0 is in P , up until P 0 receives no more data), then a cut exists between the last router of P 0 and the next one in P . Finally, P 0 is incomplete and P 0 6⊂ P , there is a cut between the last hop in P 0 and the (unknown) next one. Defining Ω as the set of all existing cuts, the cardinal |Ω| of Ω is the detection signature: if it is bigger than a threshold value, there is hijacking. Limitation First and most importantly, the detection scheme only works if the hijacker blackholes the traffic. Also, iSPY is likely to confuse a stealthy attack as a legitimate cut link, and is blinded by a tool such as Fakeroute [203]. 6 Actually, only to the transit ASes of the Internet (i.e., without stub-ASes). FP7-ICT-257495-VIS-SENSE 51 2 Network Analytics for Security PGBGP: Pretty Good BGP PGBGP’s goal is not only to detect hijacking events, but to improve overall routing quality and reliability. The core idea behind PGBGP, presented in [148], is that “unfamiliar routes should be treated cautiously when forwarding data traffic”. PGBGP defines a set of normal data containing the prefix, its origin AS, a timestamp of the last received update. The normal data set and the router’s Routing Information Base (RIB) are used to create a history for known prefixes and origins. Obviously, at startup, there is no known history, and all routing updates are accepted for h days. Afterwards, incoming routes that would alter the state of the normal behaviour are quarantined for s days. The quarantined routes are considered as suspicious. After that time, they are accepted, if still in the routing table. This quarantine mechanics prevents short-term erroneous announcement from disrupting routing. Finally, PGBGP removes data from the history if it has not been announced for h days. As any incoming route is tested against the history, hijacking attempts, arriving with a new origin AS, do not match known history for that prefix and are therefore quarantined. While suspicious, the old, trusted route is used for packet forwarding. To avoid subnet attacks, PGBGP checks if the new incoming prefix is a subnet of a known one. If it is, and the route the subnet does not traverse the larger prefix AS, it is suspicious. However, forwarding packets along the trusted route may be useless if routers along that route have been compromised. Therefore, PGBGP tries to avoid forwarding packets to neighbour routers that have announced the suspicious route. Super-prefixes of known prefixes are always accepted by PGBGP as the authors believe that it is the result of a new network destination, not of a hijack, because traffic destined to the original network will use the legitimate, more specific route. 2.4 Analysis of Spam Campaigns 2.4.1 Introduction In [248], the authors show that cybercriminals are able to misuse the BGP routing protocol to hijack blocks of IP addresses for limited periods of time during which they could launch spam campaigns from, apparently, legitimate blocks of IP addresses. To the best of our knowledge, nobody else could demonstrate, until now, to which extent this assumption can be verified. However, if this claim is true, such techniques would clearly defeat the spam blacklists that anti-spam tools use as a first layer of defence against spammers. One of the main objectives of VIS-SENSE is to provide sound scientific rationales in 52 SEVENTH FRAMEWORK PROGRAMME 2.4 Analysis of Spam Campaigns favour, or against, the idea that the core infrastructure can be misused by cybercriminals to carry on malicious activities, such as launching spam campaigns. To do this, we need first to have a clear view on the inner workings of such spam campaigns, and how we can observe them. As suggested in [333], there are two main approaches that can be used to get insights into spammers activities: • passive observation, which consists in observing the visible effects of spammers activities, e.g., analyzing the content of spam messages received for a given domain (to create spam filters), or looking at the IP addresses of the machines used for sending spam messages (IP reputation analysis). • active observation: as spammers are moving to more sophisticated techniques, and because they are increasingly relying on botnets to send spam campaigns, it can be sometimes necessary to infiltrate the infrastructure of spam gangs to better understand their modus operandi. Examples of active observation techniques include the execution of a malware sample in a sandbox to observe its behaviour, the manipulation of C&C servers to uncover botnets communication protocols, etc. On the other hand, spam detection and mitigation techniques can also be categorized according where they are applied along the path between the spam source and the destination. We can usually distinguish between: • pre-acceptance detection techniques, by which spam emails are detected before they actually reach the destination mail server. These techniques take advantage of lowlevel network features to detect and block spam traffic as soon as possible, so as to reduce the load on SMTP servers. Examples of techniques include fingerprinting spam bots at the SMTP layer, IP reputation filtering, etc. • post-acceptance detection techniques, by which spam emails are identified after they have reached the destination mail server. These techniques take advantage of features extracted from the whole spam message, including the analysis of the content of the email message, which obviously involves a heavier processing. In Sections 2.4.2, 2.4.3 and 2.4.4, we first detail a few classes of techniques that are commonly-used to detect, block or analyze spam messages. Then, in Sections 2.4.5 and 2.4.6, we describe some previous works that have focused on studying the higher-level behaviour of spammers, as well as the scam infrastructure they are using. FP7-ICT-257495-VIS-SENSE 53 2 Network Analytics for Security 2.4.2 IP reputation analysis The idea behind this technique is simply to build a database of IP addresses associated with spamming activities. Upon reception of an email, this knowledge base can be queried to help determine if this is spam or not. Basically, three types of list can be built: - blacklist: contains IP addresses of hosts from which all emails should be blocked; - greylist: contains IP addresses of hosts from which all emails should be first rejected and then accepted. This technique works because spamming hosts usually don’t resend emails; - whitelist: contains IP addresses of hosts from which all emails should be accepted. Relying on an IP reputation database is relatively effective since one just needs to accept, delay or reject emails from already known hosts. However, it can be hard sometimes to identify the correct source IP address depending on the data collection infrastructure [109]. Whitelists are often implemented in mail servers to automatically accept emails from specific IP addresses regardless of the presence of that address in any bad hosts list. Greylists take advantage of the often poorly implemented spamming hosts software which does not resend any email that is not accepted by the server. For every email coming from an unknown source IP address, the message is first rejected and then accepted. This allows to record potential spam bots while adding a little delay when receiving legitimate emails from an unknown source for the first time. Blacklists contain IP addresses of hosts that have participated in spam sending operations. They are widely used by the research community [39] and implemented in many commercial anti-spam systems ([267, 1, 13]) to help classify input messages as legitimate or spam according to the source IP of the spamming host. The different blacklists independently maintain records of IP addresses that have been involved in spamming activities (e.g., spam sending, open-relay, member of a botnet, etc). Examples of popular blacklists include Spamhaus (PBL, SBL, XBL) [12], SORBS [10], Spamcop [11], DSBL [3], NJABL [6], and Composite Blocking List [2], among others. However, IP blacklists are often said to be inefficient because spammers techniques have evolved. According to [39, 167, 147, 230], blacklists have actually forced spammers to build networks of compromised hosts (a.k.a. botnets) because many zombie machines use dynamic IP addresses, which makes blacklisting less effective. This finding is reinforced by the fact that that most of today’s spam comes from botnets [265, 266, 248, 230]. Moreover, [17] claims that blacklists are often too slowly updated to allow the real-time 54 SEVENTH FRAMEWORK PROGRAMME 2.4 Analysis of Spam Campaigns detection of spamming hosts. It is also claimed in [230] that bots tend to send low volumes of spam in order to avoid being blacklisted. However, [19] concludes that blacklists are still quite effective in identifying spam sending hosts. MessageLabs [267] makes use of IP reputation analysis in their Traffic Management Layer. They analyze the source IP address and look at its past activities to decide whether or not they should proceed with email processing. IP reputation is also used in the Skeptic(TM) Anti-Spam Layer by means of DNSBL lookups. In [39], the authors develop a distributed system called Trinity based on IP reputation to detect spam sources. The system is based on the assumption that bots are sending a lot of spam in short amounts of time, so the reputation of these sources must be shared as soon as possible. In [264], the authors develop a system called FIRE (FInding Rogue nEtworks) to identify and expose organizations and ISP’s that demonstrate persistent, malicious behavior. The goal is to isolate the networks that are consistently implicated in malicious activity from those that are victims of compromise. To this end, FIRE actively monitors botnet communication channels, spam traps, drive-by-download servers, and phishing web sites. A malscore is then computed for each AS, based on the observed activity of all IP’s belonging to that AS, which somehow reflects also the reputation of each network. IP reputation analysis is a pre-acceptance technique which can be used in both active and passive observation of spamming activities. First, IP testing can be performed before message are actually received by the SMTP server. Although blacklists and whitelists can be queried by low-level hosts, greylists require recording each new connection from unknown sources, and thus they should be deployed on powerful hosts (but before the SMTP server). 2.4.3 Message content analysis The analysis of the email content is widely used by the research community [313] and in many commercial spam filters [267, 13]. This approach basically consists in extracting every piece of information from the content of email messages to determine if this is spam or not. By analyzing the message content, we can extract patterns that allow us to detect future instances of similar spam messages. This technique is very popular because you only need to have access to email messages. However, spammers rely now on more sophisticated message generation techniques that take advantage of message polymorphism [168, 147, 248, 230]. Another very popular technique is Bayesian spam filtering, like the system used in [13], which is based on a two step process including a learning phase and a filtering phase. During learning, the filter analyses message words and computes the probability that a message is a spam based on words occurrences. In the filtering phase, the Bayesian FP7-ICT-257495-VIS-SENSE 55 2 Network Analytics for Security formula is used to compute the probability that the message is spam or ham based on the individual probabilities computed in the first place. The advantage of this technique is that it can adapt itself to users email content. However, spammers have learned in the meantime how to fool Bayesian filters by, for instance, including legitimate words/URL’s in spam messages [262]. MessageLabs applies content-filtering in the Skeptic(TM) Anti-Spam layer by using BrightMail in conjunction with different heuristics applied to message content and headers. New heuristics are developed and existing ones are constantly improved to reflect new uncovered spam patterns. Message content analysis falls into the post-acceptance class of techniques. Most works involving content message analysis use passively collected spam data. However, some works make use more active techniques (e.g., infiltration) such as in [230, 46, 147], although message content analysis is usually not the final objective for those studies, but rather a means to perform some higher-level analysis (e.g., studying characteristics of spam campaigns, spam marketing conversion, etc). 2.4.4 Network-level spam detection Network-level spam detection techniques leverage features of spam traffic below the application layer to detect and stop spam from entering mailboxes. Advocates of this technique claim that network-level features are less changing from spammers to spammers and from campaigns to campaigns. They also argue that this approach allows spam to be filtered closer to the source and, as a consequence, prevent network resources from being spoiled. However, this approach is actually limited by the type of data available (tcpdump logs, BGP routing information, etc). In [87], the authors deploy TCP signatures of identified spamming hosts in routers. They claim that good signatures exhibit low false-positives and low false-negatives. However, they also admit that this kind of signatures can currently only complement other spam detection techniques. In [36], the authors suggest that transport-level features like the RTT, the time between each SMTP packet in a flow and the TCP flow termination process can be leveraged in order to differentiate traffic that carries spam from traffic carrying legitimate email. In [248], the authors study some network-level characteristics of email traffic and attempt to infer features that can help detect spam. They look at different features like the distribution across the IP space, across ASes and by country. They also look at the volume of spam sent and the time during which spamming hosts are active. This paper makes an important contribution to the field of spam analysis: they have indeed witnessed a few spammers advertising short-lived hijacked BGP routes used to send spam in a stealthy way. However, this paper is currently the only one that found some evidence 56 SEVENTH FRAMEWORK PROGRAMME 2.4 Analysis of Spam Campaigns that spammers could take advantage of BGP-hijacking to send spam. Some other network-level techniques leverage features of the SMTP protocol used between mail servers and mail clients to detect connections from spamming botnets. This kind of method is based on the assumption that spam bots implement a customized version of the SMTP protocol. Second, these customized implementations are assumed to exhibit few polymorphism compared to headers and message contents. By extracting features at the level of SMTP communications with spam bots, it is possible to use them to detect future instances of those spamming hosts, but also instances of spamming hosts using the same spam engine. SecureWorks [9] and MessageLabs [267] both take advantage of that technique to detect spam originating from very large spamming botnets. MessageLabs uses regular expression-based signatures provided by CBL [2] contributors which describe the SMTP sessions of different spamming botnets. Since these techniques take advantage of low-level characteristics of spam traffic, they can be applied before messages are received by the mail server (pre-acceptance). The works described in [87, 36] rely on spam data collected passively. However, in [263], the authors hijack a botnet C&C server to discover IP addresses of bots and they correlate them with passively observed spamming activities to study spam originated from the botnet. As a result, both passive and active observation techniques can be used to analyze network-level features of spammers. 2.4.5 Analysis of scam infrastructure The scam hosting infrastructure refers to the Internet infrastructure used to host web sites advertised in spam. By analyzing the characteristics of such an infrastructure, one might be able to identify all spam messages advertising the same web sites or the same product. Discovering the different advertised topics and products can help to study spam campaigns and how they are carried out. It also helps to learn how spammers manage their scam hosts (e.g., multiple spammers may share a common infrastructure). In [304], the authors extract embedded URL’s and retrieve domains and IP addresses of hosting servers. They observe that domains are associated with several IP’s, probably to increase the resilience of their infrastructure when certain servers are banned or blacklisted. IP addresses also match several domains. Most hosting infrastructures seem to be widely distributed. The rotation of IP addresses seems to occur less frequently than the change rate of domains, which suggests that looking at the advertised servers IP addresses can be leveraged to analyze spam. In [142], the authors study the web scam infrastructure related to the spam they receive. They find that, while spammers employ sophisticated methods to generate polymorphic spam content, advertised web content is more static. They further cluster spam mes- FP7-ICT-257495-VIS-SENSE 57 2 Network Analytics for Security sages they receive based on the IP addresses of the advertised web domains. They also find that many spam campaigns share common scam infrastructures making it difficult to characterize individual botnets from that kind of data. In [19], Anderson et al. characterize scam infrastructure and use data related to scam to better understand the dynamics and business pressures exerted on spammers. They designed an opportunistic measurement technique called spamscatter that mines emails in real-time, follows the embedded link structure, and automatically clusters the destination Web sites using image shingling to capture graphical similarity between rendered sites. Another work by Kanich et al. [147] analyzes the conversion rate of spam, i.e., the probability that a spam message will ultimately elicit a sale. To that end, they infiltrate a spamming botnet and swap the malicious advertised web pages with innocuous web pages under their control. This sheds light on the real benefits spammers get from sending spam. The results show that running a spamming botnet is costly and that sometimes, the spammers and the advertisers may be the same. This work highlights the importance of the scam hosting infrastructure in conveying spammers’ message that incites users to buy products. Finally, in [66] Cova et al. have conducted a large-scale analysis of rogue AV campaigns and have studied the distribution infrastructure (i.e., the rogue AV websites) used for such campaigns. A rogue AV software is a type of misleading application that pretends to be legitimate security software, such as an anti-virus scanner, but which actually provides the user with little or no protection. Quite similarly to spam campaigns, Rogue AVs typically find their way into victim machines by relying on social engineering techniques to convince inexperienced users that a rogue tool is legitimate and that its use is necessary to protect their computer. It is worth noting that [66] is one of the very first ones that has demonstrated the usefulness of attack attribution approaches to the problem of mining large security datasets. By using multi-criteria decision analysis techniques (MCDA), the authors were able to discover specific campaigns likely to be associated to the action of a specific individual or group. Prior to this work, a preliminary, high-level overview of some of the results obtained with the very same attribution method was presented in the Symantec Report on Rogue Security Software [97]. Studying scam hosting infrastructures often involves extracting URL’s from spam messages. Most techniques used for this purpose are thus post-acceptance and passive techniques. 58 SEVENTH FRAMEWORK PROGRAMME 2.4 Analysis of Spam Campaigns 2.4.6 Analysis of higher-level behaviour of spammers The majority of studies on spam detection and mitigation techniques concentrate on the core-business activity of spammers, that is, sending spam. However, spammers must also perform many other activities before being able to flood users’ mailboxes with spam. For example, [243] describes how spammers can find email addresses as new target. Moreover, as most of today’s spam comes from botnets, spammers have to manage these large networks to ensure that they can work properly without being detected. For instance, some bots may send spam to recruit new members to make the botnet grow, whereas others may be responsible for relaying spammers’ orders to other bots [147, 168]. In fact, although these secondary activities are critical for spammers, both the research community and the commercial spam filters vendors don’t pay much attention to them. However, studying these activities can help understand more about spammers’ behaviours. Another important task that spammers have to perform is called email harvesting. This consists in collecting email addresses from websites or infected computers to further use them as recipient addresses of spam emails. In [243], the authors describe the Project Honey Pot [7] which studies email harvesting by setting up honeypots recording any attempt to harvest email addresses on web sites by providing fake email addresses associated with spamtraps. This way, they are able to associate spam senders with email harvesters. They identify two classes of spammers: those sending spam only a few hours after email addresses have been harvested and those sending spam a few weeks after email addresses have been harvested. They also show that hosts harvesting email addresses tend to be associated with static IP addresses and that they are less likely to be blacklisted than spamming hosts. Finally, they find that, quite surprisingly, many email harvesters can be fooled by means of simple email address obfuscation techniques. Spam campaigns have also become an important research topic over the past few years. Although the concept of “spam campaign” is not clearly defined in the research community, a spam campaign is often considered as a group of spam messages advertising the same product, and likely due to the same spammer or spam organization. Characteristics of spam campaigns can be uncovered by studying them using different techniques [313, 230]. One such technique is the analysis of similarity in the content of the spam messages, or of other specific features available in the message by itself. For instance, one can leverage URL’s to detect spam campaigns [313, 230]. In [313], the authors assume that spam campaigns are bursty and design a detection system based on the automated generation of URL regular expression signatures. On the other hand, another study of spam campaigns in [230], which uses URL’s extracted from spam messages collected at an open-relay, states that campaigns can be long lasting and are not FP7-ICT-257495-VIS-SENSE 59 2 Network Analytics for Security necessarily bursty. The authors also found that a bot may participate in different campaigns but targeting different recipients. In [46], they define a spam campaign as a set of spam messages advertising the same product and using similar obfuscation and dissemination strategies. By leveraging frequent pattern trees and a set of extracted features (i.e., the source and destination of the messages, the type of abuse and content obfuscation strategy), they can group spam messages into campaigns. In [333], the authors take advantage of text shingling to identify nearly duplicate messages in order to cluster them into campaigns. They find that half of the campaigns stay active for only a few hours and that the amount of spam sent by a botnet primarily depends on its size. Finally, in [167, 168], the authors study the way bots receive email addresses lists from C&C servers, how bots build spam messages from given templates, how bots report spam sending errors and the fact that spammers use email accounts to test their campaigns against different filters. All these activities are also part of the spamming process, and thus studying them really helps to gain insights into the spam phenomenon as a whole. 2.5 Root Cause Analysis and Attack Attribution 2.5.1 Introduction In the context of cyber-attacks, a fundamental aspect is how to address the problem of attribution. Note that there is currently no universally agreed definition for “attack attribution”. If one looks at the definition of the term attribution in a dictionary, one will find something similar to: “explain by indicating a cause”7 . However, most previous works related to that field tend to use the term “attribution” as a synonym for traceback, which consists in “determining the identity or location of an attacker or an attacker’s intermediary” [307]. In the context of a cyber-attack, the obtained identity can refer to a person’s name, an account, an alias, or similar information associated with a person, a computer or an organisation. The location may include physical (geographic) location, or any virtual address such as an IP address. In other words, IP traceback is a process that begins with the defending computer and tries to recursively step backwards in the attack path toward the attacker so as to identify her, and to subsequently enable appropriate protection measures. The rationales for developing such attribution techniques lie in the untrusting nature of the IP protocol, in which the source IP address is not authenticated and can thus be easily falsified. For this reason, most existing approaches dealing with IP 7 Definition given by Merriam-Webster. http://www.merriam-webster.com/dictionary/attribute 60 SEVENTH FRAMEWORK PROGRAMME 2.5 Root Cause Analysis and Attack Attribution traceback have been tailored toward (D)DoS attack detection, or eventually to some specific cases of targeted attacks performed by a human attacker who uses stepping stones or intermediaries in order to hide her true identity. In this project, we will refer to “attack attribution” as something quite different from what is described here above, both in terms of techniques and objectives. Although tracing back to an ordinary, isolated hacker is an important issue, we are primarily concerned by larger scale attacks that could be mounted by criminal organizations, dissident groups, rogue corporations, and profit-oriented underground organizations. Consequently, we are rather looking at analysis methods that can help security analysts to determine the root cause of global attack phenomena (which usually involve a large amount of sources or events), and to easily derive their modus operandi. These attack phenomena can be observed through many different means (e.g., honeypots, IDS’s, sandboxes, web crawlers, malware collection systems, spamtraps, etc). Typical examples of phenomena that we may want to identify and study can go from malware families that propagate via code injection attacks [188], botnets controlled by underground groups and targeting machines in the IP space [286, 71], spam campaigns, or even to certain clientside threats such as rogue software campaigns run by the same organization, which aims at deploying numerous malicious websites (or compromising legitimate ones) in order to host and sell rogue software [66]. Attack phenomena are often largely distributed in the Internet, and their lifetime can vary from a few days to several months. They typically involve a considerable amount of features interacting sometimes in a non-obvious way, which makes them inherently complex to identify. That is, due to their changing nature, the attribution of distinct events having the same root phenomenon can be a challenging task, since several attack features may evolve over time. As noted by Richard Bejtlich on his TaoSecurity blog: “Attribution means identifying the threat, meaning the party perpetrating the attack. Attribution is not just malware analysis. There are multiple factors that can be evaluated to try to attribute an attack. [...]” [29, 28]. Bejtlich suggests that those factors are very diverse and should include, e.g., the timing of the attack, information on the targets, delivery mechanism, vulnerability or exposure, propagation method, command and control mechanisms, and several other contextual features. Finally, Tim Bass suggested in [25] that “Next-generation cyberspace intrusion detection (ID) systems will require the fusion of data from myriad heterogeneous distributed network sensors to effectively create cyberspace situational awareness [...] Multisensor data fusion is a multifaceted engineering approach requiring the integration of numerous diverse disciplines such as statistics, artificial intelligence, signal processing, pattern recognition, cognitive theory, detection theory, and decision theory. The art and science of data fusion is directly applicable in cyberspace for intrusion and attack detection”. FP7-ICT-257495-VIS-SENSE 61 2 Network Analytics for Security Hence, it is not surprising to observe that emerging methods in the field of attack attribution are at the crossroads of several research domains, which we can try to categorize as follows: i) investigative and security data mining, i.e., knowledge discovery and data mining (KDD) techniques that are specifically tailored to problems related to computer security or intelligence analysis; ii) problems related to multi criteria decision analysis (MCDA), and multisensor data fusion; iii) general techniques for malicious traffic analyses on the Internet, with an emphasis on methods that aim to improve the “cyber situational awareness” (Cyber-SA). In the next paragraphs, we give an overview of some key contributions in each research area. 2.5.2 Investigative and Security Data Mining In the last ten years, considerable efforts have been devoted to applying data mining techniques to problems related to computer security. However, a great deal of those efforts has been exclusively focused on the improvement of intrusion detection systems (IDS) via data mining techniques, rather than on the discovery of new fundamental insights into the nature of attacks or their underlying root causes [144]. Furthermore, only a subset of common data mining techniques (e.g., association rules, frequent episode rules or classification algorithms) have been applied to intrusion detection, either on raw network data (such as ADAM [23], MADAM ID [184, 185] and MINDS [84]), or on intrusion alerts streams [76, 146]. A comprehensive survey of Data Mining (DM) techniques applied to Intrusion Detection (ID) can be found in [24, 40]. We note that most of these previous approaches aim at improving alert classification or intrusion detection capabilities, or at constructing better detection models thanks to the automatic generation of new rules (e.g., using some inductive rule generation mechanism). Only recently, namely in the context of the WOMBAT Project [71], some emerging work has been done regarding the application of novel data mining approaches to different security data sets, and with different purposes. More precisely, in [284, 283, 285, 282, 287], the authors have developed a graph-based, unsupervised data mining technique to discover unknown attack patterns performed by groups or communities of attackers by mining data sets containing only malicious activities. The final objective does not consist in generating new detection signatures to protect a single network, but instead to understand the root causes of large-scale attack phenomena, 62 SEVENTH FRAMEWORK PROGRAMME 2.5 Root Cause Analysis and Attack Attribution and get insights into their long-term behavior, i.e.: how long do they stay active, what is their average size, their spatial distribution, and how do they evolve over time with respect to their origins, or the type of activities performed. During the VIS-SENSE project, we want to pursue these efforts by further developing and enhancing those graph-based clustering techniques. For example, the clustering techniques employed in [282, 287] to create some sort of viewpoints for each attack feature separately, rely on graph-based techniques that require a full similarity matrix (n × n) as input. Quite obviously, such techniques do not scale very well with the size of data sets. Hence, we could investigate the possible application of other, more scalable clustering techniques, such as BIRCH [327] or BUBBLE [98], which are able to find clusters in very large databases in a single pass. However, for these techniques to be applicable to honeypot data (and the like), we still need to research and find the most appropriate representation for attack features extracted from such data sets. Crime Data Mining There are also many similarities between the tasks performed by analysts in computer security and in crime investigations or in law-enforcement domains. As a result, several researchers have studied the potential of data mining techniques to assist law-enforcement professionals. In [205], McCue provides real-world examples showing how data mining has identified crime trends and helped crime investigators in refining their analysis and decisions. Previous to that work, Jesus Mena has described and illustrated the usefulness of data mining as an investigative tool by showing how link analysis, text mining, neural networks and other machine learning techniques can be applied to security and crime detection [208]. More recently, Westphal provides additional examples of realworld applications in the field of crime data mining, such as border protection, money laundering, financial crimes or fraud analytics, and elaborates also on the advantages of using information-sharing protocols and systems in combination with those analytical methods [305]. We observe, however, that most previous work in the crime data mining field has primarily focused on “off-the-shelf” software implementing traditional data mining techniques (such as clustering, classification based on neural networks and Kohonen maps, or link analysis). Although very useful, those techniques are generally not very appropriate for modeling complex behaviors for the kind of attack phenomena that we want to identify on the Internet. FP7-ICT-257495-VIS-SENSE 63 2 Network Analytics for Security 2.5.3 Attack Attribution based on Multi-criteria Decision Analysis As mentioned here above, a new approach has been recently proposed by Thonnard et al. towards attack attribution in cyberspace [282, 287]. The idea is to combine multi-criteria decision analysis (MCDA [30, 290]) with clustering techniques, in order to identify groups of security events that are likely due to the same root cause (i.e., the same underlying phenomenon). This method can be applied to a broad range of security data sets, such as intrusion detection alerts, honeypot events, malware samples, rogue AV domains, spam messages, and more. Examples of real-world applications include the analysis of Rogue AV campaigns likely run by the same group of people [97, 66], the analysis of honeypot attacks [285, 71], or potentially also the analysis of large spam campaigns run by gangs of spammers. Despite their great flexibility in combining features or evidences, we note that rather few previous works have used MCDA approaches in order to address security-related problems. Still, in [53] the authors consider the problem of discovering anomalies in a large-scale network based on the data fusion of heterogeneous monitors. The authors evaluate the usability of two different approaches for multisensor data fusion: one based on the Dempster-Shafer Theory of Evidence and one based on Principal Component Analysis. The Dempster-Shafer theory is a mathematical theory of evidence based on belief functions and plausible reasoning [257]. It allows one to combine evidence from different sources and to obtain a certain degree of belief (represented by a belief function) that takes into account all the available evidence. It can be seen as a generalization of Bayesian inference where probability distributions are replaced by belief functions. When used as method for sensor fusion, different degrees of belief are combined using Dempster’s rule which can be viewed as a generalization of the special case of Bayes theorem where events are independent. In our attribution method, we prefer using aggregation functions as described previously, for the greater flexibility they offer in defining how we want to model interactions among criteria (e.g., a positive or negative synergy between a pair of criteria). Moreover, in Dempster-Shafer all criteria are considered as independent of each other, which is usually not the case with features used in attack attribution. Interestingly, it has been showed that there is a direct connection between fuzzy measures used in MCDA, and belief or plausability functions used in Dempster-Shafer theory ([111, 302]). During VIS-SENSE, we will thus further investigate the MCDA techniques that are best suitable for attack attribution purposes. More precisely, the aggregation of attack features performed in [282, 287] deals mainly with a limited set of aggregation functions, such as the Ordered Weighted Average (OWA) operator [317], the Weighted OWA and the Choquet integral [30, 290]. However, more real-world experiments need to be carried 64 SEVENTH FRAMEWORK PROGRAMME 2.5 Root Cause Analysis and Attack Attribution out to fine-tune the integration of these aggregation techniques into an attack attribution framework, in particular with respect to the determination of appropriate weighting vectors and fuzzy measures. Furthermore, we need to define an aggregation function that is able to model a decision scheme matching as closely as possible the phenomena under study. In many cases, the aggregation process can be modelled using a sort of averaging function, like a simple weighted means or an OWA-based operator. However, one could prefer to use another form of conjunctive or disjunctive function (such as tnorms and t-conorms), or mixed functions (such as uninorms and nullnorms) to model the aggregation of criteria in more complex scenarios. Finally, it is worth noting that MCDA has been ranked in the top 5 intelligence analysis methods by K. Wheaton, assistant professor of intelligence studies at Mercyhurst College [306]. 2.5.4 Malicious Traffic Analysis and Cyber-SA This research will build also on prior work in malicious traffic analysis, for which the literature in this field is quite significant. For example, in [321], Yegneswaran et. al. have studied the global characteristics and prevalence of Internet intrusions by systematically analyzing a set of firewall logs (from D-Shield) collected from a wide perspective (over four months of data collected from many different networks worldwide). Their study is a general analysis that focused on the issues of volume, distribution (e.g., spatial and temporal), categorization and prevalence of intrusions. Then, in [229] Pang et al. characterize the incessant non-productive network traffic (which they term Internet background radiation) that can be monitored on unused IP subnets when deploying network telescopes or more active responders such as honeypots. They analyzed temporal patterns and correlated activity within this unsolicited traffic, and they found that probes from worms heavily dominate. More recently, similar research has been conducted by Chen et al. in[56]. While all these previous works provide meaningful results and have much contributed in making advances in malicious traffic analysis, the traffic correlation and analysis techniques used by these authors stay at a fairly basic level. Indeed, they basically break down the components of background radiation by protocol, by application and sometimes by specific exploit, and then apply some statistics across each component. In [283], Dacier et al. developed a more elaborated clique-based clustering method to extract groups of correlated attack clusters from a large honeynet dataset. In [284, 285] the same authors explored two different approaches to combine attack knowledge obtained through these means. Then, they also presented in [234, 235] different signal processing techniques that can be used to extract, systematically, so-called attack events from a large set of honeynet traces. More recently, Leita et al. offer in [187] an empir- FP7-ICT-257495-VIS-SENSE 65 2 Network Analytics for Security ical study of an extensive data set collected by the SGNET honeypot deployment. In particular, they show the value of combining clustering techniques based on static and behavioral characteristics of the malware samples, and show how this combination helps in detecting clustering anomalies but also in underlining relationships among different code variants. Finally, they highlight the importance of using contextual information related malware propagation in order to get a better understanding of the malware threat ecosystem. It would be incomplete to discuss attack attribution without mentioning some active research carried out in Cyber Situational Awareness (or Cyber-SA). We acknowledge the seminal work of Yegneswaran and colleagues in this field, such as in [322] where they explore ways to integrate honeypot data into daily network security monitoring, with the purpose of effectively classifying and summarizing the data to provide ongoing situational awareness on Internet threats. However, their approach aims at providing tactical information, usable for the day to day operations, whereas Dacier et al. are interested in strategic information that reveal long term trends and the modus operandi of the attackers. Closer to their research, Li et. al. have described in [192] a framework for automating the analysis of large-scale botnet probing events and worm outbreaks using different statistical techniques applied to aggregated traffic flows. They also design schemes to extrapolate the global properties of the observed scanning events (e.g., total population and target scope) as inferred from the limited local view of a honeynet. Finally, a first compilation of scientific approaches for Cyber-SA has recently been published in [140], in which a multidisciplinary group of leading researchers (from cybersecurity, cognitive science, and decision science areas) try to establish the state of the art in cyber situational awareness and to set the course for future research. The goal of this pioneering book is to explore ways to elevate the situation awareness in the Cyber domain. Finally, another interesting project is Cyber-Threat Analytics (Cyber-TA), founded by SRI International [70]. Cyber-TA is an initiative that gathers several reputed security researchers. It aims at accelerating the ability of organizations to defend against Internet-scale threats by delivering technology that will enable the next-generation of privacy-preserving digital threat analysis centers. According to Cyber-TA, these analysis centers must be fully automatic, scalable to alert volumes and data sources that characterize attack phenomena across millions of IP addresses, and give higher fidelity in their ability to recognize attack commonalities, prioritize, and isolate the most critical threats. However, very few information is available at [70] on which scientific techniques could enable organizations to achieve such goals or to elevate their cyber-situational awareness. 66 SEVENTH FRAMEWORK PROGRAMME 3 Visual Analysis for Network Security 3.1 Introduction As described earlier visual analytics is the combination of automatic analysis methods and visual approaches to investigate huge datasets. After introducing the automatic algorithmic methods in network analysis in the second chapter, the focus of Chapter 3 will be the visual approaches, which emphasize interactive visualizations with the human in the loop. There are many fields where visual analysis has successfully improved automatic analysis or has even been able to provide new insights in complex relationships, occurring patterns or anomalies which were not known before. To get a better understanding of the techniques used in visualization applications, the following sections provide a brief overview of the most common visualization and interaction techniques. 3.1.1 Visualization Techniques When we talk of datasets, we mean datasets of no particular format. In the case of data tables, we call the columns attributes. The term attributes can be applied to datasets of other paradigms. In this case an attribute is any clearly defined property of the dataset. Additional attributes can be derived during the analysis process. These generated attributes either summarize other attributes or are the result of automated processing. The individual values of each attribute are known as data items. A visualization is simply a mapping of sets of data items onto marks. In the case of interactive computer visualizations, marks are connected groups of pixels, together with their color specifications. Marks have a dimension (point, line, area), a shape, a color, a size and a texture; all of which can be used to represent different attributes. A very detailed discussion of marks can be found in the monograph Sémiologie Graphique by the French cartographer Jacques Bertin [34]. The set of marks displayed on a computer screen at any moment is the current view of the visualization. Beside the common visualization techniques there are some more specialized ones for particular data types. In the following we will introduce and briefly describe common visualization techniques. We only focus on those visualizations, which are commonly used in most network 67 3 Visual Analysis for Network Security security prototypes and tools as discussed in detail later (Section 3.2). Most of the implementations use simple timeline or graph visualizations to represent data. Sometimes a 3D display visualizes the data in a 3D space to gain a further axis, but run the risk of loosing the overview or having some overlap among data points. Pixel visualizations try to represent as many data objects as possible on the screen by mapping each data value to a pixel and arranging the pixels adequately [151]. The most known representations of data are tables or different kinds of charts. They are easy to create and to understand but it is difficult to see relations or correlations between massive amounts of multi-dimensional data. To get a better understanding of interrelations between certain parameters, parallel coordinates [137] and scatterplots might be the right choice. So-called glyphs [303] can be used to map different attributes on a single data representation. Glyphs change their appearance depending on the characteristics of certain parameters. They can be combined with different kinds of layout algorithms to create visual patterns. Matrices are a way to arrange data points in a two dimensional way. For hierarchical datasets a treemap visualization [143] is a good choice. It arranges the data in nested rectangles representing the hierarchical structure. The area of each rectangle is mapped to the value of a specific attribute (e.g., the numbers of attackers or number of IP addresses). To easily compare different aspects of a dataset small multiples arrange several instances of the same kind of visualization technique next to each other to make the differences between the representations visibly salient. Of course there are many more ways to display or arrange different types of datasets, but the above mentioned ones are most commonly used in the field of network security and thus the most important ones to know. While some visualizations are static, modern computer-based visualizations are highly interactive and provide the analyst with a number of ways to modify the current view. The most common basic and some more advanced interactive techniques will be introduced in the following subsections. 3.1.2 Basic Interaction Techniques The visualizations provide the analyst with a variety of ways to interact with marks of a view and, thus with the data itself. The most basic forms of interaction are filtering, zooming, panning and brushing. Filtering restricts the marks in the view to a data subset, which fulfills the chosen criteria. Filters usually take the form of drop-down lists, sliders and check boxes, but can also involve complex graphic or textual query formulation. In its most basic form, zooming means an increase in the size of marks in the current view. Knowledge of the dataset may be used to add and remove information at different zoom levels. This technique is known as level-of-detail (LOD) zooming. A good example 68 SEVENTH FRAMEWORK PROGRAMME 3.1 Introduction of LOD zooming is an interactive map; at the highest level only continents are shown in the view, after zooming in the boundaries appear, zooming in further causes cities and important roads to appear. Closely related to zooming is panning. When the current view is enlarged it may not fit into the available screen space. In this case, panning becomes necessary to see all of the marks. Thus, while zooming modifies the granularity of the information displayed in the view, panning simply involves shifting the view to see different parts of it. Panning has no effect on view granularity. The final basic interaction is brushing. To see how a particular mark (or group of marks) changes as the view changes, it could be brushed or highlighted to make visual tracing of the mark easier. In many cases, more than one view (often even more than one visualization) are used to see different parts or aspects of the same dataset. To enable the visual tracking of data items across views and visualizations these can be linked. Items brushed in one view are then brushed in all other views as well. This technique is frequently referred to as brushing and linking. 3.1.3 Advanced Interaction Techniques The basic interactions are complemented by interactions involving some form of automated data processing. Most of the techniques used in visual analytics are borrowed from the fields of statistical analysis and information retrieval. We will discuss three categories of interaction: sorting, searching and aggregation. Sorting (or ordering) can be applied to the marks in a visualization (e.g., the bars in a categorical bar chart) based on a displayed or non-visible attribute. Some visualizations consist of a number of views displayed in a list or matrix format. In this case, the views themselves can be sorted based on a chosen attribute. Aggregation also involves data processing. It is related to zooming, except that aggregation is applied to the data items themselves and not to the marks in a view. The most basic forms of aggregation involve basic operations, such as averaging, summing (also known as rolling up) and linear regression. More advanced aggregation techniques include clustering. Clustering uses statistical techniques and artificial intelligence to partition datasets into meaningful subsets. Some representation of the subsets themselves can then be used as an aggregated view of the data. There are numerous clustering algorithms, each applicable to different datasets and problems. A review of these algorithms is beyond the scope of this document. The appropriate use of clustering algorithms usually requires the adjustment of certain parameters, thus clustering interactions involve the entry of information in a dialog box of some sort. When conducting an exploratory analysis of a large dataset an analyst may wish to single out a very particular subset of data. Providing some facility for searching the data FP7-ICT-257495-VIS-SENSE 69 3 Visual Analysis for Network Security set makes finding specific data items much easier. In some cases, preprocessing may be necessary to enable fast searching. The input for a search query may be textual or take the form of a selection of marks in a view. This graphical search mode is also known as a similarity search, since the search query is for data items to those represented by the selected marks. 3.1.4 The Results of an Analysis When an analyst has used a visualization to find something interesting or significant it may be necessary to access the raw data represented by a mark. The ability to interactively bring up a detailed display or open a document from the visualization is known as details on demand. If the analysis leads to new discoveries or a better understanding of the given problem then the analyst will probably want to make a note of the discovery. Recording such data in an orderly fashion makes it available to other analysts and for future reference. Storing data in this way is known as a feedback loop. 3.2 Tools for Generic Data Visualizations To get a first impression of a given data set it is possible to use free software tools, which try to visualize the data in an easy and insightful way. The idea is to quickly display different kinds of datasets without the need to have special programming skills. Graphviz [82] for example is a software to view and manipulate abstract graphs like in the database domain or in the case of computer networks. The algorithms used in Graphviz concentrate on static layouts. However, it is possible to choose different kinds of layouts like for a example a hierarchical, a force based or a radial one. Each layout provides the user with certain attributes, which can be changed to improve the visual representation of the graph. Beside changing the layout algorithm it is also possible to modify the appearance of the nodes, to label nodes and edges or to change the color. The software is available under an open source license and can be downloaded from the tool’s website. Another graph visualization tool is Gephi [26]. It focuses on graph data and provides therefore certain techniques for filtering, navigating, manipulating or even clustering. The open source software is available on the homepage and can display large networks with about 20,000 nodes. Every node can be individually designed with textures, photos, etc. Other attributes can be configured in real-time like different layout algorithms, sizeadjustments, or node-repulsion. With little programming skills the tool can be extended with filters or other kinds of algorithms. A big plus is the dynamic module which allows 70 SEVENTH FRAMEWORK PROGRAMME 3.2 Tools for Generic Data Visualizations the user to send data to the visualization while running. This means the results are immediately visible in the graph so changes in the network structure can be examined. To facilitate data analysis on large volumes of data the software Knime [33] offers an easy access to these tasks with the ability to dig deeper into the material and to program individual data mining algorithms. The idea is to provide the user with a visual pipeline where he can add different modules to preprocess, analyze and visualize the data. Each module, or visually spoken node, processes the arriving data and produces results on its output. Typical tasks are filtering or merging, some statistical functions like mean calculation or more intensive algorithms like clustering, etc. Some nodes produce as output additionally a view, which can be displayed in a separate window. These views reach from simple tables to more complex ones like scatterplots, histograms or parallel coordinates. The software can be downloaded free of charge from its website. Nearly the same idea is realized by RapidMinder [210]. The software also offers a graphical user interface to design the analytical process with the possibility to define certain display modules. An advantage is the use of the standardized XML format to exchange information in the pipeline itself. Because of the well-known format the pipeline can easily be extended with external tools or algorithms. Additionally RapidMinder integrates Weka [14] with its machine learning algorithms to perform data preprocessing, clustering, regression or other data mining tasks. While Weka is an open source tool, which can be used as a stand alone version, the tool Rapidminder is available for free in a not supported light community version and in a supported commercial version. ManyEyes [298] is a free to use web tool. Datasets can be uploaded and stored on the server or it is possible to use an already existing dataset. After choosing the data, the user can decide which visualization technique he wants to use. The different kinds of visualization techniques are categorized for their main purpose like analyzing a text or showing relations etc. The visualization is only created if the representation supports the uploaded dataset so the user has to choose the representation type wisely. Unfortunately it is not possible to automatically analyze the data. Another visualization tool without any automatic analysis functions is Gnuplot [4]. The tool has to be used via the command-line and is therefore not as easy to use as other tools. The former idea was to plot mathematical functions but the functionality was improved to visualize nearly any kind of data in many different ways like for example heatmaps, vectorfields or datastrings. The InfoVis toolkit [90] provides the user with almost the same functionality, but offers a graphical user interface for the data import. Additionally the toolkit can be extended with some programming skills. To focus on statistical and mathematical computing the R-project [161] offers a software to deal with these calculations and visualize the results via the command-line. The tool supports different kinds of visualization techniques like scatterplots, barcharts FP7-ICT-257495-VIS-SENSE 71 3 Visual Analysis for Network Security or parallel coordinates. To facilitate the access to the tool it is possible to use some third party software for a visual GUI support of the tool. One example would be the RStudio [5] which is a free to use software. Of course there is a tradeoff between the easy to use software and of how good the different software tools visualize your individual dataset and how good you can adjust certain parameters. In the case of network security it is not possible to only use these common visualization tools because the tasks of a network analyst are very special and must be supported with strong, rich and preferable high interactive software solutions. That is why the following chapters introduce different specialized tools for the different types of network data and tasks. The results of this elaboration is also summarized in Table 3.1 for a better overview and comparison. 72 SEVENTH FRAMEWORK PROGRAMME 3.2 Tools for Generic Data Visualizations IDS Logs Network Traffic BGP x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x Online Available Traffic x x x x x x x x x x x x IDS BGP time-varying (near) real-time x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x historical Similarity Search Ranking Feedback Loop Clustering Linking & Brushing Zoom Ordering Data x x x x x x Analytics x x x x x x x x Filtering Other(Tag Cloud etc.) Small Multiples Matrix Geographic Map Glyph Scatterplot Parallel Coordinates TreeMap x x x x x Interaction x x x Charts Pixel Visualization Tables 3D Display Graph Tool BGPlay [61] BGPlay++ [65] TAMP [310] BGPEye [277] VAST [224] Elisha [276] Link-Rank [177] BGPeep [258] Teoh:2004 [278] Flamingo [223] NetBytes viewer [269] FloVis [270] DNVS [226] Spinning Cube [181] InetVis [139] VIAssist [107] NVisionIP [180] NFlowVis [93] RUMINT [64] Krasser:05 [166] Pearlman:08 [232] Nfsight [32] Xiao:06 [312] Chen:07 [55] OverFlow [105] Mansmann:08 [200] PortVis [207] Existence Plots [141] Portall [92] Flowtag [182] VisFlowConnect [326] IDGraphs [250] Isis [236] TNV [106] Irwin:08 [138] NUANCE [237] Harrop:06 [116] IDS Rainstorm [15] Snort View [163] Idtk [165] Visual Firewall [183] Yelizarov:10 [323] IP Matrix [164] VisAlert [96] SnortSnarf [119] Avisa [259] SpiralView [35] Timeline Visualization x x x x x x x x x x x x x x x x x x x x x Table 3.1: Overview Table of Network Security Tools FP7-ICT-257495-VIS-SENSE 73 3 Visual Analysis for Network Security 3.3 Tools and Methods for BGP Data Much research has been done in the area of visualizing traffic flows, while there are only few approaches for BGP related data. However, BGP is a very vulnerable part of the Internet infrastructure and could be the main target for criminals in the future. A well-known tool to visualize routing information is BGPlay [61] in which an animated graph is used to visualize the autonomous systems (AS) and their connections with each other (see Figure 3.1). To enable this, BGPlay uses the routing information, that is made available by the Route Views [293] system. BGPlay is capable of showing changes over time in an animation and enabling the user to shift the time interval in any direction to examine changes of routing events like “route withdrawal” or “new route”. If an AS path does not change in a given time interval the connections between the vertices are dashed, combined in a set and merged in a tree. Each path and each tree has its own color. The animation of routing events is only shown on solid paths starting at the collector peer and ending in the target AS. There is an online version of the system where the user can enter the IP prefix and the time interval to be analyzed. Figure 3.1: The BGPlay tool [61]. 74 SEVENTH FRAMEWORK PROGRAMME 3.3 Tools and Methods for BGP Data The tool was improved one year later with an underlying topological map [65]. The idea is to show the AS paths of the BGP and the levels of the AS hierarchy to see if some AS have to ”climb” a hierarchy level higher to reach a place in the Internet, or if they can use a path on the same level. Another animated graph visualization for BGP data is provided in TAMP [310]. It displays a pruned graph for the network topology, an animated clock with controls to show and manipulate the time of the current state of the graph and another plot to present the events belonging to a selected edge. TAMP tracks the routing changes expressed by the events to generate frames of TAMP pictures to form an animation. BGPEye uses two different visualizations to satisfy the goal of tracking the healthiness of BGP activity [277]. The Internet-Centric View shows the activity among different ASes with a graph. The Home-Centric View uses a panel display to visualize the prefix status from a single border router perspective. With the different visualization techniques and the underlying Route Views data it is possible to watch real-time routing activity like the moving average of the total number of BGP events or the deviation from historical trends. Another tool showing the overall topology of the Internet as well as individual AS behavior is VAST [224]. VAST uses a quadtree visualization for a single and a 3D OctoTree visualization for multiple ASes to display the topology and the BGP behavior. Color coding, node and link size help to display further information. Interaction possibilities help to explore the 3D space like rotating, zooming or panning the information space. Furthermore, different filter techniques provide the possibility to focus on certain aspects of the data. The tool allows mainly to visualize routing anomalies and sensitive points. The tool Elisha also uses a quadtree visualization in combination with a pixel view [276] as shown in Figure 3.2. All paths from the observation point AS to the origin AS of the IP prefix are plotted. Three detail windows help the analyst to examine certain areas of the quadtree in more detail. Additionally, the tool offers different kinds of perspectives on the data like a 3D display, which can be rotated or a fish-eye view. Further options allow filtering or changes in the visualization itself like additional projection planes in the 3D view, etc. To understand changes in the dataset the system uses animation over time, which can be displayed like a video or with single steps. Color coding is used to represent the time the path was used within the currently displayed time window. For a closer look at the tool there is a free to use version downloadable from the Internet. To visualize routing changes at a global scale the LinkRank tool was developed [177]. A graph layout in combination with an overview plot helps to focus while maintaining scalability. The graphs only show changes triggered by BGP updates. Two connected nodes gaining one or more links are shown in green, nodes loosing one or more links are displayed in red. The other links without any changes are invisible. An activity plot FP7-ICT-257495-VIS-SENSE 75 3 Visual Analysis for Network Security Figure 3.2: The Elisha tool [276]. 76 SEVENTH FRAMEWORK PROGRAMME 3.4 Tools and Methods for Network Traffic Data as overview visualization shows all the changes as green and red bars in the context of time. To get an impression it is possible to download the system from the Internet. The BGPeep tool [258] offers a tag-cloud and a parallel coordinates visualization to gain more insight into BGP traffic. The tag-cloud view is used to represent results of different queries like ”ASes originating prefixes”. After selecting the interesting tags (up to four) the user can gain more information about the tags in the prefix viewer. The prefix viewer contains a parallel coordinate visualization, which consists of five axes. The first one represents the AS associated with the update message. The others display a different octet of the IP address. Because all updates are rendered simultaneously it is necessary to use opacity. With the help of a color coding the user is able to spot often announced prefixes, route flapping or prefix hijacking at once. A timeline is implemented to investigate only specific periods of time. Teoh et al. [279] presented a combination of statistical and visual methods to analyze BGP update messages. The collected data is thereby filtered and processed to obtain statistical measures for each BGP update message, which is also the reason for the tool being only applicable in near real-time. With the help of the visualization the user can select the prefix and time period to be displayed to detect clusters of BGP update messages and compare them to their associated statistical anomaly measures. 3.4 Tools and Methods for Network Traffic Data Traffic flows are captured primarily on routers or switches. They are one layer above the packet captures and that is why some information will be lost. However, it is possible to gain new information namely the AS, the next hop of a packets path through the network, and the number of packets in a flow. Parts of the information are useful to understand the topology of the network. The tools introduced in this section handle either traffic flows (e.g NetFlow data) or packet captures. Flamingo [223] is a software tool that enables 3D Internet traffic data exploration in real-time. It provides a series of different visualization methods to illustrate different aspects of the data which is collected from NetFlow records. To visualize the traffic between two IPs, for example, a quadtree algorithm is used to place the source IP address space on one side of a cube and the destination address space on the other side. The traffic between the source and the destination IP is displayed with a line connecting the two. The thickness of the line codes the amount of traffic. With zooming, rotating and panning options the user is able to navigate through the information space and extract the necessary information. The cube can be reordered and filtered in different ways to take into account other aspects of the NetFlow data like for example the ports FP7-ICT-257495-VIS-SENSE 77 3 Visual Analysis for Network Security used. Another interactive 3D visualization tool of NetFlow data is the NetBytes viewer [269]. It deals with historical flow data leaving or entering a single entity and displays the volume of traffic flows as well. For that purpose a 3D impulse graph with a time dimension, a port or protocol dimension and a volume dimension is used. To diminish the disadvantages of a 3D display the tool provides important interaction methods like rotating and zooming the 3D display or highlighting data to gain further information in separate 2D graphs. FloVis [270] combines the NetBytes viewer with other visualizations like a flow bundle diagram or an existence graph. With the added flow bundle diagram a user can additionally investigate host to host or network to network interactions while the existence graph is useful to spot role-based host information. The data source is provided by the SiLK toolkit [48] which filters the raw flow data. The tool can be downloaded from the homepage. A 3D tool monitoring network traffic in real-time is DNVS [226]. The tool is still under development and so it lacks of important features like interaction possibilities or filtering options. Nevertheless the system provides two visualizations namely the Service Behavior View and the Category View which help, even in this early state, to discover possible anomalies of the network such as DoS types or probing attacks. The Spinning Cube of Potential Doom [181] visualizes darknet data in a 3D cube. The axes represent the source IP address, the destination IP address and the destination port number. With this coding scan attacks are quickly visible to the analyst. Vertical lines represent port scans, flat two-dimensional planes appear for port scans upon multiple continuous host IP addresses and port scans trying to avoid detection produce spiral like patterns. To get a better understanding of the functionality of the tool it is possible to download a video. Focusing on scan detection InetVis [139] captures and visualizes live traffic in a 3D scatterplot. A time window offers the possibility to change the time scale or show certain states in the past. To investigate the data further and avoid over-plotting the display can be split into sub-networks or smaller port ranges. Additionally the traffic can be filtered with the help of the Berkeley Packet Filter (BPF). With this syntax the packets can be filtered on any parameter. To enhance the overview the user can color certain aspects of the data like for example different ports. A fully functional version is available on the homepage. Leaving the area of 3D displays VIAssist [107] provides an intuitive, customizable 2D dashboard to provide a big-picture overview of network flow data to enhance situational awareness. Different kinds of visualizations like scatterplots, parallel coordinates or charts are provided to analyse network activity. Users can zoom into the data by 78 SEVENTH FRAMEWORK PROGRAMME 3.4 Tools and Methods for Network Traffic Data increasing the accuracy of the data. Filtering options are provided by sliders and checkboxes. Additionally, all of the visualization views in VIAssist are linked, so LOD zooming and filtering in one view is reflected in the others. The different views can be relocated and resized within the workspace. The state of each workspace can be saved and exchanged to help other cyber defenders fulfilling their tasks. To increase the number of linked views and to provide a better overview the analyst can use multiple displays. NVisionIP [180] also uses different visualizations to support the analyst in detecting network attacks. The first view, named the Galaxy View, displays high level data about the entire network. In this pixel visualization each point represents one IP address. Every point is color coded to represent the number of unique ports used by that IP address. The second view, the Small Multiple View, is a more detailed representation of the galaxy view. Every IP address is visualized via two bar graphs. Both of these bar graphs show traffic over ports which are colored for a better understanding. The third view, the Machine View, visualizes only a single IP address with different charts to provide the most detailed information at a single glance. Figure 3.3: The NFlowVis tool [93]. The NFlowVis [93] system combines alerts from intrusion detection systems with NetFlow data of a whole company network. To enhance network security and to assess FP7-ICT-257495-VIS-SENSE 79 3 Visual Analysis for Network Security the impact of current attackers the system provides several views to support the workflow of the analyst. This workflow starts with an overview based on several timeline and pixel visualizations, followed by an intrusion detection view, which shows the current IDS alerts. The flow visualization detailed in Figure 3.3 combines attacking external hosts with affected hosts within the internal network using novel visualizations based on treemaps, splines and graphs. Applying visual data analysis to traditional IDS data allows the analyst to gain deeper insight into current threat situations. Another dashboard providing the user with many different types of visualizations is the tool RUMINT [64]. As a starting point the system shows a real-time thumbnail visualization. Each thumbnail represents one of the seven different visualizations which can be enlarged after clicking on it. The user has the option to choose between a parallel coordinate plot, a scatterplot, a glyph based animation and many more. The user can investigate the data and choose the degree of detail in an explorative way. The tool as well as the source code can be downloaded from the project’s webpage. Like one of RUMINTs views, two other systems use a glyph based visualization to deal with network security. The first is from Krasser [166] who innovated a parallel coordinate plot in combination with glyphs. Each glyph represents a packet and can be clicked to retrieve more information. Additionally the analyst can chose between different time scales and zoom into interesting areas of the data. The second is provided by Pearlman [232]. He combines glyphs with a graph layout. Each glyph represents a node on the network and codes the amount of traffic on a particular port where each port is a slice in a circle. The size of the slice depends on the relative amount of traffic on the corresponding port. Different inner circles show changes over time for a single node. Two nodes are connected when their services communicate with each other. To maintain the overview it is possible to zoom and pan the visualization to change the point of interest. Nfsight [32] uses unidirectional NetFlow data provided by Nfdump or Nfsen to monitor client server activity. One major part of the tool is the service detector which converts these flows into bidirectional flows. This detector identifies the client and the server. Additionally event alerts generated by the self implemented Intrusion Detection System are stored in a database. The visualization consists of a search engine, a dashboard and a network activity visualization table. The search engine enables the analyst to filter or query for specific parameters like IP address, etc. The dashboard displays the latest alerts, the top 20 servers, services, scanned services, and internal scanner. The visualization table provides statistical information and displays the network activity as a time series using a heat map. Color is used to distinguish between client and server and to identify invalid flows. Because the tool cannot be used with real-time data its main purpose is a forensic analysis. 80 SEVENTH FRAMEWORK PROGRAMME 3.4 Tools and Methods for Network Traffic Data Xiao [312] stores the network flow data in a database and visualizes them with scatterplots or event diagrams. The database is used to support the use of different clauses to filter the data and to store them for later reuse. The analyst can select patterns in the different visualizations for which he gets a list of predicates. This additional information is necessary to construct clauses which are currently only limited to conjunctions. After the analyst has found a clause which describes a certain pattern, he can name the pattern and commit it to the knowledge base for later use. The tool invented by Chen [55] uses a machine learning method in combination with visualizations to reconstruct and classify network scan patterns. A training set of controlled scan patterns is needed to reconstruct a noisy or incomplete pattern. This pattern can be used for later comparison or clustering to find correlations in malicious network activities. When dealing with large numbers of network scans using visual representation in combination with machine learning methods are a great advantage. The tool OverFlow [105] focuses on different types of overview visualizations. The system aggregates flow level data to provide analysts with a starting point for their network traffic investigation. The idea is to show traffic between different subnets in such a way that the analyst can spot interesting areas and focus especially on those. Therefore, the analyst is able to quickly determine if there is traffic between subnets that should not exist, or if the characteristics of that traffic have changed. Mansmann [200] introduced a graph based metaphor to satisfy the goal of discovering anomalies in the behavior of hosts or higher level network entities as shown in Figure 3.4. Therefore, the nodes of the graph represent the hosts which are placed close to each other if they have similar traffic proportions. This layout algorithm can be influenced by the analyst by changing the attractor level of the nodes. Further interaction possibilities are integrated to allow the explorative investigation of the graph like highlighting different nodes or more detailed information. Additionally the analyst can combine the graph with a treemap visualization to gain further information about the network and the host behavior. PortVis [207] displays three visualizations to present high level information as well as low level semantic constructs. The first view, the timeline, shows the number of sessions on the port range in combination with the time. Different time units can be selected to be shown in the second visualization, the main view. It consists of a pixel visualization displaying each port. Color is used to code a user selected attribute for each port. Such a port can be selected to receive additional information in the third view, the port visualization. The port visualization displays details over time for a selected port to identify if the activity on the port is anomalous. The tool helps to detect port scans and suspicious traffic patterns on individual ports. The same goals can be achieved by using existence plots as introduced by Janies [141]. FP7-ICT-257495-VIS-SENSE 81 3 Visual Analysis for Network Security Figure 3.4: The system developed by Mansmann [200]. The system uses a low-resolution visualization to represent the port usage of individual hosts over time. The display maps time on the x-axis and the port range on the y-axis. Color is used to represent the magnitude of traffic. The time scale can be changed to receive either an overview or a more detailed presentation. In both cases, the existence plot provides useful insight into the hosts activities by concurrently representing ports usage. For a more detailed analysis of application ports the system Portall [92] was invented. The tool gives analysts an end-to-end visualization of the host processes correlated with the network traffic in which the processes participate. Apart from the main window which displays the communicating applications the tool provides additional detail windows to gain further information. A timeline allows the analyst to investigate traffic and processes at some point in the past. Further interaction possibilities like highlighting help the user to obtain the overview if there are many occluding lines. Changing the time scale and other parameters is also possible with FlowTag [182]. The tool uses double-ended sliders to manage the filtering process, different tables and a parallel coordinate plot for the visualization task. It is possible to share the attack data to enhance the possibility of a collaborative analysis of Honeynet researchers. Network 82 SEVENTH FRAMEWORK PROGRAMME 3.4 Tools and Methods for Network Traffic Data flows can be tagged and later queried by a user interface to select only the interesting flows. Because of the graphical interface there is no need for a textual query because the flows are represented as lines and can be selected using rectangulars. A video of the functionality and the program itself is available on the corresponding homepage. VisFlowConnect [326] provides an animated parallel coordinate plot as the main view and a detailed host statistics table. With this combination the tool supports a high level overview but with the possibility to drill down into interesting or anomalous regions of the data. The animation is used to show changes over time. Additionally the user can manipulate the time in a way that he can go backwards and replay a certain event. With the different visualization techniques and interaction possibilities the tool satisfies the goal of displaying relationships between internal hosts and external machines, including the direction and volume of traffic. Another interactive visualization system for NetFlow data streams is IDGraphs [250]. The tool uses a Histograph visualization together with a correlation matrix to reveal network anomalies and attacks like port scans or SYN flooding. For the Histograph presentation time is mapped to the horizontal axis and SYN-SYN/ACK values to the vertical axis. This mapping is useful because high SYN-SYN/ACK values are suspicious and easy to spot. In the linked correlation matrix each row and column represents one stream. To display the correlation, every cell is additionally color coded from green (positive) to red (negative). To receive better results out of the matrix the streams are clustered to provide a better ordering. Furthermore, the matrix and the Histograph view are connected via linking and brushing interactions. Isis [236] supports the analysis of network flows through two visualization methods, progressive multiples of timelines and event plots. The system uses a matrix like metapher to show the IP addresses one host has traffic with on the y-axis and the time on the x-axis. Glyphs are used to code different events and are placed corresponding to their time and the source IP address. An interesting approach is the combination of visual affordances with structured query language (SQL) to minimize user error and maximize flexibility. To enable a feedback loop Isis keeps a history of a user’s investigation, easily allowing a user to revisit a query and change a hypothesis. A MySQL database is used to store the flows, which provides the analyst with a flexible and familiar interface for specifying queries. To preserve the big picture while performing packet-level analysis TNV [106] was developed. As an overview the system includes a histogram of the relative network traffic activity of the entire dataset. The main view combines a matrix, displaying the time and the host IP addresses, with a link display to explicitly show connectivity between hosts. Connected to the main visualization are a port activity view and a table of the textual network packet details. The port activity view provides a visual overview FP7-ICT-257495-VIS-SENSE 83 3 Visual Analysis for Network Security of relative port activity and connections for selected hosts, while the details table offers access to the raw packet-level details required for the analysis task. Several filtering and highlighting mechanisms help to explore link patterns and activity. A java version of the tool is available online. Irwin [138] invented a tool to display a very large amount of network telescope traffic and in particular to compare data collected from multiple telescope sources. To achieve these goals the author invented a visualization using a Hilbert curve to layout data points. This layout algorithm aims to place similar IP addresses close to each other. Color can be used to code different parameters like networks with unique hosts or, as a future work, geographical information. Beside the more common visualization techniques the system NUANCE [237] tries to gain context information about network attacks by building clusters of actors. Every actor is represented via an IP address and has an own profile which describes the traffic over time. These actor models are clustered with k-means to represent similar behavioral profiles. After this clustering process NUANCE constructs a text vocabulary by performing an automated web search to describe each group. Of course the system also creates visualizations like histograms or a geographic map to display the clusters of actors but more exceptional are the additional text information gained from news feeds. A very unconventional approach to visualize network security is a system invented by Harrop [116]. A 3D game engine is used to display hosts with their traffic and port information along with different analysts collaboratively supervising the network. The analysts interact with the environment like computer game players. They can move or jump in any direction and even shoot with their weapon in order to initiate an action. 3.5 Tools and Methods for IDS Logs The massive amount of textual alarm logs generated from intrusion detection systems makes it difficult to analyze each of them or to get an overall picture of what is occurring in the network. Therefore, visualizations are important to display alarm activity in a clearly arranged way. Important for the analyst is to get an overview of the alarms to obtain an idea about the general network activity and to easily detect anomalies. IDS Rainstorm [15] provides this overview with the possibility to gain additional information via zooming and drill down options. For the overall representation a matrix like visualization is used. A couple of rectangular regions are used to split the view into different sections to gain multiple y-axes. These sections provide the IP addresses on the y-axis and the time on the x-axis. Color is used to code the alarm severity. To gain further information 84 SEVENTH FRAMEWORK PROGRAMME 3.5 Tools and Methods for IDS Logs the analyst can use the cursor to select a focus point. After clicking on the area of interest a secondary window opens and shows additional information in a zoomed view. In this visualization colored glyphs are used to represent the alarms. A mouse-over reveals detailed information in a popup window while double clicking allows time scaling. Currently, the tool can only be used for forensic analysis. A download link for the tool can be found on Christopher Lee’s homepage. SnortView [163] uses nearly identical visualization techniques as IDS Rainstorm. In the main view the x-axis codes the IP addresses, the y- axis the time, and colored icons represent the alarms. One difference between the two systems is the shape of these alarm icons which codes additional information about the types of attack. To avoid overlapping or redundant repainting of icons a vertical red bar is used to display consecutive alarms. A detail view at the bottom of the screen shows further information about the alarms. The next difference is the additional Source-Destination Matrix frame which is displayed on the left of the main screen. The source IP address can be seen on the y-axis and the destination IP address at the bottom. A red circle represents the communication between source and destination. When a user clicks a symbol in the alert frame, the communication path is highlighted and further information is shown in the detail view. Unlike IDS Rainstorm SnortView does a real-time monitoring of Snort alarm logs. A similar two dimensional mapping of time and source IP addresses with color coding and glyph representations is used in IDtk [165]. The main difference of this tool is the possibility for the analyst to change the mappings of different data variables for example of the glyphs (e.g size, opacity,...) or the axis. It is even possible to introduce a third axis to create a 3D display. This flexibility allows the analysts to develop their own style of work for their own unique networks. With different filtering and interaction techniques the analysts can easily handle the massive amount of data. A few tools even combine IDS alerts with other information like traffic or other log files. Visual Firewall [183] uses four different views to handle traffic data as well as IDS logs. The Real-Time Traffic View displays packets in motion to show if a packet is rejected by the firewall or not. Colored glyphs are used for every packet to code the different kinds of traffic like UDP or TCP. The Visual Signature View is a parallel coordinate plot with two axes. The one on the left displays the local host port and the one on the right shows the foreign host IP address. A connection is drawn if there is traffic between a port and a foreign host. After some time the lines fade out to avoid occlusion and to give the analyst a feeling of time. The Statistics View uses a line chart to illustrate the overall throughput of the network over time. The last view the IDS Alarm View displays IDS alerts in a quad-axes diagram. The time is displayed at the bottom, the left axis shows different categories of snort rules, the right axis represents all possible subnets where attacks originate and the top displays all the hosts on the local FP7-ICT-257495-VIS-SENSE 85 3 Visual Analysis for Network Security machine’s subnet (the victims). Faded lines are drawn to visualize connection between the parameters. The fading animation codes the time. Colored Dots are used to display IDS alarms with color coding the severity level. The combination of the different views allows an analyst to form a coherent illustration of the network state. The tool invented by Yelizarov [323] uses cylinder like glyphs to code the severity level by height and the type of attack by color. Every glyph is linked in respect to a previously discovered attack to reveal relations of attacks within a complex event and the duration as well. The glyphs are placed on a 3D matrix. The y-axis represents the attacked IP address and the x-axis the time. The source IP addresses are displayed on a single line in the 3D space. Connections between this line and a cylinder visualize attacks from the source to the destination IP address. To visualize similar IP addresses close to each other the Tool IP-Matrix [164] uses two 2D matrices. The first matrix is for an analysis on the Internet-level and displays the first eight bits of the IP address on the vertical axis and the second eight bits on the horizontal axis. The second matrix is meant for monitoring the local network and displays the last 16 bits in the same way. To handle the large amount of different alert types the system summarizes them into eight categories and colors every category. Since it is very difficult to analyze single pixels the tool builds grids which are colored according to the most frequent alert type occurring in this grid. To visualize the amount of attacks for each single pixel two histograms are displayed on the bottom and the left of the matrix. Changes in the temporal behavior are displayed with the help of animation which can be controlled by the analyst. He can play certain events again or can change the update interval. Further interaction possibilities allow him to filter for certain attributes like for example different protocols or to gain additional information about an attack by clicking on the corresponding pixel. VisAlert [194] aims to enhance the situational awareness via visual correlation of existing alerts. This goal is achieved via a topological map in a multiple circle layout. The topological map is shown in the inner circle, while the time is represented as the radial coordinate of a polar coordinate system. The shape and the size of the nodes code different parameters like the uniqueness of alerts. When multiple alerts of the same type are triggered with regard to the same node, the alert lines will be replaced by a beam which encodes the additional information by its width and color. The different cells on the outer rings represent particular types of alerts which are colored according to their number of instances in this time slot. The system performs no analysis itself but provides a visual alert representation with which the analyst can manually explore noticeable events. SnortSnarf [119] is not interesting in the way it displays IDS alarms, but in the way it preprocesses data and the interaction possibilities it provides. For visualizing the alarm 86 SEVENTH FRAMEWORK PROGRAMME 3.5 Tools and Methods for IDS Logs logs different HTML pages display simple tables and text sections. Some links are used to connect the pages where the user can get different information from like an ordering of the log files or only some filtered information. But the most important feature of the tool is the possibility to divide up the alerts into a hierarchy of groups and to view only the representatives. With this method the analyst will still able to retain the overview over the log files even if there is a massive amount of alarms occurring. The tool can be downloaded from SourceForge. In order to avoid an occluded, overdrawn and hard to perceive display the tool Avisa [259] offers an automatic as well as a user directed prioritization of alarms and hosts. This enables the analyst to identify the hosts with interesting and often irregular behavior and discard the other ones. With this preprocessing step the main display gets clearly arranged showing a radial visualization with interior arcs and an inner and outer ring. The inner ring shows the IDS alert types in different colors while the outer ring is used for categorizing the alert types. Beside the colored IDS alert types the internal hosts of the network are displayed. The inner arcs represent the alarms starting at the alert type panel and ending at the host panel. To gain an even better overview the analyst can apply some filtering methods while interacting with the data and animation is used to understand changes over time. SpiralView [35] uses IDS alerts which are visualized in a spiral layout. It is a real-time tool keeping history of already known alerts. Older ones are located near the center of the spiral while newer alarms are on the outer ring. This kind of ordering has different advantages. More recent alarms have more space than older ones, the data is presented sequentially and it displays periodic behavior. One circle represents all alarms of the last 24 hours. The alarms are color coded to visualize different types of alerts and their size codes the severity. An additional histogram above the spiral is used to show the aggregated data over time. The analyst can select a time interval on the histogram to zoom on the corresponding ring of the spiral and investigate the result further. With the help of filtering options the user can reduce the visible data points to avoid overlapping. The tool also supports collaborative work because it is possible to label certain alarms to make other analysts aware of the discovery. FP7-ICT-257495-VIS-SENSE 87 4 Conclusions and Future Work The presented state-of-the-art techniques make it very clear that there has been much research in each of the VIS-SENSE relevant fields of network and visual analytics. There are numerous approaches for network abnormalities detection described in the relevant literature. As also discussed in Chapter 2 each approach has advantages and disadvantages on the ability to detect occurred abnormalities at a high success rate and avoid raising false positive alarms. Some of these approaches face difficulties operating in realtime and are subject to scalability limitations. A promising direction to get solutions with safe and rich functionality is the combination of several techniques into integral approaches. Moreover, introducing effective correlation mechanisms for Intrusion Detection Systems help to improve the network analysts’ ability to identify promptly the occurred abnormal events. A very interesting architecture for designing IDSs is that of honeynets and honeypots. Such an architecture effectively attracts and monitors the attacks and is able to locate the attackers. Most visual analytics applications focus on the visualization part, but often do not rely on the most advanced network analytics approaches. The survey of visual analysis tools for network security in Chapter 3, which provide interactive visual exploration for flow data, BGP and intrusion detection data also showed that most tools are very specific and in many cases only suitable for particular tasks. In addition to this the presented summary table reveals that most of those custom-built tools are rarely able to combine different data sources, but focus only on single data types. Because of these limited capabilities and scalability issues the deployment to real world operational scenarios or analyzing large datasets is often not possible. Therefore, it has to be pointed out that there is still a substantial gap between those fields of research. Especially when there are very complex algorithms involved, visual analysis could actually help to gain more insight into the data. In the field of attack attribution, for example, there are algorithms which are able to automatically group those events together which probably have the same underlying root cause. However, because of the large number of dimensions it is not obvious to the analyst any more why these events are in the same group or not. In a system where interactive visual analysis is combined with analytics algorithms, the analyst would be better equipped to gain insight into the groups of attacks. This means that we need to tightly couple network security algorithms with directly integrated visual analysis methods in the future. Besides such 88 integrative aspects, further improvements of the scalability of both the data analysis algorithms and visualizations are necessary to eventually reach this goal. Moreover, it is important to bring this research to an operational level. The solid knowledge of the VISSENSE partners in network algorithmics and visual analysis will enable us to close this gap by providing a visual analytics framework that combines the respective strengths of both worlds. FP7-ICT-257495-VIS-SENSE 89 Bibliography [1] Cisco IronPort SenderBase Security Network. http://www.senderbase.org. [Online; accessed 22-Jan-2011]. [2] Composite Blocking List (DNSBL). http://cbl.abuseat.org. [Online; accessed 22-Feb-2011]. [3] Distributed Sender Blackhole List (DNSBL). http://dsbl.org. [Online; accessed 11-Jan-2011]. [4] Gnuplot Homepage. http://www.gnuplot.info/. [Online; accessed 24-Feb-2011]. [5] Introducing RStudio. http://www.rstudio.org/. [Online; accessed 24-Feb-2011]. [6] Not Just Another Bogus List (DNSBL). http://www.njabl.org. [Online; accessed 22-Jan-2011]. [7] Project Honey Pot. http://www.projecthoneypot.org. [Online; accessed 22Jan-2011]. [8] RFC4271. A Border Gateway Protocol 4 (BGP-4). http://tools.ietf.org/ html/rfc4271. [Online; accessed 22-Jan-2011]. [9] SecureWorks. http://www.secureworks.com. [Online; accessed 22-Jan-2011]. [10] Spam and Open-Relay Blocking System (DNSBL). http://www.au.sorbs.net. [Online; accessed 25-Jan-2011]. [11] Spamcop Blocking List (DNSBL). http://www.spamcop.net/bl.shtml. [Online; accessed 25-Jan-2011]. [12] Spamhaus (DNSBL). http://www.spamhaus.org. [Online; accessed 24-Jan-2011]. [13] The Apache SpamAssassin Project. http://spamassassin.apache.org. [Online; accessed 26-Jan-2011]. [14] Weka 3: Data Mining Software for Java. http://www.cs.waikato.ac.nz/ml/ weka/. [Online; accessed 24-Feb-2011]. 90 Bibliography [15] K. Abdullah, C. Lee, G. Conti, J. Copeland, and J. Stasko. Ids rainstorm: Visualizing ids alarms. Visualization for Computer Security, IEEE Workshops on, 2005. http://chrislee.dhs.org/projects/rainstorm.html. [16] W. Aiello, J. Ioannidis, and P. D. McDaniel. Origin authentication in interdomain routing. In S. Jajodia, V. Atluri, and T. Jaeger, editors, ACM Conference on Computer and Communications Security, pages 165–178. ACM, 2003. [17] A. Al-Bataineh and G. White. Detection and Prevention Methods of Botnetgenerated Spam. In MIT Spam Conference, 2009. [18] D. Anderson, T. Lunt, H. Javitz, A. Tamaru, and A. Valdes. Next-generation intrusion detection expert system (nides): A summary. Technical report, SRI International, 1995. [19] D. S. Anderson, C. Fleizach, S. Savage, and G. M. Voelker. Spamscatter: characterizing internet scam hosting infrastructure. In SS’07: Proceedings of 16th USENIX Security Symposium on USENIX Security Symposium, pages 1–14, Berkeley, CA, USA, 2007. USENIX Association. [20] S. Axelsson. Intrusion detection systems: A survey and taxonomy. Technical Report 99-15, Department of Computer Engineering, Chalmers University of Technology, Goteborg, Sweden, 2000. [21] H. Ballani, P. Francis, and X. Zhang. A Study of Prefix Hijacking and Interception in the Internet. In SIGCOMM ’07: Proceedings of the 2007 conference on Applications, technologies, architectures, and protocols for computer communications, pages 265–276, New York, NY, USA, 2007. ACM. [22] Z. Bankovic, S. Bojanic, O. Nieto-Taladriz, and A. Badii. Unsupervised genetic algorithm deployed for intrusion detection. In E. Corchado, A. Abraham, and W. Pedrycz, editors, HAIS, volume 5271 of Lecture Notes in Computer Science, pages 132–139. Springer, 2008. [23] D. Barbara, J. Couto, S. Jajodia, and N. Wu. Adam: A testbed for exploring the use of data mining in intrusion detection. SIGMOD Record, 30(4):15–24, 2001. [24] D. Barbara and S. J. (Eds), editors. Applications of Data Mining in Computer Security, volume 6 of Advances in Information Security. Springer, 2002. [25] T. Bass. Intrusion detection systems and multisensor data fusion. Communications of the ACM, 43(4):99–105, 2000. FP7-ICT-257495-VIS-SENSE 91 Bibliography [26] M. Bastian, S. Heymann, and M. Jacomy. Gephi: An Open Source Software for Exploring and Manipulating Networks. In International AAAI Conference on Weblogs and Social Media, pages 361–362. AAAI, 2009. http://gephi.org/. [27] M. Behringer. Bgp session security requirements. Internet Draft, draft-ietf-rpsecbgp-session-sec-req-01.txt, July 2008. [28] R. Bejtlich. Attribution Is Not Just Malware Analysis. http://taosecurity. blogspot.com/2010/01/attribution-is-not-just-malware.html. [Online; accessed 22-Jan-2011]. [29] R. Bejtlich. Attribution Using 20 Characteristics. http://taosecurity. blogspot.com/2010/01/attribution-using-20-characteristics.html. [Online; accessed 22-Jan-2011]. [30] G. Beliakov, A. Pradera, and T. Calvo. Aggregation Functions: A Guide for Practitioners. Springer, Berlin, New York, 2007. [31] H. D. Benjamin Morin, Ludovic Me and M. Duccasse. M4d4: a logical framework to support alert correlation in intrusion detection. Information Fusion, 10(4):285– 299, October 2009. [32] R. Berthier, M. Cukier, M. Hiltunen, D. Kormann, G. Vesonder, and D. Sheleheda. Nfsight: NetFlow-based Network Awareness Tool. In Proceedings of the 24th Large Installation System Administration Conference (LISA ’10), November 2010. [33] M. Berthold, N. Cebron, F. Dill, T. Gabriel, T. K ”otter, T. Meinl, P. Ohl, C. Sieb, K. Thiel, and B. Wiswedel. KNIME: The Konstanz information miner. Data Analysis, Machine Learning and Applications, pages 319–326, 2008. http://www.knime.org/downloads-overview. [34] J. Bertin. Sémiologie Graphique. Les diagrammes, les réseaux, les cartes. GauthierVillars, Paris, France, 1967. [35] E. Bertini, P. Hertzog, and D. Lalanne. SpiralView: towards security policies assessment through visual correlation of network resources with evolution of alarms. In Visual Analytics Science and Technology, 2007. VAST 2007. IEEE Symposium on, pages 139–146. IEEE, 2007. [36] R. Beverly and K. Sollins. Exploiting transport-level characteristics of spam (technical report mit-csailtr-2008-008, 2008. 92 SEVENTH FRAMEWORK PROGRAMME Bibliography [37] J. Bonifacio, A. Cansian, A. de Carvalho, and E. E. Moreira. Neural networks applied in intrusion detection. In Proceedings of the International Joint Conference on Neural Networks, 1998. [38] V. J. Bono. 7007 explanation and apology. NANOG mailing list, msg00444, 1997. [39] A. Brodsky and D. Brodsky. A distributed content independent method for spam detection. In HotBots’07: Proceedings of the first conference on First Workshop on Hot Topics in Understanding Botnets, pages 3–3, Berkeley, CA, USA, 2007. USENIX Association. [40] S. T. Brugger. Data Mining Methods for Network Intrusion Detection. In dissertation proposal, submitted to ACM Computer Surveys (under revision), 2009, 2009. [41] H. Bunke, P. Dickinson, A. Humm, C. Irniger, and M. Kraetzl. Computer network monitoring and abnormal event detection using graph matching and multidimensional scaling. In P. Perner, editor, Industrial Conference on Data Mining, volume 4065 of Lecture Notes in Computer Science, pages 576–590. Springer, 2006. [42] H. Bunke and M. M. Kraetzl. Classification and detection of abnormal events in time series of graphs, volume Data Mining in Time Series Databases, chapter 6, pages 127–148. World Scientific, 2004. [43] K. Butler, T. Farley, P. McDaniel, and J. Rexford. A Survey of BGP Security Issues and Solutions. In Proceedings of the IEEE, volume 98, pages 100–122, January 2010. [44] J. Cabrera, L. Lewis, X. Qin, W. Lee, R. Prasanth, B. Ravichandran, and R. Mehra. Proactive detection of distributed denial of service attacks using mib traffic variables-a feasibility study. In Integrated Network Management Proceedings, 2001 IEEE/IFIP International Symposium on, pages 609 –622, 2001. [45] M. Caesar, L. Subramanian, and R. H. Katz. Towards localizing root causes of bgp dynamics. Technical Report UCB/CSD-03-1292, EECS Department, University of California, Berkeley, 2003. [46] P. H. Calais, D. E. V. Pires, D. O. Guedes, W. Meira, C. Hoepers, and K. Stedingjessen. A campaign-based characterization of spamming strategies. In In CEAS, 2008. FP7-ICT-257495-VIS-SENSE 93 Bibliography [47] J. Cannady and J. Mahaffey. The application of artificial neural networks to misuse detection. In In Proceedings of the International Workshop on the Recent Advances in Intrusion Detection (RAID1998), 1998. [48] CERT/NetSA at Carnegie Mellon University. SiLK (System for Internet-Level Knowledge). http://tools.netsa.cert.org/silk. [Online; accessed 24-Feb2011]. [49] P. Chan, M. Mahoney, and M. Arshad. Managing Cyber Threats: Issues, Approaches and Challenges, chapter Learning Rules and Clusters for Anomaly Detection in Network Traffic, pages 81–100. Springer, 2005. [50] D.-F. Chang, R. Govindan, and J. S. Heidemann. An empirical study of router response to large bgp routing table load. In Internet Measurement Workshop, pages 203–208. ACM, 2002. [51] D.-F. Chang, R. Govindan, and J. S. Heidemann. The temporal and topological characteristics of bgp path changes. In ICNP, pages 190–199. IEEE Computer Society, 2003. [52] C.-S. Chao, Y.-X. Chen, and A.-C. Liu. Abnormal event detection for network flooding attacks. J. Inf. Sci. Eng., 20(6):1079–1091, 2004. [53] V. Chatzigiannakis, G. Androulidakis, K. Pelechrinis, S. Papavassiliou, and V. Maglaris. Data fusion algorithms for network anomaly detection: classification and evaluation. In IEEE International Conference on Networking and Services, ICNS’07, Athens, Greece, June 2007, June 2007. [54] L. Chen and J. Leneutre. A game theoretical framework on intrusion detection in heterogeneous networks. Information Forensics and Security, IEEE Transactions on, 4(2):165 –178, June 2009. [55] L. Chen, C. Muelder, K. Ma, and A. Bartoletti. Intelligent Classification and Visualization of Network Scans. Technical report, Lawrence Livermore National Laboratory (LLNL), Livermore, CA, 2007. [56] Z. Chen, C. Ji, and P. Barford. Spatial-temporal characteristics of internet malicious sources. In Proceedings of INFOCOM, 2008. [57] S. Cheung, U. Lindqvist, and M. W. Fong. Modeling multistep cyber attacks for scenario recognition. In DISCEX (1), pages 284–292. IEEE Computer Society, 2003. 94 SEVENTH FRAMEWORK PROGRAMME Bibliography [58] A. Chittur. Model generation for an intrusion detection system using genetic algorithms. PhD thesis, Ossining High School. In cooperation with Columbia University, 2001. [59] B. Christian and T. Tauber. Bgp security requirements. Internet Draft, draft-ietfrpsec-bgpsecrec-10.txt, November 2008. [60] CNET News. Router glitch cuts Net access. http://news.cnet.com/ 2100-1033-279235.html. [Online; accessed 22-Apr-1997]. [61] L. Colitti, G. Di Battista, F. Mariani, M. Patrignani, and M. Pizzonia. Visualizing Interdomain Routing with BGPlay. Journal of Graph Algorithms and Applications, 9(1):117–148, 2005. http://bgplay.routeviews.org/. [62] Colorado State University. BGP Monitoring System: BGPmon. http://bgpmon. netsec.colostate.edu/. [Online; accessed 22-Jan-2011]. [63] Computer Networks Research Group – Roma Tre University. BGPlay. http: //bgplay.routeviews.org/. [Online; accessed 24-Jan-2011]. [64] G. Conti, K. Abdullah, J. Grizzard, J. Stasko, J. Copeland, M. Ahamad, H. Owen, and C. Lee. Countering security information overload through alert and packet visualization. IEEE Computer Graphics and Applications, pages 60–70, 2006. http://rumint.org/. [65] P. Cortese, G. Di Battista, A. Moneta, M. Patrignani, and M. Pizzonia. Topographic visualization of prefix propagation in the internet. IEEE Transactions on Visualization and Computer Graphics, pages 725–732, 2006. [66] M. Cova, C. Leita, O. Thonnard, A. D. Keromytis, and M. Dacier. An analysis of rogue av campaigns. In Proceedings of the 13th international conference on Recent advances in intrusion detection, RAID’10, pages 442–463, Berlin, Heidelberg, 2010. Springer-Verlag. [67] F. Cuppens. Managing alerts in a multi-intrusion detection environment. In Computer Security Applications Conference, 2001. ACSAC 2001. Proceedings 17th Annual, pages 22 – 31, December 2001. [68] F. Cuppens and A. Miege. Alert correlation in a cooperative intrusion detection framework. In Security and Privacy, 2002. Proceedings. 2002 IEEE Symposium on, pages 202 – 215, 2002. FP7-ICT-257495-VIS-SENSE 95 Bibliography [69] F. Cuppens and R. Ortalo. Lambda: A language to model a database for detection of attacks. In H. Debar, L. Me, and S. F. Wu, editors, Recent Advances in Intrusion Detection, volume 1907 of Lecture Notes in Computer Science, pages 197–216. Springer, 2000. [70] Cyber-TA. Cyber-threat analytics (cyber-ta), sri international. Available online at http://www.cyber-ta.org/. [Online; accessed 24-Jan-2011]. [71] M. Dacier, V. Pham, and O. Thonnard. The WOMBAT Attack Attribution method: some results. In 5th International Conference on Information Systems Security (ICISS 2009), 14-18 December 2009, Kolkata, India, Dec 2009. [72] D. Dasgupta. An immunity-based technique to characterize intrusions in computer networks. IEEE Transactions on Evolutionary Computation, 6:1081–1088, 2002. [73] H. Debar, M. Becker, and D. Siboni. A neural network component for an intrusion detection system. In Proceedings of the 1992 IEEE Computer Society Symposium on Research in Computer Security and Privacy, pages 240–250, 1992. [74] H. Debar, D. Curry, and B. Feinstein. The Intrusion Detection Message Exchange Format (IDMEF). RFC 4765 (Experimental), March 2007. [75] H. Debar, M. Dacier, and A. Wespi. A revised taxonomy for intrusion-detection systems. Annals of Telecommunications, 55:361–378, 2000. 10.1007/BF02994844. [76] H. Debar and A. Wespi. Aggregation and correlation of intrusion-detection alerts. In W. Lee, L. Me, and A. Wespi, editors, Recent Advances in Intrusion Detection, volume 2212 of Lecture Notes in Computer Science, pages 85–103. Springer, 2001. [77] D. Denning and P. Neumann. Requirements and model for ides a real-time intrusion detection system. Technical Report 83F83-01-00, Computer Science Laboratory, SRI International, 1985. [78] D. E. Denning. An intrusion detection model. IEEE Transactions on Software Engineering, SE-13:222–232, 1987. [79] Y. Dhanalakshmi and R. Babu. Intrusion detection using data mining along fuzzy logic and genetic algorithms. IJCSNS International Journal of Computer Science and Network Security, 8(2):27–32, 2008. [80] J. Dickerson and J. Dickerson. Fuzzy network profiling for intrusion detection. In Fuzzy Information Processing Society, 2000. NAFIPS. 19th International Conference of the North American, pages 301 –306, 2000. 96 SEVENTH FRAMEWORK PROGRAMME Bibliography [81] W. Eddy. TCP SYN Flooding Attacks and Common Mitigations. RFC 4987 (Informational), August 2007. [82] J. Ellson, E. Gansner, L. Koutsofios, S. North, and G. Woodhull. Graphviz: Open source graph drawing tools. Lecture notes in computer science, pages 483–484, 2002. http://www.graphviz.org/Download..php. [83] R. Ensafi, S. Dehghanzadeh, and M. R. Akbarzadeh-Totonchi. Optimizing fuzzy k-means for network anomaly detection using pso. In AICCSA, pages 686–693. IEEE, 2008. [84] Ertoz, Eilertson, Lazarevic, Tan, Kumar, Srivastava, and Dokas. MINDS - Minnesota Intrusion Detection System. In Next Generation Data Mining, MIT Press, 2004, 2004. [85] E. Eskin, A. Arnold, M. Prerau, L. Portnoy, and S. S. o. A geometric framework for unsupervised anomaly detection: Detecting intrusions in unlabeled data. In D. Barbara and S. Jajodia, editors, Applications of Data Mining in Computer Security. Kluwer, 2002. [86] F. Esponda, S. Forrest, and P. Helman. A formal framework for positive and negative detection schemes. Systems, Man, and Cybernetics, Part B: Cybernetics, IEEE Transactions on, 34(1):357 – 373, February 2004. [87] H. Esquivel, T. Mori, and A. Akella. Router-Level Spam Filtering Using TCP Fingerprints: Architecture and Measurement-Based Evaluation. In Conference on E-Mail and Anti-Spam (CEAS), 2009. [88] J. M. Estevez-Tapiador, P. Garcia-Teodoro, and J. E. Diaz-Verdejo. Stochastic protocol modeling for anomaly based network intrusion detection. In IWIA, pages 3–12, 2003. [89] W. Fan. Cost-Sensitive, Scalable and Adaptive Learning Using Ensemble-based Methods. PhD thesis, Columbia University, 2001. [90] J. Fekete. The InfoVis Toolkit. In IEEE Symposium on Information Visualization, INFOVIS 2004, pages 167–174. IEEE, 2004. [91] A. Feldmann, O. Maennel, Z. M. Mao, A. Berger, and B. Maggs. Locating internet routing instabilities. In SIGCOMM ’04: Proceedings of the 2004 conference on Applications, technologies, architectures, and protocols for computer communications, pages 205–218, New York, NY, USA, 2004. ACM. FP7-ICT-257495-VIS-SENSE 97 Bibliography [92] G. Fink, P. Muessig, and C. North. Visual correlation of host processes and network traffic. In Visualization for Computer Security, 2005.(VizSEC 05). IEEE Workshop on, pages 11–19. IEEE, 2005. [93] F. Fischer, F. Mansmann, D. Keim, S. Pietzko, and M. Waldvogel. Large-scale network monitoring for visual analysis of attacks. In Visualization for Computer Security: 5th International Workshop, Vizsec 2008, Cambridge, Ma, USA, September 15, 2008, Proceedings, page 111, 2008. [94] M. Fisk and G. Varghese. Fast content-based packet handling for intrusion detection. Technical Report CS2001-0670, UCSD, 2001. [95] G. Florez, S. Bridges, and R. Vaughn. An improved algorithm for fuzzy data mining for intrusion detection. In Fuzzy Information Processing Society, 2002. Proceedings. NAFIPS. 2002 Annual Meeting of the North American, pages 457 – 462, 2002. [96] S. Foresti and J. Agutter. VisAlert: From Idea to Product. In VizSEC 2007, pages 159–174. Springer, 2008. [97] M. Fossi, D. Turner, E. Johnson, T. Mack, T. Adams, J. Blackbird, M. K. Low, D. McKinney, M. Dacier, A. Keromytis, C. Leita, M. Cova, J. Overton, and O. Thonnard. Symantec report on rogue security software. Whitepaper, Symantec, October 2009. [98] V. Ganti, R. Ramakrishnan, J. Gehrke, and A. Powell. Clustering large datasets in arbitrary metric spaces. In Proceedings of the 15th International Conference on Data Engineering, ICDE’99, pages 502–, Washington, DC, USA, 1999. IEEE Computer Society. [99] J. Gao, G. Hu, X. Yao, and R. Chang. Anomaly detection of network traffic based on wavelet packet. In Communications, 2006. APCC ’06. Asia-Pacific Conference on, pages 1 –5, 312006-sept.1 2006. [100] L. Gao. On inferring autonomous system relationships in the Internet. IEEE/ACM Trans. Netw., 9(6):733–745, 2001. [101] T. D. Garvey and T. F. Lunt. Model-based intrusion detection. In Proceedings of the 14th National Computer Security Conference, 1991. 98 SEVENTH FRAMEWORK PROGRAMME Bibliography [102] G. Giacinto, R. Perdisci, and F. Roli. Alarm clustering for intrusion detection systems in computer networks. In P. Perner and A. Imiya, editors, MLDM, volume 3587 of Lecture Notes in Computer Science, pages 184–193. Springer, 2005. [103] V. Gill, J. Heasley, and D. Meyer. The bgp ttl security hack (btsh). Presentation at NANOG-27 meeting, October 2001. [104] V. Gill, J. Heasley, D. Meyer, P. Savola, and C. Pignataro. The generalized tl security mechanism (gtsm). RFC 5082, Internet Engineering Task Force, October 2007. [105] J. Glanfield, S. Brooks, T. Taylor, D. Paterson, C. Smith, C. Gates, and J. McHugh. Over flow: An overview visualization for network analysis. In Visualization for Cyber Security, 2009. VizSec 2009. 6th International Workshop on, pages 11–19. IEEE, 2010. [106] J. Goodall, W. Lutters, P. Rheingans, and A. Komlodi. Preserving the big picture: Visual network traffic analysis with tnv. In Visualization for Computer Security, 2005.(VizSEC 05). IEEE Workshop on, pages 47–54. IEEE, 2005. http://tnv. sourceforge.net/. [107] J. Goodall and M. Sowul. VIAssist: Visual analytics for cyber defense. In Technologies for Homeland Security, 2009. HST’09. IEEE Conference on, pages 143–150. IEEE, 2009. [108] G. Goodell, W. Aiello, T. Griffin, J. Ioannidis, and P. McDaniel. Working around bgp: An incremental approach to improving security and accuracy of interdomain routing. In Proc. of Internet Society Symposium on Network and Distributed System Security (NDSS03), February 2003. [109] J. Goodman. IP Addresses in Email Clients. In First Conference on Email and Anti-Spam, Mountain View, CA, 2004. [110] A. K. Gosh, J. Wanken, and F. Charron. Detecting anomalous and unknown intrusions against programs. In ACSAC, pages 259–267. IEEE Computer Society, 1998. [111] M. Grabisch, T. Murofushi, M. Sugeno, and J. Kacprzyk. Fuzzy Measures and Integrals. Theory and Applications. Physica Verlag, Berlin, 2000. [112] T. Griffin. What is the sound of one route flapping? Presentation at the Network Modeling and Simulation Summer Workshop, 2002. FP7-ICT-257495-VIS-SENSE 99 Bibliography [113] S. Guha, R. Rastogi, and K. Shim. ROCK: A robust clustering algorithm for categorical attributes. Information Systems, 25(5):345–366, 2000. [114] H. Hajji. Statistical analysis of network traffic for adaptive faults detection. Neural Networks, IEEE Transactions on, 16(5):1053 –1063, September 2005. [115] J. M. Hall. Isnids, a network intrusion detection system inspired by the human immune system. Technical Report CSDS-DF-TR-03-12, CSDS, 2002. [116] W. Harrop and G. Armitage. Real-time collaborative network monitoring and control using 3D game engines for representation and interaction. In Proceedings of the 3rd international workshop on Visualization for computer security, pages 31–40. ACM, 2006. [117] A. Heffernan. Protection of BGP Sessions via the TCP MD5 Signature Option. RFC 2385 (Proposed Standard), August 1998. [118] C. Hepner and E. Zmijewski. Defending Against BGP Man-In-The-Middle Attacks. Slides, February 2009. Black Hat DC. Arlington, VA. Renesys Corporation. http: //www.renesys.com/tech/presentations/pdf/blackhat-09.pdf. [119] J. A. Hoagland and S. Staniford. Viewing ids alerts: Lessons from snortsnarf. DARPA Information Survivability Conference and Exposition,, 1:0374, 2001. http://sourceforge.net/projects/snortsnarf/. [120] S. A. Hofmeyr and S. Forrest. Immunizing computer networks: Getting all the machines in your network to fight the hacker disease. In Proc. of the 1999 IEEE Symp. on Security and Privacy, pages 9–12. IEEE Computer Society Press, 1998. [121] S.-C. Hong, H.-T. Ju, and J. W. Hong. IP prefix hijacking detection using idle scan. In APNOMS’09: Proceedings of the 12th Asia-Pacific network operations and management conference on Management enabling the future internet for changing business and new computing services, pages 395–404, Berlin, Heidelberg, 2009. Springer-Verlag. [122] C. Hood and C. Ji. Intelligent network monitoring. In Neural Networks for Signal Processing [1995] V. Proceedings of the 1995 IEEE Workshop, pages 521 –530, August 1995. [123] C. S. Hood and C. Ji. Proactive network fault detection. In INFOCOM, pages 1147–1155, 1997. 100 SEVENTH FRAMEWORK PROGRAMME Bibliography [124] X. Hu and Z. M. Mao. Accurate Real-time Identification of IP Prefix Hijacking. In SP ’07: Proceedings of the 2007 IEEE Symposium on Security and Privacy, pages 3–17, Washington, DC, USA, 2007. IEEE Computer Society. [125] Y.-C. Hu, A. Perrig, and D. B. Johnson. Efficient security mechanisms for routing protocols. In In Proc. NDSS03, pages 57–73, 2003. [126] Y.-C. Hu, A. Perrig, and M. A. Sirbu. Spv: secure path vector routing for securing bgp. In R. Yavatkar, E. W. Zegura, and J. Rexford, editors, SIGCOMM, pages 179–192. ACM, 2004. [127] C.-T. Huang, S. Thareja, and Y.-J. Shin. Wavelet-based real time detection of network traffic anomalies. I. J. Network Security, 6(3):309–320, 2008. [128] L. Huang, X. Nguyen, M. N. Garofalakis, J. M. Hellerstein, M. I. Jordan, A. D. Joseph, and N. Taft. Communication-efficient online detection of network-wide anomalies. In INFOCOM, pages 134–142. IEEE, 2007. [129] P. Huang, A. Feldmann, and W. Willinger. A non-intrusive, wavelet-based approach to detecting network performance problems. In Proceedings of ACM SIGCOMM Internet Measurement Workshop, November 2001. [130] Y.-A. Huang, W. Fan, W. Lee, and P. S. Yu. Cross-feature analysis for detecting ad-hoc routing anomalies. In ICDCS, pages 478–. IEEE Computer Society, 2003. [131] Hurricane Electric. BGP Toolkit. http://bgp.he.net/. [Online; accessed 22Jan-2011]. [132] G. Huston, M. Rossi, and G. Armitage. Securing bgp - a literature survey. Communications Surveys Tutorials, IEEE, PP(99):1 –24, 2010. [133] K. Hwang, M. Cai, Y. Chen, and M. Qin. Hybrid intrusion detection with weighted signature generation over anomalous internet episodes. IEEE Trans. Dependable Sec. Comput., 4(1):41–55, 2007. [134] IBM. Iss, realsecure. http://www.iss.net, 2010. [135] T. Ide and H. Kashima. Eigenspace-based anomaly detection in computer systems. In W. Kim, R. Kohavi, J. Gehrke, and W. DuMouchel, editors, KDD, pages 440– 449. ACM, 2004. [136] K. Ilgun. Ustat - a real-time intrusion detection system for unix. Master thesis, University of California at Santa Barbara, November 1992. FP7-ICT-257495-VIS-SENSE 101 Bibliography [137] A. Inselberg and B. Dimsdale. Parallel coordinates: a tool for visualizing multidimensional geometry. In Proceedings of the 1st conference on Visualization’90, pages 361–378. IEEE Computer Society Press, 1990. [138] B. Irwin and N. Pilkington. High level internet scale traffic visualization using hilbert curve mapping. In VizSEC 2007, pages 147–158. Springer, 2008. [139] B. Irwin and J. Riel. Using inetvis to evaluate snort and bro scan detection on a network telescope. VizSEC 2007, pages 255–273, 2008. http://www.cs.ru.ac. za/research/g02v2468/inetvis.html. [140] S. Jajodia, P. Liu, V. Swarup, and C. Wang, editors. Cyber Situational Awareness: Issues and Research, volume 46 of Advances in Information Security. Springer, Nov 2009. [141] J. Janies. Existence plots: A low-resolution time series for port behavior analysis. Visualization for Computer Security, pages 161–168, 2008. [142] J. P. John, A. Moshchuk, S. D. Gribble, and A. Krishnamurthy. Studying spamming botnets using Botlab. In NSDI’09: Proceedings of the 6th USENIX symposium on Networked systems design and implementation, pages 291–306, Berkeley, CA, USA, 2009. USENIX Association. [143] B. Johnson and B. Shneiderman. Tree-maps: a space-filling approach to the visualization of hierarchical information structures. In Proceedings of the 2nd conference on Visualization ’91, VIS ’91, pages 284–291, Los Alamitos, CA, USA, 1991. IEEE Computer Society Press. [144] K. Julisch. Applications of Data Mining in Computer Security, volume 6 of Advances in Information Security, chapter Data Mining For Intrusion Detection - A Critical Review. Springer, 2002. [145] K. Julisch. Clustering intrusion detection alarms to support root cause analysis. ACM Trans. Inf. Syst. Secur., 6(4):443–471, 2003. [146] K. Julisch and M. Dacier. Mining intrusion detection alarms for actionable knowledge. In Proceedings of the 8th ACM International Conference on Knowledge Discovery and Data Mining, 2002. [147] C. Kanich, C. Kreibich, K. Levchenko, B. Enright, G. M. Voelker, V. Paxson, and S. Savage. Spamalytics: an empirical analysis of spam marketing conversion. 102 SEVENTH FRAMEWORK PROGRAMME Bibliography In Proceedings of the 15th ACM conference on Computer and communications security, CCS ’08, pages 3–14, New York, NY, USA, 2008. ACM. [148] J. Karlin, S. Forrest, and J. Rexford. Pretty good bgp: Improving bgp by cautiously adopting routes. In Network Protocols, 2006. ICNP’06. Proceedings of the 2006 14th IEEE International Conference on, pages 290–299, 2006. [149] E. Katz-Bassett, H. V. Madhyastha, J. P. John, A. Krishnamurthy, D. Wetherall, and T. Anderson. Studying black holes in the internet with hubble. In Proceedings of the 5th USENIX Symposium on Networked Systems Design and Implementation, NSDI’08, pages 247–262, Berkeley, CA, USA, 2008. USENIX Association. [150] KDD. The third international knowledge discovery and data mining tools competition dataset (kdd99 cup). http://kdd.ics.uci.edu/databases/kddcup99.html. [151] D. Keim. Designing pixel-oriented visualization techniques: Theory and applications. Visualization and Computer Graphics, IEEE Transactions on, 6(1):59–78, 2000. [152] R. Kemmerer and G. Vigna. Intrusion detection: A brief history and overview. IEEE Computer, 35(4):27–30, April 2002. [153] S. Kent. IP Authentication Header. RFC 4302 (Proposed Standard), December 2005. [154] S. Kent. Ip encapsulating security payload (esp). RFC 4303 (Proposed Standard), Dec. 2005. [155] S. Kent, C. Lynn, and K. Seo. Secure border gateway protocol (s-bgp). Selected Areas in Communications, IEEE Journal on, 18(4):582 –592, Apr. 2000. [156] S. Kent and K. Seo. Security Architecture for the Internet Protocol. RFC 4301 (Proposed Standard), December 2005. [157] S. T. Kent. Securing the border gateway protocol: A status update. In A. Lioy and D. Mazzocchi, editors, Communications and Multimedia Security, volume 2828 of Lecture Notes in Computer Science, pages 40–53. Springer, 2003. [158] S. T. Kent, C. Lynn, J. Mikkelson, and K. Seo. Secure border gateway protocol (s-bgp) - real world performance and deployment issues. In NDSS. The Internet Society, 2000. FP7-ICT-257495-VIS-SENSE 103 Bibliography [159] L. Khan, M. Awad, and B. M. Thuraisingham. A new intrusion detection system using support vector machines and hierarchical clustering. VLDB J., 16(4):507– 521, 2007. [160] R. Kisteleki. Filtering After Recent Chinese “BGP Hijack” Does not Affect RIPE Region. http://labs.ripe.net/Members/kistel/ content-recent-chinese-bgp-hijack-does-not-affect-ripe. [Online; accessed 10-Apr-2010]. [161] C. Kleiber and A. Zeileis. Applied Econometrics with R. Springer, 2008. http: //www.r-project.org/index.html. [162] J. Kline, S. Nam, P. Barford, D. Plonka, and A. Ron. Traffic anomaly detection at fine time scales with bayes nets. In Proceedings of the International Conference on Internet Monitoring and Protection (ICIMP ’08), June 2008. [163] H. Koike and K. Ohno. SnortView: visualization system of snort logs. In Proceedings of the 2004 ACM workshop on Visualization and data mining for computer security, pages 143–147. ACM, 2004. [164] H. Koike, K. Ohno, and K. Koizumi. Visualizing cyber attacks using ip matrix. Visualization for Computer Security, IEEE Workshops on, 0:11, 2005. [165] A. Komlodi, P. Rheingans, U. Ayachit, J. Goodall, and A. Joshi. A user-centered look at glyph-based security visualization. Visualization for Computer Security, IEEE Workshops on, 2005. [166] S. Krasser, G. Conti, J. Grizzard, J. Gribschaw, and H. Owen. Real-time and forensic network data analysis using animated and coordinated visualization. In Proceedings of the 6th IEEE Information Assurance Workshop, volume 142. Citeseer, 2005. [167] C. Kreibich, C. Kanich, K. Levchenko, B. Enright, G. M. Voelker, V. Paxson, and S. Savage. On the spam campaign trail. In LEET’08: Proceedings of the 1st Usenix Workshop on Large-Scale Exploits and Emergent Threats, pages 1–9, Berkeley, CA, USA, 2008. USENIX Association. [168] C. Kreibich, C. Kanich, K. Levchenko, B. Enright, G. M. Voelker, V. Paxson, and S. Savage. Spamcraft: an inside look at spam campaign orchestration. In Proceedings of the 2nd USENIX conference on Large-scale exploits and emergent threats: botnets, spyware, worms, and more, LEET’09, pages 4–4, Berkeley, CA, USA, 2009. USENIX Association. 104 SEVENTH FRAMEWORK PROGRAMME Bibliography [169] C. Kruegel, D. Mutz, W. K. Robertson, and F. Valeur. Bayesian event classification for intrusion detection. In ACSAC, pages 14–23. IEEE Computer Society, 2003. [170] C. Kruegel and T. Toth. Using decision trees to improve signature-based intrusion detection. In G. Vigna, E. Jonsson, and C. Kruegel, editors, RAID, volume 2820 of Lecture Notes in Computer Science, pages 173–191. Springer, 2003. [171] C. Kruegel, T. Toth, and C. Kerer. Decentralized event correlation for intrusion detection. In K. Kim, editor, ICISC, volume 2288 of Lecture Notes in Computer Science, pages 114–131. Springer, 2001. [172] S. Kumar and E. H. Spafford. A Pattern Matching Model for Misuse Intrusion Detection. In Proceedings of the 17th National Computer Security Conference, pages 11–21, 1994. [173] S. Kumar and E. H. Spafford. An application of pattern matching in intrusion detection. Technical Report CSD-TR-94-013, Purdue University, 1994. [174] C. Labovitz. Additional discussion of the april china bgp hijack incident. http://asert.arbornetworks.com/2010/11/ additional-discussion-of-the-april-china-bgp-hijack-incident/. [Online; accessed 10-Apr-2010]. [175] C. Labovitz. China Hijacks 15% of Internet Traffic? http://asert. arbornetworks.com/2010/11/china-hijacks-15-of-internet-traffic/. [Online; accessed 10-Apr-2010]. [176] M. Lad, D. Massey, D. Pei, Y. Wu, B. Zhang, and L. Zhang. PHAS: A Prefix Hijack Alert System. In USENIX-SS’06: Proceedings of the 15th conference on USENIX Security Symposium, Berkeley, CA, USA, 2006. USENIX Association. [177] M. Lad, D. Massey, and L. Zhang. Visualizing internet routing changes. IEEE Transactions on Visualization and Computer Graphics, pages 1450–1460, 2006. http://linkrank.cs.ucla.edu/. [178] M. Lad, A. Nanavati, D. Massey, and L. Zhang. An algorithmic approach to identifying link failures. In PRDC, pages 25–34. IEEE Computer Society, 2004. [179] A. Lakhina, M. Crovella, and C. Diot. Mining anomalies using traffic feature distributions. In R. Guerin, R. Govindan, and G. Minshall, editors, SIGCOMM, pages 217–228. ACM, 2005. FP7-ICT-257495-VIS-SENSE 105 Bibliography [180] K. Lakkaraju, W. Yurcik, R. Bearavolu, and A. Lee. NVisionIP: an interactive network flow visualization tool for security. In Systems, Man and Cybernetics, 2004 IEEE International Conference on, volume 3, pages 2675–2680. IEEE, 2005. [181] S. Lau. The spinning cube of potential doom. Communications of the ACM, 47(6):25–26, 2004. http://www.nersc.gov/nusers/security/ TheSpinningCube.php. [182] C. Lee and J. Copeland. Flowtag: a collaborative attack-analysis, reporting, and sharing tool for security researchers. In Proceedings of the 3rd international workshop on Visualization for computer security, pages 103–108. ACM, 2006. http://chrislee.dhs.org/projects/flowtag.html. [183] C. Lee, J. Trost, N. Gibbs, R. Beyah, and J. Copeland. Visual firewall: real-time network security monitor. In Visualization for Computer Security, 2005.(VizSEC 05). IEEE Workshop on, pages 129–136. IEEE, 2005. [184] W. Lee, S. Stolfo, and K. Mok. A data mining framework for building intrusion detection models. In Proceedings of the 1999 IEEE Symposium on Security and Privacy, pages 120–132, 1999. [185] W. Lee and S. J. Stolfo. Combining knowledge discovery and knowledge engineering to build IDSs. In RAID ’99: Proceedings of the 3th International Symposium on Recent Advances in Intrusion Detection, 1999. [186] W. Lee, S. J. Stolfo, and K. W. Mok. A data mining framework for building intrusion detection models. In IEEE Symposium on Security and Privacy, pages 120–132, 1999. [187] C. Leita, U. Bayer, and E. Kirda. Exploiting diverse observation perspectives to get insights on the malware landscape. In DSN 2010, 40th Annual IEEE/IFIP International Conference on Dependable Systems and Networks, June 28-July 1, 2010, Fairmont Chicago, USA, 06 2010. [188] C. Leita and M. Dacier. Sgnet: A worldwide deployable framework to support the analysis of malware threat models. In Seventh European Dependable Computing Conference, EDCC 2008, pages 99–109, 2008. [189] K. Leung and C. Leckie. Unsupervised anomaly detection in network intrusion detection using clusters. In V. Estivill-Castro, editor, ACSC, volume 38 of CRPIT, pages 333–342. Australian Computer Society, 2005. 106 SEVENTH FRAMEWORK PROGRAMME Bibliography [190] L. Lewis. A case-based reasoning approach to the management of faults in communication networks. In INFOCOM ’93. Proceedings.Twelfth Annual Joint Conference of the IEEE Computer and Communications Societies. Networking: Foundation for the Future. IEEE, pages 1422 –1429 vol.3, 1993. [191] W. Li. Using genetic algorithm for network intrusion detection. In In Proceedings of the United States Department of Energy Cyber Security Group 2004 Training Conference, pages 24–27, 2004. [192] Z. Li, A. Goyal, Y. Chen, and V. Paxson. Automating analysis of large-scale botnet probing events. In Proc. of ASIACCS, March 2009. [193] U. Lindqvist and P. Porras. Detecting computer and network misuse through the production-based expert system toolset (p-best). In Security and Privacy, 1999. Proceedings of the 1999 IEEE Symposium on, pages 146 –161, 1999. [194] Y. Livnat, J. Agutter, S. Moon, R. Erbacher, and S. Foresti. A visualization paradigm for network intrusion detection. In Information Assurance Workshop, 2005. IAW’05. Proceedings from the Sixth Annual IEEE SMC, pages 92–99. IEEE, 2005. [195] Los Angeles, University of California. Internet topology collection. http://irl. cs.ucla.edu/. [Online; accessed 13-Jan-2010]. [196] J. Luo. Integrating fuzzy logic with data mining methods for intrusion detection. Master’s thesis, Mississippi State University, 1999. [197] A. Magnaghi, T. Hamada, and T. Katsuyama. A wavelet-based framework for proactive detection of network misconfigurations. In Proceedings of SIGCOMM 2004, 2004. [198] R. Mahajan, D. Wetherall, and T. Anderson. Understanding BGP Misconfiguration. In SIGCOMM ’02: Proceedings of the 2002 conference on Applications, Technologies, Architectures, and Protocols for Computer Communications, pages 3–16, New York, NY, USA, 2002. ACM. [199] M. V. Mahoney and P. K. Chan. Learning rules for anomaly detection of hostile network traffic. In ICDM, pages 601–604. IEEE Computer Society, 2003. [200] F. Mansmann, L. Meier, and D. Keim. Visualization of host behavior for network security. VizSEC 2007, pages 187–202, 2008. FP7-ICT-257495-VIS-SENSE 107 Bibliography [201] Z. M. Mao, R. Bush, T. Griffin, and M. Roughan. Bgp beacons. In Internet Measurement Comference, pages 1–14. ACM, 2003. [202] J. Marin, D. Ragsdale, and J. Sirdu. A hybrid approach to the profile creation and intrusion detection. In DARPA Information Survivability Conference Exposition II, 2001. DISCEX ’01. Proceedings, volume 1, pages 69 –76 vol.1, 2001. [203] C. McArthur and M. Guirguis. Stealthy IP Prefix Hijacking: Don’t Bite Off More Than You Can Chew. In Global Telecommunications Conference, GLOBECOM 2009, pages 1–6. IEEE, 2009. [204] S. Mccreary. BGP Core Routing Table Size. dynamics/. [Online; accessed 13-Jan-2011]. http://www.routeviews.org/ [205] C. McCue. Data Mining and Predictive Analysis: Intelligence Gathering and Crime Analysis. Butterworth-Heinemann (Elsevier), May 2007, 2007. [206] R. McMillan. A Chinese ISP Momentarily Hijacks the Internet. http://www.nytimes.com/external/idg/2010/04/08/ 08idg-a-chinese-isp-momentarily-hijacks-the-internet-33717.html. [Online; accessed 13-Apr-2010]. [207] J. McPherson, K. Ma, P. Krystosk, T. Bartoletti, and M. Christensen. Portvis: a tool for port-based detection of security events. In Proceedings of the 2004 ACM workshop on Visualization and data mining for computer security, pages 73–81. ACM, 2004. [208] J. Mena. Investigative Data Mining for Security and Criminal Detection. Butterworth-Heinemann (Elsevier,) Avril 2003, 2003. [209] R. C. Merkle. Protocols for public key cryptosystems. In IEEE Symposium on Security and Privacy, pages 122–134, 1980. [210] I. Mierswa, M. Wurst, R. Klinkenberg, M. Scholz, and T. Euler. Yale: Rapid prototyping for complex data mining tasks. In L. Ungar, M. Craven, D. Gunopulos, and T. Eliassi-Rad, editors, KDD ’06: Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 935–940, New York, NY, USA, August 2006. ACM. http://rapid-i.com/content/view/ 181/196/. 108 SEVENTH FRAMEWORK PROGRAMME Bibliography [211] S. Ming, S. Wu, X. Zhao, and K. Zhang. On reverse engineering the management actions from observed bgp data. In INFOCOM Workshops 2008, IEEE, pages 1 –6, April 2008. [212] B. Morin and H. Debar. Correlation of intrusion symptoms: An application of chronicles. In G. Vigna, E. Jonsson, and C. Kruegel, editors, RAID, volume 2820 of Lecture Notes in Computer Science, pages 94–112. Springer, 2003. [213] S. Mukkamala and A. Sung. Feature selection for intrusion detection using neural networks and support vector machines. Journal of the Transportation Research Board, 2003:33–39, 2003. [214] S. Mukkamala and A. H. Sung. Identifying key features for intrusion detection using neural networks. In Proceedings of the 15th international conference on Computer communication, ICCC ’02, pages 1132–1138, Washington, DC, USA, 2002. International Council for Computer Communication. [215] S. Mukkamala, A. H. Sung, and A. Abraham. Intrusion detection systems using adaptive regression splines. In 6th International Conference on Enterprise Information Systems, ICEIS’04, pages 26–33. Kluwer Academic Press, 2004. [216] R. NCC. Routing Information Service. http://www.ripe.net/ris/. [Online; accessed 13-Jan-2011]. [217] T. Ndousse and T. Okuda. Computational intelligence for distributed fault management in networks using fuzzy cognitive maps. In Communications, 1996. ICC 96, Conference Record, Converging Technologies for Tomorrow’s Applications. 1996 IEEE International Conference on, volume 3, pages 1558 –1562 vol.3, June 1996. [218] P. Ning, Y. Cui, and D. S. Reeves. Constructing attack scenarios through correlation of intrusion alerts. In V. Atluri, editor, ACM Conference on Computer and Communications Security, pages 245–254. ACM, 2002. [219] P. Ning, Y. Cui, D. S. Reeves, and D. Xu. Techniques and tools for analyzing intrusion alerts. ACM Trans. Inf. Syst. Secur., 7(2):274–318, 2004. [220] P. Ning and D. Xu. Learning attack strategies from intrusion alerts. In in Proceedings of 10th ACM Conference on Computer and Communications Security (CCS03, pages 200–209. ACM Press, 2003. FP7-ICT-257495-VIS-SENSE 109 Bibliography [221] P. Ning and D. Xu. Hypothesizing and reasoning about attacks missed by intrusion detection systems. ACM Trans. Inf. Syst. Secur., 7(4):591–627, 2004. [222] O. Nordstrom and C. Dovrolis. Beware of BGP attacks. Computer Communication Review, 34(2):1–8, 2004. [223] J. Oberheide, M. Goff, and M. Karir. Flamingo: Visualizing internet traffic. In Network Operations and Management Symposium, 2006. NOMS 2006. 10th IEEE/IFIP, pages 150–161. IEEE, 2006. [224] J. Oberheide, M. Karir, and D. Blazakis. VAST: visualizing autonomous system topology. In Proceedings of the 3rd international workshop on Visualization for computer security, pages 71–80. ACM, 2006. [225] R. Oliveira. Cyclops: The internet as-level observatory. Slides and video: http://www.nanog.org/meetings/nanog43/abstracts.php?pt= NTkmbmFub2c0Mw==&nm=nanog43. [Online; accessed 13-June-2008]. [226] I. Onut, B. Zhu, and A. Ghorbani. A novel visualization technique for network anomaly detection. In proc. 2nd Annual Conf. on Privacy Security and trust, pages 167–174. Citeseer, 2004. [227] P. v. Oorschot, T. Wan, and E. Kranakis. On interdomain routing security and pretty secure bgp (psbgp). ACM Trans. Inf. Syst. Secur., 10, July 2007. [228] Packet Clearing House. http://www.pch.net/home/index.php. [Online; accessed 13-Jan-2011]. [229] R. Pang, V. Yegneswaran, P. Barford, V. Paxson, and L. Peterson. Characteristics of Internet Background Radiation. In Proceedings of the 4th ACM SIGCOMM conference on the Internet Measurement, 2004. [230] A. Pathak, F. Qian, Y. C. Hu, Z. M. Mao, and S. Ranjan. Botnet spam campaigns can be long lasting: evidence, implications, and analysis. In Proceedings of the eleventh international joint conference on Measurement and modeling of computer systems, SIGMETRICS ’09, pages 13–24, New York, NY, USA, 2009. ACM. [231] V. Paxson. Bro: a system for detecting network intruders in real-time. Computer Networks, 31(23-24):2435–2463, 1999. [232] J. Pearlman and P. Rheingans. Visualizing network security events using compound glyphs from a service-oriented perspective. VizSEC 2007, pages 131–146, 2008. 110 SEVENTH FRAMEWORK PROGRAMME Bibliography [233] J. Peng, C. Feng, and J. W. Rozenblit. A hybrid intrusion detection and visualization system. In ECBS, pages 505–506. IEEE Computer Society, 2006. [234] V.-H. Pham and M. Dacier. Honeypot traces forensics : the observation view point matters. In NSS 2009, 3rd International Conference on Network and System Security, October 19-21, 2009, Gold Coast, Australia, Dec 2009. [235] V.-H. Pham, M. Dacier, G. Urvoy Keller, and T. En Najjary. The quest for multiheaded worms. In DIMVA 2008, 5th Conference on Detection of Intrusions and Malware & Vulnerability Assessment, July 10-11th, 2008, Paris, France, Jul 2008. [236] D. Phan, J. Gerth, M. Lee, A. Paepcke, and T. Winograd. Visual analysis of network flow data with timelines and event plots. VizSEC 2007, pages 85–99, 2008. [237] W. Pike, C. Scherrer, and S. Zabriskie. Putting security in context: Visual correlation of network activity with real-world information. VizSEC 2007, pages 203–220, 2008. [238] A. Pilosov and T. Kapela. Stealing The Internet: An Internet-Scale Man In The Middle Attack. http://www.defcon.org/images/defcon-16/ dc16-presentations/defcon-16-pilosov-kapela.pdf. [Online; accessed 20Aug-2008]. [239] A. C. Popescu, B. J. Premore, and T. Underwood. The Anatomy of a Leak: AS9121. http://www.renesys.com/tech/presentations/pdf/ renesys-nanog34.pdf. [Online; accessed 13-May-2005]. [240] P. Porras and R. Kemmerer. Penetration state transition analysis: A rule-based intrusion detection approach. In Computer Security Applications Conference, 1992. Proceedings., Eighth Annual, pages 220 –229, November 1992. [241] P. A. Porras, M. W. Fong, and A. Valdes. A mission-impact-based approach to infosec alarm correlation. In RAID, pages 95–114, 2002. [242] F. Pouget and M. Dacier. Honeypot-based forensics. In Proceedings of AusCERT Asia Pacific Information Technology Security Conference, May 2004. [243] M. B. Prince, B. M. Dahl, L. Holloway, A. M. Keller, and E. Langheinrich. Understanding How Spammers Steal Your E-Mail Address: An Analysis of the First Six Months of Data from Project Honey Pot. In CEAS 2005 - Second Conference FP7-ICT-257495-VIS-SENSE 111 Bibliography on Email and Anti-Spam, July 21-22, 2005, Stanford University, California, USA, 2005. [244] T. S. Project. Snort 2.0, open source network intrusion detection system. http://www.snort.org. [245] T. Qin, X. Guan, W. Li, and P. Wang. Monitoring abnormal traffic flows based on independent component analysis. In ICC, pages 1–5. IEEE, 2009. [246] X. Qin and W. Lee. Statistical causality analysis of infosec alert data. In G. Vigna, E. Jonsson, and C. Kruegel, editors, RAID, volume 2820 of Lecture Notes in Computer Science, pages 73–93. Springer, 2003. [247] J. Qiu, L. Gao, S. Ranjan, and A. Nucci. Detecting bogus BGP route information: Going beyond prefix hijacking. In Security and Privacy in Communications Networks and the Workshops, 2007. SecureComm 2007., pages 381–390, 2007. [248] A. Ramachandran and N. Feamster. Understanding the network-level behavior of spammers. In SIGCOMM ’06: Proceedings of the 2006 conference on Applications, technologies, architectures, and protocols for computer communications, pages 291– 302, New York, NY, USA, 2006. ACM. [249] M. Ramadas, S. Ostermann, and B. C. Tjaden. Detecting anomalous network traffic with self-organizing maps. In G. Vigna, E. Jonsson, and C. Kruegel, editors, RAID, volume 2820 of Lecture Notes in Computer Science, pages 36–54. Springer, 2003. [250] P. Ren, Y. Gao, Z. Li, Y. Chen, and B. Watson. IDGraphs: intrusion detection and analysis using histographs. Visualization for Computer Security, IEEE Workshops on, 2005. [251] RIPE. YouTube Hijacking: A RIPE NCC RIS case study. http://www.ripe. net/news/study-youtube-hijacking.html. [Online; accessed 13-Jan-2011]. [252] Robtex. AS Analysis. http://www.robtex.com/as/. [Online; accessed 13-Jan2011]. [253] L. F. Salim and A. Mezrioui. Improving the quality of alerts with correlation in intrusion detection. IJCSNS International Journal of Computer Science and Network Security, 7(12):210–215, 2007. 112 SEVENTH FRAMEWORK PROGRAMME Bibliography [254] D. Schnackenberg, K. Djahandari, and D. Sterne. Infrastructure for intrusion detection and response. In DARPA Information Survivability Conference and Exposition, 2000. DISCEX ’00. Proceedings, volume 2, pages 3 –11 vol.2, 2000. [255] D. Schnackengerg, H. Holliday, R. Smith, K. Djahandari, and D. Sterne. Cooperative intrusion traceback and response architecture (citra). In DARPA Information Survivability Conference Exposition II, 2001. DISCEX ’01. Proceedings, volume 1, pages 56 –68 vol.1, 2001. [256] K. Sequira and M. Zaki. ADMIT: Anomaly-based Data Mining for Intrusions. In SIGKDD Conference, 2002. [257] G. Shafer. A mathematical theory of evidence. Princeton university press, 1976. [258] J. Shearer, K. Ma, and T. Kohlenberg. BGPeep: An IP-Space Centered View for Internet Routing Data. Visualization for Computer Security, pages 95–110, 2008. [259] H. Shiravi, A. Shiravi, and A. Ghorbani. IDS Alert Visualization and Monitoring through Heuristic Host Selection. Information and Communications Security, pages 445–458, 2010. [260] N. Spring, R. Mahajan, and T. Anderson. Quantifying the causes of path inflation. In SIGCOMM ’03: Proceedings of the 2003 conference on Applications, technologies, architectures, and protocols for computer communications, pages 113–124, New York, NY, USA, 2003. ACM. [261] K. Sriram, D. Montgomery, O. Borchert, O. Kim, and D. R. Kuhn. Study of BGP peering session attacks and their impacts on routing performance. IEEE Journal on Selected Areas in Communications, 24(10):1901–1915, 2006. [262] H. Stern. A survey of modern spam tools. In Fifth Conference on Email and Anti-Spam, Mountain View, CA, 2008. [263] B. Stone-Gross, M. Cova, L. Cavallaro, B. Gilbert, M. Szydlowski, R. Kemmerer, C. Kruegel, and G. Vigna. Your botnet is my botnet: analysis of a botnet takeover. In CCS ’09: Proceedings of the 16th ACM conference on Computer and communications security, pages 635–647, New York, NY, USA, 2009. ACM. [264] B. Stone-Gross, A. Moser, C. Kruegel, E. Kirda, and K. Almeroth. FIRE: FInding Rogue nEtworks. In Proceedings of the Annual Computer Security Applications Conference (ACSAC), Honolulu, HI, December 2009. FP7-ICT-257495-VIS-SENSE 113 Bibliography [265] Symantec MessageLabs Intelligence. In the battle of the botnets rustock remains dominant. Monthly report, August 2010. [266] Symantec MessageLabs Intelligence. Survival of the fittest: Selfish botnets dominate the spam landscape as rustock becomes the largest botnet; linux takes a share of spam from windows. Monthly report, April 2010. [267] Symantec.cloud. Messagelabs hosted email antispam. http://www.messagelabs. com. [Online; accessed 13-Jan-2011]. [268] M. Tahara, N. Tateishi, T. Oimatsu, and S. Majima. A Method to Detect Prefix Hijacking by Using Ping Tests. In APNOMS ’08: Proceedings of the 11th Asia-Pacific Symposium on Network Operations and Management, pages 390–398, Berlin, Heidelberg, 2008. Springer-Verlag. [269] T. Taylor, S. Brooks, and J. McHugh. NetBytes viewer: An entity-based netflow visualization utility for identifying intrusive behavior. VizSEC 2007, pages 101– 114, 2008. [270] T. Taylor, D. Paterson, J. Glanfield, C. Gates, S. Brooks, and J. McHugh. Flovis: Flow visualization system. In Conference For Homeland Security, 2009. CATCH’09. Cybersecurity Applications & Technology, pages 186–198. IEEE, 2009. http://projects.cs.dal.ca/flovis/download.html. [271] R. Teixeira, S. Agarwal, and J. Rexford. Bgp routing changes: merging views from two isps. SIGCOMM Comput. Commun. Rev., 35(5):79–82, 2005. [272] R. Teixeira and J. Rexford. A measurement framework for pin-pointing routing changes. In NetT ’04: Proceedings of the ACM SIGCOMM workshop on Network troubleshooting, pages 313–318, New York, NY, USA, 2004. ACM. [273] S. J. Templeton and K. Levitt. A requires/provides model for computer attacks. In Proceedings of New Security Paradigms Workshop, pages 31–38. ACM Press, 2000. [274] S. J. Templeton and K. E. Levitt. Detecting spoofed packets. In DISCEX (1), pages 164–. IEEE Computer Society, 2003. [275] S. T. Teoh, K.-L. Ma, S. F. Wu, D. Massey, X. Zhao, D. Pei, L. Wang, L. Zhang, and R. Bush. Visual-based anomaly detection for bgp origin as change (oasc) events. In M. Brunner and A. Keller, editors, DSOM, volume 2867 of Lecture Notes in Computer Science, pages 155–168. Springer, 2003. 114 SEVENTH FRAMEWORK PROGRAMME Bibliography [276] S. T. Teoh, K. L. Ma, S. F. Wu, and X. Zhao. Case study: interactive visualization for internet security. In Proceedings of the conference on Visualization ’02, VIS ’02, pages 505–508, Washington, DC, USA, 2002. IEEE Computer Society. http: //www.cs.ucdavis.edu/~ma/SecVis/. [277] S. T. Teoh, S. Ranjan, A. Nucci, and C.-N. Chuah. Bgp eye: a new visualization tool for real-time detection and analysis of bgp anomalies. In VizSEC ’06: Proceedings of the 3rd international workshop on Visualization for computer security, pages 81–90, New York, NY, USA, 2006. ACM. [278] S. T. Teoh, K. Zhang, S.-M. Tseng, K.-L. Ma, and S. F. Wu. Combining visual and automated data mining for near-real-time anomaly detection and analysis in bgp. In C. E. Brodley, P. Chan, R. Lippman, and W. Yurcik, editors, VizSEC, pages 35–44. ACM, 2004. [279] S. T. Teoh, K. Zhang, S.-M. Tseng, K.-L. Ma, and S. F. Wu. Combining visual and automated data mining for near-real-time anomaly detection and analysis in bgp. In VizSEC/DMSEC ’04: Proceedings of the 2004 ACM workshop on Visualization and data mining for computer security, pages 35–44, New York, NY, USA, 2004. ACM. [280] L. Terran. Machine Learning Techniques for the Domain of Anomaly Detection for Computer Security. PhD thesis, Purdue University, 2000. [281] J. Thomas and K. Cook, editors. Illuminating the Path: the Research and Development Agenda for Visual Analytics. IEEE, 2005. [282] O. Thonnard. A multi-criteria clustering approach to support attack attribution in cyberspace. PhD thesis, École Doctorale d’Informatique, Télécommunications et Électronique de Paris, March 2010. [283] O. Thonnard and M. Dacier. A framework for attack patterns’ discovery in honeynet data. digital investigation, 5:S128–S139, 2008. [284] O. Thonnard and M. Dacier. Actionable knowledge discovery for threats intelligence support using a multi-dimensional data mining methodology. In Data Mining Workshops, 2008. ICDMW ’08. IEEE International Conference on, pages 154 –163, 2008. [285] O. Thonnard, W. Mees, and M. Dacier. Addressing the attack attribution problem using knowledge discovery and multi-criteria fuzzy decision-making. In Proceedings FP7-ICT-257495-VIS-SENSE 115 Bibliography of the ACM SIGKDD Workshop on CyberSecurity and Intelligence Informatics, CSI-KDD ’09, pages 11–21, New York, NY, USA, 2009. ACM. [286] O. Thonnard, W. Mees, and M. Dacier. Behavioral Analysis of Zombie Armies. In C. Czossek and K. Geers, editors, The Virtual Battlefield: Perspectives on Cyber Warfare, volume 3 of Cryptology and Information Security Series, pages 191–210, Amsterdam, The Netherlands, 2009. IOS Press. [287] O. Thonnard, W. Mees, and M. Dacier. On a multicriteria clustering approach for attack attribution. SIGKDD Explor. Newsl., 12:11–20, November 2010. [288] M. Thottan and C. Ji. Anomaly detection in ip networks. IEEE Trans. Signal Processing, 51(8):2191–2204, 2003. Special Issue of Signal Processing in Networking. [289] A. Toonk. Chinese ISP hijacks the Internet. http://bgpmon.net/blog/?p=282. [Online; accessed 10-Apr-2010]. [290] V. Torra and Y. Narukawa. Modeling Decisions: Information Fusion and Aggregation Operators. Springer, Berlin, 2007. [291] University of California Los Angeles. Cyclops. http://cyclops.cs.ucla.edu/. [Online; accessed 13-Jan-2011]. [292] University of Memphis. NetViews. http://netlab.cs.memphis.edu/projects_ netviews.html. [Online; accessed 13-Jan-2011]. [293] University of Oregon. Route Views Project. http://www.routeviews.org/. [Online; accessed 13-Jan-2011]. [294] University of Washington. iPlane. http://iplane.cs.washington.edu/. [Online; accessed 13-Jan-2011]. [295] A. Valdes and K. Skinner. Probabilistic alert correlation. In W. Lee, L. Me, and A. Wespi, editors, Recent Advances in Intrusion Detection, volume 2212 of Lecture Notes in Computer Science, pages 54–68. Springer, 2001. [296] F. Valeur, G. Vigna, C. Kruegel, and R. A. Kemmerer. A comprehensive approach to intrusion detection alert correlation. IEEE Trans. Dependable Sec. Comput., 1(3):146–169, 2004. [297] I. van Beijnum. BGP. O’Reilly Media, Inc., Sebastopol, CA, USA, September 2002. 116 SEVENTH FRAMEWORK PROGRAMME Bibliography [298] F. Viegas, M. Wattenberg, F. Van Ham, J. Kriss, and M. McKeon. Manyeyes: a site for visualization at internet scale. IEEE Transactions on Visualization and Computer Graphics, pages 1121–1128, 2007. [299] G. Vigna and R. A. Kemmerer. Netstat: A network-based intrusion detection system. Journal of Computer Security, 7(1), 1999. [300] H. Wang, D. Zhang, and K. G. Shin. Detecting syn flooding attacks. In INFOCOM, 2002. [301] L. Wang, X. Zhao, D. Pei, R. Bush, D. Massey, A. Mankin, S. F. Wu, and L. Zhang. Observation and analysis of bgp behavior under stress. In Internet Measurement Workshop, pages 183–195. ACM, 2002. [302] Z. Wang and G. Klir. Fuzzy Measure Theory. Plenum Press, New York, 1992. [303] M. O. Ward. Multivariate data glyphs: Principles and practice. In Handbook of Data Visualization, Springer Handbooks Comp.Statistics, pages 179–198. Springer Berlin Heidelberg, 2008. [304] C. Wei, A. Sprague, G. Warner, and A. Skjellum. Characterization of Spam Advertised Website Hosting Strategy. In Sixth Conference on Email and Anti-Spam, Mountain View, CA, 2009. [305] C. Westphal. Data Mining for Intelligence, Fraud & Criminal Detection: Advanced Analytics & Information Sharing Technologies. CRC Press, 1st edition (December 22, 2008), 2008. [306] K. J. Wheaton. Top 5 intelligence http://sourcesandmethods.blogspot.com, [sep 2009]. analysis methods, [307] D. Wheeler and G. Larson. Techniques for cyber attack attribution. IDA Paper P-3792, Institute for Defense Analyses, Alexandria, Virginia, 2003. [308] R. White. Securing bgp through secure origin bgp. Internet Protocol Journal, 6(3), September 2003. [309] J. Wolf. Pentagon says “aware” of China Internet rerouting. reuters.com/article/idUSTRE6AI4HJ20101119?pageNumber=1. cessed 22-Nov-2010]. FP7-ICT-257495-VIS-SENSE http://www. [Online; ac- 117 Bibliography [310] T. Wong and C. Alaettinoglu. Internet routing anomaly detection and visualization. In Dependable Systems and Networks, 2005. DSN 2005. Proceedings. International Conference on, pages 172–181. IEEE, 2005. [311] J. Wu, Z. M. Mao, J. Rexford, and J. Wang. Finding a needle in a haystack: Pinpointing significant bgp routing changes in an ip network. In NSDI. USENIX, 2005. [312] L. Xiao, J. Gerth, and P. Hanrahan. Enhancing visual analysis of network traffic using a knowledge representation. In Visual Analytics Science And Technology, 2006 IEEE Symposium On, pages 107–114. IEEE, 2006. [313] Y. Xie, F. Yu, K. Achan, R. Panigrahy, G. Hulten, and I. Osipkov. Spamming botnets: signatures and characteristics. In SIGCOMM ’08: Proceedings of the ACM SIGCOMM 2008 conference on Data communication, pages 171–182, New York, NY, USA, 2008. ACM. [314] D. Xu and P. Ning. Alert correlation through triggering events and common resources. In ACSAC, pages 360–369. IEEE Computer Society, 2004. [315] D. Xu and P. Ning. Correlation Analysis of Intrusion Alerts. Springer, 2008. [316] K. Xu, J. Chandrashekhar, and Z. Zhang. A first step towards understanding inter-domain routing dynamics. In Proceedings of ACM SIGCOMM MINENET, Philadelphia, PA, August 2005. [317] R. Yager. On ordered weighted averaging aggregation operators in multicriteria decision-making. IEEE Trans. Syst. Man Cybern., 18(1):183–190, 1988. [318] H. Yan, R. Olivera, K. Burnett, D. Matthews, L. Zhang, and D. Massey. BGPmon: A real-time, scalable, extensible monitoring system. CATCH2009, http: //bgpmon.netsec.colostate.edu/download/publications/catch09.pdf. [Online; accessed 13-May-2009]. [319] Y. Yang, F. Deng, and H. Yang. An unsupervised anomaly detection approach using subtractive clustering and hidden markov model. In Proceedings of Communications and Networking in China, pages 313–316, 2007. [320] M. Yannuzzi, X. Masip-Bruin, and O. Bonaventure. Open issues in interdomain routing: a survey. IEEE Network, 19(6):49–56, 2005. 118 SEVENTH FRAMEWORK PROGRAMME Bibliography [321] V. Yegneswaran, P. Barford, and U. Johannes. Internet intrusions: global characteristics and prevalence. In SIGMETRICS, pages 138–147, 2003. [322] V. Yegneswaran, P. Barford, and V. Paxson. Using honeynets for internet situational awareness. In Fourth ACM Sigcomm Workshop on Hot Topics in Networking (Hotnets IV), 2005. [323] A. Yelizarov and D. Gamayunov. Visualization of complex attacks and state of attacked network. In Visualization for Cyber Security, 2009. VizSec 2009. 6th International Workshop on, pages 1–9. IEEE, 2010. [324] D. S. Yeung, S. Jin, and X. Wang. Covariance-matrix modeling and detecting various flooding attacks. IEEE Transactions on Systems, Man, and Cybernetics, Part A, 37(2):157–169, 2007. [325] D.-Y. Yeung and Y. Ding. Host-based intrusion detection using dynamic and static behavioral models. Pattern Recognition, 36(1):229–243, 2003. [326] X. Yin, W. Yurcik, M. Treaster, Y. Li, and K. Lakkaraju. VisFlowConnect: netflow visualizations of link relationships for security situational awareness. In Proceedings of the 2004 ACM workshop on Visualization and data mining for computer security, pages 26–34. ACM, 2004. [327] T. Zhang, R. Ramakrishnan, and M. Livny. Birch: An efficient data clustering method for very large databases. In H. V. Jagadish and I. S. Mumick, editors, Proceedings of the 1996 ACM SIGMOD International Conference on Management of Data, Montreal, Quebec, Canada, June 4-6, 1996, pages 103–114. ACM Press, 1996. [328] Z. Zhang, Y. Zhang, Y. C. Hu, Z. M. Mao, and R. Bush. iSPY: Detecting IP Prefix Hijacking on My Own. In Proceedings of the ACM SIGCOMM 2008 conference on Data communication, SIGCOMM ’08, pages 327–338, New York, NY, USA, August 2008. ACM. [329] J.-L. Zhao, J.-F. Zhao, and J.-J. Li. Intrusion detection based on clustering genetic algorithm. In Proceedings of 2005 International Conference on Machine Learning and Cybernetics, volume 6, pages 3911–3914, 2005. [330] X. Zhao, D. Pei, L. Wang, D. Massey, A. Mankin, S. F. Wu, and L. Zhang. An analysis of BGP multiple origin AS (MOAS) conflicts. In Proceedings of the 1st ACM SIGCOMM Workshop on Internet Measurement, IMW ’01, pages 31–35, New York, NY, USA, 2001. ACM. FP7-ICT-257495-VIS-SENSE 119 Bibliography [331] C. Zheng, L. Ji, D. Pei, J. Wang, and P. Francis. A Light-Weight Distributed Scheme for Detecting IP Prefix Hijacks in Real-Time. SIGCOMM Comput. Commun. Rev., 37(4):277–288, 2007. [332] C. Zheng, L. Ji, D. Pei, J. Wang, and P. Francis. A light-weight distributed scheme for detecting ip prefix hijacks in realtime. In J. Murai and K. Cho, editors, SIGCOMM, pages 277–288. ACM, 2007. [333] L. Zhuang, J. Dunagan, D. R. Simon, H. J. Wang, and J. D. Tygar. Characterizing botnets from email spam records. In LEET’08: Proceedings of the 1st Usenix Workshop on Large-Scale Exploits and Emergent Threats, pages 1–9, Berkeley, CA, USA, 2008. USENIX Association. 120 SEVENTH FRAMEWORK PROGRAMME