Information Integration via Hierarchical and Hybrid Bayesian Networks
Transcription
Information Integration via Hierarchical and Hybrid Bayesian Networks
1 Information Integration via Hierarchical and Hybrid Bayesian Networks Haiying Tu, Jeffrey Allanach, Satnam Singh, Krishna R. Pattipati, Fellow, IEEE, Peter Willett, Fellow, IEEE Abstract— A collaboration scheme for information integration among multiple agencies (and/or various divisions within a single agency) is designed using hierarchical and hybrid Bayesian networks (HHBNs). In this scheme, raw information is represented by transactions (such as communication, travel, financing), and information entities to be integrated are modeled as random variables (such as: an event occurs, an effect exists, or an action is undertaken). Each random variable has certain states with probabilities assigned to them. Hierarchical is in terms of the model structure and hybrid stems from our usage of both general Bayesian networks (BNs) and hidden Markov models (HMMs, a special form of dynamic BNs). The general Bayesian networks are adopted in the top (decision) layer to address global assessment for a specific question (e.g., “Is target A under terrorist threat?” in the context of counter-terrorism). HMMs function in the bottom (observation) layer to report processed evidence to the upper layer BN based on the local information available to a particular agency or a division. A software tool, termed the adaptive safety analysis and monitoring (ASAM) system, is developed to implement HHBNs for information integration either in a centralized or in a distributed fashion. As an example, a terrorist attack scenario gleaned from open sources is modeled and analyzed to illustrate the functionality of the proposed framework. Index Terms— Information integration, decision making, hidden Markov models, Bayesian networks, counter-terrorism. I. I NTRODUCTION A. Motivation ECISION making in the modern information era is a complex process. Not only are the sources of information diverse, distributed and possibly conflicting, the acquired information is very likely noisy, dynamic, incomplete and uncertain. In complex decision making scenarios, a strategic decision is supported by collaboration among multiple agencies (or multiple divisions in a single agency), wherein each agency has access to a portion of the total information, and may only be responsible for part of the problem under consideration. The key issues involve not only identifying valuable information in a timely fashion, sharing this information across agencies in an efficient manner, but also to integrate large volumes of disparate information to support strategic decision making. Information technologies including information integration are vital to the national security and world-wide counterterrorism operations [1]. Terrorist organizations are typically D This work was supported by Aptima Inc., Woburn, MA 01801, USA. The authors are with Electrical and Computer Engineering Department, University of Connecticut, Storrs, CT 06269-2157, USA. Jeffrey Allanach now works at Applied Physical Science Corp., New London, CT 06320, USA. (e-mail: haiying.tu@uconn.edu, jallanach@aphysci.com, satnam/krishna/willett@engr.uconn.edu). elusive, geographically distributed across many countries, highly dynamic and adaptive. Consequently, raw information from various intelligence agencies is noisy, scattered and evolving over time. However, analysis of prior terrorist attacks suggests that a high magnitude terrorist attack requires certain enabling events to take place [2]. We term the raw information entities about the terrorist events “transactions”; this raw information is filtered, processed in a summary form, and reported to a higher level agency. The higher level agency can be viewed as a “fusion center” that integrates the summarized information, thus possibly providing early warning to facilitate preemption and/or support strategic decision making. Information integration covers numerous research areas, such as data mining, information extraction, machine learning, constraint reasoning, databases, view integration, web services, and other related areas. This paper utilizes Bayesian networks (BNs) and hidden Markov models (HMMs, a special form of dynamic BNs) as basic modelling techniques to address the information integration problem, with the identification and analysis of the terrorist threats as the application background. HMMs are hosted in lower level (sensing) agencies that serve as information filters; that is, they take transactions as inputs and provide local assessments as outputs that are transformed into soft evidence (i.e., local decisions and the concomitant confidence levels). BNs are maintained by higher level (decision making) agencies functioning as fusion centers, and pool the summarized information (in the form of soft evidence) to support global decisions. BNs and HMMs are therefore graphically constructed in a hierarchical fashion in our modeling framework, resulting in hierarchical and hybrid Bayesian networks (HHBNs). Why do HHBNs make sense for information integration? First, a HMM is natural in situations where there is no direct access to true states of the environment having an underlying Markovian structure. Analogous to the target tracking problem where states (location, velocity, etc.) are observed through noisy measurements, the true states of terrorist activities are detected against a background of “noise” transactions (observations). Thus, HMMs provide a realistic representation for information processing in intelligence agencies tasked to identify terrorist activities. Intelligence agencies may only obtain partial information of the complete pattern of transactions associated with a particular terrorist activity. Consequently, the hidden states of the terrorist activity are observed through another set of stochastic processes that produce the sequence of observable transactions. The key problem here is to detect a suspicious pattern (i.e., a HMM corresponding to a terrorist activity) and assess its likelihood, given a sequence of partial 2 and noisy observed transactions. Secondly, a BN incorporates uncertainty in a cause-effect modeling framework. BNs are useful for both inferential exploration of previously undetermined relationships among variables as well as descriptions of these relationships upon discovery [3]. Different pieces of information from various agencies may be related to each other, if they potentially belong to the same attack plan. More information can be inferred, if not directly collected from agencies, according to prior knowledge of causal relationships. With additional choices of counter-terrorism actions, a BN can even be used to suggest optimal action strategies (i.e., the best courses of action). Finally, the hierarchical structure of HHBN is naturally suited to information integration across multiple collaborating agencies monitoring the information space. A software tool, termed the adaptive safety analysis and monitoring (ASAM) system, is developed to implement the HHBNs for information integration in either a centralized or a distributed fashion. Although the ASAM system itself is developed for counter-terrorism applications, the underlying HHBN theory and the prototype software tool have broad applications in command and control, strategic framework for business information, information pooling in economic organizations, to name a few. B. Related Work HMMs are powerful statistical techniques for modeling sequential data. Although they are well-known and have been successfully applied in speech recognition, HMMs have also been used in many other areas, such as DNA sequence analysis, robot control, fault diagnosis [4], signal detection [5] [6], and so on. A tutorial on HMMs can be found in [7] [8]. BNs, also known as probabilistic networks, causal networks or belief networks, are a formalism for representing uncertainty in a way that is consistent with the axioms of probability theory [9]. With the assumption of conditional independence, BNs can model complex systems in well-structured and easily interpretable ways. A fairly large set of theoretical concepts and results can be found in [10]–[12]. Research efforts also cover combinations of different models or extensions of general HMMs and BNs. Fine et al. [13] generalized the HMM inference and learning processes to hierarchical hidden Markov models (HHMMs), and demonstrated the usability of HHMMs in hand-written English text recognition. A combination of HMM and decision tree, termed the hidden Markov decision tree (HMDT), can be found in [14]. Hierarchical Bayesian networks (HBN) are introduced in [15] to represent additional information about the structure of the domains of variables. Although represented hierarchically, the HBN inference algorithm is the same as that for a general BN when applied on a fully “flattened” HBN. In [16], a hybrid HMM/BN model is proposed to supplement acoustic spectrum features in speech recognition. The HMM is used for modeling temporal speech characteristics and a state probability model is represented by the BN. As far as we are aware, this is the only work in the literature which combines HMMs and BNs in a single model, and our work is quite different in both the application background and the representation details. BN or HMM-related research has been brought into the national security community as well. Coffman and Marcus [17] employ HMMs to identify groups with suspicious behaviors based on communication patterns among the group members. An anti-terrorist risk management tool, termed “Site Profiler® ”, was introduced in [18]. This tool applies knowledge-based BN construction and allows one to combine evidence from analytical models, simulations, historical data, and user judgments. Paté-Cornell and Guikema [19] present a model in the form of an influence diagram (a variant of the BN) for setting priorities among threats and among countermeasures, based on probabilistic risk analysis, decision analysis, and elements of game theory. While the methods in [18] and [19] are consistent with our approach to fuse the data from many sources at the BN layer, we introduce HMMs at a lower (observation or sensing) layer to automate the detection of terrorist activities, and to report the concomitant soft evidence to the BN layer. C. Organization of the Paper The remainder of the paper is organized as follows. The proposed HHBN model is described in Section II. The theoretical background on information processing using HHBN is discussed in Section III. In Section IV, we describe the ASAM system designed to implement the HHBN scheme. The Indian Airlines Flight IC-814 hijacking scenario is modeled and analyzed in Section V using the HHBN scheme. Finally, we conclude the paper with a summary and an outline of our current track of research in Section VI. II. T HE HHBN M ODEL As mentioned earlier, a HHBN model is a hierarchical combination of BNs and HMMs, which can be arranged in multiple layers. We demonstrate our key ideas of information integration with a two-layer model in this paper, and discuss other possibilities in Section VI. A typical HHBN model is shown in Fig. 1. It consists of a BN model at the top layer serving as a fusion center, several HMM models (two HMMs belonging to two agencies are shown in this figure) at the bottom layer serving as information filters; they process the raw information and provide soft evidence to the corresponding BN node where the BN is maintained by another agency. Formally, a HHBN is a triplet hMBN , MHM M s , Ri, where 1) MBN denotes the top layer BN model. It contains N random variables (BN nodes) {Vi |(i = 1, 2, · · · , N )} and each random variable has {Qi |(i = 1, 2, · · · , N )} number of discrete states. We will use the upper case letters for the random variables and lower case letters for the instances of the random variables hereafter. The relationships among random variables are constrained by the model structure; viz., a directed acyclic graph (DAG) obeying the usual conditional independence assumptions. Specifically, the arcs (links) between the BN nodes represent probabilistic causal dependencies. The function of the MBN can therefore be characterized by the joint probability distribution function P (v1 , v2 , · · · , vN ) = 3 V2 Agency 3 V1 Global belief (integrated information) V4 V3 Soft evidence (updated information) EV4 V5 BN Level HMM Level HMM1 (ȁ1) S1 X1 Fig. 1. (1) Agency 1 S2 X2 HMM2 (ȁ2) (2) S1' X3 S2' Confidence measurement (local information) Agency 2 S3' observation space () Transactions (raw information) Information flow HHBN model structure. QN i=1 P (vi |pa(vi )), where pa(vi ) is the possible instantiation of the parent nodes of Vi , derived using the chain rule of probability and conditional independence [20]. To be precise, given the state of a node’s parents, all the ancestors are conditionally independent of the node. Here, we use “parents” to depict the direct fan-in nodes, and “ancestors” to represent the parents’ parents, and so on [21]. The conditional probabilities {P (vi |pa(vi ))|(i = 1, 2, · · · , N )} constitute the numerical parameters of the model; they correspond to the conditional probability tables (CPTs) in the discrete case. The size of CPTs will increase with the number of parent nodes. In many cases, the parent nodes can be assumed to be marginally independent and linked to the effect node via NoisyOR logic, which will limit the conditional probabilities to a reasonable size. When the Noisy-OR assumption is not valid and the scale of the CPTs is a concern, one can introduce intermediate causal nodes to reduce the density of a single node. The nodes are either partially observable or probabilistically inferable based on the network structure. 2) MHM M s is a set of discrete time, finite state HMMs {HM Mi |(i = 1, 2, · · · , M )} functioning at the bottom layer. A discrete HMM itself is a five-tuple: hS, X, A, B, Πi, where Λ = (A, B, Π) represents the set of model parameters, i.e., state transition matrix, emission matrix, and the prior probabilities of the states. In the rest of this paper, we may use the HMM parameter set {Λi |(i = 1, 2, · · · , M )} to represent the corresponding HMMs, with λi or λi denoting HM Mi being active or inactive, respectively. Here, S = {S1 , S2 , · · · , SNS } denotes the set of finite states, and X = {X1 , X2 , · · · , XNX } is the set of possible observations. MHM M s represents multiple hypotheses of the environment (e.g., diverse patterns of terrorist activities); the objective is to detect which HMM is active (or, which of several HMMs are active) at a certain time index k based on the available observations up to time k. Unlike traditional HMMs, the states as well as the possible observations in our HMM are also in the form of networks in this paper. Such a network contains nodes and arcs, where the nodes are the keywords (person, target, etc.) in the terrorist activity modeling; an arc between two nodes creates a transaction. We will further clarify this representation in Section V in the context of Indian Airlines hijacking example. 3) The relation R, a key concept of the HHBN model, provides the bridge between the top layer BN and the bottom layer HMMs. R is a set of associations {Rijk |(i = 1, 2, · · · , M ); j ∈ (1, 2, · · · , N ); k ∈ (1, 2, · · · , Qj )} with Rijk = 1 implying that HM Mi is assigned to state k of BN node Vj . In Fig. 1, HM M1 (or Λ1 ) is assigned to the BN node V1 and HM M2 (or Λ2 ) is assigned to the BN node V4 . The node states are not specified in this figure. However, as we will explain in the next section, a binary BN node (i.e., a node having two mutually exclusive states) has a one-to-one relationship with one HMM along with its alternative hypothesis (λ1 vs. λ1 , where λ1 means HM M1 is active and λ1 means HM M1 is inactive). In order to complete our model description, the definitions of several notions are provided here. They are used in Fig. 1 and/or frequently appear in the rest of the paper. 1) Transactions: A transaction is a link between key nodes. For example, a simple event “an unknown person purchased chemicals” can be modeled as a transaction, where “unknown person” and “chemicals” are the key nodes and there is a link (arc) between them denoting a transaction called “collecting resources”. Typically, the transactions occurring in the real world could be classified into two groups: “signal” (“harmful”) transactions and “noise” (“benign”) transactions. The former, i.e., signal transactions, are represented in a HMM state; and the latter, i.e., the noise transactions, are not. The noise transactions are treated as clutter. 2) Observations: Observations are the inputs to HMMs; they are a series of transactions among suspicious people, places, and things with a time stamp associated with each transaction denoting the event occurrence time. 3) Patterns: A pattern is the time evolution of different transactions. Each HMM state sequence can be viewed as a hypothetical pattern. The patterns are typically gleaned from past statistics or subject matter experts. 4) Confidence measurement: The output of a HMM is a confidence measurement, which is the likelihood ratio of observing the sequences of observation up to current time given alternative hypotheses (e.g., λ1 vs. λ1 ). 5) Evidence: Evidence is a terminology of BNs. The evidence for a particular BN node can be observed as one of its states, called hard evidence; or, the evidence may be observed with uncertainty, i.e., soft evidence. 6) Soft evidence: Soft or virtual evidence is the most general type of evidence introduced to reflect uncertainty [22]. For a node without parents, soft evidence is equivalent to modifying the prior probability of that node; otherwise, soft evidence on a variable V (i.e., a node in a BN) is represented by a reported state v together with its conditional 4 probability vector P (V = v|Hi ), (i = 1, 2, · · · , Q) for all the Q states, where Hi denotes the hypothesis that the true state is i. The right side of Fig. 1 shows the information flow associated with the HHBN model. Raw information arrives as sequences of transactions, which constitute the inputs to the HMMs. HMMs, based on the partition of the observation space, detect the “signal” transactions (if any), and report the local decisions and the corresponding confidence to higher layer BN nodes. Since only the active HMMs will report their findings and trigger the BN inference, the HMMs are essentially running in a faster time scale compared to the BN. The confidence measurement from the active HMMs is then transformed into soft evidence, and is used to update the evidential nodes (BN nodes assigned by R). Newly arriving evidence is thus propagated through the BN structure using the inference scheme of the BN. The details of how to process and propagate the information via the HHBN model will be discussed in the next section. How does one specify the model parameters (viz., conditional probability tables, transition probabilities)? While these parameters could be estimated using a learning algorithm such as EM (or Baum-Welch algorithm [23] in the classical HMM phraseology), in a data-scarce environment such as counterterrorism it is doubtful if one can obtain enough training data to learn the model (including the structure and model parameters)1 . Our approach has been to develop an initial model based on our understanding of the domain, and seek review and feedback from the subject matter experts, as in [18] and [19]. III. I NFORMATION P ROCESSING WITH HHBN In this section, we will discuss the theoretical foundation of the HHBN model. It addresses how the HMMs filter noise transaction data and produce local information, how the local information becomes soft evidence, and how the BN handles the soft evidence from the HMMs and integrates it for a global assessment. The information transformation between the HMMs and BN is our primary focus. The three basic problems solvable using HMMs are [24]: 1) Evaluation: Evaluating the probability of a sequence of observations given a particular HMM. 2) Decoding: Finding the most likely sequence of state transitions (i.e., the most likely path) associated with an observed sequence. 3) Training: Adjusting the parameter set Λ to maximize the probability of generating an observed sequence via the Baum-Welch algorithm. In this paper, we will assume that the model parameters are known and fixed. That is, we focus on the evaluation and decoding problems associated with HMMs. In the context of counter-terrorism, the states and observations of the HMMs are snapshots of the transactions associated with the modeled terrorist activities; graphically, they are terrorist networks with 1 It may be possible to learn HMMs that model benign behavior. Better inference of these frequent occurrences translates to easier removal of such “clutter”. Ω1 Ω2 Ω1 Ω2 Ω1 Ω2 (a) Independent Fig. 2. (b) Overlapping (c) Intersecting Three cases of the observation space for two HMMs. the instantiated nodes and links. The HMM parameter set Λ = (A, B, Π) represents the probability of moving from the current state of terrorist activity to another (usually denoting an increase in terrorist threat), the probability of observing a new set of suspicious transactions given the current state, and the probability of initial threat, respectively [25]. The HMMs accept a series of transactions among suspicious people, places, and things as inputs. The goal of a HMM algorithm is to detect the “signal” transactions, which are embedded in many noise transactions, in a timely fashion. Given new observed evidence (e), the probabilistic inference in a BN has four tasks [26]: 1) Belief updating, P (V = v|e); 2) Finding most probable explanation (MPE); 3) Finding maximum a posterior probability estimate of network state (MAP); 4) Finding maximum expected utility (MEU) decision. The current realization of the HHBN model (with the ASAM system) considers the first task, viz., belief updating only; the other three tasks will be included later. For example, the MEU task is of interest when suggesting preemptive counter-terrorism actions to a threat. The BN evolution is triggered by the evidence from HMMs and/or directly observed evidence (viz., hard evidence on the BN nodes). Since the belief updating with hard evidence is the basic function of BN inference algorithms, we do not address this issue in the paper. With uncertainties in the raw information as well as the nature of HMM detection statistics, the HHBN model considers the soft evidence gleaned from the likelihoods transmitted from HMMs to the BN; the soft evidence measures the confidence of the corresponding HMM in detecting the monitored terrorist activities. Fig. 2 illustrates three cases of how the observation space may be clustered for multiple HMMs (two HMMs as example): independent, overlapping and intersecting (e.g., the model in Fig. 1 illustrated the third case). When the HMMs are based on independent observation spaces, the collaboration or information integration is among agencies that are monitoring different aspects of the problem: they are accessing different databases to obtain their raw information. Alternatively, both the overlapping and intersecting spaces corresponding to the cases where the collaborative agencies are sharing entirely or partially the same data source, such as a transaction database. Consequently, the independent case only requires a decoupled tracking scheme, while the overlapping and intersecting cases require a multiple hypothesis tracking scheme. In the later case, the soft evidence requires additional processing to make it statistically meaningful. 5 A. Independent Tracking In the independent case, a binary hypothesis can be constructed for each HMM. Specifically, instead of evaluating the probability of a sequence of observations up to a specified discrete time index k given a particular HMM as in the usual evaluation problem, we are interested in a hypothesis testing problem with the null hypothesis H0 as pure noise (“benign” or random transactions) and H1 as a HMM (HM M1 parameterized by Λ1 for example) of interest (viz., “terrorist activity”) being detected at a specified discrete time index n0 . The details of a single HMM detection scheme based on Page’s test [27] is given in Appendix A. With the forward variables in Page’s test, we have: P (x1 , x2 , · · · , xk |λ1 ) = NS X αk (i) (1) Fig. 3. Detection of a single HMM in the presence of “noise” background. i=1 where NS is the total number of states in HM M1 and the αk ’s are the forward variables (defined in equation (14) of Appendix A). Using the likelihood ratio, or the so-called confidence measurement (with xk1 denoting the sequence of observations {x1 , x2 , · · · , xk }), we have: L(xk1 ) = P (xk1 |λ1 ) . P (xk1 |λ1 ) (2) We can calculate the posterior probability of the HMM via: P (xk1 |λ1 )P (λ1 ) P (xk1 ) P (xk1 |λ1 )P (λ1 ) = P (xk1 |λ1 )P (λ1 ) + P (xk1 |λ1 )P (λ1 ) L(xk1 )L0 . = L(xk1 )L0 + 1 P (λ1 |xk1 ) = (3) Here, P (λ1 ) is the prior belief on the existence of HM M1 , and L0 = P (λ1 )/P (λ1 ) is the prior odds ratio. The posterior probability is the agency’s belief on the existence of HM M1 based on the observations up to time index k. It is the probability of detection in the BN layer, thus forming the soft evidence to update the BN inference as discussed in Appendix B. Briefly, when HM M1 is associated with a binary BN node “V ” (with state “1” associated with λ1 and state “0” associated with λ1 ), we will augment the initial BN with a dummy node “EV ” which has the same set of states as “V ” and a link from node “V ” when HM M1 is detected. The BN belief updating is triggered by a hard evidence “EV = 1” (since the local agency reports that the HMM is active) with a conditional probability table (CPT) constructed from P (λ1 |xk1 ) and 1−P (λ1 |xk1 ) to represent the uncertainties in the evidence. Actually, only the column corresponding to “EV = 1” in the CPT will be of interest for belief updating, as we can see from the two equations in Appendix B. It is also feasible that multiple HMMs report to different states of the same nonbinary BN node. However, the states of a BN node have an assumption that they are mutually exclusive, thus creating a conflict if more than one HMM reports as being active to the same node at the same time. We assume that this issue is resolved in the modeling process, where we design binary BN nodes to collect information from individual HMMs, while adding intermediate nodes to specify the possible relationships and semantics among active HMMs. An example of the detection of a terrorist network via Page’s test is illustrated in Fig. 3. HM M1 is detected at time unit n0 = 60 with the threshold of detection “h” set at 20. Again, a HMM only reports its confidence measurement to the BN node when it is detected (viz., the CuSum test statistic associated with Page’s test is above the threshold). B. Multiple Hypothesis Tracking When multiple HMMs share a data source, the inference becomes essentially a multiple hypothesis tracking (MHT) or multiple target tracking problem [28] because the HMMs now compete for the observations (i.e., the association of transactions to HMMs is uncertain). The HMMs with such a superimposed observation space are similar to the so-called factorial HMMs [29]. This paper follows a technique developed in [6], which is an extension of the single HMM detection case, to solve this multiple HMMs detection problem. For illustrative purpose, consider two HMMs: HM M1 and HM M2 . Without loss of generality, we assume that at most one HMM can be activated or deactivated at a time. The valid multiple hypothesis tests are constructed as shown in Fig. 4. In each test, H0 represents the null hypothesis and H1 or H2 is the alternative hypothesis. The term “NULL” implies that none of the HMMs is active, i.e., λ1 λ2 . MHT starts with the independence assumption (test #1) that is identical to independent tracking. Once one of the HMMs is detected (i.e., significant transactions showing that the underlying terrorist activity is active), a new hypothesis test is formed (either test #2 or #3 based on which of the two HMMs is detected first). A simulation result in the form of CuSum test statistic is shown in Fig. 5. The ground truth [2] for the simulation is superimposed in the figure where HM M1 is actually active from k = 1 to k = 150 and HM M2 is truly active from k = 50 to k = 92. The decision can be made based on either L(xk1 ) = P (xk1 |λ1 λ2 )/P (xk1 |λ1 λ2 ) or L(xk1 ) = P (xk1 |λ1 λ2 )/P (xk1 |λ1 λ2 ) exceeding some predefined threshold h ( h = 20 in this simulation). It is evident 6 Test #2 Test #1 H0: NULL H1: HMM1 only H2: HMM2 only λ 1λ 2 (a) H0: HMM1 only H1: HMM1 and HMM2 H2: NULL HMM1 HMM2 (b) λ 1λ 2 λ 1λ 2 λ 1λ 2 Test #3 H0: HMM2 only H1: HMM1 and HMM2 H2: NULL Test #4 H0: HMM1 and HMM2 H1: HMM1 only H2: HMM2 only (a) Detection of modeled HMM1 at k = 25 Fig. 4. Illustration of tests for two HMMs. The arrows represent test outcomes: for example, in test #2, HM M1 can disappear, or be joined by HM M2 . HMM1 HMM2 that HM M1 is detected first at time k = 25 as shown in Fig. 5(a), which causes a transition from test #1 to test #2 in Fig. 4. Starting from k = 25, a new test is generated to track if both HM M1 and HM M2 are active, given that HM M1 is already detected and still valid. Fig. 5(b) shows that the new test statistic exceeds the threshold at time k = 60. In this case, L(xk1 ) = P (xk1 |λ1 λ2 )/P (xk1 |λ1 λ2 ) ≥ h. The new test result causes a transition from test #2 to test #3 as shown in Fig. 4. Extension to more than two HMMs is straightforward. A major output of the MHT is the likelihood function of the observation sequence given multiple HMMs to be detected [6], e.g., P (xk1 |λ1 , λ2 ). However, we require the marginal posterior probabilities of individual HMMs to be reported to BN, i.e., P (λi |xk1 ) ∀i. Suppose we are currently dealing with hypothesis test #2 in the previous example, and that both HMMs are detected (viz., accept “H1 ”). The marginal probabilities can then be approximated by: . P (λ1 |xk1 ) = P (λ1 λ2 |xk1 ) + P (λ1 λ2 |xk1 ) . P (λ2 |xk1 ) = P (λ1 λ2 |xk1 ) (4) (5) The first and the second posterior probabilities in (4) come from the hypotheses H0 and H1 in test #2, respectively. Generally, this marginal posterior probability is approximated via: . X P (Hj |xk1 ) ∀i P (λi |xk1 ) = Fig. 5. Detection of multiple HMMs. where P (xk1 ) = P (xk1 |λ1 λ2 )P (λ1 )P (λ2 )+ P (xk1 |λ1 λ2 )P (λ1 )P (λ2 ) + P (xk1 |λ1 λ2 ))P (λ1 )P (λ2 ) ≈ P (xk1 |λ1 λ2 )P (λ1 )P (λ2 ) + P (xk1 |λ1 λ2 )P (λ1 )P (λ2 ) (8) with L(xk1 ) = P (xk1 |λ1 λ2 )/P (xk1 |λ1 λ2 )) (9) L0 = P (λ2 )/(1 − P (λ2 )) (10) and HMMs are assumed to be marginally independent (independent in the absence of observations) in (8). (6) IV. S OFTWARE I MPLEMENTATION λi ∈Hj i.e., sum over all the posterior probabilities where the current hypothesis covers the HMM of interest (HM Mi is active in this hypothesis). The joint posterior probabilities are determined in a way similar to the independent tracking case. For example, P (xk1 |λ1 λ2 )P (λ1 λ2 ) P (xk1 ) k L(x1 )L0 = L(xk1 )L0 + 1 (b) Detection of HMM1 and HMM2 in the presence of HMM1 P (λ1 λ2 |xk1 ) = (7) The adaptive safety analysis and monitoring (ASAM) system is developed based on the HHBN architecture and aimed to support the collaborative analysis of intelligence information. As shown in Fig. 6, the ASAM system consists of five functional modules: a graphical modeling tool, a knowledge repository, HMM Engines, a BN Engine, and a web browser. The modules of the ASAM system can be either locally hosted or distributed via a network connection. While the former case can be used for demonstration purposes or for prototype testing, the latter is the deployment structure. For example, the knowledge repository and the web service can be hosted in a secure server, while authorized users can still access the 7 Web GUI BN Engine HMM Engine Local Host or Distributed via Network Connection Knowledge Repository Fig. 6. Graphical Modeling in TEAMS® The ASAM architecture. ASAM web site from anywhere via the internet; the modeling tool can be installed where the modeling expertise resides; the location of the BN engine and HMM engines should be consistent with the geographical distribution of the agencies (or divisions) involved. The HMM-related algorithms are implemented in the HMM engine, and the BN engine is implemented with the support of the BN API (Application Programming Interface) “SMILE” [30]. The graphical modeling tool is developed using TEAMS® (Testability Engineering and Maintenance Systems [31]). We utilize the hierarchical modeling capability of TEAMS® , which was expanded to include inputs related to BNs (e.g., states and conditional probabilities), HMMs (e.g., Markov chains and transition probabilities), and the relationship between them (who reports to whom). The complete model information is then exported into the knowledge repository, which is currently hosted in a MySQL database. The modeling process is offline and requires subject matter experts (i.e., intelligence analysts) to enter the scenarios (in the form of BNs) and terrorist activity templates (in the form of HMMs). A snapshot of the TEAMS® interface for the BN layer modeling is illustrated in Fig. 7. This model will be used in the example in Section V. The conditional probability table is for the highlighted node: “Planning And Strategy”. Once the models are entered, the ASAM system can be deployed for online monitoring or for offline “what-if” analysis. A typical online monitoring scenario is as follows: various agencies run their local HMMs to detect patterns of terrorist events in the transaction space, and transmit their beliefs of the events, as well as the observed transactions into the repository. The BN, which also runs in real-time, obtains the new information of related HMMs from the repository in the form of soft evidence, updates the overall network beliefs, and saves the inference results back into the repository. The analyst can query any model, the HMM results, BN inference results or transactions of interest via a web browser. The graphical results such as the HMM confidence estimates or the state probabilities of user-specified BN nodes are displayed on the web browser in real-time. An offline usage scenario of the ASAM system is similar to the online usage scenario, except that an analyst can perform “what-if” studies by editing existing transactions or adding others into the transaction sequences, and can re-execute the BN and HMM engines to generate a new set of results to inject their subjective assessments into the analysis process. A simple “what-if” analysis example is as follows: an analyst, examining a detected transaction sequence from the web page, may realize that an unobserved transaction must have happened and thus one more transaction should be added at a certain time index2 . He can test this subjective assumption by adding a new transaction, and re-executing the HMM and BN engines in an offline mode. The analyst can compare the new results with the original ones and assess whether the results are sensitive to the newly added transaction. V. E XAMPLE : I NDIAN A IRLINES H IJACKING M ODEL As discussed in the previous sections, the HHBN models are transaction-based. A pattern of transactions is a potential realization of a possible event such as a hijack, suicide bombing, or attacking an infrastructure target as in a counter-terrorism application. A specific event scenario can be decomposed into groups of transactions, and each group is assigned to the state of a Hidden Markov chain. A BN model represents the overall threat from diverse scenarios, with each scenario modeled as a HMM. In this section, a hijacking scenario gleaned from open sources is modeled and analyzed. A HHBN model related to the Athens Olympics as well as general modeling process can be found in [32]. On December 24, 1999, an Indian Airlines (IA) flight IC814, flying from Kathmandu to New Delhi with 180 persons on board, was hijacked by a group of terrorists. The stand-off ended on December 31st when the Indian government released three high profile terrorists from a Kashmir jail. Our Indian Airline Hijacking model abstracts the IA flight IC-814 hijacking event, and is created based on open source information from the Embassy of India [33] and the Frontline Magazine [34]. The model contains patterns of terrorist activities that are present in the actual hijacking. The people, places and things involved in the IA hijacking events are encapsulated in non-specific nodes in an attempt to develop a canonical representation of any airline hijacking. Fig. 8 shows the BN model with representative prior probabilities and conditional probability tables. The Bayesian node labeled “PU” depicts the level of political unrest between India and Pakistan over the issue of Kashmir. Another Bayesian node labeled “Activity” represents the activity level of terrorist organizations in Kashmir. In the following simulations, the prior probabilities associated with the BN nodes are held constant, while the statistical inferences calculated by the underlying HMMs (“Planning and Strategy”, “Collect Resources” and “Preparations for Hijacking”) update the soft evidence of the corresponding BN nodes. The final, or global, effect of these individual terrorist activities causes the BN node, “Hijack”, to change with respect to the current belief – the state of which (in the form of a probability mass function) shows the likelihood of a hijacking taking place. In this model, there are three HMMs (assumed to be originally from independent observation spaces) which symbolize: planning and strategy, resource collection, and preparations 2 In fact, a proper HMM will allow for such a “missed detection”, but it may be helpful to see how important it is to the inference. 8 Fig. 7. Graphical modeling in TEAMS® . PU PU (High) 0.02 PU (Medium) 0.18 Activity (High) 0.05 PU (Low) 0.8 Activity (Medium) 0.1 High Activity (Low) 0.85 Medium Low Planning Prepare Yes Yes No Yes No No Resources Yes No Yes No Yes No Yes No Hijack (Yes) 0.99 0.8 0.7 0.5 0.8 0.6 0.5 0.02 Hijack (No) 0.01 0.2 0.3 0.5 0.2 0.4 0.5 0.98 High Medium Low High Medium Low High Medium Low Planning Yes No Resources Yes No Fig. 8. Activity Planning Yes No Yes No Planning (Yes) 0.99 0.8 0.3 0.8 0.6 0.1 0.7 0.6 0.02 Resources (Yes) 0.98 0.02 Prepare (Yes) 0.98 0.4 0.6 0.02 Planning (No) 0.01 0.2 0.7 0.2 0.4 0.9 0.3 0.4 0.98 Resources (No) 0.02 0.98 Prepare (No) 0.02 0.6 0.4 0.98 BN model for Indian airline hijacking. for hijacking. The likelihood of these events are associated with the Boolean BN node state: “Yes”. The Markov chain of the these three HMMs are shown in Figs. 9, 10 and 11, respectively. The evolution of planning activities, political ideology and general goals of the terrorist organization are depicted in the HMM: “Planning and Strategy”. Political instability associated with a terrorist organization induces them to set up bases/cells in the country X. Parallel to this, fundamentalists and separatists also announce Holy War against the country X. Headquarters personnel of terrorist organizations recruit and train new members with particular talents that can be employed in the attack. Planners analyze the targets and, in selecting the target, attention is given to seize installations that are highly visible and, consequently, would warrant extensive media coverage. A HMM representation of planning and strategy for the Indian airlines hijacking problem is illustrated in Fig. 9. This model has nine states (N = 9) with state transition probabilities (which form matrix A) labeled next to the feasible transitions. The transaction network snapshots corresponding to S1 , S2 and S9 are shown in Fig. 12(a)-(c). The other states have the same set of nodes, but different links. The transactions of solid lines in S9 represent the signal transactions of this state and the transactions with dashed lines superimpose possible signal transactions accumulated from the state transitions (those are the transactions that occurred before reaching the absorbing states). A transaction links two nodes of the network, but each state may introduce more than one new signal transaction. For instance, the assertion that this HMM is in state S1 is denoting the network state that “there is a political intent from certain terrorist organizations”; the assertion that this HMM is in state S2 corresponds to 9 6 3ROLWLFDOLQVWDELOLW\ DVVRFLDWHGZLWKD WHUURULVWRUJDQL]DWLRQ 6 6 Fig. 9. 6 6 5HFUXLWV WUDLQLQJQHZ PHPEHUV 6HWXS EDVHVFHOOVLQ FRXQWU\; 6 6 )XQGDPHQWDOLVWV DQQRXQFHKRO\ZDU DJDLQVWFRXQWU\; 3ODQQHUVHVWDEOLVK UHODWLRQVKLSZLWK ORFDOVPXJJOHUV 3ODQQHUV HPEHGGHG LQFRXQWU\; 7DUJHWLV LGHQWLILHG 6 3ODQQHUVDUH DVVLJQHG 6 3ODQQHUV DQDO\]H SRWHQWLDOWDUJHWV Markov chain for HMM: Planning and Strategy. 0.2 0.5 S1 S5 Planners arrange forged document for Hijackers Planners meet 0.45 Planners get money 0.25 S6 Planners establish relationship with local smugglers 0.2 Planners Collect Tools 0.2 0.2 0.2 2 0. S7 Fig. 10. S8 All resources collected 0.4 0.2 1 0.4 Assign tasks to Hijackers S3 S4 0.4 0.55 0.2 S2 0.2 0.4 0.75 0.5 Planners collect weapons Markov chain for HMM: Collect Resources. 0.7 0.5 S4 0.4 2 Arrival of Hijack leader at target airport Fig. 11. S7 0.4 1 S9 Hijacking 0.2 Weapons embedded on flight Markov chain for HMM: Preparations for Hijacking. Political intent Terrorist bases/cells Terrorist organization Fundamentalists Target country Target New terrorists Planners Local smugglers Potential targets (a) Network of S1 . Fig. 12. 0.6 Communications with weapons installment team Hijackers assemble at target airport 0.5 0.8 S6 0.2 S8 0. 3 0. 45 0.2 S5 8 0. Target airport and flight reconnaissance by Hijackers 4 0. S2 Meeting between Planners and Hijackers 0.5 0.55 S3 Planners go to hidden location 0. 0.3 S1 Target airport and flight reconnaissance by Planners Political intent Terrorist bases/cells Terrorist organization Fundamentalists Target country Target New terrorists Planners Local smugglers Potential targets (b) Network of S2 . Transaction network snapshots for HMM: Planning and Strategy. Political intent Terrorist bases/cells Terrorist organization Fundamentalists Target country Target New terrorists Planners Local smugglers Potential targets (c) Network of S9 . 10 Hijackers Planners Weapons Money Misc. tools Forged documents Local smugglers Target country (a) Last state (S8 ) of HMM: Collect Resources. Fig. 13. Hijackers Weapons Planners Hijack leader Target airline Weapon team Target airport Hijack flight Hidden location (b) Last state (S9 ) of HMM: Preparations for Hijacking. Transaction network snapshots for the last states. the event “enroll fundamentalists from the target country into the terrorist organizations”. A possible state sequence of a HMM is essentially a concatenation of all the transactions in its previous state(s) with the current set of transactions, i.e., a snapshot of a pattern. The prior probability Π for this model is set as: [0.5, 0.5, 0, 0, 0, 0, 0, 0, 0]. This implies that, at the time this HMM is detected, it will be in state S1 or S2 with a probability of 0.5. The state evolution with the structure in Fig. 9 implies that these two steps (S1 and S2 ) of terrorist planning and strategy process can be performed simultaneously. The emission probabilities are assigned by comparing the observation to the state model via the specified probabilities of false alarm and missed detection associated with the model [25]. Once a target is identified, a detailed plan of attack is developed. Such a plan includes the kinds of demands that will be made and the means by which they will be communicated to authorities and the media. The HMM corresponding to “Collect Resources”, as shown in Fig. 10, tracks the transactions that involve collecting resources to carry out a terrorist attack. Terrorists begin to function as a group, once their organizational identity is established. The tactical and logistical requirements of the operation, such as the types of weapons that will be employed, the means by which the target (an airplane in this case) will be held, the requirements of satellite phones and other miscellaneous equipment, are established. Planners acquire and transport the arms, ammunition, forged documents and related equipment through interconnections with local organized crime cells. The HMM, denoting “Preparations for Hijacking”, as shown in Fig. 11, demonstrates all the exercises for the hijacking. Planners and hijackers check the target airport and the target airline. They repeatedly reconnoitre the target airline to estimate the actions and measures they need to take in order to neutralize or penetrate whatever security measures had been established to protect the target. Each hijacker has an organizational affiliation and identity. The organizational identities of the hijackers enable them to get more quickly into the personal roles that they will play throughout the preparation and duration of the attack. Sometime before the hijacking, planners hide in secret locations so that security personnel cannot capture them after the hijacking. The hijack leader communicates with the weapons team sometime before the flight departure. When weapons team informs the hijack leader that weapons are installed on the plane, the hijack leader executes the hijacking of plane with his team. Due to space limitations, only the last states for the latter two HMMs are shown in Figs. 13 (a) and (b). Detection of these modeled HMMs is shown in Fig. 14 in the form of CuSum test statistic. The evolution of the corresponding Bayesian belief that the airline hijacking occurs is shown in Fig. 15. We speed up the flow of the new transactions (e.g., every two seconds in the figures) for simulation purposes. The real time associated with the IA hijacking events are labeled for reference. The starting point of each HMM detection curve is associated with the first time this HMM is detected; thus, we believe (with certain probability) that the modeled terrorist activity is in progress. A peak probability usually results when this pattern evolves into the absorbing state of the HMM. Once the peak is attained, the numerous unrelated transactions will reduce the confidence in the detection. Thus, there are two reasons which can decrease the probability in Fig. 14. They are caused by noise transactions or simply because the terrorist activities have already reached their goal and do not warrant any further transactions. The BN updates its belief only when HMMs detect significant new evidence. Typically, it merges all available information from diverse sources and generates a global alarm. VI. S UMMARY AND F UTURE W ORK An information integration scheme using hierarchical and hybrid Bayesian networks is introduced with counter-terrorism as an application context. A HHBN model is constructed from one BN and several HMMs. HMMs function in lower layer transaction spaces in a fast time-scale, while the BN is operating in top layer strategy space in a relatively slow time-scale. An analytical software tool, the ASAM system, is developed in accordance with the HHBN scheme. The ASAM system uses HMMs to model the stochastic and dynamic evolution of terrorist activities, which pertain to a particular node state in the BN. The HMMs transmit soft evidence to BN nodes, and the BN inference algorithms integrate the soft evidence from multiple HMMs into an overall assessment of terrorist threat. An example terrorist scenario, related to the Indian Airlines hijacking, was adopted to illustrate the proposed scheme and test the functionality of the software. In designing and implementing strategies of response to potential terrorist attacks, it is essential to think beyond the re- 11 HMM Detection Scheme Terrorists Have Collected All Necessary Resources Attack HMM1 HMM3 Probability CUSUM Statistic HMM2 Bayesian Network Inference Terrorists Are Planning Attack 08/1999 08/1999 Fig. 15. 10/1999 Event Time 12/1999 The belief of the Indian airline hijacking occurrence. Samples Agency 1 Fig. 14. Detection of three modeled HMMs in the presence of “noise” background. occurrence of the last event [19]. Although our methodology of using HHBN and the ASAM system to analyze the information is based on the knowledge of past events, a large spectrum of possible scenarios and hypothetical patterns can be generated using the ASAM modeling tool, and support the analyst in exploring a range of possible countermeasures as well as in conducting “what-if” analyses. Our current work provided a distributed processing structure for gathering, sharing, understanding, and using information to assess the evolution of the terrorist activities. In combination with counter-terrorist network models, feasible actions can be suggested to inhibit potential terrorist threats. More sophisticated BN models, such as influence diagrams, may be incorporated in the top level for strategic decision support by adding action nodes and utility nodes into the BN model. The HHBN is illustrated with a two-layer model in this paper. Theoretically, hierarchical HMMs or BNs are also possible, but dramatically increase the modeling and analysis complexity. Fig. 16 shows another reasonable model where the submodels (including HMMs and BNs) are tree-structured. While HMMs always reside in the bottom layer for information filtering, the local agency can host a local BN for further analysis. The local analysis results are then propagated upwards to a higher level agency, and finally in a threat integration center to arrive at a final decision. The information flow in Fig. 1 shows upwards propagation. More research can be done on the alternative direction, i.e., propagate backwards to suggest the future possible information that needs to be gathered by the local agencies. In other words, global assessment can give direction to the informationcollection process and thus reduce the probability of future missed detections. When the evidence from a particular HMM is always inconsistent with the rest of the BN inference (evidence conflict), it is very likely that this HMM is malfunctioning and it is preferable to prune it from the network. This function requires structural adaptation of the network and will be one of our future research efforts. In this paper, we assumed the model parameters are derived from interviews of subject matter experts. We are currently ex- BN1 Agency 2 Agency 3 top level BN second level HMMs HMM1 HMM2 BN3 BN2 HMM3 Fig. 16. second level BNs third level HMM Tree-structured HHBN model. panding the proposed HHBN mechanism to other applications where the data are available (e.g., fault diagnosis, command and control architecture), and can learn the parameters from the data. Further, online adaptation of model parameters is feasible with evolving data. We are also continuing to refine the model and performance/sensitivity tests are underway. ACKNOWLEDGMENT This work was supported by Aptima Inc. as part of the NEMESIS (NEtwork Modeling Environment for Structural Intervention Strategies) project. A preliminary version of this paper was presented in SPIE 2004 [35]. The authors would like to thank Qualtech Systems Inc. for the TEAMS® software, and the Decision Systems Laboratory at the University of Pittsburgh for the “SMILE” Bayesian Networks API in C++. We thank anonymous reviewers for valuable comments. A PPENDIX A. The HMM Detection Scheme The state transition matrix of the underlying Markov chain associated with a HMM hS, X, A, B, Πi parameterized by Λ = hA, B, Πi is given by: h ¡ ¢i A = [aij ] = p s(k + 1) = Sj |s(k) = Si ¡ ¢ i, j ∈ {1, 2, · · · , NS } (11) 12 where s(k) is the state at time k. The observation process is represented via the emission matrix: h ¡ ¢i B = [bil ] = p x(k) = Xl |s(k) = Si ¡ ¢ i ∈ {1, 2, · · · , NS }, l ∈ {1, 2, · · · , NX } (12) The prior probabilities of the Markov states at time k = 1 are given by h ¡ ¢i ¡ ¢ i ∈ {1, 2, · · · , NS } (13) Π = [πi ] = p s(1) = Si An efficient detection scheme based on forward variables and the log likelihood ratio was developed in [25]. The forward variable αk (i) [36] is defined as the joint probability of observation sequence and state at time k given λ (meaning that the HMM parameterized by Λ is active) as follows: ¡ ¢ αk (i) = p x(1), x(2), · · · , x(k), s(k) = Si |λ (14) This variable can be updated recursively via: # "N S X αk (i)aij bjx(k+1) αk+1 (j) = (15) i=1 with the initial condition α1 (j) = πj bjx(1) (16) The detection time n0 , based on Page’s test [27], can be found via: ½µ ¶ ¾ n n0 = arg min max Lk ≥ h (17) n 1≤k≤n Here, h is a predefined threshold and Lnk is the log likelihood ratio of observations {x(k), · · · , x(n)} given by: Ã ¡ ¢! n X PH1 x(i)|x(i − 1), · · · , x(k) n ¡ ¢ ln Lk = (18) PH0 x(i) i=k with H1 and H0 contain the alternative hypotheses as discussed in Section III, and are consistent for both independent tracking case and multiple hypothesis tracking case. The unconditioned denominator comes from the assumption that the “benign” transaction-based observations are independent. The HMM detection scheme, also know as Page’s test or the Cumulative Sum (CuSum) method, is optimal in this case [27]. We use Page’s test to detect a switch from ordinary noise (“benign”) transactions to those modeled “signal” (“terrorist activity”) transactions. This is a change detection problem, wherein the distribution of transactions is different before and after an unknown time n0 ; and our objective is to detect the change, if it exists, as soon as possible. Extending Page’s test to fit the theoretical framework of HMMs is straightforward, given the forward variables. Recall that at time index k, NS ¡ ¢ X αk (i) P x(1), x(2), · · · , x(k)|λ = (19) i=1 where NS is the total number of states of the HMM. Given this, the conditional probability in (18) is readily solved via: PH1 (x(k)|x(k − 1), · · · , x1 )=P (x(k)|x(k − 1), · · · , x(1), λ) PNS αk (i) = PNi=1 (20) S i=1 αk−1 (i) More details on the detection algorithm as well as the use of HMMs for prediction can be found in [25]. In our development we have assumed the observations x(k) to be observed or missed. However, for the (realistic) case that transactions are imperfectly observed — that is, there is vagueness — the feature-aided tracking approach of [37] can be directly applied. B. BN Belief Updating While Observing Soft Evidence To simplify the presentation, consider a binary node V with state variables (0, 1). Define H1 and H0 as binary hypotheses that node V is in state “1” or “0” with prior probabilities of P (H1 ) and P (H0 ) = 1−P (H1 ); The two likelihoods P (V = 1|H1 ) and P (V = 1|H0 ), which form the soft evidence vector, are indeed the probability of detection (PD ) and the probability of false alarm (PF ), respectively. In order to illustrate how we update the belief with soft evidence, consider the BN in Fig. 17 with three binary (with state “1” and “0”) nodes A, B and C. Before we P receive any evidence, the belief of node C is P (C = 1) = A,B P (C = 1|A, B)P (A)P (B) = 0.624. This value is easy to obtain for small networks; however, for larger and more practical networks, efficient algorithms such as junction tree are required to reduce the computation time. A survey on the BN inference algorithms can be found in [38]. If soft evidence is observed on a node such as node A, and since this evidence is the only source of information, we can directly update the prior probability. As an example, suppose that the HMM corresponding to node A is detected to be active with confidence 0.7, we will then have P (A = 1) = 0.7 and P (A = 0) = 0.3 and use these priors to update the inference. For a node such as node C, the soft evidence can be modeled as a noisy sensor. Whenever soft evidence is reported to a BN node, a dummy node (EC as example) is added to represent the output of the sensor, and the link between the physical node and the dummy node characterizes the confidence of the sensor measurement. The soft evidence is represented as a contingency matrix, with elements that are function of the probability of detection and the probability of false alarm. Without loss of generality, we assume that the sensors have symmetric performance, that is, PD + PF = 1. This assumption is identical to the idea of normalizing the conditional probabilities in the parlance of BNs. Thus, given the parameters listed in Fig. 17, we update the belief of node C as follows: Q(C = 1) = P (C = 1|EC = 1) P (EC = 1|C = 1)P (C = 1) = P (EC = 1|C = 1)P (C = 1) + P (EC = 1|C = 0)P (C = 0) ≈ 0.937 (21) The prior probability distribution of node C is the probabilistic belief before the new evidence arrives, viz., P (C = 1) = 0.624 and P (C = 0) = 0.376. We can see that the belief updating trades off the prior knowledge and the new “dummy” observation from the soft evidence. 13 A=1 A=0 0.2 H1 H0 observe observe 1 0 PD 1-PD PF 1-PF 0.8 C 1 0 EC=1 EC=0 0.9 0.1 C 0.1 0.9 EC Fig. 17. B=1 B=0 0.9 0.1 B A A B C=1 C=0 1 1 0 0 1 0 1 0 0.9 0.7 0.6 0.2 0.1 0.3 0.4 0.8 Example for belief updating with soft evidence. R EFERENCES [1] R. Popp, T. Armour, T. Senator, and K. Numrych, “Countering terrorism through information technology,” Communications of ACM, vol. 47, no. 3, pp. 36–43, March 2004. [2] S. Singh, J. Allanach, H. Tu, K. R. Pattipati, and P. Willett, “Stochastic modeling of a terrorist event via the ASAM system,” in IEEE International Conference on SMC, The Hague, The Netherlands, October 10-13 2004. [3] D. Niedermayer. An introduction to Bayesian networks and their contemporary applications. [Online]. Available: http://www.niedermayer.ca/papers/bayesian/bayes.html [4] J. Ying, T. Kirubarajan, K. R. Pattipati, and A. Patterson-Hine, “A hidden Markov model-based algorithm for fault diagnosis with partial and imperfect tests,” IEEE Transactions on Systems, Man, and Cybernetics - Part C: Applications and Reviews, vol. 30, no. 4, pp. 463–473, November 2000. [5] B. Chen and P. Willett, “Detection of hidden Markov model transient signals,” IEEE Transactions on Aerospace and Electronic systems, vol. 36, no. 4, pp. 1253–1268, December 2000. [6] ——, “Superimposed HMM transient detection via target tracking ideas,” IEEE Transactions on Aerospace and Electronic systems, vol. 37, no. 3, pp. 946–956, July 2001. [7] L. R. Rabiner and B. H. Juang, “An introduction to hidden Markov models,” IEEE ASSP Magazine, pp. 4–16, January 1986. [8] L. R. Rabiner, “A tutorial on hidden Markov models and selected applications in speech recognition,” Proceedings of the IEEE, vol. 77, no. 2, pp. 257–286, February 1989. [9] J. Pearl, Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference. San Mateo, CA: Morgan Kaufmann, 1988. [10] M. I. Jordan, Learning in Graphical Models. MIT Press, 1999. [11] F. V. Jensen, An Introduction to Bayesian Networks. UCL Press London, 1996. [12] R. G. Cowell, A. P. Dawid, S. L. Lauritzen, and D. J. Spiegelhalter, Probabilistic Networks and Expert Systems. Springer-Verlag, 1999. [13] S. Fine, Y. Singer, and N. Tishby, “The hierarchical hidden Markov model: Analysis and applications,” Machine Learning, vol. 32, no. 1, pp. 41–62, July 1998. [14] M. I. Jordan, Z. Ghahramani, and L. K. Saul, “Hidden Markov decision trees,” in Advances in Neural Information Processing Systems, M. C. Mozer, M. I. Jordan, and T. Petsche, Eds. The MIT Press, 1997. [15] E. Gyftodimos and P. Flach, “Hierarchical Bayesian networks: a probabilistic reasoning model for structured domains,” in Proceedings of the ICML-2002 Workshop on Development of Representations, E. D. Jong and T. Oates, Eds. The university of New South Wales, July 2002, pp. 23–30. [16] S. Nakamura and K. Markov, “A hybrid HMM/Bayesian network approach to robust speech recognition,” in Proceedings of the Special Workshop in MAUI (SWIM), Maui, Hawaii, January 12-14 2004. [17] T. Coffman and S. Marcus, “Dynamic classification of groups through social network analysis and HMMs,” in IEEE Aerospace Conference, Big Sky, Montana, March 2004. [18] L. Hudson, B. Ware, K. Laskey, and S. Mahoney, “An application of Bayesian networks to antiterrorism risk management for military planners,” Department of Systems Engineering and Operations Research, Georgy Mason University, Tech. Rep., 2001. [19] E. Paté-Cornell and S. Guikema, “Probabilistic modeling of terrorist threats: a systems analysis approach to setting priorities among countermeasures,” Military Operations Research, vol. 7, no. 4, pp. 5–20, December 2002. [20] D. Heckerman and J. S. Breese, “Causal independence for probability assessment and inference using Bayesian networks,” IEEE Transactions on Systems, Man & Cybernetics - Part A: Systems and Humans, vol. 26, no. 6, pp. 826–831, November 1996. [21] H. Tu, J. Levchuk, and K. R. Pattipati, “Robust action strategies to induce desired effects,” IEEE Transactions on Systems, Man & Cybernetics - Part A: Systems and Humans, vol. 34, no. 5, pp. 664– 680, September 2004. [22] C. Huang and A. Darwiche, “Inference in belief networks: a procedural guide,” International Journal of Approximate Reasoning, vol. 15, no. 3, pp. 225–263, 1996. [23] L. Baum, T. Petric, G. Soules, and N. Weiss, “A maximization technique occurring in the statistical analysis of probabilistic function of Markov chains,” Annals of Mathematical Statistics, vol. 41, no. 1, pp. 164–171, 1970. [24] C. Zhai. (2003) A brief note on the hidden Markov models. [Online]. Available: http://sifaka.cs.uiuc.edu/course/397cxz03f/hmm.pdf [25] J. Allanach, H. Tu, S. Singh, P. Willett, and K. R. Pattipati, “Detecting, tracking, and counteracting terrorist networks via hidden Markov models,” in IEEE Aerospace Conference, Big Sky, Montana, March 2004. [26] I. Rish and M. Singh. A tutorial on inference and learning in Bayesian networks. [Online]. Available: http://www.research.ibm.com/people/r/rish/talks/BN-tutorial.ppt [27] E. Page, “Continuous inspection schemes,” Biometrika, vol. 41, pp. 100– 115, 1954. [28] S. Blackman and R. Popoli, Design and Analysis of Modern Tracking Systems. Artech House, 1999. [29] Z. Ghahramani and M. I. Jordan, “Factorial hidden Markov model,” Machine Learning, vol. 29, no. 2-3, pp. 245–273, November/December 1997. [30] Genie/SMILE, Decision Systems Laboratory, University of Pittsburgh. [Online]. Available: http://www.sis.pitt.edu/~genie [31] TEAMS® . [Online]. Available: http://www.teamqsi.com [32] S. Singh, H. Tu, J. Allanach, J. Areta, P. Willett, and K. R. Pattipati, “Modeling threats,” IEEE Potentials, pp. 18–21, August/September 2004. [33] Hijacking of Indian Airlines Flight IC-814. [Online]. Available: http://www.indianembassy.org/archive/IC 814.htm [34] Frontline Magazine, India,, vol. 17, no. 2, January-February 2000. [Online]. Available: http://www.frontlineonnet.com/fl1702/17020040.htm [35] H. Tu, J. Allanach, S. Singh, K. R. Pattipati, and P. Willett, “The adaptive safety analysis and monitoring system,” in SPIE Defense and Security Symposium, Orlando, April 12-16 2004, pp. 153–165. [36] S. Levinson, L. Rabiner, and M. Sondhi, “An introduction to the application of the theory of probabilistic functions of a Markov process to automatic speech recognition,” Bell Systems Technical Journal, vol. 62, pp. 1035–1074, 1983. [37] Y. Bar-Shalom, T. Kirubarajan, and C. Gokberk, “Tracking with classification-aided multiframe data association,” IEEE Transactions on Aerospace and Electronic systems, vol. 41, no. 4, October 2005. [38] H. Guo and W. Hsu, “A survey on algorithms for real-time Bayesian network inference,” In the joint AAAI-02/KDD-02/UAI- 14 02 workshop on Real-Time Decision Support and Diagnosis Systems, Edmonton, Alberta, Canada, 2002. [Online]. Available: citeseer.ist.psu.edu/guo02survey.html Haiying Tu received the BS degree in automatic control from Shanghai Institute of Railway Technology in 1993 and MS in transportation information engineering and control from Shanghai Tiedao University in 1996. She is currently a Ph.D. student in Electrical and Computer Engineering at the University of Connecticut (UCONN). Prior to joining UCONN, she was a lecturer at Tongji University in Shanghai, China and also worked as an employee of Computer Interlocking System Testing Center, which belongs to the Ministry of Railway of China. Her current research interests include organizational design, Bayesian analysis, fault diagnosis and decision making. Jeffrey Allanach is a systems engineer for Applied Physical Sciences (APS) in New London, CT. Prior to join APS, he was a graduate student of Electrical and Computer Engineering at the University of Connecticut (UConn). He received his MS in May 2005 and BS in December 2003, both from UConn. Currently, his research interests include signal processing and target tracking. Satnam Singh is a PhD student at Systems Optimization Laboratory, University of Connecticut. He received his MS degree in Electrical Engineering from the University of Wyoming. His interests are in signal processing, communication and optimization. Krishna Pattipati is a Professor of Electrical and Computer Engineering at the University of Connecticut, Storrs, CT, USA. His research has been primarily in the application of systems theory and optimization techniques to complex systems. Prof. Pattipati received the Centennial Key to the Future award in 1984 from the IEEE Systems, Man and Cybernetics (SMC) Society, and was elected a Fellow of the IEEE in 1995. He received the Andrew P. Sage award for the Best SMC Transactions Paper for 1999, Barry Carlton award for the Best AES Transactions Paper for 2000, the 2002 NASA Space Act Award, and the 2003 AAUP Research Excellence Award at the University of Connecticut. He also won the best technical paper awards at the 1985, 1990, 1994, 2002 and 2004 IEEE AUTOTEST Conferences, and at the 1997 and 2004 Command and Control Conferences. Prof. Pattipati served as Editor-in-Chief of the IEEE Transactions on SMC-Cybernetics (Part B) during 1998-2001. Peter Willett is a Professor of Electrical and Computer Engineering at the University of Connecticut. Previously he was at the University of Toronto, from which he received his BS in 1982, and at Princeton University from which he received his PhD in 1986. He has written, among other topics, about the processing of signals from volumetric arrays, decentralized detection, information theory, CDMA, learning from data, target tracking, and transient detection. He is a Fellow of the IEEE, is a member of the Board of Governors of IEEE’s AES society, and is a member of the IEEE Signal Processing Society’s SAM technical committee. He is an associate editor both for IEEE Transactions on Aerospace and Electronic Systems and for IEEE Transactions on Systems, Man, and Cybernetics. He is a track organizer for Remote Sensing at the IEEE Aerospace Conference (2001-2003), and was co-chair of the Diagnostics, Prognosis, and System Health Management SPIE Conference in Orlando. He also served as Program Co-Chair for the 2003 IEEE Systems, Man and Cybernetics Conference in Washington, DC.