Techniques and Challenges for Temporal Event Tracking

Transcription

Techniques and Challenges for Temporal Event Tracking
Techniques and Challenges for
Temporal Event Tracking
Heng Ji
Computer Science Department
Queens College and the Graduate Center
City University of New York
hengji@cs.qc.cuny.edu
June 1, 2010
Outline
 A Cross-document IE Task
 Methods in Global Time Discovery
 Remaining Challenges
 Chinese-specific Challenges
2
Traditional Single-document IE
VivendiUniversal
UniversalEntertainment.
Entertainment
BarryDiller
Diller on Wednesday quit as chief of Vivendi
Barry
Trigger
Arguments
Quit (a “Personnel/End-Position” event)
Role = Person
Barry Diller
Role = Organization
Vivendi Universal Entertainment
Role = Position
Chief
Role = Time-within
Wednesday (2003-03-04)
3
Limitations of Single-document IE
 Various events are evolving, updated, repeated and corrected in
different documents
 Most current IE analyzes single documents in isolation
 Net result is a set of facts which are
 Unconnected: Related events (e.g.“Tony Blair’s foreign trips”)
appear unconnected and unordered; 13480 event arguments for
337 articles per day in TDT5 corpus
 Unranked: all events are considered equally important
 Redundant: many events are frequently repeated in different
documents
 Erroneous and Incomplete (‘performance ceiling’): ACE event
extraction systems barely exceeded 50% F-score on argument
labeling; more than 50% event instances don’t include explicit time
arguments
4
A New Cross-document IE Task
…
Centroid=“Toefting”
Rank=26
…
  Time
2002-01-01
Time
2003-03-15
Time
2003-03-31
Event
Attack
Event
End-Position
Event
Sentence
Person
Toefting
Person
Toefting
Defendant
Toefting
Place
Copenhagen
Entity
Bolton
Sentence
Target
workers
four months
in prison
Crime
assault
Input: A test set of documents
Output: Identify a set of centroid entities, and then for each
centroid entity, link and order the events centered around it
on a time line
5
Evaluation Metrics
 Browsing Cost: Incorporate Novelty/Diversity into F-Measure
    An argument is correctly extracted in an event chain if its event type, string and role
match any of the reference argument mentions
Two arguments in an event chain are redundant if their event types, event time, string
(the full or partial name) and roles overlap
Browsing Cost (i) = the number of incorrect or redundant event arguments that a user
must examine before finding i correct event arguments
Temporal Correlation: Measure Coherence
  Temporal Correlation = the correlation of the temporal order of argset in the system
output and the answer key
Argument recall = number of unique and correct arguments in response / number of
unique arguments in key
6
A bit difference from what I learned from
this workshop
    We also like event arguments
We want to recover implicit event time arguments
Ideally we hope to conduct cross-doc inference
Event Chain centering around centroid entities, links on event mention
level instead of sentence level (main verb)
7
A Cross-document IE System
Test docs
Single-doc IE
Background Data
Unconnected Events
Wikipedia
Cross-doc Argument
Refinement
Related docs
Centroid Entity Detection
Global Time Discovery
Cross-doc Event Selection & Temporal Linking
Cross-doc Event Coreference
(Ji et al., RANLP 2009)
Ranked Temporal Event Chains
8
What’s New: Research Challenges Overview
 More Salient: Detecting centroid entities using global
confidence
 More Accurate and Complete: Correcting and
enriching arguments from the background data
 More Concise: Conducting cross-document event
coreference resolution to remove redundancy
9
Why Detecting Event Time?
 It’s
important to many NLP applications
Textual inference (Baral et al., 2005)
Multi-document text summarization (e.g. Barzilay e al., 2002),
Temporal event tracking (e.g. Bethard et al., 2007; 2008; Chambers et al., 2009;
Ji and Chen, 2009)
Template based question answering (Ahn et al., 2006)
     It’s
challenging because about half of the event instances don’t include
explicit time arguments
 Prior
    Our
  work of detecting implicit time arguments
Filatova and Hovy, 2001; Mani et al., 2003; Lapata and Lascarides, 2006;
Eidelman, 2008
Most work focused on sentence level
Linguistic evidence such as verb tense was used for inference
Focus
More fine-grained events
An event mention and all of its coreferential event mentions do not include any
explicit or implicit time expressions
10
Observations about Events in News
 Based on series of events
 Various situations are evolving, updated, repeated and
corrected in different event mentions
 Events occur as chains
 ConflictLife-Die/Life-Injure
 Justice-Convict  Justice-Charge-Indict/Justice-TrialHearing
 Writer won’t mention time repeatedly
 To avoid redundancy, rarely provide time arguments
for all of the related events
 Reader is expected to use inference
 On Aug 4 there is fantastic food in Suntec…Millions of people
came to attend the IE session.  the IE session is on Aug 4
11
Solution 1: Background Knowledge Reasoning
 Time Search from Related Documents
[Test Sentence]
<entity>Al-Douri</entity> said in the <entity>AP</entity> interview he would
love to
return to teaching but for now he plans to remain at the United Nations.
[Sentences from Related Documents]
In an interview with <entity>The Associated Press</entity>
<time>Wednesday<time> night, <entity>Al-Douri</entity> said he will
continue to
work at the United Nations and had no intention of defecting.
 Time Search from Wikipedia
[Test Sentence]
<person>Diller</person> started his entertainment career at <entity>ABC</
entity>,
where he is credited with creating the ``movie of the week'' concept.
[Sentences from Wikipedia]
<person>Diller</person> was hired by <entity>ABC</entity> in <time>1966</
time>
and was soon placed in charge of negotiating broadcast rights to feature
films.
12
Solution 2: Time Propagation between Events
Event Mention with time
Injured Russian diplomats and a convoy of America
America's Kurdish
Kurdish comrades in arms
Sunday
were among unintended victims caught in crossfire
crossfire and friendly fire
fire [Sunday]
Sunday.
Event Mention without time
Kurds
Kurds said 18 of their own died
died in the mistaken U.S.
U.S. air strike
strike.
Event Mention with time
courtsuspended a newspaper
A state
state security
security court
newspaper critical of the government
convictingit of publishing religiously inflammatory material.
Saturday
[Saturday]
Saturday after convicting
Event Mention without time
Monitor
The sentence
sentence was the latest in a series of state
state actions against the Monitor,
the only English language daily in Sudan and a leading critic of conditions in the
south of the country, where a civil war has been waged for 20 years.
(Gupta and Ji, ACL 2009)
13
Rule based Prediction
 Same-Sentence Propagation
  Relevant-Type Propagation
    EMi and EMj are in the same sentence and only one time expression exists in the
sentence
typei= “Conflict”, typei= “Life-Die/Life-Injure”
argi is coreferential with argj
rolei=“Target” and rolej=“Victim”, or rolei=rolej= “Instrument”
Same–Type Propagation
 argi is coreferential with argj, typei= typei, rolei= rolei, and match time-cue roles
Typei
Rolei
Typei
Rolei
Conflict
Target/Attacker/Crime
MovementTransport
Destination/Origin
Justice
Defendant/Crime/Plaintiff
Transaction
Buyer/Seller/Giver/Recipient
Life-Die/Life-Injure
Victim
Contact
Person/Entity
Life-Be-Born/LifeMarry/Life-Divorce
Person/Entity
Personnel
Person/Entity
Business
Organization/Entity
14
Statistical Learning based Prediction
 Maximum Entropy based model for propagate/non-propagate
classification of any event mention pair <EMi, EMj>
 Features
   Same Sentence: whether EMi and EMj are located in the same
sentence or not
Number of Time Arguments: EMi and EMj are in the same sentence, then
assign the number of time arguments in the sentence
Time-Cue Argument Role Matching: whether the time-cue role types in
EMi and EMj match or not
15
Cross-document Event Coreference Resolution
1. An explosion in a cafe at one of the
capital's busiest intersections killed one
woman and injured another Tuesday
4. Ankara police chief Ercument Yilmaz
visited the site of the morning blast
2. Police were investigating the cause of
the explosion in the restroom of the
multistory Crocodile Cafe in the
commercial district of Kizilay during
the morning rush hour
5. The explosion comes a month after
3. The blast shattered walls and
windows in the building
7. Radical leftist, Kurdish and Islamic
groups are active in the country and have
carried out the bombing in the past
6. a bomb exploded at a McDonald's
restaurant in Istanbul, causing damage
but no injuries
(Chen and Ji, 2009)
16
Method 1: Spectral Graph Clustering
Trigger
Arguments
Trigger
Arguments
Trigger
Arguments
explosion
Role = Place
a cafe
Role = Time
Tuesday
explosion
Trigger
Arguments
Trigger
Role = Place
restroom
Arguments
Role = Time
morning
rush hour
Trigger
explosion
Role = Place
building
Arguments
Trigger
Arguments
blast
Role = Place
site
Role = Time
morning
explosion
Role = Time
a month
after
exploded
Role = Place
restaurant
bombing
Role = Attacker groups
17
Spectral Graph Clustering
0.8
0.7
A
0.9
0.9
0.8
0.6
0.3
0.8
0.2
0.7
0.2
0.1
0.3
B
cut(A,B) = 0.1+0.2+0.2+0.3=0.8
18
Automatically Detect Event Attributes
 Modality
Expressing degrees of possibility, belief, evidentiality, expectation,
attempting, and command (Sauri et al., 2006); An Event is ASSERTED
when the author or speaker makes reference to it as though it were a
real occurrence; All other events are annotated as OTHER
  Polarity
Polarity has a value of NEGATIVE if an event did not occur, otherwise, it
has a value of POSITIVE
  Genericity
Genericity has a value of SPECIFIC if an event is a singular occurrence at
a particular place and time, otherwise, it has a value of GENERIC
  TENSE
It is determined with respect to the speaker or author. Possible values:
PAST, FUTURE, PRESENT, and UNSPECIFIED
 6/11/10
eETTs 2009
19
Event Attribute Disagreement Examples
Event
Attributes
Modality
Event Mentions
Toyota Motor Corp. said Tuesday it will promote Akio Toyoda, a grandson of
the company's founder who is widely viewed as a candidate to some day
head
Japan's largest automaker.
Managing director Toyoda, 46, grandson of Kiichiro Toyoda and the eldest son
of Toyota honorary chairman Shoichiro Toyoda, became one of 14 senior
managing directors under a streamlined management system set to be…
Polarity
Genericity
Other
Asserted
At least 19 people were killed in the first blast
Positive
There were no reports of deaths in the blast
Negative
An explosion in a cafe at one of the capital's busiest
intersections killed one woman and injured another Tuesday
Specific
Roh has said any pre-emptive strike against the North's nuclear facilities could
prove disastrous
Tense
Attribute
Value
Israel holds the Palestinian leader responsible for the latest violence, even
though the recent attacks were carried out by Islamic militants
We are warning Israel not to exploit this war against Iraq to carry
out more attacks against the Palestinian people in the Gaza Strip
and destroy the Palestinian Authority and the peace process.
Generic
Past
Future
20
Experiments: Data
  106 newswire texts from ACE 2005 training corpora as test set
extracted the top 40 ranked person names as centroid entities,
and manually created temporal event chains by
   Aggregated reference event mentions (Inter-annotator agreement:
~90%)
Filled in the implicit event time arguments from the background data
(Inter-annotator agreement: ~82%)
Annotated by two annotators independently and adjudicated
 278,108 texts from English TDT5 corpus and 148 million
sentences from Wikipedia as the source for background data
 140 events with 368 arguments (257 are unique)
 The top ranked centroid entities are “Bush”, “Ibrahim”, “Putin”,
“Al-douri”, “Blair”, etc.
21
Browsing Cost
22
Temporal Correlation
Method
Temporal Argument
Correlation
Recall
Baseline: ordered by event reporting time
3.71%
27.63%
Method1: Single-document IE
44.02%
27.63%
Method2: 1+Cross-doc Event Coreference
46.15%
27.63%
Method3: 2+ Cross-doc Argument Refinement
55.73%
30.74%
Method4: 3 + Global Time Discovery
70.09%
33.07%
23
Time Propagation Experiments
 Data and Answer-Key Annotation
    Construct any pair of event mentions <EMi, EMj> as a candidate sample
if EMi includes a time argument while EMj and its coreferential event
mentions don’t include any time arguments; manually label “Propagate/
Not-Propagate” for <EMi, EMj>
47 ACE05 newswire texts for training (485 “Propagate” samples and 617
“Not-Propagate” samples) and blind test on 10 texts (212 samples)
Results
Method
P (%)
R (%)
F(%)
Rule-Based
70.40
74.06
72.18
Statistical Learning
72.48
50.94
59.83
The most common correctly propagated pairs are
    Conflict-Attack  Life-Die/Life-Injure
Justice Convict Justice-Sentence/Justice-Charge-Indict
Movement-Transport  Contact-Meet
Justice-Charge-Indict  Justice-Convict
24
Why Rule-based Prediction Performs Better
 Not
enough training data to capture all the evidence from different time-cue roles
 Example: only one positive training sample matching “defendant” role
(newspaper/Monitor):
Event Mention with time
courtsuspended a newspaper
A state
state security
security court
newspaper critical of the government
convictingit of publishing religiously inflammatory material.
Saturday
[Saturday]
Saturday after convicting
Event Mention without time
Monitor
The sentence
sentence was the latest in a series of state
state actions against the Monitor,
the only English language daily in Sudan and a leading critic of conditions
in the south of the country, where a civil war has been waged for 20 years.
 Combining
these two approaches in a self-training framework – adding the highconfidence results from rules as additional training data to re-train the MaxEnt
classifier - did not provide further improvement
25
To Fix the Remaining Spurious Errors
 Incorporate distance, event reporting order, context event
features and better entity coreference resolution
Event Mention with time
American troops stormed a presidential palace and other key buildings in
U.S. tanks
tanks rumbled into the heart of the battered Iraqi capital on
Baghdad as U.S.
Monday amid the thunder of gunfire
[Monday]
Monday
gunfire and explosions
explosions…
Event Mention without time
√
?
Iraqis shot
At the palace compound, Iraqis
shot small
small arms
arms fire from a clock tower,
U.S. tanks
which the U.S. tanks quickly destroyed.
Event Mention with time
[Saturday]
gun battles
Saturday
The first one was on Saturday and triggered intense gun battles, which
Iraqi
according to some U.S. accounts, left at least 2,000 Iraqi fighters dead.
26
Remaining Challenges: Cross-document Discourse
Reasoning
 Query: When was Carol Shepp McCain acting as the wife of John
McCain?
 Answer: 1966-1980
 DOCID: LTW_ENG_20081007.0068.LDC2009T13
Carol Shepp McCain, then 42, had endured much in more than 14
years of marriage to John. She had raised their three young children
alone while her husband languished in a North Vietnamese prison camp
for 5 1/2 years
 DOCID: LTW_ENG_20081007.0068.LDC2009T13
Nine months earlier, at a cocktail reception in Hawaii, he met a
glamorous young heiress named Cindy Lou Hensley and, by all
accounts, fell instantly in love. According to public records, he and
Cindy received a marriage license in Maricopa County, Ariz., in early
March 1980, four weeks before his divorce from Carol was final.
27
Remaining Challenges: Paraphrase Discovery
      Query: During when was R. Nicholas Burns a member of the U.S. State Department?
Answer: 1995-2008
<DOCID> APW_ENG_19950112.0477.LDC2007T07 </DOCID>
R. Nicholas Burns, a career foreign service officer in charge of Russian affairs at
the National Security Council, is due to be named the new spokesman at the U.S.
State Department, a senior U.S. official said Thursday.
[APW_ENG_20070324.0924.LDC2009T13 and many other DOCS]
The United States is "very pleased by the strength of this resolution" after two years
of diplomacy, said R. Nicholas Burns, undersecretary for political affairs at the
State Department.
<DOCID> NYT_ENG_20080118.0161.LDC2009T13 </DOCID>
R. Nicholas Burns, the country's third-ranking diplomat and Secretary of State
Condoleezza Rice's right-hand man, is retiring for personal reasons, the State
Department said Friday.
<DOCID> NYT_ENG_20080302.0157.LDC2009T13 </DOCID>
The chief U.S. negotiator, R. Nicholas Burns, who left his job on Friday, countered
that the sanctions were all about Iran's refusal to stop enriching uranium, not about
weapons. But that argument was a tough sell.
28
Remaining Challenges: Unclear Boundaries
 With ending-date only:
    With vague starting-date only:
  Sotheby's main shareholder and former chairman
Vivendi Universal, the world's second-largest media group after AOL Time
Warner of the United States, has been digging out from under a mountain of
debt since the removal of expansionist boss Jean-Marie Messier last July,
largely through asset sales
Nathan divorced wallpaper salesman Bruce Nathan in 1992
the recently appointed Palestinian prime minister
With time-duration only:
  Liana Owen drove 10 hours from Pennsylvania to attend the rally in
Manhattan with her parents
Hariri submitted his resignation during a 10-minute meeting with the
head of state at the Baabda presidential palace, outside the capital
29
Chinese-specific Challenges
 Time Argument Associated with Un-defined Events
   贝克和他的研究小组在1990 年代的初期到中期一共研究了103个人,这项最新
的研究结论是在追踪这103人病例以后所获得的 ,研究人发现在这103人当
中,婚姻不愉快的心脏的内壁都比较厚。
From the beginning to middle of 1990s Bake and his research team
investigated 103 people ……, unpleasant marriage …
检察官说,他们希望正式起诉英国和欧洲官员,因为1989年,他们在国内禁
止使用被疑受污染的动物饲料,但却允许英国继续出口这种饲料
The prosecutor said, they hope to formally sue the British and European
officials, because in 1989, they forbid using the animal feed…
Time Argument Associated with Longer Distance Events
 第55届联大20号晚在巴以冲突紧急特别会议上以压倒性多数的表决结果通过
决议
The 55th joint association 20th night at Palestine-Israel confliction
emergency meeting with significantly major voting passed the decision…b
30
Chinese-specific Challenges (Cont’)
 Reasoning across Multiple Time Arguments
还有半个月是结婚一周年
It’s the marriage one year anniversary after half a month
 昨天晚上香港中华总商会在香港会展中心举 办成立100周年庆祝酒会
Last night Hong Kong China Business Association held a 100 anniversary
banquet in the Hong Kong Exhibition Center
  Reasoning across Multiple Events/Slots
此次释放全部被捕军警的行动是在哥政府与游击队代表在哈瓦那经过一周多
协商后由该游击队组织单方面决定的。
This release activity on all the arrested police was made one week
after the negotiation between ….
 摩托罗拉设计师梁丽娇(42岁)第一次出国公干,不料竟踏上不归路。
The Motorola designer Liang Lijiao (42) went abroad for business for
the first time…  31
Chinese-specific Challenges (Cont’)
 “Hidden” Tense
 新华社南京12月17日电(记者 赵明亮)我国家电行业大型企业
之一江苏苏宁电器集团昨天在南京宣布:未来三年内在全国建立1
500家综合电器连锁店,比现在的数量增加10倍,形成行业内
的“航空母舰”,以应对入世后跨国商业资本的<进入>。
…announced: in the next three years to build in the whole
country…
32
Related Work
 Event Tracking
   Information Redundancy to Improve Extraction Accuracy
  Discovering temporal event chains: TempEval (Verhagen et al., 2007); e.g.
Bethard and Martin (2008), Chambers and Jurafsky (2008)
Topic detection and tracking (Allan, 2002)
Downey et al. (2005), Yangarber (2006), Mann (2007), Patwardhan and
Riloff (2007; 2009)
What’s New
   Extend the representation of each node in the linked chains from an event
trigger/sentence/sentence to a structured aggregated event including finegrained information such as event types, arguments and their roles
Global argument correction and implicit time discovery
Correct the original extracted facts and discover implicit time arguments
using background knowledge
33
Conclusion
 Temporal Event Tracking is an important and challenging task
 Substantial improvement requires global reasoning and more finegrained temporal annotation
 Let’s keep working on it. An advertisement:
  NIST Knowledge Base Population:
http://nlp.cs.qc.cuny.edu/kbp/2010/
New research focus in KBP 2011:
  Temporal KBP
Cross-lingual (Chinese to English) KBP
34
Thank you
35
Evaluation Metrics
 Centroid Entity Detection
    Browsing Cost: Incorporate Novelty/Diversity into F-Measure
    F-Measure: A centroid entity is correctly detected if its name (and document id)
matches the full or partial name of a reference centroid
Normalized Kendall tau distance (Centroid entities) = the fraction of correct system
centroid entity pairs out of salience order
Centroid Entity Ranking Accuracy = 1- Normalized Kendall tau distance (Centroids)
An argument is correctly extracted in an event chain if its event type, string and role
match any of the reference argument mentions
Two arguments in an event chain are redundant if their event types, event time, string
(the full or partial name) and roles overlap
Browsing Cost (i) = the number of incorrect or redundant event arguments that a user
must examine before finding i correct event arguments
Temporal Correlation: Measure Coherence
  Temporal Correlation = the correlation of the temporal order of argset in the system
output and the answer key
Argument recall = number of unique and correct arguments in response / number of
unique arguments in key
36
Baseline Single-document IE System
 Includes entity extraction, time expression extraction and
normalization, relation extraction and event extraction
 Event Extraction
 Pattern Matching
 British
and US forces reported gains in the advance on Baghdad
 PER report gain in advance on LOC
 Maximum Entropy Models
 Trigger
Labeling: to distinguish event instances from non-events, to
classify event instances by type
 Argument Identification: to distinguish arguments from non-arguments
 Argument Classification: to classify arguments by argument role
 Reportable-Event Classifier: to determine whether there is a
reportable event instance
 Each step produces local confidence
(Grishman et al., 2005)
37
Conclusion and Future Work
 Used propagation between related events to predict unknown
time arguments which were not possible using the traditional
explicit time argument extraction techniques
 Compared two approaches and demonstrated that
embarrassingly simple but smart knowledge engineering can
perform better than supervised learning with small training
corpora for some particular tasks
 Further work
Applied for temporal event tracking (Ji et al., 2009) and improved the
correlation score from 55.73% to 70.09%.
  Future work
Incorporate dynamic context features into MaxEnt
 Extend to cross-document IE, predict event time from related documents
 38