Topic Detection and Tracking: Event-based Information Organization

Cover
James Allan
Springer Science & Business Media, 28.02.2002 - 266 Seiten
The purposeofthis book is to providea recordofthe stateofthe art in Topic Detection and Tracking (TDT) in a single place. Research in TDT has been going on for about five years, and publications related to it are scattered all over the place as technical reports, unpublished manuscripts, or in numerous conference proceedings. The third and fourth in a series of on-going TDT evaluations marked a turning point in the research. As such. it provides an excellent time to pause. review the state of the art. gather lessons learned, and describe the open challenges. This book is a collection oftechnical papers. As such, its primary audience is researchers interested in the the current state of TDT research, researchers who hope to leverage that work sothat theirown efforts can avoid pointlessdu- plication and false starts. It might also pointthem in the direction ofinteresting unsolved problems within the area. The book is also of interest to practition- ers in fields that are related to TDT--e.g., Information Retrieval. Automatic Speech Recognition. Machine Learning, Information Extraction, and so on. In thosecases, TDTmay provide arich application domain for theirown research, or it might address similarenough problems that some lessons learned can be tweaked slightly to answer-perhaps partiallY-
 

Inhalt

Introduction to Topic Detection and Tracking
3
2 TDT tasks
5
3 History of TDT
9
4 TDT 1999 and TDT 2000
12
5 The Future of TDT
15
Topic Detection and Tracking Evaluation Overview
19
Stories Events and Topics
20
3 TDT Corpora
21
5 Summary
134
Segmentation and Detection at IBM Hybrid Statistical Models and Twotiered Clustering
137
2 Topic Detection
144
3 Acknowledgements
149
A ClusterBased Approach to Broadcast News
151
2 Segmentation
154
3 Detection
156
4 Tracking
165

4 Evaluation Methodology
22
5 Task Definitions
27
Corpora for Topic Detection and Tracking
35
2 Overview of TDT Corpus Development
37
3 Collection of Raw Data
38
4 Transcription
40
5 Story Segmentation
41
6 Topic Definition
44
7 Topic Annotation
47
8 Corpus Formats
56
9 Some Properties of the Corpus
63
10 Conclusion
66
Probabilistic Approaches to Topic Detection and Tracking
69
2 Core TDT Technologies
70
3 Corpus Processing
77
5 Detection
79
6 Crosslingual TDT
82
7 Conclusions and Future Work
83
Multistrategy Learning for Topic Detection and Tracking A joint report of CMU approaches to multilingual TDT
87
2 Segmentation
89
3 Topic and Event Tracking
90
4 Topic Detection
98
5 First Story Detection
101
6 Story Link Detection
103
7 Multilingual TDT
109
8 Concluding Remarks
113
Statistical Models of Topical Content
117
2 Models of Story Generation
119
3 Tracking Systems
122
4 Detection System
130
5 Acknowledgements
175
Signal Boosting for Translingual Topic Tracking Document Expansion and nbest Translation
177
1 Introduction
178
2 The SignaltoNoise Perspective
179
3 Topic Tracking System Architecture
180
4 Contrastive Conditions
186
5 Conclusions and Future Work
193
Explorations Within Topic Tracking and Detection
199
2 Basic System
200
3 Tracking
205
4 Cluster Detection
207
5 First Story Detection
210
7 Bounds on Effectiveness
218
8 Automatic Timeline Generation
221
9 Conclusions
224
Towards a Universal Dictionary for MultiLanguage Information Retrieval Applications
227
2 Our TDT tracking algorithm
231
3 The Universal Dictionary experiment
238
4 Conclusions and Directions for Future Work
241
An NLP IR Approach to Topic Detection
245
2 General System Framework
247
3 Representation of News Stories and Topics
248
Method
250
5 Multilingual Topic Detection
252
6 Development Experiments
258
7 Evaluation
261
8 Discussion
263
9 Concluding Remarks and Future Works
264
Urheberrecht

Andere Ausgaben - Alle anzeigen

Häufige Begriffe und Wortgruppen

Beliebte Passagen

Seite iv - Kang Wu; Mohan S. Kankanhalli;Joo-Hwee Lim;Dezhong Hong; ISBN: 0-7923-7944-6 MINING THE WORLD WIDE WEB : An Information Search Approach, by George Chang, Marcus J. Healey, James AM McHugh, Jason TL Wang; ISBN: 0-7923-7349-9 INTEGRATED REGION-BASED IMAGE RETRIEVAL, by James Z. Wang; ISBN: 0-7923-7350-2 TOPIC DETECTION AND TRACKING: Event-based Information Organization, Language Modeling for Information Retrieval Edited by W.
Seite iv - Lalmas, and Cornells Joost van Rijsbergen; ISBN: 0-7923-8302-8 DOCUMENT COMPUTING: Technologies for Managing Electronic Document Collections, by Ross Wilkinson, Timothy Arnold-Moore, Michael Fuller, Ron Sacks-Davis, James Thorn, and Justin Zobel; ISBN: 0-7923-8357-5 AUTOMATIC INDEXING AND ABSTRACTING OF DOCUMENT TEXTS, by MarieFrancine Moens; ISBN 0-7923-7793-1 ADVANCES IN INFORMATIONAL RETRIEVAL: Recent Research from the Center for Intelligent...

Bibliografische Informationen