Advance in storage technologies has enabled us to create large-scale online broadcast video archives. Among various programs, news shows provide important information in an unstructured form due to the nature of their contents. Thus it is critical to an alyze the semantic structures (topics, threads, clusters, and so on) that lie in a news video archive in order to efficiently retrieve its contents. We, at NII, have been archiving a Japanese daily news show since March, 2001. This has now summed up to more than 1,300 hours of video data in the form of approximately 750 GB of MPEG-1 video data and 60 MB of closed-caption text data. Grasping the contents of such large amount of video data, and moreover, the relations between news stories exceeds human ability.
This talk will introduce and demonstrate the following
projects which support retrieval and reusue of news video in a large-scale
news video archive, as solutions to the above-mentioned problem:
(1) mediaWalker -- Retrieval of news topic threads and an
interface based on the thread structures.
(2) mediaTraveller -- Cross-language retrieval of related news stories
by text and near-duplicate video segments.
(3) trackThem -- A news browsing interface based on social
relations between people in the news.