Beyond the Informedia digital video library: video and audio analysis for remembering conversations

IEEE Workshop on Automatic Speech Recognition and Understanding, 2001. ASRU '01. - Trang 296-300

A.G. Hauptmann¹, Wei-Hao Lin¹

¹School of Computer Science, Carnegie Mellon University, Pittsburgh, PA, USA

Tóm tắt

The Informedia Project digital video library pioneered the automatic analysis of television broadcast news and its retrieval on demand. Building on that system, we have developed a wearable, personalized Informedia system, which listens to and transcribes the wearer's part of a conversation, recognizes the face of the current dialog partner and remembers his/her voice. The next time the system sees the same person's face and hears the same voice, it can retrieve the audio from the last conversation, replaying in compressed form the names and major issues that were mentioned. All of this happens unobtrusively, somewhat like an intelligent assistant who whispers to you: "That's Bob Jones from Tech Solutions; two weeks ago in London you discussed solar panels". This paper outlines the general system components as well as interface considerations. Initial implementations showed that both face recognition methods and speaker identification technology have serious shortfalls that must be overcome.

Từ khóa

#Software libraries #Digital video broadcasting #Digital audio broadcasting #Face recognition #Speech recognition #Space technology #Global Positioning System #Computer science #TV broadcasting #Automatic speech recognition

Tài liệu tham khảo

rhodes, 1996, Rememberance Agent: A continuously running automated information retrieval system, Proc of Pract App of Intelligent Agents and Multi-Agent Tech (PAAM) 10.1353/log.2014.0007 lamming, 1994, Forget-me-not Intimate Computing in Support of Human Memory FRIEND21 Symposium on Next Generation Human Interface bush, 1945, As We May Think The Atlantic Monthly, 176, 101 gray, 1999, What next? A few remaining problems in Information Technology, Federated Computing Research Conference arons, 1994, Interactively Skimming Recorded Speech arons, 1994, Pitch-Based Emphasis Detection for Segmenting Speech Recordings, ICASSP-94, 4, 18 rowley, 0, Face Detection in Visual Scenes, CMU-CS-95–186 Technical report rowley, 1998, invariant neural network-based face detection, IEEE CVPR 10.1109/CVPR.1997.609351 10.1109/CVPR.1997.609414 0, CHARME 10.1109/ICASSP.1991.150352 0, Xybemaut leibe, 2000, Toward Spontaneous Interaction with the Perceptive Workbench, IEEE Computer Graphics and Applications 10.1109/ISWC.1998.729528 0, VIA abowd, 1999, Classroom 2000 An experiment with the instrumentation of a living educational environment IBM Systems Journal sparacino, 2000, Media in performance Interactive spaces for dance theater circus and museum exhibits IBM Systems Journal 10.1002/0471221635 russell, 1995, Unencumbered Virtual Environments, Proc of IJCAI'95 Workshop on Entertainment and AI/Alife 0, MicroOptical 10.1109/CVPR.1998.698586 seymore, 1998, The 1997 CMU Sphinx-3 English Broadcast News Transcription System, DARPA Workshop on Broadcast News Understanding Systems 0, Virage Corporate Web Site 10.1109/2.745722 10.1109/2.493456 schmidt, 1997, GMM sample statistic log-likelihoods for text-independent speaker recognition, Eurospeech 9, 855 0, Visionics FaceIt Developer Kit

Scholar Hub - Công cụ hỗ trợ trích dẫn và phân tích khoa học Việt Nam

Về chúng tôi

Scholar Hub là công cụ hỗ trợ trích dẫn và phân tích các bài báo, công bố khoa học Việt Nam. Công cụ trợ giúp người nghiên cứu, tạp chí, đơn vị nghiên cứu tra cứu, phân tích và thống kê dữ liệu nghiên cứu khoa học tại Việt Nam và quốc tế.
ScholarHub KHÔNG đăng thông tin tổng hợp, KHÔNG đăng lại nội dung từ các trang báo chí Việt Nam hoặc trang thông tin điện tử khác tại Việt Nam.

Thông tin, cập nhật

Đăng ký Tạp chí tham gia vào Scholar Hub

Phản hồi ý kiến về Scholar Hub

Bài viết, nội dung cập nhật

Chủ đề khoa học

Website liên kết

Hệ thống CSDL Khoa học & Công nghệ

Phần mềm kiểm tra trùng lặp Kiểm Tra Tài Liệu

Phần mềm xuất bản tạp chí điện tử VOJS

Nền tảng trắc nghiệm và đề thi đa lĩnh vực LetQA