A real-time 3D video analyzer for enhanced 3D audio–visual systemsSpringer Science and Business Media LLC - - 2019
Sangoh Jeong, Hyun-Soo Kim, KyuWoon Kim, Byeong-Moon Jeon, Joong-Ho Won
With the recent advent of three-dimensional (3D) sound home theater systems (HTS), more and more TV viewers are experiencing rich, immersive auditory presence at home. In this paper, visual processing approaches are provided to make 3D audio–visual (AV) systems more realistic to the viewers. In the proposed system, a visual engine processes stereo video streams to extract a disparity map for each ...... hiện toàn bộ
Spatial attention-guided deformable fusion network for salient object detectionSpringer Science and Business Media LLC - Tập 29 - Trang 2563-2573 - 2023
Aiping Yang, Yan Liu, Simeng Cheng, Jiale Cao, Zhong Ji, Yanwei Pang
Most of salient object detection methods employ U-shape architecture as the understructure. Although promising performance has been achieved, they struggle to detect salient objects with non-rigid shapes and arbitrary sizes. Besides, the features are transmitted to the decoder directly without any discrimination and active selection, resulting in prominent features underutilized. To address the ab...... hiện toàn bộ
Association rule mining for the usability of the CAPTCHA interfaces: a new study of multimedia systemsSpringer Science and Business Media LLC - Tập 24 - Trang 625-644 - 2018
Darko Brodić, Alessia Amelio
This paper presents an analysis of the CAPTCHA interfaces in terms of their usability to Internet users. The usability is represented by the time needed to the users for finding a solution to the CAPTCHA, which is called response time. Specifically, the analysis is focused on four examples of text and image-based CAPTCHA. The aim is to study the cognitive factors influencing the Internet users in ...... hiện toàn bộ
The management and applications of teleaction objectsSpringer Science and Business Media LLC - Tập 3 - Trang 204-216 - 1995
Hui-Jung Chang, Tai-Yuan Hou, Arding Hsu, Shi-Kuo Chang
Teleaction objects (TAOs) possess private knowledge specific to the object instances. The user can create and modify the private knowledge of a TAO, so that it automatically reacts to certain events to preperform operations for generating timely responses, improving operational efficiency and maintaining consistency. Moreover, TAOs also possess a hypergraph structure leading to the effective prese...... hiện toàn bộ
How does context influence music preferences: a user-based study of the effects of contextual information on users’ preferred musicSpringer Science and Business Media LLC - Tập 27 Số 2 - Trang 143-160 - 2021
Ben Sassi, Imen, Ben Yahia, Sadok
To simplify effective music filtering, recommender systems (RS) have received great attention from both industry and academia area. To select which music to recommend, traditional RS uses an approximation of users’ real interests. However, while discarding users’ contexts, profiles information is not able to reflect their exact needs and to provide overpowering recommendations. One of the main iss...... hiện toàn bộ
$$\hbox {DA}^2$$ Net: a dual attention-aware network for robust crowd countingSpringer Science and Business Media LLC - Tập 29 - Trang 3027-3040 - 2022
Wenzhe Zhai, Qilei Li, Ying Zhou, Xuesong Li, Jinfeng Pan, Guofeng Zou, Mingliang Gao
Crowd counting in congested scenes is a crucial yet challenging task in video surveillance and urban security system. The performance of crowd counting has been greatly boosted with the rapid development of deep learning. However, robust crowd counting in high-density environment with scale variations remains under-explored. To address this problem, we propose a dual attention-aware network (
...... hiện toàn bộ
Audio-visual speech recognition integrating 3D lip information obtained from the KinectSpringer Science and Business Media LLC - Tập 22 - Trang 315-323 - 2015
Jianrong Wang, Ju Zhang, Kiyoshi Honda, Jianguo Wei, Jianwu Dang
Audio-visual speech recognition (AVSR) has shown impressive improvements over audio-only speech recognition in the presence of acoustic noise. However, the problems of region-of-interest detection and feature extraction may influence the recognition performance due to the visual speech information obtained typically from planar video data. In this paper, we deviate from the traditional visual spe...... hiện toàn bộ