IEEE Transactions on Neural Networks
Công bố khoa học tiêu biểu
* Dữ liệu chỉ mang tính chất tham khảo
Sắp xếp:
Any reasonable cost function can be used for a posteriori probability approximation
IEEE Transactions on Neural Networks - Tập 13 Số 5 - Trang 1204-1210 - 2002
In this paper, we provide a straightforward proof of an important, but nevertheless little known, result obtained by Lindley in the framework of subjective probability theory. This result, once interpreted in the machine learning/pattern recognition context, puts new light on the probabilistic interpretation of the output of a trained classifier. A learning machine, or more generally a model, is usually trained by minimizing a criterion-the expectation of the cost function-measuring the discrepancy between the model output and the desired output. In this letter, we first show that, for the binary classification case, training the model with any "reasonable cost function" can lead to Bayesian a posteriori probability estimation. Indeed, after having trained the model by minimizing the criterion, there always exists a computable transformation that maps the output of the model to the Bayesian a posteriori probability of the class membership given the input. Then, necessary conditions allowing the computation of the transformation mapping the outputs of the model to the a posteriori probabilities are derived for the multioutput case. Finally, these theoretical results are illustrated through some simulation examples involving various cost functions.
#Cost function #Laboratories #Machine learning #Bayesian methods #Mean square error methods #Artificial intelligence #Computational modeling #Decision making #Artificial neural networks #Input variables
Statistical multimodal integration for audio-visual speech processing
IEEE Transactions on Neural Networks - Tập 13 Số 4 - Trang 854-866 - 2002
Sensory information is indispensable for living things. It is also important for living things to integrate multiple types of senses to understand their surroundings. In human communications, human beings must further integrate the multimodal senses of audition and vision to understand intention. In this paper, we describe speech related modalities since speech is the most important media to transmit human intention. To date, there have been a lot of studies concerning technologies in speech communications, but performance levels still have room for improvement. For instance, although speech recognition has achieved remarkable progress, the speech recognition performance still seriously degrades in acoustically adverse environments. On the other hand, perceptual research has proved the existence of the complementary integration of audio speech and visual face movements in human perception mechanisms. Such research has stimulated attempts to apply visual face information to speech recognition and synthesis. This paper introduces works on audio-visual speech recognition, speech to lip movement mapping for audio-visual speech synthesis, and audio-visual speech translation.
#Speech processing #Speech synthesis #Speech recognition #Humans #Hidden Markov models #Keyboards #Mice #Man machine systems #Communications technology #Oral communication
An HMM-based speech-to-video synthesizer
IEEE Transactions on Neural Networks - Tập 13 Số 4 - Trang 900-915 - 2002
Emerging broadband communication systems promise a future of multimedia telephony, e.g. the addition of visual information to telephone conversations. It is useful to consider the problem of generating the critical information useful for speechreading, based on existing narrowband communications systems used for speech. This paper focuses on the problem of synthesizing visual articulatory movements given the acoustic speech signal. In this application, the acoustic speech signal is analyzed and the corresponding articulatory movements are synthesized for speechreading. This paper describes a hidden Markov model (HMM)-based visual speech synthesizer. The key elements in the application of HMMs to this problem are the decomposition of the overall modeling task into key stages and the judicious determination of the observation vector's components for each stage. The main contribution of this paper is a novel correlation HMM model that is able to integrate independently trained acoustic and visual HMMs for speech-to-visual synthesis. This model allows increased flexibility in choosing model topologies for the acoustic and visual HMMs. Moreover the propose model reduces the amount of training data compared to early integration modeling techniques. Results from objective experiments analysis show that the propose approach can reduce time alignment errors by 37.4% compared to conventional temporal scaling method. Furthermore, subjective results indicated that the purpose model can increase speech understanding.
#Synthesizers #Hidden Markov models #Speech synthesis #Telephony #Signal synthesis #Speech analysis #Broadband communication #Multimedia systems #Narrowband #Acoustic applications
An analysis of global asymptotic stability of delayed cellular neural networks
IEEE Transactions on Neural Networks - Tập 13 Số 5 - Trang 1239-1242 - 2002
In this paper, a new sufficient condition is given for the uniqueness and global asymptotic stability of the equilibrium point for delayed cellular neural networks (DCNNs). This condition imposes constraints on the feedback and delayed feedback matrices of a DCNN independently of the delay parameter. This result is also compared with the previous results derived in the literature.
#Asymptotic stability #Cellular neural networks #Neurofeedback #Delay #Sufficient conditions #Stability analysis #Neural networks #Equations #Output feedback #State feedback
From members to teams to committee-a robust approach to gestural and multimodal recognition
IEEE Transactions on Neural Networks - Tập 13 Số 4 - Trang 972-982 - 2002
When building a complex pattern recognizer with high-dimensional input features, a number of selection uncertainties arise. Traditional approaches to resolving these uncertainties typically rely either on the researcher's intuition or performance evaluation on validation data, both of which result in poor generalization and robustness on test data. This paper describes a novel recognition technique called members to teams to committee (MTC), which is designed to reduce modeling uncertainty. In particular, the MTC posterior estimator is based on a coordinated set of divide-and-conquer estimators that derive from a three-tiered architectural structure corresponding to individual members, teams, and the overall committee. Basically, the MTC recognition decision is determined by the whole empirical posterior distribution, rather than a single estimate. This paper describes the application of the MTC technique to handwritten gesture recognition and multimodal system integration and presents a comprehensive analysis of the characteristics and advantages of the MTC approach.
#Robustness #Pattern recognition #Uncertainty #Feature extraction #Handwriting recognition #Acoustic noise #Cepstral analysis #Testing #Character recognition #Decision making
A spatial-temporal approach for video caption detection and recognition
IEEE Transactions on Neural Networks - Tập 13 Số 4 - Trang 961-971 - 2002
We present a video caption detection and recognition system based on a fuzzy-clustering neural network (FCNN) classifier. Using a novel caption-transition detection scheme we locate both spatial and temporal positions of video captions with high precision and efficiency. Then employing several new character segmentation and binarization techniques, we improve the Chinese video-caption recognition accuracy from 13% to 86% on a set of news video captions. As the first attempt on Chinese video-caption recognition, our experiment results are very encouraging.
#Indexing #Neural networks #Optical character recognition software #Character recognition #Shape measurement #Layout #Data mining #Video compression #Gunshot detection systems #Fuzzy neural networks
Improved neural network for SVM learning
IEEE Transactions on Neural Networks - Tập 13 Số 5 - Trang 1243-1244 - 2002
The recurrent network of Xia et al. (1996) was proposed for solving quadratic programming problems and was recently adapted to support vector machine (SVM) learning by Tan et al. (2000). We show that this formulation contains some unnecessary circuits which, furthermore, can fail to provide the correct value of one of the SVM parameters and suggest how to avoid these drawbacks.
#Neural networks #Support vector machines #Support vector machine classification #Quadratic programming #Machine learning #Circuits #Hardware #Proposals #Very large scale integration #Differential equations
A class of physical modeling recurrent networks for analysis/synthesis of plucked string instruments
IEEE Transactions on Neural Networks - Tập 13 Số 5 - Trang 1137-1148 - 2002
A new approach is proposed that closely synthesizes tones of plucked string instruments by using a class of physical modeling recurrent networks. The strategies employed consist of a fast training algorithm and a multistage training procedure that are able to obtain the synthesis parameters for a specific instrument automatically. The training vector can be recorded tones of most target plucked instruments with ordinary microphones. The proposed approach delivers encouraging results when it is applied to different types of plucked string instruments such as steel-string guitar, nylon-string guitar, harp, Chin, Yueh-chin, and Pipa. The synthesized tones sound very close to the originals produced by their acoustic counterparts. In addition, the paper presents an embedded technique that can produce special effects such as vibrato and portamento that are vital to the playing of plucked-string instruments. The computation required in the resynthesis processing is also reasonable.
#Network synthesis #Instruments #Signal synthesis #Speech synthesis #Digital filters #Lattices #Synthesizers #Position measurement #Time measurement #Microphones
Một phương pháp dựa trên pha để ước lượng trường dòng quang học bằng cách sử dụng lọc không gian Dịch bởi AI
IEEE Transactions on Neural Networks - Tập 13 Số 5 - Trang 1127-1136 - 2002
Chúng tôi giới thiệu một kỹ thuật mới để ước lượng trường dòng quang học, bắt đầu từ các chuỗi hình ảnh. Như đã được Fleet và Jepson (1990) gợi ý, chúng tôi theo dõi các đường viền pha không đổi theo thời gian, vì chúng có độ bền hơn với những biến đổi trong điều kiện ánh sáng và các sai lệch so với chuyển động thuần túy so với các đường viền biên độ không đổi. Phương pháp dựa trên pha của chúng tôi tiến hành qua ba giai đoạn. Đầu tiên, chuỗi hình ảnh được lọc không gian bằng cách sử dụng bộ lọc Gabor gồm các cặp bậc hai, và độ dốc pha tạm thời được tính toán, tạo ra các ước tính về thành phần vận tốc theo các hướng vuông góc với các hướng của các cặp bộ lọc. Thứ hai, một thành phần vận tốc sẽ bị loại bỏ nếu thông tin pha của cặp bộ lọc tương ứng không tuyến tính qua thời gian cho trước. Thứ ba, các thành phần vận tốc còn lại tại một vị trí không gian duy nhất được kết hợp và một mạng nơ-ron hồi tiếp được sử dụng để suy ra vận tốc đầy đủ. Chúng tôi thử nghiệm phương pháp của mình trên nhiều chuỗi hình ảnh, cả tổng hợp và thực tế.
#Lọc quang học #Ước lượng pha #Phân tích chuyển động hình ảnh #Bộ lọc Gabor #Chuỗi hình ảnh #Độ bền #Ngân hàng bộ lọc #Ước lượng năng suất #Lọc thông tin #Bộ lọc thông tin
The Graph Neural Network Model
IEEE Transactions on Neural Networks - Tập 20 Số 1 - Trang 61-80 - 2009
Tổng số: 99
- 1
- 2
- 3
- 4
- 5
- 6
- 10