Pattern Analysis and Applications
Công bố khoa học tiêu biểu
* Dữ liệu chỉ mang tính chất tham khảo
Sắp xếp:
Finding Hidden Events in Astrophysical Data using PCA and Mixture of Gaussians Clustering
Pattern Analysis and Applications - Tập 5 Số 1 - Trang 15-22 - 2002
Wise-SrNet: a novel architecture for enhancing image classification by learning spatial resolution of feature maps
Pattern Analysis and Applications - - 2024
One of the main challenges, since the advancement of convolutional neural networks is how to connect the extracted feature map to the final classification layer. VGG models used two sets of fully connected layers for the classification part of their architectures, which significantly increased the number of models’ weights. ResNet and the next deep convolutional models used the global average pooling layer to compress the feature map and feed it to the classification layer. Although using the GAP layer reduces the computational cost, but also causes losing spatial resolution of the feature map, which results in decreasing learning efficiency. In this paper, we aim to tackle this problem by replacing the GAP layer with a new architecture called Wise-SrNet. It is inspired by the depthwise convolutional idea and is designed for processing spatial resolution while not increasing computational cost. We have evaluated our method using three different datasets they are Intel Image Classification Challenge, MIT Indoors Scenes, and a part of the ImageNet dataset. We investigated the implementation of our architecture on several models of the Inception, ResNet, and DenseNet families. Applying our architecture has revealed a significant effect on increasing convergence speed and accuracy. Our experiments on images with 224224 resolution increased the Top-1 accuracy between 2 to 8% on different datasets and models. Running our models on 512512 resolution images of the MIT Indoors Scenes dataset showed a notable result of improving the Top-1 accuracy within 3 to 26%. We will also demonstrate the GAP layer’s disadvantage when the input images are large and the number of classes is not few. In this circumstance, our proposed architecture can do a great help in enhancing classification results. The code is shared at
https://github.com/mr7495/image-classification-spatial
.
Non-deterministic approach to allay replay attack on iris biometric
Pattern Analysis and Applications - Tập 22 - Trang 717-729 - 2018
Biometric-based verification system has emerged as a powerful authentication tool. Despite its advantages over traditional systems, it is prone to several attacks. These attacks may creep through the biometric system and may prove fatal if it is not robust enough. One such attack, known as replay attack, relates to replaying of illegally intercepted data has been least explored with respect to biometrics. The paper proposes a non-deterministic approach to iris recognition and attempts to show its utility in allaying replay attack over iris recognition system. The system determines robust iris regions for each eye using LBP-based feature extraction and involves the use of randomly selected subsets of these regions for authentication. These data, even if intercepted, are useless as the non-deterministic nature of technique will require a differently ordered subset of regions for each authentication. The performance of this system and its effectiveness in allaying replay attack has been shown experimentally. The results have been compared with existing state-of-art techniques with respect to iris recognition and replay attack. The impact of hill climbing attack on the proposed approach has also been discussed as it has been proved, by various researchers, to be critical to the performance of a biometric system.
RETIN: A Content-Based Image Indexing and Retrieval System
Pattern Analysis and Applications - Tập 4 - Trang 153-173 - 2001
This paper presents RETIN, a new system for automatic image indexing and interactive content-based image retrieval. The most original aspect of our work rests on the distance computation and its adjustment by relevance feedback. First, during an offline stage, the indexes are computed from attribute vectors associated with image pixels. The feature spaces are partitioned through an unsupervised classification, and then, thanks to these partitions, statistical distributions are processed for each image. During the online use of the system, the user makes an iconic request, i.e. he brings an example of the type of image he is looking for. The query may be global or partial, since the user can reduce his request to a region of interest. The comparison between the query distribution and that of every image in the collection is carried out by using a weighted dissimilarity function which manages the use of several attributes. The results of the search are then refined by means of relevance feedback, which tunes the weights of the dissimilarity measure via user interaction. Experiments are then performed on large databases and statistical quality assessment shows the good properties of RETIN for digital image retrieval. The evaluation also shows that relevance feedback brings flexibility and robustness to the search.
A new LDA-KL combined method for feature extraction and its generalisation
Pattern Analysis and Applications - Tập 7 - Trang 225-225 - 2004
Detection of artificial and scene text in images and video frames
Pattern Analysis and Applications - Tập 16 - Trang 431-446 - 2011
Textual information in images and video frames constitutes a valuable source of high-level semantics for multimedia indexing and retrieval systems. Text detection is the most crucial step in a multimedia text extraction system and although it has been extensively studied the past decade still, it does not exist a generic architecture that would work for artificial and scene text in multimedia content. In this paper we propose a system for text detection of both artificial and scene text in images and video frames. The system is based on a machine learning stage which uses an Random Forest classifier and a highly discriminative feature set produced by using a new texture operator called Multilevel Adaptive Color edge Local Binary Pattern (MACeLBP). MACeLBP describes the spatial distribution of color edges in multiple adaptive levels of contrast. Then, a gradient-based algorithm is applied to achieve distinction among text lines as well as refinement in the localization of the text lines. The whole algorithm is situated in a multiresolution framework to achieve invariance to scale for the detection of text lines. Finally, an optional connected-component step segments text lines into words based on the distances between the resulting components. The experimental results are produced by applying a concise evaluation methodology and prove the superior performance achieved by the proposed text detection system for artificial and scene text in images and video frames.
SWGMM: a semi-wrapped Gaussian mixture model for clustering of circular–linear data
Pattern Analysis and Applications - Tập 19 - Trang 631-645 - 2014
Finite mixture models are widely used to perform model-based clustering of multivariate data sets. Most of the existing mixture models work with linear data; whereas, real-life applications may involve multivariate data having both circular and linear characteristics. No existing mixture models can accommodate such correlated circular–linear data. In this paper, we consider designing a mixture model for multivariate data having one circular variable. In order to construct a circular–linear joint distribution with proper inclusion of correlation terms, we use the semi-wrapped Gaussian distribution. Further, we construct a mixture model (termed SWGMM) of such joint distributions. This mixture model is capable of approximating the distribution of multi-modal circular–linear data. An unsupervised learning of the mixture parameters is proposed based on expectation maximization method. Clustering is performed using maximum a posteriori criterion. To evaluate the performance of SWGMM, we choose the task of color image segmentation in LCH space. We present comprehensive results and compare SWGMM with existing methods. Our study reveals that the proposed mixture model outperforms the other methods in most cases.
Exemplar-based color image inpainting: a fractional gradient function approach
Pattern Analysis and Applications - Tập 17 - Trang 389-399 - 2013
Image inpainting is the art of recovering a plausible image from images which are generally incomplete due to various factors, including degradation due to ageing, damage due to wear and tear and missing image details due to occlusion. In such situations, there is a need to predict the missing image information without introducing undesirable artifacts. Original contribution in this direction is due to a seminal paper by Criminisi et al. This has led to a number of novel contributions in terms of patch filling prioritization and associated metrics to measure color and structure. In this paper, we propose a fast and simple technique based on a novel gradient function and its generalization via fractional derivatives to evaluate the filling order prioritization. Results demonstrate superior and robust performance over all the recent advances in the domain of exemplar-based methods quoted in the literature.
Fully automated age-weighted expression classification using real and apparent age
Pattern Analysis and Applications - Tập 25 - Trang 451-466 - 2022
After decades of research, automatic facial expression recognition (AFER) has been shown to work well when restricted to subjects with a limited range of ages. Expression recognition in subjects having a large range of ages is harder as it has been shown that ageing, health, and lifestyle affect facial expression. In this paper, we present a discriminative system that explicitly predicts expression across a large range of ages, which we show to perform better than an equivalent system which ignores age. In our system, we first build a fully automatic facial feature point detector (FFPD) using random forest regression voting in a constrained local mode (RFRV-CLM) framework (Cootes et al., in: European conference on computer vision, Springer, Berlin, 2012) which we use to automatically detect the location of key facial points, study the effect of ageing on the accuracy of point localization task. Second, a set of age group estimator and age-specific expression recognizers are trained from the extracted features that include shape, texture, appearance and a fusion of shape with texture, to analyse the effect of ageing on the face features and subsequently on the performance of AFER. We then propose a simple and effective method to recognize the expression across a large range of ages through using a weighted combination rule of a set of age group estimator and age specific expression recognizers (one for each age group), where the age information is used as prior knowledge to the expression classification. The advantage of using the weighted combination of all the classifiers is that more information about the classification can be obtained and subjects whose apparent age puts them in the wrong chronological age group will be dealt with more effectively. The performance of the proposed system was evaluated using three age-expression databases of static and dynamic images for deliberate and spontaneous expressions: FACES (Ebner et al., in Behav Res Methods 42:351–362, 2010) (2052 images), Lifespan (Minear and Park in Behav Res Methods Instrum Comput 36:630–633, 2004) (844 images) and NEMO (Dibeklioğlu et al., in: European conference on computer vision, Springer, Berlin, 2012) (1,243 videos). The results show the system to be accurate and robust against a wide variety of expressions and the age of the subject. Evaluation of point localization, age group estimation and expression recognition against ground truth data was obtained and compared with the existing results of alternative approaches tested on the same data. The quantitative results with 2.1% error rates (using manual points) and 3.0% error rates (fully automatic) of expression classification demonstrated that the results of our novel system were encouraging in comparison with the state-of-the-art systems which ignore age and alternative models recently applied to the problem.
Domain adaption based on source dictionary regularized RKHS subspace learning
Pattern Analysis and Applications - Tập 24 - Trang 1513-1532 - 2021
Domain adaption is to transform the source and target domain data into a certain space through a certain transformation, so that the probability distribution of the transformed data is as close as possible. The domain adaption algorithm based on Maximum Mean Difference (MMD) Maximization and Reproducing Kernel Hilbert Space (RKHS) subspace transformation is the current main algorithm for domain adaption, in which the RKHS subspace transformation is determined by MMD of the transformed source and target domain data. However, MMD has inherent defects in theory. The probability distributions of two different random variables will not change after subtracting their respective mean values, but their MMD becomes zero. A reasonable method should be that the MMD of the source and target domain data with the same label should be as small as possible after RKHS subspace transformation. However, the labels of target domain data are unknown and there is no way to model according to this criterion. In this paper, a domain adaption algorithm based on source dictionary regularized RKHS subspace learning is proposed, in which the source domain data are used as a dictionary, and the target domain data are approximated by the sparse coding of the dictionary. That is to say, in the process of RKHS subspace transformation, the target domain data are distributed around the mostly relevant source domain data. In this way, the proposed algorithm indirectly achieves the MMD of the source and target domain data with the same label after RKHS subspace transformation. So far there has been no similar work reported in the published academic papers. The experimental results presented in this paper show that the proposed algorithm outperforms 5 other state-of-the-art domain adaption algorithms on 5 commonly used datasets.
Tổng số: 1,005
- 1
- 2
- 3
- 4
- 5
- 6
- 10