Journal of Chemometrics
Công bố khoa học tiêu biểu
* Dữ liệu chỉ mang tính chất tham khảo
Sắp xếp:
Near‐infrared spectroscopy. Principles, instruments, applications. H. W. Siesler, Y. Ozaki, S. Kawata and H. M. Heise (eds), Wiley‐VCH, Weinheim, 2002, ISBN 3‐527‐30149‐6, 348 pp, £ 70.00
Journal of Chemometrics - Tập 16 Số 12 - Trang 636-638 - 2002
Combination of heuristic optimal partner bands for variable selection in near‐infrared spectral analysisAbstract Variable selection plays a critical role in the analysis of near‐infrared (NIR) spectra. A method for variable selection based on the principle of the successive projection algorithm (SPA) and optimal partner wavelength combination (OPWC) was proposed for NIR spectral analysis. The method determines a number of knot variables with sufficient independence by SPA, and candidate variable bands with a definite width are defined. The cooperative effect of the bands is then evaluated with the partial least squares regression model by using the method of OPWC. The performance of the proposed method was compared with those of SPA, OPWC, randomization test, competitive adaptive reweighted sampling, and Monte Carlo uninformative variable elimination by using NIR datasets for pharmaceutical tablets, corn, and soil. The results show that the proposed method can select informative variable bands with a cooperative effect and improves the model for quantitative analysis.
Journal of Chemometrics - Tập 32 Số 11 - 2018
Deflation in multiblock PLSAbstract This paper describes some of the deflation problems in multiblock PLS. Deflation of X using block scores leads to inferior prediction of Y . Deflation of X using super scores gives the same predictions as standard PLS with all variables in one large X ‐block, but the information of the separate blocks gets mixed up owing to the deflation of X . If Y is deflated using the super score instead, these problems disappear. Copyright © 2001 John Wiley & Sons, Ltd.
Journal of Chemometrics - Tập 15 Số 5 - Trang 485-493 - 2001
Comments on the NIPALS algorithmAbstract The Non‐linear Iterative Partial Least Squares (NIPALS) algorithm is used in principal component analysis to decompose a data matrix into score vectors and eigenvectors (loading vectors) plus a residual matrix. NIPALS starts with some guessed starting vector. The principal components obtained by NIPALS depends on the starting vector; the first principal component could not always be computed. Wold has suggested a starting vector for NIPALS, but we have found that even if this starting vector is used, the first principal component cannot be obtained in all cases. The reason why such a situation occurs is explained by the power method. A simple modification of the original NIPALS procedure to avoid getting smaller eigenvalues is presented.
Journal of Chemometrics - Tập 4 Số 1 - Trang 97-100 - 1990
Common components and specific weight analysis and multiple co‐inertia analysis applied to the coupling of several measurement techniquesAbstract The present paper compares two multiblock techniques: the Common Components and Specific Weights Analysis (CCSWA) and the Multiple Co‐inertia Analysis (MCoA). Both methods are used to (1) to investigate the relationships among various data tables and (2) to extract latent variables from information of different nature, reflecting different facets of a food product. Our objective is to study the ability of these methods to extract, from a set of data tables, latent characteristics which are representative of the whole modifications brought to a complex system (food product) by a modification of a given process factor. The comparison of these methods is based on the investigation of their conceptual framework by particularly highlighting new properties of CCSWA. Moreover, the two techniques of analysis are compared on the basis of a case study in cheese processing where each cheese sample is described by different kinds of measurements. Copyright © 2007 John Wiley & Sons, Ltd.
Journal of Chemometrics - Tập 20 Số 5 - Trang 172-183 - 2006
A theoretical foundation for the PLS algorithm
Journal of Chemometrics - Tập 1 Số 1 - Trang 19-31 - 1987
Determination of rice type by 1H NMR spectroscopy in combination with different chemometric toolsA 400‐MHz 1 H nuclear magnetic resonance (NMR) spectroscopy and multivariate data analysis were used in the context of food surveillance to discriminate 46 authentic rice samples according to type. It was found that the optimal sample preparation consists of preparing aqueous rice extracts at pH 1.9. For the first time, the chemometric method independent component analysis (ICA) was applied to differentiate clusters of rice from the same type (Basmati, non‐Basmati long‐grain rice, and round‐grain rice) and, to a certain extent, their geographical origin. ICA was found to be superior to classical principal component analysis (PCA) regarding the verification of rice authenticity. The chemical shifts of the principal saccharides and acetic acid were found to be mostly responsible for the observed clustering. Among classification methods (linear discriminant analysis, factorial discriminant analysis, partial least squares discriminant analysis (PLS‐DA), soft independent modeling of class analogy, and ICA), PLS‐DA and ICA gave the best values of specificity (0.96 for both methods) and sensitivity (0.94 for PLS‐DA and 1.0 for ICA). Hence, NMR spectroscopy combined with chemometrics could be used as a screening method in the official control of rice samples. Copyright © 2013 John Wiley & Sons, Ltd.
Journal of Chemometrics - Tập 28 Số 2 - Trang 83-92 - 2014
Sum of ranking differences for method discrimination and its validation: comparison of ranks with random numbersAbstract This paper describes the theoretical background, algorithm and validation of a recently developed novel method of ranking based on the sum of ranking differences [TrAC Trends Anal. Chem . 2010; 29 : 101–109]. The ranking is intended to compare models, methods, analytical techniques, panel members, etc. and it is entirely general. First, the objects to be ranked are arranged in the rows and the variables (for example model results) in the columns of an input matrix. Then, the results of each model for each object are ranked in the order of increasing magnitude. The difference between the rank of the model results and the rank of the known, reference or standard results is then computed. (If the golden standard ranking is known the rank differences can be completed easily.) In the end, the absolute values of the differences are summed together for all models to be compared. The sum of ranking differences (SRD) arranges the models in a unique and unambiguous way. The closer the SRD value to zero (i.e. the closer the ranking to the golden standard), the better is the model. The proximity of SRD values shows similarity of the models, whereas large variation will imply dissimilarity. Generally, the average can be accepted as the golden standard in the absence of known or reference results, even if bias is also present in the model results in addition to random error. Validation of the SRD method can be carried out by using simulated random numbers for comparison (permutation test). A recursive algorithm calculates the discrete distribution for a small number of objects (n < 14), whereas the normal distribution is used as a reasonable approximation if the number of objects is large. The theoretical distribution is visualized for random numbers and can be used to identify SRD values for models that are far from being random. The ranking and validation procedures are called Sum of Ranking differences (SRD) and Comparison of Ranks by Random Numbers (CRNN), respectively. Copyright © 2010 John Wiley & Sons, Ltd.
Journal of Chemometrics - Tập 25 Số 4 - Trang 151-158 - 2011
Is it possible to improve the quality of predictions from an “intelligent” use of multiple QSAR/QSPR/QSTR models?Abstract Quantitative structure‐activity/property/toxicity relationship (QSAR/QSPR/QSTR) models are effectively employed to fill data gaps by predicting a given response from known structural features or physicochemical properties of new query compounds. The performance of a model should be assessed based on the quality of predictions checked through diverse validation metrics, which confirm the reliability of the developed QSAR models along with the acceptability of their prediction quality for untested compounds. There is an ongoing effort by QSAR modelers to improve the quality of predictions by lowering the predicted residuals for query compounds. In this endeavor, consensus models integrating all validated individual models were found to be more externally predictive than individual models in many previous studies. The objective of this work has been to explore whether the quality of predictions of external compounds can be enhanced through an “intelligent” selection of multiple models. The consensus predictions used in this study are not simple average of predictions from multiple models. It has been considered in the present study that a particular QSAR model may not be equally effective for prediction of all query compounds in the list. Our approach is different from the previous ones in that none of the previously reported methods considered selection of predictive models in a query compound specific way while at the same time using all or most of the valid models for the total set of query chemicals. We have implemented our approach in a software tool that is freely available via the web http://teqip.jdvu.ac.in/QSAR_Tools/ and http://dtclab.webs.com/software‐tools .
Journal of Chemometrics - Tập 32 Số 4 - 2018
Feature selection based on graph Laplacian by using compounds with known and unknown activitiesA semisupervised feature selection method based on graph Laplacian (S2 FSGL) was proposed for quantitative structure‐activity relationship (QSAR) models, which uses an ℓ2,1 ‐norm and compounds with both known and unknown activities. In the proposed S2 FSGL method, 2 graphs G unsup and G sup are constructed. It uses the label information of compounds with known activities and the local structure of compounds with known and unknown activities to select the most important descriptors. The weight matrix of graph G unsup models the local structure of the compounds with known and unknown activities. The S2 FSGL method uses the ℓ2,1 ‐norm to consider the correlation between different descriptors when conducting descriptor selection. The performance of the proposed S2 FSGL coupled with a kernel smoother model was evaluated using 2 QSAR data sets and compared with the performance of other feature selection methods. For the evaluation of the performance of QSAR models and selected descriptors, several different training and test sets were produced for each data set. The comparison between the statistical parameters of QSAR models built based on the semisupervised feature selection method and those obtained by other feature selection methods revealed the superiority of the proposed S2 FSGL in selecting the most relevant descriptors. The results showed that the use of compounds with unknown activities beside compounds with known activities can be helpful in selecting the relevant descriptors of QSAR models.
Journal of Chemometrics - Tập 31 Số 8 - 2017
Tổng số: 58
- 1
- 2
- 3
- 4
- 5
- 6