Machine Learning
Công bố khoa học tiêu biểu
* Dữ liệu chỉ mang tính chất tham khảo
Sắp xếp:
Optimizing Epochal Evolutionary Search: Population-Size Dependent Theory
Machine Learning - Tập 45 - Trang 77-114 - 2001
Epochal dynamics, in which long periods of stasis in an evolving population are punctuated by a sudden burst of change, is a common behavior in both natural and artificial evolutionary processes. We analyze the population dynamics for a class of fitness functions that exhibit epochal behavior using a mathematical framework developed recently, which incorporates techniques from the fields of mathematical population genetics, molecular evolution theory, and statistical mechanics. Our analysis predicts the total number of fitness function evaluations to reach the global optimum as a function of mutation rate, population size, and the parameters specifying the fitness function. This allows us to determine the optimal evolutionary parameter settings for this class of fitness functions. We identify a generalized error threshold that smoothly bounds the two-dimensional regime of mutation rates and population sizes for which epochal evolutionary search operates most efficiently. Specifically, we analyze the dynamics of epoch destabilization under finite-population sampling fluctuations and show how the evolutionary parameters effectively introduce a coarse graining of the fitness function. More generally, we find that the optimal parameter settings for epochal evolutionary search correspond to behavioral regimes in which the consecutive epochs are marginally stable against the sampling fluctuations. Our results suggest that in order to achieve optimal search, one should set evolutionary parameters such that the coarse graining of the fitness function induced by the sampling fluctuations is just large enough to hide local optima.
Improving coordination in small-scale multi-agent deep reinforcement learning through memory-driven communication
Machine Learning - Tập 109 - Trang 1727-1747 - 2020
Deep reinforcement learning algorithms have recently been used to train multiple interacting agents in a centralised manner whilst keeping their execution decentralised. When the agents can only acquire partial observations and are faced with tasks requiring coordination and synchronisation skills, inter-agent communication plays an essential role. In this work, we propose a framework for multi-agent training using deep deterministic policy gradients that enables concurrent, end-to-end learning of an explicit communication protocol through a memory device. During training, the agents learn to perform read and write operations enabling them to infer a shared representation of the world. We empirically demonstrate that concurrent learning of the communication device and individual policies can improve inter-agent coordination and performance in small-scale systems. Our experimental results show that the proposed method achieves superior performance in scenarios with up to six agents. We illustrate how different communication patterns can emerge on six different tasks of increasing complexity. Furthermore, we study the effects of corrupting the communication channel, provide a visualisation of the time-varying memory content as the underlying task is being solved and validate the building blocks of the proposed memory device through ablation studies.
Wasserstein-based fairness interpretability framework for machine learning models
Machine Learning - Tập 111 - Trang 3307-3357 - 2022
The objective of this article is to introduce a fairness interpretability framework for measuring and explaining the bias in classification and regression models at the level of a distribution. In our work, we measure the model bias across sub-population distributions in the model output using the Wasserstein metric. To properly quantify the contributions of predictors, we take into account favorability of both the model and predictors with respect to the non-protected class. The quantification is accomplished by the use of transport theory, which gives rise to the decomposition of the model bias and bias explanations to positive and negative contributions. To gain more insight into the role of favorability and allow for additivity of bias explanations, we adapt techniques from cooperative game theory.
A unified probabilistic framework for robust manifold learning and embedding
Machine Learning - - 2017
Training a reciprocal-sigmoid classifier by feature scaling-space
Machine Learning - Tập 65 - Trang 273-308 - 2006
This paper presents a reciprocal-sigmoid model for pattern classification. This proposed classifier can be considered as a Φ-machine since it preserves the theoretical advantage of linear machines where the weight parameters can be estimated in a single step. The model can also be considered as an approximation to logistic regression under the framework of Generalized Linear Models. While inheriting the necessary classification capability from logistic regression, the problems of local minima and tedious recursive search no longer exist in the proposed formulation. To handle possible over-fitting when using high order models, the classifier is trained using multiple samples of uniformly scaled pattern features. Empirically, the classifier is evaluated using a benchmark synthetic data from random sampling runs for initial statistical evidence regarding its classification accuracy and computational efficiency. Additional experiments based on ten runs of 10-fold cross validations on 40 data sets further support the effectiveness of the reciprocal-sigmoid model, where its classification accuracy is seen to be comparable to several top classifiers in the literature. Main reasons for the good performance are attributed to effective use of reciprocal sigmoid for embedding nonlinearities and effective use of bundled feature sets for smoothing the training error hyper-surface.
Integrating Quantitative and Qualitative Discovery: The ABACUS System
Machine Learning - Tập 1 - Trang 367-401 - 1986
Most research on inductive learning has been concerned with qualitative learning that induces conceptual, logic-style descriptions from the given facts. In contrast, quantitative learning deals with discovering numerical laws characterizing empirical data. This research attempts to integrate both types of learning by combining newly developed heuristics for formulating equations with the previously developed concept learning method embodied in the inductive learning program AQ11. The resulting system, ABACUS, formulates equations that bind subsets of observed data, and derives explicit, logic-style descriptions stating the applicability conditions for these equations. In addition, several new techniques for quantitative learning are introduced. Units analysis reduces the search space of equations by examining the compatibility of variables' units. Proportionality graph search addresses the problem of identifying relevant variables that should enter equations. Suspension search focusses the search space through heuristic evaluation. The capabilities of ABACUS are demonstrated by several examples from physics and chemistry.
Asymptotic accuracy of Bayes estimation for latent variables with redundancy
Machine Learning - Tập 102 - Trang 1-28 - 2015
Hierarchical parametric models consisting of observable and latent variables are widely used for unsupervised learning tasks. For example, a mixture model is a representative hierarchical model for clustering. From the statistical point of view, the models can be regular or singular due to the distribution of data. In the regular case, the models have the identifiability; there is one-to-one relation between a probability density function for the model expression and the parameter. The Fisher information matrix is positive definite, and the estimation accuracy of both observable and latent variables has been studied. In the singular case, on the other hand, the models are not identifiable and the Fisher matrix is not positive definite. Conventional statistical analysis based on the inverse Fisher matrix is not applicable. Recently, an algebraic geometrical analysis has been developed and is used to elucidate the Bayes estimation of observable variables. The present paper applies this analysis to latent-variable estimation and determines its theoretical performance. Our results clarify behavior of the convergence of the posterior distribution. It is found that the posterior of the observable-variable estimation can be different from the one in the latent-variable estimation. Because of the difference, the Markov chain Monte Carlo method based on the parameter and the latent variable cannot construct the desired posterior distribution.
Learning (predictive) risk scores in the presence of censoring due to interventions
Machine Learning - Tập 102 - Trang 323-348 - 2015
A large and diverse set of measurements are regularly collected during a patient’s hospital stay to monitor their health status. Tools for integrating these measurements into severity scores, that accurately track changes in illness severity, can improve clinicians’ ability to provide timely interventions. Existing approaches for creating such scores either (1) rely on experts to fully specify the severity score, (2) infer a score using detailed models of disease progression, or (3) train a predictive score, using supervised learning, by regressing against a surrogate marker of severity such as the presence of downstream adverse events. The first approach does not extend to diseases where an accurate score cannot be elicited from experts. The second assumes that the progression of disease can be accurately modeled, limiting its application to populations with simple, well-understood disease dynamics. The third approach, also most commonly used, often produces scores that suffer from bias due to treatment-related censoring (Paxton et al. in AMIA annual symposium proceedings, American Medical Informatics Association, p 1109, 2013). Specifically, since the downstream outcomes used for their training are observed only noisily and are influenced by treatment administration patterns, these scores do not generalize well when treatment administration patterns change. We propose a novel ranking based framework for disease severity score learning (DSSL). DSSL exploits the following key observation: while it is challenging for experts to quantify the disease severity at any given time, it is often easy to compare the disease severity at two different times. Extending existing ranking algorithms, DSSL learns a function that maps a vector of patient’s measurements to a scalar severity score subject to two constraints. First, the resulting score should be consistent with the expert’s ranking of the disease severity state. Second, changes in score between consecutive periods should be smooth. We apply DSSL to the problem of learning a sepsis severity score using a large, real-world electronic health record dataset. The learned scores significantly outperform state-of-the-art clinical scores in ranking patient states by severity and in early detection of downstream adverse events. We also show that the learned disease severity trajectories are consistent with clinical expectations of disease evolution. Further, we simulate datasets containing different treatment administration patterns and show that DSSL shows better generalization performance to changes in treatment patterns compared to the above approaches.
Model-free inverse reinforcement learning with multi-intention, unlabeled, and overlapping demonstrations
Machine Learning - Tập 112 - Trang 2263-2296 - 2022
In this paper, we define a novel inverse reinforcement learning (IRL) problem where the demonstrations are multi-intention, i.e., collected from multi-intention experts, unlabeled, i.e., without intention labels, and partially overlapping, i.e., shared between multiple intentions. In the presence of overlapping demonstrations, current IRL methods, developed to handle multi-intention and unlabeled demonstrations, cannot successfully learn the underlying reward functions. To solve this limitation, we propose a novel clustering-based approach to disentangle the observed demonstrations and experimentally validate its advantages. Traditional clustering-based approaches to multi-intention IRL, which are developed on the basis of model-based Reinforcement Learning (RL), formulate the problem using parametric density estimation. However, in high-dimensional environments and unknown system dynamics, i.e., model-free RL, the solution of parametric density estimation is only tractable up to the density normalization constant. To solve this, we formulate the problem as a mixture of logistic regressions to directly handle the unnormalized density. To research the challenges faced by overlapping demonstrations, we introduce the concepts of shared pair, which is a state-action pair that is shared in more than one intention, and separability, which resembles how well the multiple intentions can be separated in the joint state-action space. We provide theoretical analyses under the global optimality condition and the existence of shared pairs. Furthermore, we conduct extensive experiments on four simulated robotics tasks, extended to accept different intentions with specific levels of separability, and a synthetic driver task developed to directly control the separability. We evaluate the existing baselines on our defined problem and demonstrate, theoretically and experimentally, the advantages of our clustering-based solution, especially when the separability of the demonstrations decreases.
Tổng số: 1,832
- 1
- 2
- 3
- 4
- 5
- 6
- 10