Constructing generative logical models for optimisation problems using domain knowledge
Tóm tắt
In this paper we seek to identify data instances with a low value of some objective (or cost) function. Normally posed as optimisation problems, our interest is in problems that have the following characteristics: (a) optimal, or even near-optimal solutions are very rare; (b) it is expensive to obtain the value of the objective function for large numbers of data instances; and (c) there is domain knowledge in the form of experience, rules-of-thumb, constraints and the like, which is difficult to translate into the usual constraints for numerical optimisation procedures. Here we investigate the use of Inductive Logic Programming (ILP) to construct models within a procedure that progressively attempts to increase the number of near-optimal solutions. Using ILP in this manner requires a change in focus from discriminatory models (the usual staple for ILP) to generative models. Using controlled datasets, we investigate the use of probability-sampling of solutions based on the estimated cost of clauses found using ILP. Specifically, we compare the results obtained against: (a) simple random sampling; and (b) generative deep network models that use a low-level encoding and automatically construct higher-level features. Our results suggest: (1) Against each of the alternatives, probability-sampling from ILP-constructed models contain more near-optimal solutions; (2) The key to the effectiveness of ILP-constructed models is the availability of domain knowledge. We also demonstrate the use of ILP in this manner on two real-world problems from the area of drug-design (predicting solubility and binding affinity), using domain knowledge of chemical ring structures and functional groups. Taken together, our results suggest that generative modelling using ILP can be very effective for optimisation problems where: (a) the number of training instances to be used is restricted, and (b) there is domain knowledge relevant to low-cost solutions.
Tài liệu tham khảo
Angelopoulos, N., & Cussens, J. (2008). Bayesian learning of Bayesian networks with informative priors. Annals of Mathematics and Artificial Intelligence, 54(1–3), 53. https://doi.org/10.1007/s10472-009-9133-x.
Bain, M. (1991). Experiments in non-monotonic learning. In Proceedings of the eighth international workshop (ML91), Northwestern University, Evanston, Illinois, USA, pp. 380–384.
Bain, M., & Muggleton, S. (1994). Learning optimal chess strategies. Machine Intelligence, 13, 291–309.
Breda, G. (2006). KRK chess endgame database. Knowledge extraction and compression (T.U. Darmstadt, 2006). Diploma Thesis.
Cussens, J. (2000). Stochastic logic programs: Sampling, inference and applications. In Proceedings of the sixteenth conference on uncertainty in artificial intelligence (Morgan Kaufmann Publishers Inc., San Francisco, CA, USA), UAI’00, pp. 115–122. http://dl.acm.org/citation.cfm?id=2073946.2073961.
D’Ariano, A., Pacciarelli, D., & Pranzo, M. (2007). A branch and bound algorithm for scheduling trains in a railway network. European Journal of Operational Research, 183, 643–657.
Dash, T., Srinivasan, A., Vig, L., Orhobor, O. I., & King, R. D. (2018). Large-scale assessment of deep relational machines. In Proceedings of 28th international conference on inductive logic programming, ILP 2018, Ferrara, Italy, September 2–4, 2018, pp. 22–37.
De Bonet, J. S., Isbell, C. L., & Viola, P. et al. (1997). MIMIC: Finding optima by estimating probability densities. In Advances in neural information processing systems, pp. 424–430.
De Raedt, L., Frasconi, P., Kersting, K., & Muggleton, S. (Eds.). (2008). Probabilistic inductive logic programming—Theory and applications, Lecture notes in computer science, Vol. 4911. Springer.
De Raedt, L., & Kimmig, A. (2015). Probabilistic (logic) programming concepts. Machine Learning, 100(1), 5. https://doi.org/10.1007/s10994-015-5494-z.
Fischer, A., & Igel, C. (2012). An introduction to restricted Boltzmann machines. In Iberoamerican congress on pattern recognition, Springer, pp. 14–36.
Gamo, F. J., Sanz, L. M., Vidal, J., de Cozar, C., Alvarez, E., Lavandera, J. L., et al. (2010). Thousands of chemical starting points for antimalarial lead identification. Nature, 465(7296), 305.
Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A., & Bengio, Y. (2014). Generative adversarial nets. In Advances in neural information processing systems, pp. 2672–2680.
Harrod, S. (2010). A tutorial on fundamental model structures for railway timetable optimization. Surveys in Operations Research and Management Science, 17, 85–96.
Hou, T., Xia, K., Zhang, W., & Xu, X. (2004). ADME evaluation in drug discovery. 4. Prediction of aqueous solubility based on atom contribution approach. Journal of Chemical Information and Modeling, 44(1), 266. https://doi.org/10.1021/ci034184n.
Joshi, S., Ramakrishnan, G., & Srinivasan, A. (2008). Feature construction using theory-guided sampling and randomised search. In ILP, pp. 140–157.
Jouglet, A., & Carlier, J. (2011). Dominance rules in combinatorial optimization problems. Operational Research, 212, 433.
King, R. D., Sternberg, M. J. E., & Srinivasan, A. (1995). Relating chemical activity to structure: An examination of ILP successes. New Generation Computer, 13(3&4), 411. https://doi.org/10.1007/BF03037232.
Lloyd, J. W. (1987). Foundations of logic programming (2nd ed.). Berlin: Springer.
Muggleton, S. (1996). Stochastic logic programs. In New generation computing, Academic Press.
Muggleton, S., & Raedt, L. D. (1994). Inductive logic programming: Theory and methods. Journal of Logic Programming, 19,20, 629.
Muggleton, S., Srinivasan, A., & Bain, M. (1992). Compression, significance, and accuracy. In Proceedings of the ninth international workshop on machine learning (ML 1992), Aberdeen, Scotland, UK, July 1–3, 1992, pp. 338–347.
Pelikan, M., Goldberg, D. E., & Lobo, F. G. (2000). A survey of optimization by building and using probabilistic models. Computational Optimization and Applications, 21, 5.
Ramakrishnan, G., Joshi, S., Balakrishnan, S., & Srinivasan, A. (2007). Using ILP to construct features for information extraction from semi-structured text. In International conference on inductive logic programming, Springer, pp. 211–224.
Riguzzi, F. (2018). Foundations of probabilistic logic programming: Languages, semantics, inference and learning. Valencia: River Publishers.
Riguzzi, F., & Swift, T. (2011). The PITA system: Tabling and answer subsumption for reasoning under uncertainty. TPLP, 11(4–5), 433.
Riguzzi, F., & Zese, R. (2017). Probabilistic inductive logic programming on the web. In Proceedings of the doctoral consortium, challenge, industry track, tutorials and posters @ RuleML+RR 2017 hosted by international joint conference on rules and reasoning 2017 (RuleML+RR 2017), London, UK, July 11–15, 2017.
Saha, A., Srinivasan, A., & Ramakrishnan, G. (2012). What kinds of relational features are useful for statistical learning? In F. Riguzzi & F. Zelezný (Eds.), Inductive logic programming—22nd international conference, ILP 2012, Dubrovnik, Croatia, September 17–19, 2012, Revised selected papers. Lecture notes in computer science (Vol. 7842, pp. 501–508). Springer
Saikia, S., Vig, L., Srinivasan, A., Shroff, G., Agarwal, P., & Rawat, R. (2016). Neuro-symbolic EDA-based optimization using ILP-enhanced DBNs. In Proceedings of the workshop on cognitive computation: Integrating neural and symbolic approaches 2016 co-located with the 30th annual conference on neural information processing systems (NIPS 2016), Barcelona, Spain, December 9, 2016.
Sammut, C. (1981). Concept learning by experiment. In Proceedings of the 7th international joint conference on artificial intelligence, IJCAI ’81, Vancouver, BC, Canada, August 24–28, 1981, pp. 104–105. http://ijcai.org/Proceedings/81-1/Papers/021.pdf.
Sato, T., & Kameya, Y. (1997). PRISM: A language for symbolic-statistical modeling. In Proceedings of the fifteenth international joint conference on artificial intelligence, IJCAI 97, Vol. 2, Nagoya, Japan, August 23–29, 1997, pp. 1330–1339.
Specia, L., Srinivasan, A., Joshi, S., Ramakrishnan, G., & Nunes, M. G. V. (2009). An investigation into feature construction to assist word sense disambiguation. Machine Learning, 76(1), 109.
Specia, L., Srinivasan, A., Ramakrishnan, G., & Nunes, M. D. (2006). Word sense disambiguation using inductive logic programming. In ILP, pp. 409–423.
Srinivasan, A. (1999). The Aleph manual. http://www.comlab.ox.ac.uk/oucl/research/areas/machlearn/Aleph/.
Srinivasan, A. (2001). Extracting context-sensitive models in inductive logic programming. Machine Learning, 44(3), 301.
Srinivasan, A., Faruquie, T., Bhattacharya, I., & King, R. (2012). Topic models with relational features for drug design. In ILP.
Srinivasan, A., & King, R. D. (1996). Feature Construction with inductive logic programming: A study of quantitative predictions of biological activity by structural attributes. In 6th international workshop on inductive logic programming, ILP-96, Stockholm, Sweden, August 26–28, 1996, selected papers, pp. 89–104.
Srinivasan, A., King, R. D., Muggleton, S., & Sternberg, M. J. (1997). Carcinogenesis predictions using ILP. In Proceedings of 7th international workshop inductive logic programming, ILP-97, Prague, Czech Republic, September 17–20, 1997, pp. 273–287.