Generalised likelihood profiles for models with intractable likelihoods
Tóm tắt
Likelihood profiling is an efficient and powerful frequentist approach for parameter estimation, uncertainty quantification and practical identifiablity analysis. Unfortunately, these methods cannot be easily applied for stochastic models without a tractable likelihood function. Such models are typical in many fields of science, rendering these classical approaches impractical in these settings. To address this limitation, we develop a new approach to generalising the methods of likelihood profiling for situations when the likelihood cannot be evaluated but stochastic simulations of the assumed data generating process are possible. Our approach is based upon recasting developments from generalised Bayesian inference into a frequentist setting. We derive a method for constructing generalised likelihood profiles and calibrating these profiles to achieve desired frequentist coverage for a given coverage level. We demonstrate the performance of our method on realistic examples from the literature and highlight the capability of our approach for the purpose of practical identifability analysis for models with intractable likelihoods.
Tài liệu tham khảo
Adams, M.P., Sisson, S.A., Helmstedt, K.J., Baker, C.M., Holden, M.H., Plein, M., McDonald-Madden, E.: Informing management decisions for ecological networks, using dynamic models calibrated to noisy time-series data. Ecol. Lett. 23(4), 607–619 (2020). https://doi.org/10.1111/ele.13465
Andrieu, C., Doucet, A., Holenstein, R.: Particle Markov chain Monte Carlo methods. J. R. Stat. Soc. Ser. B (Stat. Methodol.) 72(3), 269–342 (2010). https://doi.org/10.1111/j.1467-9868.2009.00736.x
Beaumont, M.A.: Approximate Bayesian computation in evolution and ecology. Annu. Rev. Ecol. Evol. Syst. 41(1), 379–406 (2010). https://doi.org/10.1146/annurev-ecolsys-102209-144621
Bellio, R., Greco, L., Ventura, L.: Modified quasi-profile likelihoods from estimating functions. J. Stat. Plan. Inference 138(10), 3059–3068 (2008). https://doi.org/10.1016/j.jspi.2007.11.013
Bissiri, P.G., Holmes, C.C., Walker, S.G.: A general framework for updating belief distributions. J. R. Stat. Soc. Ser. B (Stat. Methodol.) 78(5), 1103–1130 (2016). https://doi.org/10.1111/rssb.12158
Black, A.J., McKane, A.J.: Stochastic formulation of ecological models and their applications. Trends Ecol. Evol. 27(6), 337–345 (2012). https://doi.org/10.1016/j.tree.2012.01.014
Bon, J.J., Warne, D.J., Nott, D.J., Drovandi, C.: Bayesian score calibration for approximate models (2022). arXiv: 2211.05935 [stat.CO]
Box, G.E.P.: Science and statistics. J. Am. Stat. Assoc. 71(356), 791–799 (1976). https://doi.org/10.1080/01621459.1976.10480949
Brehmer, J.: Simulation-based inference in particle physics. Nat. Rev. Phys. 3(5), 305–305 (2021). https://doi.org/10.1038/s42254-021-00305-6
Browning, A.P., Warne, D.J., Burrage, K., Baker, R.E., Simpson, M.J.: Identifiability analysis for stochastic differential equation models in systems biology. J. R. Soc. Interface 17(173), 20200652 (2020). https://doi.org/10.1098/rsif.2020.0652
Buma, B.: Disturbance ecology and the problem of \(n = 1\): a proposed framework for unifying disturbance ecology studies to address theory across multiple ecological systems. Methods Ecol. Evol. 12(12), 2276–2286 (2021). https://doi.org/10.1111/2041-210X.13702
Byrd, R.H., Gilbert, J.C., Nocedal, J.: A trust region method based on interior point techniques for nonlinear programming. Math. Program. 89(1), 149–185 (2000). https://doi.org/10.1007/pl00011391
Carr, E.J., Simpson, M.J.: New homogenization approaches for stochastic transport through heterogeneous media. J. Chem. Phys. 150, 044104 (2019). https://doi.org/10.1063/1.5067290
Casella, G.: Statistical Inference. Thomson Learning, Boston (2002)
Chinazzi, M., Davis, J.T., Ajelli, M., Gioannini, C., Litvinova, M., Merler, S., Vespignani, A.: The effect of travel restrictions on the spread of the 2019 novel coronavirus (COVID-19) outbreak. Science 368(6489), 395–400 (2020). https://doi.org/10.1126/science.aba9757
Contoyannis, P., Jones, A.M., Leon-Gonzalez, R.: Using simulation-based inference with panel data in health economics. Health Econ. 13(2), 101–122 (2004). https://doi.org/10.1002/hec.811
Conway, R.W., Maxwell, W.L.: A queuing model with state dependent service rates. J. Ind. Eng. 12(2), 132–136 (1962)
Cox, D.R.: Principles of Statistical Inference. Cambridge University Press, Cambridge (2006)
Cranmer, K., Brehmer, J., Louppe, G.: The frontier of simulation-based inference. Proc. Natl. Acad. Sci. 117(48), 30055–30062 (2020). https://doi.org/10.1073/pnas.1912789117
Dalmasso, N. , Izbicki, R., Lee, A.: Confidence sets and hypothesis testing in a likelihood-Free inference setting. In: Proceedings of the 37th International Conference on Machine Learning, vol. 119, pp. 2323–2334. PMLR (2020). https://proceedings.mlr.press/v119/dalmasso20a.html
Drton, M.: Likelihood ratio tests and singularities. Ann. Stat. 37(2), 979–1012 (1964). https://doi.org/10.1214/07-AOS571
Durbin, J.: Estimation of parameters in time-series regression models. J. R. Stat. Soc. Ser. B (Methodol.) 22(1), 139–153 (1960). https://doi.org/10.1111/j.2517-6161.1960.tb00361.x
Edgington, E.S.: Statistical inference from \(N = 1\) experiments. J. Psychol. 65(2), 195–199 (1976). https://doi.org/10.1080/00223980.1967.10544864
Edwards, A.: Likelihood. Cambridge University Press, Cambridge (1984)
Efron, B.: Bootstrap methods: another look at the jackknife. Ann. Stat. 7(1), 1–26 (1979). https://doi.org/10.1214/aos/1176344552
Ellery, A.J., Simpson, M.J., McCue, S.W., Baker, R.E.: Moments of action provide insight into critical times for advection–diffusion–reaction processes. Phys. Rev. E 86, 031136 (2012). https://doi.org/10.1103/PhysRevE.86.031136
Gelman, A., Carlin, J.B., Stern, H.S., Dunson, D.B., Vehtari, A.: Bayesian Data Analysis. Taylor & Francis Ltd., London (2013)
Geweke, J.F.: Simulation-Based Bayesian Inference for Economic Time Series. Simulation-based Inference in Econometrics, pp. 255–300. Cambridge University Press, Cambridge (2000)
Godambe, V.P.: An optimum property of regular maximum likelihood estimation. Ann. Math. Stat. 31(4), 1208–1211 (1960). https://doi.org/10.1214/aoms/1177705693
Gourieroux, C., Gourieroux, M., Monfort, A., Monfort, D.A.: Simulation-Based Econometric Methods. Oxford University Press, Oxford (1996)
Hines, K.E., Middendorf, T.R., Aldrich, R.W.: Determination of parameter identifiability in nonlinear biophysical models: a Bayesian approach. J. Gen. Physiol. 143(3), 401–416 (2014). https://doi.org/10.1085/jgp.201311116
Hüber, P.J.: Robust estimation of a location parameter. Ann. Math. Stat. 35(1), 73–101 (1964). https://doi.org/10.1214/aoms/1177703732
Hüber, P.J., Ronchetti, E.M.: Robust Statistics. Wiley, London (2009)
Ionides, E.L., Breto, C., Park, J., Smith, R., King, A.A.: Monte Carlo profile confidence intervals for dynamic systems. J. R. Soc. Interface 14(132), 20170126 (2017). https://doi.org/10.1098/rsif.2017.0126
Kelly, R.P., Nott, D.J., Frazier, D.T., Warne, D.J., Drovandi, C.: Misspecification-robust sequential neural likelihood (2023). arXiv: 2301.13368 [stat.ME]
Kiss, O., Grossi, M., Roggero, A.: Importance sampling for stochastic quantum simulations. Quantum 7, 977 (2023). https://doi.org/10.22331/q-2023-04-13-977
Kleiner, A., Talwalkar, A., Sarkar, P., Jordan, M.I.: A scalable bootstrap for massive data. J. R. Stat. Soc. Ser. B (Stat. Methodol.) 76(4), 795–816 (2014). https://doi.org/10.1111/rssb.12050
Kolb, S., Uzman, D., Leyer, I., Reineke, A., Entling, M.H.: Differential effects of semi-natural habitats and organic management on spiders in viticultural landscapes. Agric. Ecosyst. Environ. 287, 106695 (2020). https://doi.org/10.1016/j.agee.2019.106695
Le Cam, L.: Maximum likelihood: an introduction. Int. Stat. Rev. 58, 153–171 (1990)
Lemos, P., Cranmer, M., Abidi, M., Hahn, C., Eickenberg, M., Massara, E., Ho, S.: Robust simulation-based inference in cosmology with Bayesian neural networks. Mach. Learn. Sci. Technol. 4(1), 01LT01 (2023). https://doi.org/10.1088/2632-2153/acbb53
Li, B., McCullagh, P.: Potential functions and conservative estimating functions. Ann. Stat. (1994). https://doi.org/10.1214/aos/1176325372
Li, R., Pei, S., Chen, B., Song, Y., Zhang, T., Yang, W., Shaman, J.: Substantial undocumented infection facilitates the rapid dissemination of novel coronavirus (SARS-CoV-2). Science 368(6490), 489–493 (2020). https://doi.org/10.1126/science.abb3221
Liang, K.Y., Zeger, S.L.: Inference based on estimating functions in the presence of nuisance parameters. Stat. Sci. 10(2), 158–173 (1995). https://doi.org/10.1214/ss/1177010028
Lindsay, B.G., Qu, A.: Inference functions and quadratic score tests. Stat. Sci. 18(3), 394–410 (2003). https://doi.org/10.1214/ss/1076102427
Lynch, H.J., Thorson, J.T., Shelton, A.O.: Dealing with under- and over-dispersed count data in life history, spatial, and community ecology. Ecology 95(11), 3173–3180 (2014). https://doi.org/10.1890/13-1912.1
Matsubara, T., Knoblauch, J., Briol, F.X., Oates, C.: Generalised Bayesian inference for discrete intractable likelihood. arXiv: 2206.08420 [stat.ME] (2022a)
Matsubara, T., Knoblauch, J., Briol, F.-X., Oates, C.J.: Robust generalised Bayesian inference for intractable likelihoods. J. R. Stat. Soc. Ser. B Stat. Methodol. 84(3), 997–1022 (2022). https://doi.org/10.1111/rssb.12500
McCullagh, P., Nelder, J.A.: Generalized Linear Models. Taylor & Francis Ltd., London (1989)
McElreath, R.: Statistical Rethinking. Taylor & Francis Ltd., London (2020)
McLeish, D.L., Small, C.G.: The Theory and Applications of Statistical Inference Functions. Springer, New York (1988)
Meeker, W.Q., Escobar, L.A.: Teaching about approximate confidence regions based on maximum likelihood estimation. Am. Stat. 49(1), 48–53 (1995). https://doi.org/10.1080/00031305.1995.10476112
Murphy, R.J., Maclaren, O.J., Calabrese, A.R., Thomas, P.B., Warne, D.J., Williams, E.D., Simpson, M.J.: Computationally efficient framework for diagnosing, understanding and predicting biphasic population growth. J. R. Soc. Interface (2022). https://doi.org/10.1098/rsif.2022.0560
Oates, C.J.: Minimum kernel discrepancy estimators. (2022). arXiv: 2210.16357 [stat.ME]
Owen, A.B.: Empirical likelihood ratio confidence intervals for a single functional. Biometrika 75(2), 237–249 (1988). https://doi.org/10.1093/biomet/75.2.237
Owen, A.: Empirical likelihood ratio confidence regions. Ann. Stat. (1990). https://doi.org/10.1214/aos/1176347494
Pace, L., Salvan, A.: Principles of Statistical Inference: From a Neo-Fisherian Perspective, vol. 4. World Scientific, London (1997)
Palmer, T.N.: Stochastic weather and climate models. Nat. Rev. Phys. 1(7), 463–471 (2019). https://doi.org/10.1038/s42254-019-0062-2
Papamakarios, G., Sterratt, D., Murray, I.: Sequential neural likelihood: fast likelihood-free inference with autoregressive flows. In: Chaudhuri, K., Sugiyama, M. (eds.) Proceedings of the Twenty-Second International Conference on Artificial Intelligence and Statistics, vol. 89, pp. 837–848. PMLR (2019). https://proceedings.mlr.press/v89/papamakarios19a.html
Parr, W.C., Schucany, W.R.: Minimum distance and robust estimation. J. Am. Stat. Assoc. 75(371), 616–624 (1980). https://doi.org/10.1080/01621459.1980.10477522
Pawitan, Y.: A reminder of the fallibility of the Wald statistic: likelihood explanation. Am. Stat. 54(1), 54–56 (2000). https://doi.org/10.1080/00031305.2000.10474509
Pawitan, Y.: In All Likelihood: Statistical Modelling and Inference Using Likelihood. Oxford University Press, Oxford (2001)
Peleg, N., Molnar, P., Burlando, P., Fatichi, S.: Exploring stochastic climate uncertainty in space and time using a gridded hourly weather generator. J. Hydrol. 571, 627–641 (2019). https://doi.org/10.1016/j.jhydrol.2019.02.010
Price, L.F., Drovandi, C.C., Lee, A., Nott, D.J.: Bayesian synthetic likelihood. J. Comput. Graph. Stat. 27(1), 1–11 (2017). https://doi.org/10.1080/10618600.2017.1302882
Psaros, A.F., Meng, X., Zou, Z., Guo, L., Karniadakis, G.E.: Uncertainty quantification in scientific machine learning: methods, metrics, and comparisons. J. Comput. Phys. 477, 111902 (2023). https://doi.org/10.1016/j.jcp.2022.111902
Raue, A., Kreutz, C., Maiwald, T., Bachmann, J., Schilling, M., Klingmüller, U., Timmer, J.: Structural and practical identifiability analysis of partially observed dynamical models by exploiting the profile likelihood. Bioinformatics 25(15), 1923–1929 (2009). https://doi.org/10.1093/bioinformatics/btp358
Roosa, K., Chowell, G.: Assessing parameter identifiability in compartmental dynamic models using a computational approach: application to infectious disease transmission models. Theor. Biol. Med. Model. (2019). https://doi.org/10.1186/s12976-018-0097-6
Royall, R.: Statistical Evidence: A Likelihood Paradigm. Routledge, London (2017)
Sellers, K.F., Shmueli, G.: A flexible regression model for count data. Ann. Appl. Stat. 4(2), 943–961 (2010). https://doi.org/10.1214/09-aoas306
Severini, T.A.: Likelihood Methods in Statistics. Oxford University Press, Oxford (2000)
Shapiro, A.: On the asymptotics of constrained local \(M\)-estimators. Ann. Stat. 28(3), 948–960 (2000). https://doi.org/10.1214/aos/1015952006
Shoemaker, L.G., Sullivan, L.L., Donohue, I., Cabral, J.S., Williams, R.J., Mayfield, M.M., Abbott, K.C.: Integrating the underlying structure of stochasticity into community ecology. Ecology 101(2), e02922 (2019). https://doi.org/10.1002/ecy.2922
Simpson, M.J., Maclaren, O.J.: Profile-wise analysis: a profile likelihood-based workflow for identifiability analysis, estimation, and prediction with mechanistic mathematical models. PLoS Comput. Biol. 19(9), e1011515 (2023). https://doi.org/10.1371/journal.pcbi.1011515
Simpson, M.J., Baker, R.E., Vittadello, S.T., Maclaren, O.J.: Practical parameter identifiability for spatio-temporal models of cell invasion. J. R. Soc. Interface 17(164), 20200055 (2020). https://doi.org/10.1098/rsif.2020.0055
Simpson, M.J., Browning, A.P., Drovandi, C., Carr, E.J., Maclaren, O.J., Baker, R.E.: Profile likelihood analysis for a stochastic model of diffusion in heterogeneous media. Proc. R. Soc. A Math. Phys. Eng. Sci. 477(2250), 20210214 (2021). https://doi.org/10.1098/rspa.2021.0214
Simpson, M.J., Baker, R.E., Buenzli, P.R., Nicholson, R., Maclaren, O.J.: Reliable and efficient parameter estimation using approximate continuum limit descriptions of stochastic models. J. Theor. Biol. 549, 111201 (2022). https://doi.org/10.1016/j.jtbi.2022.111201
Sisson, S.A., Fan, Y., Beaumont, M.: Handbook of Approximate Bayesian Computation. Taylor & Francis Ltd., London (2018)
Sprott, D.A.: Statistical Inference in Science. Springer, Berlin (2008)
Syring, N., Martin, R.: Calibrating general posterior credible regions. Biometrika 106(2), 479–486 (2019). https://doi.org/10.1093/biomet/asy054
Székely, T., Burrage, K.: Stochastic simulation in systems biology. Comput. Struct. Biotechnol. J. 12(20–21), 14–25 (2014). https://doi.org/10.1016/j.csbj.2014.10.003
Vasylkivska, V., Dilmore, R., Lackey, G., Zhang, Y., King, S., Bacon, D., Harp, D.: NRAP-open-IAM: a flexible open-source integrated-assessment-model for geologic carbon storage risk assessment and management. Environ. Model. Softw. 143, 105114 (2021). https://doi.org/10.1016/j.envsoft.2021.105114
Volodina, V., Challenor, P.: The importance of uncertainty quantification in model reproducibility. Philos. Trans. R. Soc. A Math. Phys. Eng. Sci. 379(2197), 20200071 (2021). https://doi.org/10.1098/rsta.2020.0071
Wald, A.: Contributions to the theory of statistical estimation and testing hypotheses. Ann. Math. Stat. 10(4), 299–326 (1939). https://doi.org/10.1214/aoms/1177732144
Wang, Y., Chung, E., Fu, S.: A local-global multiscale method for highly heterogeneous stochastic groundwater flow problems. Comput. Methods Appl. Mech. Eng. 392, 114688 (2022). https://doi.org/10.1016/j.cma.2022.114688
Warne, D.J., Baker, R.E., Simpson, M.J.: A practical guide to pseudo-marginal methods for computational inference in systems biology. J. Theor. Biol. 496, 110255 (2020). https://doi.org/10.1016/j.jtbi.2020.110255
Warne, D.J., Baker, R.E., Simpson, M.J.: Rapid Bayesian Inference for Expensive Stochastic Models. J. Comput. Graph. Stat. 31(2), 512–528 (2021). https://doi.org/10.1080/10618600.2021.2000419
Wasserman, L.: All of Statistics; A Concise Course in Statistical Inference. Springer, New York (2014)
Wedderburn, R.W.M.: Quasi-likelihood functions, generalized linear models, and the Gauss–Newton method. Biometrika 61(3), 439–447 (1974). https://doi.org/10.1093/biomet/61.3.439
Wilkinson, D.J.: Stochastic Modelling for Systems Biology, 3rd edn. Chapman and Hall/CRC, London (2018)
Winker, P., Gilli, M., Jeleskovic, V.: An objective function for simulation based inference on exchange rate data. J. Econ. Interact. Coord. 2(2), 125–145 (2007). https://doi.org/10.1007/s11403-007-0020-4
Wolfowitz, J.: The minimum distance method. Ann. Math. Stat. 28(1), 75–88 (1957). https://doi.org/10.1214/aoms/1177707038
Wu, G., Holan, S.H., Wikle, C.K.: Hierarchical Bayesian spatio-temporal Conway-Maxwell Poisson models with dynamic dispersion. J. Agric. Biol. Environ. Stat. 18(3), 335–356 (2013). https://doi.org/10.1007/s13253-013-0141-2
Xu, J., Scealy, J.L., Wood, A.T.A., Zou, T.: Generalized score matching for regression. arXiv: 2203.09864 [stat.ST] (2022)
Zehna, P.W.: Invariance of maximum likelihood estimators. Ann. Math. Stat. 37, 744 (1966)
Zhao, X., Mao, Y., Cheng, C., Wandelt, B.D.: Simulation-based inference of reionization parameters from 3D tomographic 21 cm light-cone images. Astrophys. J. 926(2), 151 (2022). https://doi.org/10.3847/1538-4357/ac457d