SciPy 1.0: fundamental algorithms for scientific computing in Python

Nature Methods - Tập 17 Số 3 - Trang 261-272 - 2020
Pauli Virtanen1, Ralf Gommers2, Travis E. Oliphant2, Matt Haberland3, Tyler Reddy4, David Cournapeau5, Evgeni Burovski6, Pearu Peterson7, Warren Weckesser8, J.D. Bright5, Stéfan van der Walt8, Matthew Brett9, Joshua Wilson5, K. Jarrod Millman8, Nikolay Mayorov10, Andrew Nelson11, Eric D. Jones12, Roland Kern13, Eric B. Larson14, CJ Carey15, İlhan Polat5, G. Bolla8, Eric Moore16, Jake Vanderplas14, Denis Laxalde17,18, Josef Perktold5, Robert Cimrman19, Ian Henriksen20, R. Quitzow-James5, C. R. Harris5, Anne M. Archibald21, Antônio H. Ribeiro22, Fabián Pedregosa23, Paul van Mulbregt23, A. Vijaykumar, Alessandro Pietro Bardelli, Amy E. Rothberg, Andreas Hilboll24, Andreas Klöckner, Anthony Scopatz, Antony Lee, Ariel Rokem, Charles Woods, C. Fulton, C. R. Masson, C. Häggström, Chris Fitzgerald, David Nicholson, David Hagen, Dmitrii V. Ṗasechnik, Emanuele Olivetti25, Elizabeth A. Martin26, Eric Wieser, Fabrice Silva, Frank K. Wilhelm27, George S. Young, George Price, Gert‐Ludwig Ingold, G.E. Allen, A. Lister, Hervé Audren, Irvin Probst28, J. P. Dietrich, Jacob Silterra, James T. Webber, Janko Slavič, Joel Nothman, J. Büchner29, Jeffrey H. Kulick, Johannes L. Schönberger, José Vinícius de Miranda Cardoso, Joscha Reimer, Joseph Harrington, José A. Rodríguez, Juan Nunez-Iglesias, Justin Kuczynski, K. Tritz, M. Thoma, M. Newville30, Matthias Kümmerer, Maximilian Bolingbroke, M. Tartre, M. Pak, Nathaniel J. Smith, Nikolai Nowaczyk, N. Shebanov, Oleksandr Pavlyk, Per A. Brodtkorb, Peter Lee31, Robert T. McGibbon, Roman Feldbauer, S. R. Lewis32, Sam Tygier, Scott Sievert, Sebastiano Vigna, S. Peterson, Surhud More, T. Pudlik, Takuya Oshima33, Thomas J. Pingel, Thomas Robitaille, Thomas Spura, Thouis R. Jones, Tim Cera, Thomas M. Leslie, Tiziano Zito, Thomas F. Krauss, Udita Upadhyay, Yaroslav Halchenko, Yoshiki Vázquez‐Baeza
1Low Temperature Laboratory
2Quansight LLC
3California Polytechnic State University, San Luis Obispo
4Los Alamos National Laboratory
5AUTRES
6National Research University Higher School of Economics, St. Petersburg
7Tallinn University of Technology
8University of California, Berkeley
9University of Birmingham, Birmingham
10Skolkovo Innovation Center
11Australian Nuclear Science and Technology Organisation [Australie]
12Enthought Inc
13Centre Interdisciplinaire de Nanoscience de Marseille
14University of Washington, Seattle
15University of Massachusetts Amherst
16Bruker BioSpin Corporation
17Laboratoire de Tribologie et Dynamique des Systèmes
18SNECMA Villaroche [Moissy-Cramayel]
19New Technologies Research Centre [Plzeň]
20University of Texas at Austin, Austin
21Astronomical Institute Anton Pannekoek
22Universidade Federal de Minas Gerais = Federal University of Minas Gerais [Belo Horizonte, Brazil]
23Google LLC
24Institute of Environmental Physics [Bremen]
25Istituto Nazionale di Ricerca Metrologica
26Laboratoire Chrono-environnement (UMR 6249)
27European Synchrotron Radiation Facility
28Lab-STICC_ENSTAB_CID_PRASYS
29Max-Planck-Institut für Sonnensystemforschung = Max Planck Institute for Solar System Research
30University of Chicago, Consortium of Advanced Radiation Sciences, Advanced Photon Source, GSECARS
31Laboratoire de physique des gaz et des plasmas
32The Open University, Milton Keynes
33Nobeyama Radio Observatory

Tóm tắt

Abstract

SciPy is an open-source scientific computing library for the Python programming language. Since its initial release in 2001, SciPy has become a de facto standard for leveraging scientific algorithms in Python, with over 600 unique code contributors, thousands of dependent packages, over 100,000 dependent repositories and millions of downloads per year. In this work, we provide an overview of the capabilities and development practices of SciPy 1.0 and highlight some recent technical developments.

Từ khóa


Tài liệu tham khảo

Oliphant, T.E. Guide to NumPy 1st edn (Trelgol Publishing USA, 2006).

van derWalt, S., Colbert, S. C. & Varoquaux, G.The NumPy array: a structure for efficient numerical computation. Comput. Sci. Eng. 13, 22–30 (2011).

Pedregosa, F.et al.Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011).

van derWalt, S.et al.scikit-image: image processing in Python. Peer J. 2, e453 (2014).

Nitz, A. et al. gwastro/pycbc: PyCBC v1.13.2 release, https://doi.org/10.5281/zenodo.1596771 (27 November 2018).

Vallisneri, M., Kanner, J., Williams, R., Weinstein, A. & Stephens, B.The LIGO Open Science Center. J. Phys. Conf. Ser. 610, 012021 (2015).

Abbott, B. P.et al.GW150914: First results from the search for binary black hole coalescence with Advanced LIGO. Phys. Rev. D. 93, 122003 (2016).

Abbott, B. P.et al.GW170817: observation of gravitational waves from a binary neutron star inspiral. Phys. Rev. Lett. 119, 161101 (2017).

The Event Horizon Telescope Collaboration.et al.First M87 event horizon telescope results. III. Data processing and calibration. Astrophys. J. Lett. 875, L3 (2019).

Blanton, K. At Mathworks, support + fun = success: CEO Jack Little believes in power of his workers–and their ideas. The Boston Globe, J5 (20 April 1997).

Howell, D. Jack Dangermond’s digital mapping lays it all out. Investor’s Business Daily (14 August 2009).

Port, O. Simple solutions. BusinessWeek, 24–24 (3 October 2005).

van Rossum, G. Python/C API Reference Manual, http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.211.6702&rep=rep1&type=pdf (2001).

Hugunin, J. The matrix object proposal (very long), https://mail.python.org/pipermail/matrix-sig/1995-August/000002.html (18 August 1995).

Hugunin, J. Extending Python for numerical computation, http://hugunin.net/papers/hugunin95numpy.html (1995).

Oliphant, T. E. Moving forward from the last decade of SciPy. Presentation slides, https://conference.scipy.org/scipy2010/slides/travis_oliphant_keynote.pdf (1 July 2010).

Oliphant, T. E. Some Python modules. Web Archive, https://web.archive.org/web/19990125091242/http://oliphant.netpedia.net:80/ (25 January 1999).

Oliphant, T. E. Modules to enhance Numerical Python. Web Archive, https://web.archive.org/web/20001206213500/http://oliphant.netpedia.net:80/ (6 December 2000).

Peterson, P.F2PY: a tool for connecting Fortran and Python programs. Int. J. Comput. Sci. Eng. 4, 296–305 (2009).

Strangman, G. Python modules. Web Archive, https://web.archive.org/web/20001022231108/http://www.nmr.mgh.harvard.edu/Neural_Systems_Group/gary/python.html (2000).

SciPy Developers. SciPy.org. Web Archive, https://web.archive.org/web/20010309040805/http://scipy.org:80/ (2001).

Vaught, T. N. SciPy Developer mailing list now online, https://mail.python.org/pipermail/scipy-dev/2001-June/000000.html (2001).

Jones, E. ANN: SciPy 0.10–scientific computing with Python, https://mail.python.org/pipermail/python-list/2001-August/106419.html (2001).

Vaught, T.N. Reference documentation and Tutorial documentation are now available for download as tarballs. Web Archivehttps://web.archive.org/web/20021013204556/http://www.scipy.org:80/scipy/site_content/site_news/docs_released1 (2002).

Vaught, T. N. [ANN] SciPy ‘02 - Python for Scientific Computing Workshop, https://mail.python.org/pipermail/numpy-discussion/2002-June/001511.html (2002).

Ascher, D., Dubois, P. F., Hinsen, K., Hugunin, J. & Oliphant, T. E. An open source project: Numerical Python, https://doi.org/10.5281/zenodo.3599566 (2001).

Greenfield, P. How Python slithered Into astronomy. Presentation, https://conference.scipy.org/scipy2011/slides/greenfield_keynote_astronomy.pdf (2011).

Greenfield, P., Miller, J.T., Hsu, J.T. & White, R.L. numarray: a new scientific array package for Python. PyCon DC (2003).

NumPy Developers. v1.0, https://github.com/numpy/numpy/releases/tag/v1.0 (25 October 2006).

Millman, K. J. & Pérez, F. Developing open-source scientific practice. in Implementing Reproducible Research (CRC Press) 149–183 (2014).

Brandl, G. & the Sphinx team. Sphinx - Python Documentation Generator, http://www.sphinx-doc.org/en/master/ (2007).

Virtanen, P. et al. pydocweb: a tool for collaboratively documenting Python modules via the web. Web Archive, https://code.google.com/archive/p/pydocweb/ (2008).

Harrington, J. The SciPy documentation project. In Proceedings of the 7th Python in Science Conference (eds G. Varoquaux, G., Vaught, T. & Millman, K. J.) 33–35 (2008).

van der Walt, S. The SciPy documentation project (technical overview). In Proceedings of the 7th Python in Science Conference (eds G. Varoquaux, G., Vaught, T. & Millman, K. J.) 27–28 (2008).

Harrington, J. & Goldsmith, D. Progress report: NumPy and SciPy documentation in 2009. In Proceedings of the 8th Python in Science Conference (eds Varoquaux, G., van der Walt, S. & Millman, K. J.) 84–87 (2009).

Pérez, F., Langtangen, H. P. & LeVeque, R. Python for scientific computing. In SIAM Conference on Computational Science and Engineering, 42 (5) (2009).

Dubois, P. F.Python: batteries included. Comput. Sci. Eng.9, 7–9 (2007).

Millman, K. J. & Aivazis, M.Python for scientists and engineers. Comput. Sci. Eng. 13, 9–12 (2011).

Pérez, F., Granger, B. E. & Hunter, J. D.Python: an ecosystem for scientific computing. Comput. Sci. Eng. 13, 13–21 (2011).

Behnel, S.et al.Cython: the best of both worlds. Comput. Sci. Eng.13, 31–39 (2011).

Ramachandran, P. & Varoquaux, G.Mayavi: 3D visualization of scientific data. Comput. Sci. Eng. 13, 40–51 (2011).

Muller, E.et al. Python in neuroscience. Front. Neuroinform. 9, 11 (2015).

GitHub. Network dependents - scipy/scipy, https://github.com/scipy/scipy/network/dependents (2019).

Boisvert, R. F., Howe, S. E. & Kahaner, D. K.The guide to available mathematical software problem classification system. Commun. Stat. Simul. Comput 20, 811–842 (1991).

Seabold, S. & Perktold, J. Statsmodels: econometric and statistical modeling with Python. In Proceedings of the 9th Python in Science Conference 57–61 (2010).

Salvatier, J., Wiecki, T. V. & Fonnesbeck, C.Probabilistic programming in Python using PyMC3. PeerJ Comput. Sci. 2, e55 (2016).

Foreman-Mackey, D., Hogg, D. W., Lang, D. & Goodman, J. emcee: the MCMC hammer. Publ. Astron. Soc. Pac.125, 306–312 (2013).

Meurer, A.et al.SymPy: symbolic computing in Python. PeerJ Comput. Sci. 3, e103 (2017).

Hagberg, A. A., Schult, D. A. & Swart, P. J. Exploring network structure, dynamics, and function using NetworkX. In Proceedings of the 7th Python in Science Conference. (eds G. Varoquaux, G., Vaught, T. & Millman, K. J.) 11–15 (2008).

Koelbel, C.H. & Zosel, M.E. The High Performance FORTRAN Handbook (MIT Press, 1993).

Piessens, R., de Doncker-Kapenga, E., Uberhuber, C.W. & Kahaner, D.K. QUADPACK: A Subroutine Package for Automatic Integration (Springer, 1983).

Hindmarsh, A.C. ODEPACK, a systematized collection of ODE solvers. Scientific Computing 55–64 (1983).

Dierckx, P. Curve and Surface Fitting with Splines (Oxford Univ. Press, 1993).

Boggs, P.T., Byrd, R.H., Rogers, J.E. & Schnabel, R.B. User’s Reference Guide for ODRPACK Version 2.01: Software for Weight Orthogonal Distance Regression (U.S. Department of Commerce, National Institute of Standards and Technology, 1992).

Moré. Jorge J., Garbow, B. S. & Hillstrom, K. E. User guide for MINPACK-1. Report ANL-80–74 (Argonne National Laboratory, 1980).

Swarztrauber, P. N. Vectorizing the FFTs. In Parallel Computations (ed. Rodrigue, G.) 51–83 (Academic, 1982).

Swarztrauber, P. N.FFT algorithms for vector computers. Parallel Comput. 1, 45–63 (1984).

Lehoucq, R.B., Sorensen, D.C. & Yang, C. ARPACK users’ guide: solution of large scale eigenvalue problems with implicitly restarted Arnoldi methods. (Rice University, 1997).

Amos, D. E.Algorithm 644: A portable package for Bessel functions of a complex argument and nonnegative order. ACM Trans. Math. Softw. 12, 265–273 (1986).

Brown, B., Lovato, J. & Russell, K. CDFLIB, https://people.sc.fsu.edu/~jburkardt/f_src/cdflib/cdflib.html (accessed 6 July 2018).

Kernighan, B. W. & Ritchie, D. M. The C Programming Language 2nd edn (Prentice Hall Professional Technical Reference, 1988).

Lenders, F., Kirches, C. & Potschka, A.trlib: a vector-free implementation of the GLTR method for iterative solution of the trust region problem. Optim. Methods Softw. 33, 420–449 (2018).

Li, X.S. et al. SuperLU Users’ Guide. Report LBNL-44289 (Lawrence Berkeley National Laboratory, 1999).

Li, X. S.An overview of SuperLU: algorithms, implementation, and user interface. ACM Trans. Math. Softw. 31, 302–325 (2005).

Barber, C. B., Dobkin, D. P. & Huhdanpaa, H.The Quickhull algorithm for convex hulls. ACM Trans. Math. Softw. 22, 469–483 (1996).

Lam, S. K., Pitrou, A. & Seibert, S. Numba: A LLVM-based Python JIT compiler. In Proceedings of the Second Workshop on the LLVM Compiler Infrastructure in HPC 7:1–7:6 (ACM, 2015).

Bolz, C. F., Cuni, A., Fijalkowski, M. & Rigo, A. Tracing the meta-level: PyPy’s tracing JIT compiler. In Proceedings of the 4th Workshop on the Implementation, Compilation, Optimization of Object-Oriented Languages and Programming Systems 18–25 (ACM, 2009).

VanderPlas, J. Benchmarking nearest neighbor searches in Python, https://jakevdp.github.io/blog/2013/04/29/benchmarking-nearest-neighbor-searches-in-python/ (19 April 2013).

Maneewongvatana, S. & Mount, D. M. Analysis of approximate nearest neighbor searching with clustered point sets. Preprint at https://arxiv.org/pdf/cs/9901013.pdf (1999).

Molden, S. ENH: Enhancements to spatial.cKDTree, https://github.com/scipy/scipy/pull/4374/ (7 January 2015).

Aspnas, M., Signell, A. & Westerholm, J. Efficient assembly of sparse matrices using hashing. In Applied Parallel Computing. State of the Art in Scientific Computing (eds Kagstrom, B. et al.) 900–907 (Springer, 2007).

Cormen, T. H., Stein, C., Rivest, R. L. & Leiserson, C. E. Introduction to Algorithms 2nd edn (McGraw-Hill Higher Education, 2001).

Moore A. W. et al. Fast algorithms and efficient statistics: N-point correlation functions. In Mining the Sky. ESO Astrophysics Symposia (European Southern Observatory) (eds Banday, A. J., Zaroubi, S. & Bartelmann, M.) 71–82 (Springer, 2001).

Feng, Y. ENH: Faster count_neighour in cKDTree / + weighted input data https://github.com/scipy/scipy/pull/5647 (2015).

Martin, A. M., Giovanelli, R., Haynes, M. P. & Guzzo, L. The clustering characteristics of HI-selected galaxies from the 40% ALFALFA survey. Astrophys. J.750, 38 (2012).

Anderson, E. et al. LAPACK Users’ Guide 3rd edn (Society for Industrial and Applied Mathematics, 1999).

Henriksen, I. Circumventing the linker: using SciPy’s BLAS and LAPACK within Cython. In Proceedings of the 14th Python in Science Conference (SciPy 2015) (eds Huff, K. & Bergstra, J.) 49–52 (2015).

Andersen, E. D. & Andersen, K. D. (2000) The Mosek interior point optimizer for linear programming: an implementation of the homogeneous algorithm. In High Performance Optimization 197–232 (Springer, 2000).

The NETLIB LP test problem set, http://www.numerical.rl.ac.uk/cute/netlib.html (2019).

Andersen, E. D. & Andersen, K. D.Presolving in linear programming. Math. Program. 71, 221–245 (1995).

Wormington, M., Panaccione, C., Matney Kevin, M. & Bowen, D. K.Characterization of structures from X-ray scattering data using genetic algorithms. Philos. Trans. R. Soc. Lond. A 357, 2827–2848 (1999).

Storn, R. & Price, K.Differential evolution — a simple and efficient heuristic for global optimization over continuous spaces. J. Glob. Optim. 11, 341–359 (1997).

Griffiths, T. L. & Steyvers, M.Finding scientific topics. Proc. Natl Acad. Sci. USA 101(Suppl. 1), 5228–5235 (2004).

Dierckx, P. Curve and Surface Fitting with Splines (Oxford Univ. Press, 1993).

Virtanen, P. ENH: interpolate: rewrite ppform evaluation in Cython, https://github.com/scipy/scipy/pull/2885 (2013).

Burovski, E. add b-splines, https://github.com/scipy/scipy/pull/3174 (27 December 2013).

de Boor, C. A Practical Guide to Splines (Springer, 1978).

Mayorov, N. ENH: CubicSpline interpolator, https://github.com/scipy/scipy/pull/5653 (2 January 2016).

Fritsch, F. N. & Carlson, R. E.Monotone piecewise cubic interpolation. SIAM J. Numer. Anal. 17, 238–246 (1980).

Akima, H.A new method of interpolation and smooth curve fitting based on local procedures. J. Assoc. Comput. Mach. 17, 589–602 (1970).

Beck, K. Test-driven Development: By Example (Addison-Wesley, 2003).

Silver, A.Collaborative software development made easy. Nature 550, 143–144 (2017).

Eghbal, N. Roads and Bridges: The Unseen Labor Behind Our Digital Infrastructure (Ford Foundation, 2016).

The Astropy Collaboration.et al.The Astropy Project: building an open-science project and status of the v2.0 core package. Astron. J. 156, 123 (2018).

Lev, O., Dufresne, J., Kasim, R., Skinn, B. & Wilk, J. pypinfo: view PyPI download statistics with ease, https://github.com/ofek/pypinfo (2018).

Abbott, B. P.et al.Observation of gravitational waves from a binary black hole merger. Phys. Rev. Lett. 116, 061102 (2016).

David Liu. The Intel distribution for Python, https://software.intel.com/en-us/articles/intel-optimized-packages-for-the-intel-distribution-for-python (25 August 2017, updated 30 October 2017, accessed 25 July 2018).

Nelder, J. A. & Mead, R.A simplex method for function minimization. Comput. J. 7, 308–313 (1965).

Wright, M. H. Direct search methods: once scorned, now respectable. Pitman Research Notes in Mathematics Series 191–208 (1996).

Powell, M. J. D.An efficient method for finding the minimum of a function of several variables without calculating derivatives. Comput. J. 7, 155–162 (1964).

Powell, M. J. D. A direct search optimization method that models the objective and constraint functions by linear interpolation. In Advances in Optimization and Numerical Analysis (eds Gomez, S. & Hennart, J. P.) 51–67 (Springer, 1994).

Powell, M. J. D.Direct search algorithms for optimization calculations. Acta Numerica 7, 287–336 (1998).

Powell, M. J. D.A view of algorithms for optimization without derivatives. Math. Today Bull. Inst. Math. Appl. 43, 170–174 (2007).

Polak, E. & Ribiere, G.Note sur la convergence de methodes de directions conjuguees. Rev. française d’informatique et. de. Rech. op.érationnelle 3, 35–43 (1969).

Nocedal, J. & Wright, S. Numerical Optimization 2nd edn (Springer Science & Business Media, 2006).

Byrd, R. H., Lu, P., Nocedal, J. & Zhu, C.A limited memory algorithm for bound constrained optimization. SIAM J. Sci. Comput. 16, 1190–1208 (1995).

Zhu, C., Byrd, R. H., Lu, P. & Nocedal, J.Algorithm 778: L-BFGS-B: Fortran subroutines for large-scale bound-constrained optimization. ACM Trans. Math. Softw. 23, 550–560 (1997).

Schittkowski, K.On the convergence of a sequential quadratic programming method with an augmented Lagrangian line search function. Mathematische Operationsforschung und Statistik. Ser. Optim. 14, 197–216 (1983).

Schittkowski, K.The nonlinear programming method of Wilson, Han, and Powell with an augmented Lagrangian type line search function. Part 2: an efficient implementation with linear least squares subproblems. Numer. Math. 38, 115–127 (1982).

Schittkowski, K.The nonlinear programming method of Wilson, Han, and Powell with an augmented Lagrangian type line search function. Part 1: convergence analysis. Numer. Math. 38, 83–114 (1982).

Kraft, D. A software package for sequential quadratic programming. Report DFVLR-FR 88–28 (Deutsche Forschungs- und Versuchsanstalt für Luft- und Raumfahrt, 1988).

Nash, S. G.Newton-type minimization via the Lanczos method. SIAM J. Numer. Anal. 21, 770–788 (1984).

Powell, M. J. D. A new algorithm for unconstrained optimization. Nonlinear Programming 31–65 (1970).

Steihaug, T.The conjugate gradient method and trust regions in large scale optimization. SIAM J. Numer. Anal. 20, 626–637 (1983).

Conn, A.R., Gould, N.I.M. & Toint, P.L. Trust Region Methods (SIAM, 2000).

Moré, J. J. & Sorensen, D. C. Computing a trust region step. SIAM J. Sci.Statist. Comput. 4, 553–572 (1983).

Gould, N. I. M., Lucidi, S., Roma, M. & Toint, P. L.Solving the trust-region subproblem using the Lanczos method. SIAM J. Optim. 9, 504–525 (1999).

Abbasi, H. Sparse: a more modern sparse array library. In Proceedings of the 17th Python in Science Conference (eds Akici, F. et al.) 27–30 (2018).

Mohr, P. J., Newell, D. B. & Taylor, B. N.CODATA recommended values of the fundamental physical constants: 2014. J. Phys. Chem. Ref. Data 45, 043102 (2016).

Boisvert, R. F., Pozo, R., Remington, K., Barrett, R. F. & Dongarra, J. J. Matrix Market: a web resource for test matrix collections. In Quality of Numerical Software 125–137 (Springer, 1997).

Rew, R. & Davis, G.NetCDF: an interface for scientific data access. IEEE Comput. Graph. Appl. 10, 76–82 (1990).

Duff, I.S., Grimes, R.G. & Lewis, J.G. Users’ guide for the Harwell-Boeing sparse matrix collection (release I), http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.41.8922 (1992).