BioVeL: a virtual laboratory for data analysis and modelling in biodiversity science and ecology

Alex Hardisty1, Finn Bacall2, Niall Beard2, Maria-Paula Balcázar-Vargas3, Bachir Balech4, Zoltán Barcza5, Sarah J. Bourlat6, Renato De Giovanni7, Yde de Jong3, Francesca De Leo4, Laura Dobor5, Giacinto Donvito8, Donal Fellows2, Antonio Fernàndez-Guerra9, Nuno G.C. Ferreira10, Yuliya Fetyukova11, Bruno Fosso4, Jonathan Giddy1, Carole Goble2, Anton Güntsch12, Robert Haines13, Vera Hernández Ernst14, Hannes Hettling15, Dóra Hidy16, Ferenc Horváth17, Dóra Ittzés17, Péter Ittzés17, Andrew C. Jones1, Renzo Kottmann9, Robert Kulawik14, Sonja Leidenberger18, Päivi Lyytikäinen‐Saarenmaa19, Mathew Cherian12, Norman Morrison2, Aleksandra Nenadić2, Abraham Nieva de la Hidalga1, Matthias Obst6, Gerard Oostermeijer3, Elisabeth Paymal20, Graziano Pesole21, Salvatore Pinto10, Axel Poigné14, Francisco Quevedo Fernandez1, Mónica Santamaría4, Hannu Saarenmaa11, Gergely Sipos10, Karl-Heinz Sylla14, Marko Tähtinen22, Saverio Vicario23, Rutger Vos15, Alan Williams2, Pelin Yilmaz9
1School of Computer Science and Informatics, Cardiff University, Queens Buildings, 5 The Parade, Cardiff, CF24 3AA, UK
2School of Computer Science, University of Manchester, Kilburn Building, Oxford Road, Manchester, M13 9PL, UK
3Institute for Biodiversity and Ecosystem Dynamics (IBED), University of Amsterdam, PO Box 94248, 1090, Amsterdam, The Netherlands
4Institute of Biomembranes and Bioenergetics (IBBE), National Research Council (CNR), via Amendola 165/A, 70126, Bari, Italy
5Department of Meteorology, Eötvös Loránd University, Pázmány sétány 1/A, Budapest, 1117, Hungary
6Department of Marine Sciences, University of Gothenburg, Box 463, 405 30, Gothenburg, Sweden
7Centro de Referência em Informação Ambiental, Avenida Dr. Romeu Tórtima, 388, Campinas, SP, 13084-791, Brazil
8Institute of Nuclear Physics (INFN), Via E. Orabona 4, 70125, Bari, Italy
9Max Planck Institute for Marine Microbiology, Celsiusstrasse 1, 28359 Bremen, Germany
10Stichting EGI (EGI.eu), Science Park 140, 1098, Amsterdam, The Netherlands
11SIB Labs, Joensuu Science Park, University of Eastern Finland, P.O. Box 111, 80101, Joensuu, Finland
12Botanic Garden and Botanical Museum Berlin, Freie Universität Berlin, Königin-Luise-Strasse 6-8, 14195, Berlin, Germany
13IT Services, University of Manchester, Kilburn Building, Oxford Road, Manchester, M13 9PL, UK
14Fraunhofer Institute for Intelligent Analysis and Information Systems (IAIS), Schloss Birlinghoven, 53757, Sankt Augustin, Germany
15Naturalis Biodiversity Center, Postbus 9517, 2300, Leiden, The Netherlands
16MTA-SZIE Plant Ecology Research Group, Szent István University, Páter K. u.1., Gödöllő, 2103, Hungary
17Institute of Ecology and Botany, Centre for Ecological Research, Hungarian Academy of Sciences, Alkotmány u. 2-4., Vácrátót, 2163, Hungary
18Swedish Species Information Centre/ArtDatabanken, Swedish University of Agricultural Sciences, Bäcklösavägen 10, 750 07, Uppsala, Sweden
19Department of Forest Sciences, University of Helsinki, P.O. Box 27, 00014 Helsinki, Finland
20Fondation pour la Recherche sur la Biodiversité (FRB), 195, rue Saint-Jacques, 75005, Paris, France
21Department of Biosciences, Biotechnology and Biopharmaceutics, University of Bari “A. Moro”, via Orabona, 1514, 70126, Bari, Italy
22Finnish Museum of Natural History, University of Helsinki, P.O. Box 17, 00014, Helsinki, Finland
23Institute of Biomedical Technology (ITB), National Research Council (CNR), via Amendola 122/D, 70126, Bari, Italy

Tóm tắt

Từ khóa


Tài liệu tham khảo

Evans MR. Modelling ecological systems in a changing world. Philos Trans R Soc Lond B Biol Sci. 2012;367:181–90. doi: 10.1098/rstb.2011.0172 .

Evans MR, Bithell M, Cornell SJ, Dall SRX, Díaz S, Emmott S, et al. Predictive systems ecology. Proc Biol Sci. 2013;280:20131452. doi: 10.1098/rspb.2013.1452 .

Purves D, Scharlemann J, Harfoot M, Newbold T, Tittensor DP, Hutton J, et al. Ecosystems: time to model all life on Earth. Nature. 2013;493:295–7. doi: 10.1038/493295a .

Díaz S, Demissew S, Carabias J, Joly C, Lonsdale M, Ash N, et al. The IPBES conceptual framework—connecting nature and people. Curr Opin Environ Sustain. 2015;14:1–16. doi: 10.1016/j.cosust.2014.11.002 .

Hampton SE, Strasser CA, Tewksbury JJ, Gram WK, Budden AE, Batcheller AL, et al. Big data and the future of ecology. Front Ecol Environ. 2013;11:156–62. doi: 10.1890/120103 .

Michener WK, Jones MB. Ecoinformatics: supporting ecology as a data-intensive science. Trends Ecol Evol. 2012;27:85–93. doi: 10.1016/j.tree.2011.11.016 .

Koureas D, Arvanitidis C, Belbin L, Berendsohn W, Damgaard C, Groom Q, et al. Community engagement: the “last mile” challenge for European research e-infrastructures. Res Ideas Outcomes. 2016;2:e9933.

Altintas I, Berkley C, Jaeger E, Jones M, Ludascher B, Mock S. Kepler: an extensible system for design and execution of scientific workflows. In: Proceedings 16th international conference on scientific and statistical database management. IEEE. 2004:423–424. doi: 10.1109/SSDM.2004.1311241 .

Deelman E, Vahi K, Juve G, Rynge M, Callaghan S, Maechling PJ, et al. Pegasus, a workflow management system for science automation. Futur Gener Comput Syst. 2015;46:17–35. doi: 10.1016/j.future.2014.10.008 .

Wolstencroft K, Haines R, Fellows D, Williams A, Withers D, Owen S, et al. The Taverna workflow suite: designing and executing workflows of web services on the desktop, web or in the cloud. Nucleic Acids Res. 2013;41:W557–61. doi: 10.1093/nar/gkt328 .

Callahan SP, Freire J, Santos E, Scheidegger CE, Silva CT. Managing the evolution of dataflows with VisTrails. In: 22nd international conference on data engineering workshops (ICDEW’06). IEEE. 2006:71. doi: 10.1109/ICDEW.2006.75 .

Berthold MR, Cebron N, Dill F, Gabriel TR, Kötter T, Meinl T, Ohl P, Thiel K, Wiswedel B. KNIME-the Konstanz information miner: version 2.0 and beyond. In: ACM SIGKDD explorations Newsletter. vol. 16. New York: ACM. 2009. p. 26–31.

Goecks J, Nekrutenko A, Taylor J. Galaxy: a comprehensive approach for supporting accessible, reproducible, and transparent computational research in the life sciences. Genome Biol. 2010;11:R86. doi: 10.1186/gb-2010-11-8-r86 .

Hofmann M, Klinkenberg R. RapidMiner: data mining use cases and business analytics applications. Boca Raton: CRC Press, Taylor & Francis Group; 2013.

Fisher P, Hedeler C. A systematic strategy for large-scale analysis of genotype–phenotype correlations: identification of candidate genes involved in African trypanosomiasis. Nucleic Acids Res. 2007;35:5625–33. doi: 10.1093/nar/gkm623 .

Bentley RD, Csillaghy A, Aboudarham J, Jacquey C, Hapgood MA, Bocchialini K, et al. HELIO: the heliophysics integrated observatory. Adv Space Res. 2011;47:2235–9. doi: 10.1016/j.asr.2010.02.006 .

Hardy B, Douglas N, Helma C, Rautenberg M, Jeliazkova N, Jeliazkov V, et al. Collaborative development of predictive toxicology applications. J Cheminform. 2010;2:7. doi: 10.1186/1758-2946-2-7 .

Rex DE, Ma JQ, Toga AW. The LONI pipeline processing environment. Neuroimage. 2003;19:1033–48. doi: 10.1016/S1053-8119(03)00185-X .

Lu Y, Yue T, Wang C, Wang Q. Workflow-based spatial modeling environment and its application in food provisioning services of grassland ecosystem. In: 2010 18th international conference on geoinformatics. IEEE. 2010:1–6. doi: 10.1109/GEOINFORMATICS.2010.5567853 .

Krüger F, Clare EL, Greif S, Siemers BM, Symondson WOC, Sommer RS. An integrative approach to detect subtle trophic niche differentiation in the sympatric trawling bat species Myotis dasycneme and Myotis daubentonii. Mol Ecol. 2014;23:3657–71. doi: 10.1111/mec.12512 .

Michener W, Beach J, Bowers S, Downey L, Jones M, Ludäscher B, et al. Data integration and workflow solutions for ecology. In: Proceedings, 2nd International Workshop on Data Integration in the Life Sciences July 20-22, 2005 Univ Calif, San Diego, San Diego, USA. Lecture Notes in Computer Science. Vol. 3615. Berlin: Springer; 2005. p. 321–324.

Pennington D, Higgins D, Peterson A, Jones M, Ludäscher B, Bowers S. Ecological niche modeling using the Kepler workflow system. In: Taylor I, Deelman E, Gannon D, Shields M, editors. Workflows for e-Science scientific workflows for grids. London: Springer; 2007. p. 91–108. doi: 10.1007/978-1-84628-757-2_7 .

Jarnevich CS, Holcombe TR, Bella EM, Carlson ML, Graziano G, Lamb M, et al. Cross-scale assessment of potential habitat shifts in a rapidly changing climate. Invasive Plant Sci Manag. 2014;7:491–502. doi: 10.1614/IPSM-D-13-00071.1 .

Dou L, Zinn D, McPhillips T, Köhler S, Riddle S, Bowers S, et al. Scientific workflow design 2.0: demonstrating streaming data collections in Kepler. In: IEEE 27th international conference on Data engineering (ICDE). IEEE. 2011. doi: 10.1109/ICDE.2011.5767938 .

Dou L, Cao G, Morris P, Morris R, Ludäscher B, Macklin J, et al. Kurator: a Kepler package for data curation workflows. Proc Comput Sci. 2012;9:1614–9. doi: 10.1016/j.procs.2012.04.177 .

BioVeL portal. http://portal.biovel.eu/ . Accessed 21 Mar 2016.

Papazoglou MP, Georgakopoulos D. Introduction: service-oriented computing. Commun ACM. 2003;46:24. doi: 10.1145/944217.944233 .

De Giovanni R, Torres E, Amaral R, Blanquer I, Rebello V, Canhos V. OMWS: a web service interface for ecological niche modelling. Biodivers Inform. 2015;10:35–44. doi: 10.17161/bi.v10i0.4853 .

Biodiversity catalogue. http://www.biodiversitycatalogue.org/ . Accessed 4 Apr 2015.

Bhagat J, Tanoh F, Nzuobontane E, Laurent T, Orlowski J, Roos M, et al. BioCatalogue: a universal catalogue of web services for the life sciences. Nucleic Acids Res. 2010;38((Web Server issue)):W689–94. doi: 10.1093/nar/gkq394 .

Ihaka R, Gentleman R. R: a language for data analysis and graphics. J Comput Graph Stat. 1996;5:299–314. doi: 10.1080/10618600.1996.10474713 .

Racine JS. RStudio: a platform-independent IDE for R and sweave. J Appl Econom. 2012;27:167–72. doi: 10.1002/jae.1278 .

Leidenberger S, Obst M, Kulawik R, Stelzer K, Heyer K, Hardisty A, et al. Evaluating the potential of ecological niche modelling as a component in marine non-indigenous species risk assessments. Mar Pollut Bull. 2015;97:470–87. doi: 10.1016/j.marpolbul.2015.04.033 .

Obst M, Vicario S, Berggren M, Karlsson A, Lundin K, Haines R, et al. Marine long-term biodiversity assessment indicates loss of species richness in the Skagerrak and Kattegat region. Mar Biodivers (In review).

De Giovanni R, Williams AR, Vera Hernández E, Kulawik R, Fernandez FQ, Hardisty AR. ENM components: a new set of web service-based workflow components for ecological niche modelling. Ecography (Cop). 2015;. doi: 10.1111/ecog.01552 .

Laugen AT, Hollander J, Obst M, Strand Å. The Pacific Oyster (Crassostrea gigas) invasion in Scandinavian coastal waters in a changing climate: impact on local ecosystem services. In: Canning-Clode J, editor. Biological invasions in aquatic and terrestrial systems: biogeography, ecological impacts, predictions, and management. Berlin: De Gruyter Open; 2015. p. 230–52.

Hidy D, Barcza Z, Haszpra L, Churkina G, Pintér K, Nagy Z. Development of the Biome-BGC model for simulation of managed herbaceous ecosystems. Ecol Model. 2012;226:99–119. doi: 10.1016/j.ecolmodel.2011.11.008 .

Sándor R, Ma S, Acutis M, Barcza Z, Ben Touhami H, Doro L, et al. Uncertainty in simulating biomass yield and carbon-water fluxes from grasslands under climate change. Adv Anim Biosci. 2015;6:49–51.

Sándor R, Barcza Z, Hidy D, Lellei-Kovács E, Ma S, Bellocchi G. Modelling of grassland fluxes in Europe: evaluation of two biogeochemical models. Agric Ecosyst Environ. 2016;215:1–19. doi: 10.1016/j.agee.2015.09.001 .

Sándor R, Acutis M, Barcza Z, Doro L, Hidy D, Köchy M, et al. Multi-model simulation of soil temperature, soil water content and biomass in Euro-Mediterranean grasslands: uncertainties and ensemble performance. European. Eur J Agron. 2016; (In Press). doi: 10.1016/j.eja.2016.06.006 .

Kopf A, Bicak M, Kottmann R, Schnetzer J, Kostadinov I, Lehmann K, et al. The ocean sampling day consortium. Gigascience. 2015;4:27. doi: 10.1186/s13742-015-0066-5 .

Manzari C, Fosso B, Marzano M, Annese A, Caprioli R, D’Erchia AM, et al. The influence of invasive jellyfish blooms on the aquatic microbiome in a coastal lagoon (Varano, SE Italy) detected by an Illumina-based deep sequencing strategy. Biol Invasions. 2014;17:923–40. doi: 10.1007/s10530-014-0810-2 .

Fosso B, Santamaria M, Marzano M, Alonso-Alemany D, Valiente G, Donvito G, et al. BioMaS: a modular pipeline for Bioinformatic analysis of Metagenomic AmpliconS. BMC Bioinformatics. 2015;16:203. doi: 10.1186/s12859-015-0595-z .

Sandionigi A, Vicario S, Prosdocimi EM, Galimberti A, Ferri E, Bruno A, et al. Towards a better understanding of Apis mellifera and Varroa destructor microbiomes: introducing “phyloh” as a novel phylogenetic diversity analysis tool. Mol Ecol Resour. 2014;15:697–710. doi: 10.1111/1755-0998.12341 .

Antonelli A, Hettling H, Condamine FL, Vos K, Nilsson RH, Sanderson MJ, et al. Towards a self-updating platform for estimating rates of speciation and migration, ages, and relationships of taxa (SUPERSMART). Syst Biol. 2016. doi: 10.1093/sysbio/syw066 .

Balech B, Vicario S, Donvito G, Monaco A, Notarangelo P, Pesole G. MSA-PAD: DNA multiple sequence alignment framework based on PFAM accessed domain information. Bioinformatics. 2015;31:2571–3. doi: 10.1093/bioinformatics/btv141 .

Delić D, Balech B, Radulović M, Lolić B, Karačić A, Vukosavljević V, et al. Vmp1 and stamp genes variability of “Candidatus phytoplasma solani” in Bosnian and Herzegovinian grapevine. Eur J Plant Pathol. 2016;45:221–5. doi: 10.1007/s10658-015-0828-z .

RStudio integrated development environment. https://www.rstudio.com/ . Accessed 15 Jul 2016.

CRAN task view: web technologies and services. https://cran.r-project.org/web/views/WebTechnologies.html . Accessed 16 Oct 2015.

CRAN task view: analysis of ecological and environmental data. https://cran.r-project.org/web/views/Environmetrics.html . Accessed 22 Oct 2015.

Creating a workflow from an R script. https://wiki.biovel.eu/x/iYSz . Accessed 28 Aug2015.

Taverna workbench for biodiversity. http://www.taverna.org.uk/download/workbench/2-5/biodiversity/ . Accessed 16 Jul 2016.

World register of marine species (WoRMS). http://www.marinespecies.org/ . Accessed 21 Mar 2016.

Leidenberger S, De Giovanni R, Kulawik R, Williams AR, Bourlat SJ. Mapping present and future potential distribution patterns for a meso-grazer guild in the Baltic Sea. J Biogeogr. 2015;42:241–54. doi: 10.1111/jbi.12395 .

De Roure D, Goble C, Stevens R. The design and realisation of the myExperiment virtual research environment for social sharing of workflows. Futur Gener Comput Syst. 2009;25:561–7. doi: 10.1016/j.future.2008.06.010 .

myExperiment BioVeL group. http://biovel.myexperiment.org/ . Accessed 21 Mar 2016.

Wolstencroft K, Owen S, Krebs O, Nguyen Q, Stanford NJ, Golebiewski M, et al. SEEK: a systems biology data and model management platform. BMC Syst Biol. 2015;9:33. doi: 10.1186/s12918-015-0174-y .

Funch P, Obst M, Quevedo F. et al. Present and future distributions of horseshoe crabs under predicted climate changes. The third international workshop on the science and conservation of horseshoe crabs, June 15–19, 2015, Saikai national park kujukushima, Sasebo-City, Nagasaki, Japan.   http://forskningsdatabasen.dk/en/catalog/2297666567 . Accessed 18 Oct 2016.

Mathew C, Güntsch A, Obst M, Vicario S, Haines R, Williams AR, et al. A semi-automated workflow for biodiversity data retrieval, cleaning, and quality control. Biodivers Data J. 2014;2:e4221. doi: 10.3897/BDJ.2.e4221 .

Haines R, Goble C, Rycroft S, Smith V. Integrating taverna player into scratchpads. Manchester: University of Manchester; 2014. http://zenodo.org/record/10871 . Accessed 20 Apr 2015.

Baker E, Price BW, Rycroft SD, Hill J, Smith VS. BioAcoustica: a free and open repository and analysis platform for bioacoustics. Database. 2015;2015:bav054. doi: 10.1093/database/bav054 .

The royal society. Science as an open enterprise. Final report june. 2012. http://royalsociety.org/uploadedFiles/Royal_Society_Content/policy/projects/ . Accessed 1 Sept 2016.

Hampton SE, Anderson SS, Bagby SC, Gries C, Han X, Hart EM, et al. The Tao of open science for ecology. Ecosphere. 2015;6:1–13. doi: 10.1890/ES14-00402.1 .

Mislan KAS, Heer JM, White EP. Elevating the status of code in ecology. Trends Ecol Evol. 2016;31:4–7. doi: 10.1016/j.tree.2015.11.006 .

Kenall A, Harold S, Foote C. An open future for ecological and evolutionary data? BMC Ecol. 2014;14:10. doi: 10.1186/1472-6785-14-10 .

Rigoni R, Fontana E, Guglielmetti S, Fosso B, D’Erchia AM, Maina V, et al. Intestinal microbiota sustains inflammation and autoimmunity induced by hypomorphic RAG defects. J Exp Med. 2016;213:355–75. doi: 10.1084/jem.20151116 .

Pereira HM, Ferrier S, Walters M, Geller GN, Jongman RHG, Scholes RJ, et al. Essential biodiversity variables. Science. 2013;339:277–8. doi: 10.1126/science.1229931 .

Verbruggen H, Tyberghein L, Pauly K, Vlaeminck C, Van Nieuwenhuyze K, Kooistra WHCF, et al. Macroecology meets macroevolution: evolutionary niche dynamics in the seaweed Halimeda. Glob Ecol Biogeogr. 2009;18:393–405. doi: 10.1111/j.1466-8238.2009.00463.x .

Vilhena DA, Antonelli A. A network approach for identifying and delimiting biogeographical regions. Nat Commun. 2015;6:6848. doi: 10.1038/ncomms7848 .

Kearney MR, Wintle BA, Porter WP. Correlative and mechanistic models of species distribution provide congruent forecasts under climate change. Conserv Lett. 2010;3:203–13. doi: 10.1111/j.1755-263X.2010.00097.x .

White RL, Sutton AE, Salguero-Gómez R, Bray TC, Campbell H, Cieraad E, et al. The next generation of action ecology: novel approaches towards global ecological research. Ecosphere. 2015;6:1–16. doi: 10.1890/ES14-00485.1 .

Belhajjame K, Zhao J, Garijo D, Hettne K, Palma R, Corcho Ó, et al. The research object suite of ontologies: sharing and exchanging research data and methods on the open web. 2014. arXiv:1401.4307 .

Pahwa JS, Brewer P, Sutton T, Yesson C, Burgess M, Xu X, et al. Biodiversity World: a problem-solving environment for analysing biodiversity patterns. In: Sixth IEEE international symposium on cluster computing and the grid (CCGRID’06). IEEE. 2006:1. doi: 10.1109/CCGRID.2006.23 .

Michener W, Beach J, Jones M, Ludäscher B, Pennington D, Pereira R, et al. A knowledge environment for the biodiversity and ecological sciences. J Intell Inf Syst. 2007;29:111–26. doi: 10.1007/s10844-006-0034-8 .

Smith VS, Rycroft SD, Brake I, Scott B, Baker E, Livermore L, et al. Scratchpads 2.0: a virtual research environment supporting scholarly collaboration, communication and data publication in biodiversity science. Zookeys. 2011;150:53–70. doi: 10.3897/zookeys.150.2193 .

Pérez F, Granger B. IPython: a system for interactive scientific computing. Comput Sci Eng. 2007;9:21–9. doi: 10.1109/MCSE.2007.53 .

Gärdenfors U, Jönsson M, Obst M, Wremp AM, Kindvall O, Nilsson J. Swedish LifeWatch ─ a biodiversity infrastructure integrating and reusing data from citizen science, monitoring and research. Hum Comput. 2014;1:147–61. doi: 10.15346/hc.v1i2.6 .

García EA, Bellisari L, De Leo F, Hardisty A, Keuchkerian S, Konijn J, et al. Flock together with CReATIVE-B: a roadmap of global research data infrastructures supporting biodiversity and ecosystem science. 2014. http://orca.cf.ac.uk/88151/ . Accessed 21 Mar 2016.

IPBES Secretariat. Guide on the production and integration of assessments from and across all scales. 2014. http://www.ipbes.net/work-programme/guide-production-assessments . Accessed 21 Mar 2016.

Kissling WD, Hardisty A, García EA, Santamaria M, De Leo F, Pesole G, et al. Towards global interoperability for supporting biodiversity research on essential biodiversity variables (EBVs). Biodiversity. 2015:1–9. doi: 10.1080/14888386.2015.1068709 .

Goble C, De Roure D. The impact of workflow tools on data-centric research. In: Hey T, Tansley S, Tolle K, editors. The fourth paradigm: data intensive scientific discovery. Redmond: Microsoft Research; 2009. p. 137–46.