
Scientific data
SCIE-ISI SCOPUS (2014-2023)
2052-4463
Anh Quốc
Cơ quản chủ quản: NATURE PORTFOLIO , Nature Publishing Group
Các bài báo tiêu biểu
There is an urgent need to improve the infrastructure supporting the reuse of scholarly data. A diverse set of stakeholders—representing academia, industry, funding agencies, and scholarly publishers—have come together to design and jointly endorse a concise and measureable set of principles that we refer to as the FAIR Data Principles. The intent is that these may act as a guideline for those wishing to enhance the reusability of their data holdings. Distinct from peer initiatives that focus on the human scholar, the FAIR Principles put specific emphasis on enhancing the ability of machines to automatically find and use the data, in addition to supporting its reuse by individuals. This Comment is the first formal publication of the FAIR Principles, and includes the rationale behind them, and some exemplar implementations in the community.
MIMIC-III (‘Medical Information Mart for Intensive Care’) is a large, single-center database comprising information relating to patients admitted to critical care units at a large tertiary care hospital. Data includes vital signs, medications, laboratory measurements, observations and notes charted by care providers, fluid balance, procedure codes, diagnostic codes, imaging reports, hospital length of stay, survival data, and more. The database supports applications including academic and industrial research, quality improvement initiatives, and higher education coursework.
The Climate Hazards group Infrared Precipitation with Stations (CHIRPS) dataset builds on previous approaches to ‘smart’ interpolation techniques and high resolution, long period of record precipitation estimates based on infrared Cold Cloud Duration (CCD) observations. The algorithm i) is built around a 0.05° climatology that incorporates satellite information to represent sparsely gauged locations, ii) incorporates daily, pentadal, and monthly 1981-present 0.05° CCD-based precipitation estimates, iii) blends station data to produce a preliminary information product with a latency of about 2 days and a final product with an average latency of about 3 weeks, and iv) uses a novel blending procedure incorporating the spatial correlation structure of CCD-estimates to assign interpolation weights. We present the CHIRPS algorithm, global and regional validation results, and show how CHIRPS can be used to quantify the hydrologic impacts of decreasing precipitation and rising air temperatures in the Greater Horn of Africa. Using the Variable Infiltration Capacity model, we show that CHIRPS can support effective hydrologic forecasts and trend analyses in southeastern Ethiopia.
We present new global maps of the Köppen-Geiger climate classification at an unprecedented 1-km resolution for the present-day (1980–2016) and for projected future conditions (2071–2100) under climate change. The present-day map is derived from an ensemble of four high-resolution, topographically-corrected climatic maps. The future map is derived from an ensemble of 32 climate model projections (scenario RCP8.5), by superimposing the projected climate change anomaly on the baseline high-resolution climatic maps. For both time periods we calculate confidence levels from the ensemble spread, providing valuable indications of the reliability of the classifications. The new maps exhibit a higher classification accuracy and substantially more detail than previous maps, particularly in regions with sharp spatial or elevation gradients. We anticipate the new maps will be useful for numerous applications, including species and vegetation distribution modeling. The new maps including the associated confidence maps are freely available via
High-resolution information on climatic conditions is essential to many applications in environmental and ecological sciences. Here we present the CHELSA (Climatologies at high resolution for the earth’s land surface areas) data of downscaled model output temperature and precipitation estimates of the ERA-Interim climatic reanalysis to a high resolution of 30 arc sec. The temperature algorithm is based on statistical downscaling of atmospheric temperatures. The precipitation algorithm incorporates orographic predictors including wind fields, valley exposition, and boundary layer height, with a subsequent bias correction. The resulting data consist of a monthly temperature and precipitation climatology for the years 1979–2013. We compare the data derived from the CHELSA algorithm with other standard gridded products and station data from the Global Historical Climate Network. We compare the performance of the new climatologies in species distribution modelling and show that we can increase the accuracy of species range predictions. We further show that CHELSA climatological data has a similar accuracy as other products for temperature, but that its predictions of precipitation patterns are better.
Gliomas belong to a group of central nervous system tumors, and consist of various sub-regions. Gold standard labeling of these sub-regions in radiographic imaging is essential for both clinical and computational studies, including radiomic and radiogenomic analyses. Towards this end, we release segmentation labels and radiomic features for all pre-operative multimodal magnetic resonance imaging (MRI) (
We present TerraClimate, a dataset of high-spatial resolution (1/24°, ~4-km) monthly climate and climatic water balance for global terrestrial surfaces from 1958–2015. TerraClimate uses climatically aided interpolation, combining high-spatial resolution climatological normals from the WorldClim dataset, with coarser resolution time varying (i.e., monthly) data from other sources to produce a monthly dataset of precipitation, maximum and minimum temperature, wind speed, vapor pressure, and solar radiation. TerraClimate additionally produces monthly surface water balance datasets using a water balance model that incorporates reference evapotranspiration, precipitation, temperature, and interpolated plant extractable soil water capacity. These data provide important inputs for ecological and hydrological studies at global scales that require high spatial resolution and time varying climate and climatic water balance data. We validated spatiotemporal aspects of TerraClimate using annual temperature, precipitation, and calculated reference evapotranspiration from station data, as well as annual runoff from streamflow gauges. TerraClimate datasets showed noted improvement in overall mean absolute error and increased spatial realism relative to coarser resolution gridded datasets.
The development of magnetic resonance imaging (MRI) techniques has defined modern neuroimaging. Since its inception, tens of thousands of studies using techniques such as functional MRI and diffusion weighted imaging have allowed for the non-invasive study of the brain. Despite the fact that MRI is routinely used to obtain data for neuroscience research, there has been no widely adopted standard for organizing and describing the data collected in an imaging experiment. This renders sharing and reusing data (within or between labs) difficult if not impossible and unnecessarily complicates the application of automatic pipelines and quality assurance protocols. To solve this problem, we have developed the Brain Imaging Data Structure (BIDS), a standard for organizing and describing MRI datasets. The BIDS standard uses file formats compatible with existing software, unifies the majority of practices already common in the field, and captures the metadata necessary for most common data processing operations.
China is the world’s top energy consumer and CO2 emitter, accounting for 30% of global emissions. Compiling an accurate accounting of China’s CO2 emissions is the first step in implementing reduction policies. However, no annual, officially published emissions data exist for China. The current emissions estimated by academic institutes and scholars exhibit great discrepancies. The gap between the different emissions estimates is approximately equal to the total emissions of the Russian Federation (the 4th highest emitter globally) in 2011. In this study, we constructed the time-series of CO2 emission inventories for China and its 30 provinces. We followed the Intergovernmental Panel on Climate Change (IPCC) emissions accounting method with a territorial administrative scope. The inventories include energy-related emissions (17 fossil fuels in 47 sectors) and process-related emissions (cement production). The first version of our dataset presents emission inventories from 1997 to 2015. We will update the dataset annually. The uniformly formatted emission inventories provide data support for further emission-related research as well as emissions reduction policy-making in China.
The elastic constant tensor of an inorganic compound provides a complete description of the response of the material to external stresses in the elastic limit. It thus provides fundamental insight into the nature of the bonding in the material, and it is known to correlate with many mechanical properties. Despite the importance of the elastic constant tensor, it has been measured for a very small fraction of all known inorganic compounds, a situation that limits the ability of materials scientists to develop new materials with targeted mechanical responses. To address this deficiency, we present here the largest database of calculated elastic properties for inorganic compounds to date. The database currently contains full elastic information for 1,181 inorganic compounds, and this number is growing steadily. The methods used to develop the database are described, as are results of tests that establish the accuracy of the data. In addition, we document the database format and describe the different ways it can be accessed and analyzed in efforts related to materials discovery and design.