Geoscientific Model Development
Công bố khoa học tiêu biểu
* Dữ liệu chỉ mang tính chất tham khảo
Abstract. The Lagrangian particle dispersion model FLEXPART in its original version in the mid-1990s was designed for calculating the long-range and mesoscale dispersion of hazardous substances from point sources, such as those released after an accident in a nuclear power plant. Over the past decades, the model has evolved into a comprehensive tool for multi-scale atmospheric transport modeling and analysis and has attracted a global user community. Its application fields have been extended to a large range of atmospheric gases and aerosols, e.g., greenhouse gases, short-lived climate forcers like black carbon and volcanic ash, and it has also been used to study the atmospheric branch of the water cycle. Given suitable meteorological input data, it can be used for scales from dozens of meters to global. In particular, inverse modeling based on source–receptor relationships from FLEXPART has become widely used. In this paper, we present FLEXPART version 10.4, which works with meteorological input data from the European Centre for Medium-Range Weather Forecasts (ECMWF) Integrated Forecast System (IFS) and data from the United States National Centers of Environmental Prediction (NCEP) Global Forecast System (GFS). Since the last publication of a detailed FLEXPART description (version 6.2), the model has been improved in different aspects such as performance, physicochemical parameterizations, input/output formats, and available preprocessing and post-processing software. The model code has also been parallelized using the Message Passing Interface (MPI). We demonstrate that the model scales well up to using 256 processors, with a parallel efficiency greater than 75 % for up to 64 processes on multiple nodes in runs with very large numbers of particles. The deviation from 100 % efficiency is almost entirely due to the remaining nonparallelized parts of the code, suggesting large potential for further speedup. A new turbulence scheme for the convective boundary layer has been developed that considers the skewness in the vertical velocity distribution (updrafts and downdrafts) and vertical gradients in air density. FLEXPART is the only model available considering both effects, making it highly accurate for small-scale applications, e.g., to quantify dispersion in the vicinity of a point source. The wet deposition scheme for aerosols has been completely rewritten and a new, more detailed gravitational settling parameterization for aerosols has also been implemented. FLEXPART has had the option of running backward in time from atmospheric concentrations at receptor locations for many years, but this has now been extended to also work for deposition values and may become useful, for instance, for the interpretation of ice core measurements. To our knowledge, to date FLEXPART is the only model with that capability. Furthermore, the temporal variation and temperature dependence of chemical reactions with the OH radical have been included, allowing for more accurate simulations for species with intermediate lifetimes against the reaction with OH, such as ethane. Finally, user settings can now be specified in a more flexible namelist format, and output files can be produced in NetCDF format instead of FLEXPART's customary binary format. In this paper, we describe these new developments. Moreover, we present some tools for the preparation of the meteorological input data and for processing FLEXPART output data, and we briefly report on alternative FLEXPART versions.
Tóm tắt. Chúng tôi đã phát triển một hệ thống đồng hóa dữ liệu carbon để ước lượng các dòng carbon bề mặt. Hệ thống này sử dụng bộ lọc Kalman chuyển đổi tổ hợp cục bộ (LETKF) và mô hình vận chuyển khí quyển GEOS-Chem được dẫn động bởi phân tích lại các trường khí tượng của MERRA-1 dựa trên mô hình Hệ thống Quan sát Trái Đất Goddard phiên bản 5 (GEOS-5). Hệ thống đồng hóa này lấy cảm hứng từ phương pháp của Kang và cộng sự (2011, 2012), những người đã ước tính dòng carbon bề mặt trong một thí nghiệm mô phỏng hệ thống quan sát (OSSE) như là các tham số thay đổi trong việc đồng hóa CO2 khí quyển, sử dụng cửa sổ đồng hóa ngắn 6 giờ. Họ đã bao gồm đồng hóa các biến khí tượng tiêu chuẩn, để tổ hợp mang lại một thước đo của độ không chắc chắn trong việc vận chuyển CO2. Sau khi giới thiệu các kỹ thuật mới như 'định vị biến động' và tăng trọng số quan sát gần bề mặt, họ đã đạt được các dòng carbon bề mặt chính xác ở độ phân giải điểm lưới. Chúng tôi đã phát triển một phiên bản mới của bộ lọc Kalman chuyển đổi tổ hợp cục bộ liên quan đến phương pháp 'ra-vào tại chỗ' (RIP) để tăng tốc giai đoạn tăng vòng của đồng hóa dữ liệu bộ lọc Kalman tổ hợp (EnKF) (Kalnay và Yang, 2010; Wang và cộng sự, 2013; Yang và cộng sự, 2012). Giống như RIP, hệ thống đồng hóa mới sử dụng thuật toán 'làm mịn không chi phí' cho LETKF (Kalnay và cộng sự, 2007b), cho phép dịch chuyển nghiệm của bộ lọc Kalman tiến hoặc lùi trong cửa sổ đồng hóa mà không tốn chi phí nào. Trong sơ đồ mới, một 'cửa sổ quan sát' dài (ví dụ, 7 ngày hoặc lâu hơn) được sử dụng để tạo ra tổ hợp LETKF sau 7 ngày. Sau đó, bộ làm mịn RIP được dùng để tạo ra phân tích cuối cùng chính xác trong 1 ngày. Cách tiếp cận mới này có lợi thế là dựa trên cửa sổ đồng hóa ngắn, điều này giúp nó chính xác hơn, và được tiếp xúc với những quan sát tương lai trong 7 ngày, điều này cải thiện phân tích và tăng tốc giai đoạn tăng vòng. Cửa sổ đồng hóa và quan sát sau đó được lùi lên trước 1 ngày, và quy trình này được lặp lại. Điều này giảm đáng kể lỗi phân tích, cho thấy rằng phương pháp đồng hóa mới được phát triển có thể được sử dụng với các mô hình hệ thống Trái Đất khác, đặc biệt là để tận dụng tốt hơn các quan sát kết hợp với các mô hình này.
Tóm tắt. Cả sai số bình phương trung bình (RMSE) và sai số tuyệt đối trung bình (MAE) đều thường được sử dụng trong các nghiên cứu đánh giá mô hình. Willmott và Matsuura (2005) đã đề xuất rằng RMSE không phải là một chỉ số tốt về hiệu suất trung bình của mô hình và có thể là một chỉ báo gây hiểu lầm về sai số trung bình, do đó MAE sẽ là một chỉ số tốt hơn cho mục đích đó. Mặc dù một số lo ngại về việc sử dụng RMSE được Willmott và Matsuura (2005) và Willmott et al. (2009) nêu ra là có cơ sở, sự đề xuất tránh sử dụng RMSE thay vì MAE không phải là giải pháp. Trích dẫn những bài báo đã nói ở trên, nhiều nhà nghiên cứu đã chọn MAE thay vì RMSE để trình bày thống kê đánh giá mô hình của họ khi việc trình bày hoặc thêm các chỉ số RMSE có thể có lợi hơn. Trong ghi chú kỹ thuật này, chúng tôi chứng minh rằng RMSE không mơ hồ trong ý nghĩa của nó, trái ngược với những gì được Willmott et al. (2009) tuyên bố. RMSE thích hợp hơn để đại diện cho hiệu suất của mô hình khi phân phối sai số được kỳ vọng là phân phối Gaussian. Ngoài ra, chúng tôi chỉ ra rằng RMSE thỏa mãn yêu cầu bất đẳng thức tam giác cho một chỉ số đo khoảng cách, trong khi Willmott et al. (2009) chỉ ra rằng các thống kê dựa trên tổng bình phương không thỏa mãn quy tắc này. Cuối cùng, chúng tôi đã thảo luận về một số tình huống mà việc sử dụng RMSE sẽ có lợi hơn. Tuy nhiên, chúng tôi không tranh cãi rằng RMSE ưu việt hơn MAE. Thay vào đó, một sự kết hợp của các chỉ số, bao gồm nhưng chắc chắn không giới hạn ở RMSEs và MAEs, thường cần thiết để đánh giá hiệu suất của mô hình.\n
Abstract. Mixed-precision approaches can provide substantial speed-ups for both computing- and memory-bound codes with little effort. Most scientific codes have overengineered the numerical precision, leading to a situation in which models are using more resources than required without knowing where they are required and where they are not. Consequently, it is possible to improve computational performance by establishing a more appropriate choice of precision. The only input that is needed is a method to determine which real variables can be represented with fewer bits without affecting the accuracy of the results. This paper presents a novel method that enables modern and legacy codes to benefit from a reduction of the precision of certain variables without sacrificing accuracy. It consists of a simple idea: we reduce the precision of a group of variables and measure how it affects the outputs. Then we can evaluate the level of precision that they truly need. Modifying and recompiling the code for each case that has to be evaluated would require a prohibitive amount of effort. Instead, the method presented in this paper relies on the use of a tool called a reduced-precision emulator (RPE) that can significantly streamline the process. Using the RPE and a list of parameters containing the precisions that will be used for each real variable in the code, it is possible within a single binary to emulate the effect on the outputs of a specific choice of precision. When we are able to emulate the effects of reduced precision, we can proceed with the design of the tests that will give us knowledge of the sensitivity of the model variables regarding their numerical precision. The number of possible combinations is prohibitively large and therefore impossible to explore. The alternative of performing a screening of the variables individually can provide certain insight about the required precision of variables, but, on the other hand, other complex interactions that involve several variables may remain hidden. Instead, we use a divide-and-conquer algorithm that identifies the parts that require high precision and establishes a set of variables that can handle reduced precision. This method has been tested using two state-of-the-art ocean models, the Nucleus for European Modelling of the Ocean (NEMO) and the Regional Ocean Modeling System (ROMS), with very promising results. Obtaining this information is crucial to build an actual mixed-precision version of the code in the next phase that will bring the promised performance benefits.
Abstract. Geoscientific models and measurements generate false precision (scientifically meaningless data bits) that wastes storage space. False precision can mislead (by implying noise is signal) and be scientifically pointless, especially for measurements. By contrast, lossy compression can be both economical (save space) and heuristic (clarify data limitations) without compromising the scientific integrity of data. Data quantization can thus be appropriate regardless of whether space limitations are a concern. We introduce, implement, and characterize a new lossy compression scheme suitable for IEEE floating-point data. Our new Bit Grooming algorithm alternately shaves (to zero) and sets (to one) the least significant bits of consecutive values to preserve a desired precision. This is a symmetric, two-sided variant of an algorithm sometimes called Bit Shaving that quantizes values solely by zeroing bits. Our variation eliminates the artificial low bias produced by always zeroing bits, and makes Bit Grooming more suitable for arrays and multi-dimensional fields whose mean statistics are important. Bit Grooming relies on standard lossless compression to achieve the actual reduction in storage space, so we tested Bit Grooming by applying the DEFLATE compression algorithm to bit-groomed and full-precision climate data stored in netCDF3, netCDF4, HDF4, and HDF5 formats. Bit Grooming reduces the storage space required by initially uncompressed and compressed climate data by 25–80 and 5–65 %, respectively, for single-precision values (the most common case for climate data) quantized to retain 1–5 decimal digits of precision. The potential reduction is greater for double-precision datasets. When used aggressively (i.e., preserving only 1–2 digits), Bit Grooming produces storage reductions comparable to other quantization techniques such as Linear Packing. Unlike Linear Packing, whose guaranteed precision rapidly degrades within the relatively narrow dynamic range of values that it can compress, Bit Grooming guarantees the specified precision throughout the full floating-point range. Data quantization by Bit Grooming is irreversible (i.e., lossy) yet transparent, meaning that no extra processing is required by data users/readers. Hence Bit Grooming can easily reduce data storage volume without sacrificing scientific precision or imposing extra burdens on users.
Abstract. We present a suite of nine scenarios of future emissions trajectories of anthropogenic sources, a key deliverable of the ScenarioMIP experiment within CMIP6. Integrated assessment model results for 14 different emissions species and 13 emissions sectors are provided for each scenario with consistent transitions from the historical data used in CMIP6 to future trajectories using automated harmonization before being downscaled to provide higher emissions source spatial detail. We find that the scenarios span a wide range of end-of-century radiative forcing values, thus making this set of scenarios ideal for exploring a variety of warming pathways. The set of scenarios is bounded on the low end by a 1.9 W m−2 scenario, ideal for analyzing a world with end-of-century temperatures well below 2 ∘C, and on the high end by a 8.5 W m−2 scenario, resulting in an increase in warming of nearly 5 ∘C over pre-industrial levels. Between these two extremes, scenarios are provided such that differences between forcing outcomes provide statistically significant regional temperature outcomes to maximize their usefulness for downstream experiments within CMIP6. A wide range of scenario data products are provided for the CMIP6 scientific community including global, regional, and gridded emissions datasets.
Abstract. The Fire INventory from NCAR version 1.0 (FINNv1) provides daily, 1 km resolution, global estimates of the trace gas and particle emissions from open burning of biomass, which includes wildfire, agricultural fires, and prescribed burning and does not include biofuel use and trash burning. Emission factors used in the calculations have been updated with recent data, particularly for the non-methane organic compounds (NMOC). The resulting global annual NMOC emission estimates are as much as a factor of 5 greater than some prior estimates. Chemical speciation profiles, necessary to allocate the total NMOC emission estimates to lumped species for use by chemical transport models, are provided for three widely used chemical mechanisms: SAPRC99, GEOS-CHEM, and MOZART-4. Using these profiles, FINNv1 also provides global estimates of key organic compounds, including formaldehyde and methanol. Uncertainties in the emissions estimates arise from several of the method steps. The use of fire hot spots, assumed area burned, land cover maps, biomass consumption estimates, and emission factors all introduce error into the model estimates. The uncertainty in the FINNv1 emission estimates are about a factor of two; but, the global estimates agree reasonably well with other global inventories of biomass burning emissions for CO, CO2, and other species with less variable emission factors. FINNv1 emission estimates have been developed specifically for modeling atmospheric chemistry and air quality in a consistent framework at scales from local to global. The product is unique because of the high temporal and spatial resolution, global coverage, and the number of species estimated. FINNv1 can be used for both hindcast and forecast or near-real time model applications and the results are being critically evaluated with models and observations whenever possible.
Abstract. We describe the HadGEM2 family of climate configurations of the Met Office Unified Model, MetUM. The concept of a model "family" comprises a range of specific model configurations incorporating different levels of complexity but with a common physical framework. The HadGEM2 family of configurations includes atmosphere and ocean components, with and without a vertical extension to include a well-resolved stratosphere, and an Earth-System (ES) component which includes dynamic vegetation, ocean biology and atmospheric chemistry. The HadGEM2 physical model includes improvements designed to address specific systematic errors encountered in the previous climate configuration, HadGEM1, namely Northern Hemisphere continental temperature biases and tropical sea surface temperature biases and poor variability. Targeting these biases was crucial in order that the ES configuration could represent important biogeochemical climate feedbacks. Detailed descriptions and evaluations of particular HadGEM2 family members are included in a number of other publications, and the discussion here is limited to a summary of the overall performance using a set of model metrics which compare the way in which the various configurations simulate present-day climate and its variability.
Abstract. The core version of the Norwegian Climate Center's Earth System Model, named NorESM1-M, is presented. The NorESM family of models are based on the Community Climate System Model version 4 (CCSM4) of the University Corporation for Atmospheric Research, but differs from the latter by, in particular, an isopycnic coordinate ocean model and advanced chemistry–aerosol–cloud–radiation interaction schemes. NorESM1-M has a horizontal resolution of approximately 2° for the atmosphere and land components and 1° for the ocean and ice components. NorESM is also available in a lower resolution version (NorESM1-L) and a version that includes prognostic biogeochemical cycling (NorESM1-ME). The latter two model configurations are not part of this paper. Here, a first-order assessment of the model stability, the mean model state and the internal variability based on the model experiments made available to CMIP5 are presented. Further analysis of the model performance is provided in an accompanying paper (Iversen et al., 2013), presenting the corresponding climate response and scenario projections made with NorESM1-M.
Abstract. The Joint UK Land Environment Simulator (JULES) is a process-based model that simulates the fluxes of carbon, water, energy and momentum between the land surface and the atmosphere. Many studies have demonstrated the important role of the land surface in the functioning of the Earth System. Different versions of JULES have been employed to quantify the effects on the land carbon sink of climate change, increasing atmospheric carbon dioxide concentrations, changing atmospheric aerosols and tropospheric ozone, and the response of methane emissions from wetlands to climate change. This paper describes the consolidation of these advances in the modelling of carbon fluxes and stores, in both the vegetation and soil, in version 2.2 of JULES. Features include a multi-layer canopy scheme for light interception, including a sunfleck penetration scheme, a coupled scheme of leaf photosynthesis and stomatal conductance, representation of the effects of ozone on leaf physiology, and a description of methane emissions from wetlands. JULES represents the carbon allocation, growth and population dynamics of five plant functional types. The turnover of carbon from living plant tissues is fed into a 4-pool soil carbon model. The process-based descriptions of key ecological processes and trace gas fluxes in JULES mean that this community model is well-suited for use in carbon cycle, climate change and impacts studies, either in standalone mode or as the land component of a coupled Earth system model.
- 1
- 2
- 3
- 4