Performance and power consumption evaluation of concurrent queue implementations in embedded systemsComputer Science - Research and Development - Tập 30 - Trang 165-175 - 2014
Lazaros Papadopoulos, Ivan Walulya, Paul Renaud-Goud, Philippas Tsigas, Dimitrios Soudris, Brendan Barry
Embedded and high performance computing (HPC) systems face many common challenges. One of them is the synchronization of the memory accesses in shared data. Concurrent queues have been extensively studied in the HPC domain and they are used in a wide variety of HPC applications. In this work, we evaluate a set of concurrent queue implementations in an embedded platform, in terms of execution time ...... hiện toàn bộ
Automatic detection and quantification of coronary calcium on 3D CT angiography dataComputer Science - Research and Development - Tập 26 - Trang 117-124 - 2010
Matthias Teßmann, Fernando Vega-Higuera, Bernhard Bischoff, Jörg Hausleiter, Günther Greiner
Cardiac calcium scoring is an important step for the diagnosis of coronary heart diseases. Therefore, non-contrast enhanced cardiac computed tomography has been established as the de facto standard method for clinical risk assessment and contrast enhanced computed tomography has proven to be a reliable, non-invasive alternative to traditional coronary angiography. However, calcium scores determine...... hiện toàn bộ
Profiling high performance dense linear algebra algorithms on multicore architectures for power and energy efficiencyComputer Science - Research and Development - Tập 27 - Trang 277-287 - 2011
Hatem Ltaief, Piotr Luszczek, Jack Dongarra
This paper presents the power profile of two high performance dense linear algebra libraries i.e., LAPACK and PLASMA. The former is based on block algorithms that use the fork-join paradigm to achieve parallel performance. The latter uses fine-grained task parallelism that recasts the computation to operate on submatrices called tiles. In this way tile algorithms are formed. We show results from t...... hiện toàn bộ
Mapping fine-grained power measurements to HPC application runtime characteristics on IBM POWER7Computer Science - Research and Development - Tập 29 - Trang 211-219 - 2013
Michael Knobloch, Maciej Foszczynski, Willi Homberg, Dirk Pleiter, Hans Böttiger
Optimization of energy consumption is a key issue for future HPC. Evaluation of energy consumption requires a fine-grained power measurement. Additional useful information is obtained when performing these measurements at component level. In this paper we describe a setup which allows to perform fine-grained power measurements up to a 1 ms resolution at component level on IBM POWER (IBM and POWER ...... hiện toàn bộ
Modeling power and energy of the task-parallel Cholesky factorization on multicore processorsComputer Science - Research and Development - Tập 29 - Trang 105-112 - 2012
Pedro Alonso, Manuel F. Dolz, Rafael Mayo, Enrique S. Quintana-Ortí
In this paper we introduce a model for the total energy consumption of the Cholesky factorization on a multicore processor. Our model assumes a task-parallel execution of the factorization process, with concurrency leveraged via a run-time as those recently proposed in projects like SMPSs, PLASMA or libflame, and decomposes the power usage into its system, static and dynamic components. A few simp...... hiện toàn bộ
Energy-aware job scheduler for high-performance computingComputer Science - Research and Development - Tập 27 - Trang 265-275 - 2011
Olli Mämmelä, Mikko Majanen, Robert Basmadjian, Hermann De Meer, André Giesler, Willi Homberg
In recent years energy-aware computing has become a major topic, not only in wireless and mobile devices but also in devices using wired technology. The ICT industry is consuming an increasing amount of energy and a large part of the consumption is generated by large-scale data centers. In High-Performance Computing (HPC) data centers, higher performance equals higher energy consumption. This has ...... hiện toàn bộ
Scalable parallel AMG on ccNUMA machines with OpenMPComputer Science - Research and Development - Tập 26 - Trang 221-228 - 2011
Malte Förster, Jiri Kraus
In many numerical simulation codes the backbone of the application covers the solution of linear systems of equations. Often, being created via a discretization of differential equations, the corresponding matrices are very sparse. One popular way to solve these sparse linear systems are multigrid methods—in particular AMG—because of their numerical scalability. But looking at modern multi-core ar...... hiện toàn bộ
Using LAMA for efficient AMG on hybrid clustersComputer Science - Research and Development - Tập 28 - Trang 211-220 - 2012
Jiri Kraus, Malte Förster, Thomas Brandes, Thomas Soddemann
In this paper, we describe the implementation of an AMG solver for a hybrid cluster that exploits distributed and shared memory parallelization and uses the available GPU accelerators on each node. This solver has been written by using LAMA (Library for Accelerated Math Applications). This library does not only provide an easy-to-use framework for solvers that might run on different devices with d...... hiện toàn bộ