Multiprocessor Systems (multiprocessor + system)

Distribution by Scientific Domains


Selected Abstracts


Performance of computationally intensive parameter sweep applications on Internet-based Grids of computers: the mapping of molecular potential energy hypersurfaces

CONCURRENCY AND COMPUTATION: PRACTICE & EXPERIENCE, Issue 4 2007
S. Reyes
Abstract This work focuses on the use of computational Grids for processing the large set of jobs arising in parameter sweep applications. In particular, we tackle the mapping of molecular potential energy hypersurfaces. For computationally intensive parameter sweep problems, performance models are developed to compare the parallel computation in a multiprocessor system with the computation on an Internet-based Grid of computers. We find that the relative performance of the Grid approach increases with the number of processors, being independent of the number of jobs. The experimental data, obtained using electronic structure calculations, fit the proposed performance expressions accurately. To automate the mapping of potential energy hypersurfaces, an application based on GRID superscalar is developed. It is tested on the prototypical case of the internal dynamics of acetone. Copyright © 2006 John Wiley & Sons, Ltd. [source]


Inexact information aided, low-cost, distributed genetic algorithms for aerodynamic shape optimization

INTERNATIONAL JOURNAL FOR NUMERICAL METHODS IN FLUIDS, Issue 10-11 2003
Marios K. Karakasis
Abstract Despite its robustness, the design and optimization of aerodynamic shapes using genetic algorithms suffers from high computing cost requirements, due to excessive calls to Computational Fluid Dynamics tools for the evaluation of candidate solutions. To alleviate this problem, either the use of distributed genetic algorithms or the implementation of surrogate evaluation models have separately been proposed in the past. A distributed genetic algorithm relies on the handling of population subsets that evolve in a semi-isolated manner by regularly exchanging their best individuals. It is known that distributed schemes generally outperform single-population ones. On the other hand, the implementation of less costly surrogate evaluation tools, such as the autocatalytic radial basis function networks developed by the authors for the purpose of getting rid of most of the ,useless' exact evaluations, reduces considerably the computational cost. The aim of the present paper is to employ a surrogate evaluation model in the context of a distributed genetic algorithm and to demonstrate that the combination of both results in maximum economy in CPU cost. In addition, whenever a multiprocessor system is available, the gain is much more pronounced, since the new optimization method maximizes parallel efficiency. The proposed method is used to solve inverse design and optimization problems in aeronautics and turbomachinery. Copyright © 2003 John Wiley & Sons, Ltd. [source]


Java multithreading-based parallel approximate arrow-type inverses

CONCURRENCY AND COMPUTATION: PRACTICE & EXPERIENCE, Issue 10 2008
George A. Gravvanis
Abstract A new parallel shared memory Java multithreaded design and implementation of the explicit approximate inverse preconditioning, for efficiently solving arrow-type linear systems on symmetric multiprocessor systems (SMPs), is presented. A new parallel algorithm for computing a class of optimized approximate arrow-type inverse matrix is introduced. The performance on an SMP, using Java multithreading, is investigated by solving arrow-type linear systems and numerical results are given. The parallel performance of the construction of the optimized approximate inverse and the explicit preconditioned generalized conjugate gradient square scheme, using a dynamic workload scheduling, is also presented. Copyright © 2007 John Wiley & Sons, Ltd. [source]


An overlapping task assignment scheme for hierarchical coarse-grain task parallel processing

CONCURRENCY AND COMPUTATION: PRACTICE & EXPERIENCE, Issue 11 2006
Akimasa YoshidaArticle first published online: 12 JAN 200
Abstract This paper proposes an overlapping task assignment scheme for the hierarchical coarse-grain task parallel processing on multiprocessor systems. In coarse-grain task parallel processing, the compiler extracts parallelism among coarse-grain tasks automatically and the coarse-grain tasks are assigned to processor clusters at runtime. However, several programs may decrease the processor-cluster utilization factor owing to lack of parallelism inside each coarse-grain task. Therefore, in order to improve the processor-cluster utilization factor, this paper proposes the execution scheme with overlapping task assignment whose dynamic scheduler can assign several coarse-grain tasks to a processor cluster simultaneously. Also, the performance evaluations by simulations and executions on SMP showed that the proposed scheme could reduce the execution time remarkably. Copyright © 2006 John Wiley & Sons, Ltd. [source]


Distributed loop-scheduling schemes for heterogeneous computer systems

CONCURRENCY AND COMPUTATION: PRACTICE & EXPERIENCE, Issue 7 2006
Anthony T. Chronopoulos
Abstract Distributed computing systems are a viable and less expensive alternative to parallel computers. However, a serious difficulty in concurrent programming of a distributed system is how to deal with scheduling and load balancing of such a system which may consist of heterogeneous computers. Some distributed scheduling schemes suitable for parallel loops with independent iterations on heterogeneous computer clusters have been designed in the past. In this work we study self-scheduling schemes for parallel loops with independent iterations which have been applied to multiprocessor systems in the past. We extend one important scheme of this type to a distributed version suitable for heterogeneous distributed systems. We implement our new scheme on a network of computers and make performance comparisons with other existing schemes. Copyright © 2005 John Wiley & Sons, Ltd. [source]


Clustering revealed in high-resolution simulations and visualization of multi-resolution features in fluid,particle models

CONCURRENCY AND COMPUTATION: PRACTICE & EXPERIENCE, Issue 2 2003
Krzysztof Boryczko
Abstract Simulating natural phenomena at greater accuracy results in an explosive growth of data. Large-scale simulations with particles currently involve ensembles consisting of between 106 and 109 particles, which cover 105,106 time steps. Thus, the data files produced in a single run can reach from tens of gigabytes to hundreds of terabytes. This data bank allows one to reconstruct the spatio-temporal evolution of both the particle system as a whole and each particle separately. Realistically, for one to look at a large data set at full resolution at all times is not possible and, in fact, not necessary. We have developed an agglomerative clustering technique, based on the concept of a mutual nearest neighbor (MNN). This procedure can be easily adapted for efficient visualization of extremely large data sets from simulations with particles at various resolution levels. We present the parallel algorithm for MNN clustering and its timings on the IBM SP and SGI/Origin 3800 multiprocessor systems for up to 16 million fluid particles. The high efficiency obtained is mainly due to the similarity in the algorithmic structure of MNN clustering and particle methods. We show various examples drawn from MNN applications in visualization and analysis of the order of a few hundred gigabytes of data from discrete particle simulations, using dissipative particle dynamics and fluid particle models. Because data clustering is the first step in this concept extraction procedure, we may employ this clustering procedure to many other fields such as data mining, earthquake events and stellar populations in nebula clusters. Copyright © 2003 John Wiley & Sons, Ltd. [source]


A class of parallel multiple-front algorithms on subdomains

INTERNATIONAL JOURNAL FOR NUMERICAL METHODS IN ENGINEERING, Issue 11 2003
A. Bose
Abstract A class of parallel multiple-front solution algorithms is developed for solving linear systems arising from discretization of boundary value problems and evolution problems. The basic substructuring approach and frontal algorithm on each subdomain are first modified to ensure stable factorization in situations where ill-conditioning may occur due to differing material properties or the use of high degree finite elements (p methods). Next, the method is implemented on distributed-memory multiprocessor systems with the final reduced (small) Schur complement problem solved on a single processor. A novel algorithm that implements a recursive partitioning approach on the subdomain interfaces is then developed. Both algorithms are implemented and compared in a least-squares finite-element scheme for viscous incompressible flow computation using h - and p -finite element schemes. Copyright © 2003 John Wiley & Sons, Ltd. [source]