Parallel Efficiency (parallel + efficiency)

Distribution by Scientific Domains


Selected Abstracts


Parallel divide-and-conquer scheme for 2D Delaunay triangulation

CONCURRENCY AND COMPUTATION: PRACTICE & EXPERIENCE, Issue 12 2006
Min-Bin Chen
Abstract This work describes a parallel divide-and-conquer Delaunay triangulation scheme. This algorithm finds the affected zone, which covers the triangulation and may be modified when two sub-block triangulations are merged. Finding the affected zone can reduce the amount of data required to be transmitted between processors. The time complexity of the divide-and-conquer scheme remains O(n log n), and the affected region can be located in O(n) time steps, where n denotes the number of points. The code was implemented with C, FORTRAN and MPI, making it portable to many computer systems. Experimental results on an IBM SP2 show that a parallel efficiency of 44,95% for general distributions can be attained on a 16-node distributed memory system. Copyright © 2006 John Wiley & Sons, Ltd. [source]


On a multilevel preconditioning module for unstructured mesh Krylov solvers: two-level Schwarz

INTERNATIONAL JOURNAL FOR NUMERICAL METHODS IN BIOMEDICAL ENGINEERING, Issue 6 2002
R. S. Tuminaro
Abstract Multilevel methods offer the best promise to attain both fast convergence and parallel efficiency in the numerical solution of parabolic and elliptic partial differential equations. Unfortunately, they have not been widely used in part because of implementation difficulties for unstructured mesh solvers. To facilitate use, a multilevel preconditioner software module, ML, has been constructed. Several methods are provided requiring relatively modest programming effort on the part of the application developer. This report discusses the implementation of one method in the module: a two-level Krylov,Schwarz preconditioner. To illustrate the use of these methods in computational fluid dynamics (CFD) engineering applications, we present results for 2D and 3D CFD benchmark problems. Copyright © 2002 John Wiley & Sons, Ltd. [source]


Parallel eigenanalysis of multiaquifer systems

INTERNATIONAL JOURNAL FOR NUMERICAL METHODS IN ENGINEERING, Issue 15 2005
L. Bergamaschi
Abstract Finite element discretizations of flow problems involving multiaquifer systems deliver large, sparse, unstructured matrices, whose partial eigenanalysis is important for both solving the flow problem and analysing its main characteristics. We studied and implemented an effective preconditioning of the Jacobi,Davidson algorithm by FSAI-type preconditioners. We developed efficient parallelization strategies in order to solve very large problems, which could not fit into the storage available to a single processor. We report our results about the solution of multiaquifer flow problems on an SP4 machine and a Linux Cluster. We analyse the sequential and parallel efficiency of our algorithm, also compared with standard packages. Questions regarding the parallel solution of finite element eigenproblems are addressed and discussed. Copyright © 2005 John Wiley & Sons, Ltd. [source]


Large-scale parallel finite-element analysis using the internet: a performance study

INTERNATIONAL JOURNAL FOR NUMERICAL METHODS IN ENGINEERING, Issue 2 2005
Ryuji Shioya
Abstract This paper describes a parallel finite-element system implemented using the domain decomposition method on a cluster of remote computers connected via the Internet. This technique is also readily applicable to a grid computing environment. A three-dimensional finite-element elastic analysis involving more than one million degrees of freedom was solved using this system, and a good approximate solution was obtained with high parallel efficiency of over 90% using remote computers located in three different countries. Copyright © 2005 John Wiley & Sons, Ltd. [source]


A distributed memory parallel implementation of the multigrid method for solving three-dimensional implicit solid mechanics problems

INTERNATIONAL JOURNAL FOR NUMERICAL METHODS IN ENGINEERING, Issue 8 2004
A. Namazifard
Abstract We describe the parallel implementation of a multigrid method for unstructured finite element discretizations of solid mechanics problems. We focus on a distributed memory programming model and use the MPI library to perform the required interprocessor communications. We present an algebraic framework for our parallel computations, and describe an object-based programming methodology using Fortran90. The performance of the implementation is measured by solving both fixed- and scaled-size problems on three different parallel computers (an SGI Origin2000, an IBM SP2 and a Cray T3E). The code performs well in terms of speedup, parallel efficiency and scalability. However, the floating point performance is considerably below the peak values attributed to these machines. Lazy processors are documented on the Origin that produce reduced performance statistics. The solution of two problems on an SGI Origin2000, an IBM PowerPC SMP and a Linux cluster demonstrate that the algorithm performs well when applied to the unstructured meshes required for practical engineering analysis. Copyright © 2004 John Wiley & Sons, Ltd. [source]


Coupled Navier,Stokes,Molecular dynamics simulations using a multi-physics flow simulation framework

INTERNATIONAL JOURNAL FOR NUMERICAL METHODS IN FLUIDS, Issue 10 2010
R. Steijl
Abstract Simulation of nano-scale channel flows using a coupled Navier,Stokes/Molecular Dynamics (MD) method is presented. The flow cases serve as examples of the application of a multi-physics computational framework put forward in this work. The framework employs a set of (partially) overlapping sub-domains in which different levels of physical modelling are used to describe the flow. This way, numerical simulations based on the Navier,Stokes equations can be extended to flows in which the continuum and/or Newtonian flow assumptions break down in regions of the domain, by locally increasing the level of detail in the model. Then, the use of multiple levels of physical modelling can reduce the overall computational cost for a given level of fidelity. The present work describes the structure of a parallel computational framework for such simulations, including details of a Navier,Stokes/MD coupling, the convergence behaviour of coupled simulations as well as the parallel implementation. For the cases considered here, micro-scale MD problems are constructed to provide viscous stresses for the Navier,Stokes equations. The first problem is the planar Poiseuille flow, for which the viscous fluxes on each cell face in the finite-volume discretization are evaluated using MD. The second example deals with fully developed three-dimensional channel flow, with molecular level modelling of the shear stresses in a group of cells in the domain corners. An important aspect in using shear stresses evaluated with MD in Navier,Stokes simulations is the scatter in the data due to the sampling of a finite ensemble over a limited interval. In the coupled simulations, this prevents the convergence of the system in terms of the reduction of the norm of the residual vector of the finite-volume discretization of the macro-domain. Solutions to this problem are discussed in the present work, along with an analysis of the effect of number of realizations and sample duration. The averaging of the apparent viscosity for each cell face, i.e. the ratio of the shear stress predicted from MD and the imposed velocity gradient, over a number of macro-scale time steps is shown to be a simple but effective method to reach a good level of convergence of the coupled system. Finally, the parallel efficiency of the developed method is demonstrated. Copyright © 2009 John Wiley & Sons, Ltd. [source]


Inexact information aided, low-cost, distributed genetic algorithms for aerodynamic shape optimization

INTERNATIONAL JOURNAL FOR NUMERICAL METHODS IN FLUIDS, Issue 10-11 2003
Marios K. Karakasis
Abstract Despite its robustness, the design and optimization of aerodynamic shapes using genetic algorithms suffers from high computing cost requirements, due to excessive calls to Computational Fluid Dynamics tools for the evaluation of candidate solutions. To alleviate this problem, either the use of distributed genetic algorithms or the implementation of surrogate evaluation models have separately been proposed in the past. A distributed genetic algorithm relies on the handling of population subsets that evolve in a semi-isolated manner by regularly exchanging their best individuals. It is known that distributed schemes generally outperform single-population ones. On the other hand, the implementation of less costly surrogate evaluation tools, such as the autocatalytic radial basis function networks developed by the authors for the purpose of getting rid of most of the ,useless' exact evaluations, reduces considerably the computational cost. The aim of the present paper is to employ a surrogate evaluation model in the context of a distributed genetic algorithm and to demonstrate that the combination of both results in maximum economy in CPU cost. In addition, whenever a multiprocessor system is available, the gain is much more pronounced, since the new optimization method maximizes parallel efficiency. The proposed method is used to solve inverse design and optimization problems in aeronautics and turbomachinery. Copyright © 2003 John Wiley & Sons, Ltd. [source]


Parallelization of a vorticity formulation for the analysis of incompressible viscous fluid flows

INTERNATIONAL JOURNAL FOR NUMERICAL METHODS IN FLUIDS, Issue 11 2002
Mary J. Brown
Abstract A parallel computer implementation of a vorticity formulation for the analysis of incompressible viscous fluid flow problems is presented. The vorticity formulation involves a three-step process, two kinematic steps followed by a kinetic step. The first kinematic step determines vortex sheet strengths along the boundary of the domain from a Galerkin implementation of the generalized Helmholtz decomposition. The vortex sheet strengths are related to the vorticity flux boundary conditions. The second kinematic step determines the interior velocity field from the regular form of the generalized Helmholtz decomposition. The third kinetic step solves the vorticity equation using a Galerkin finite element method with boundary conditions determined in the first step and velocities determined in the second step. The accuracy of the numerical algorithm is demonstrated through the driven-cavity problem and the 2-D cylinder in a free-stream problem, which represent both internal and external flows. Each of the three steps requires a unique parallelization effort, which are evaluated in terms of parallel efficiency. Copyright © 2002 John Wiley & Sons, Ltd. [source]


A new parallel algorithm of MP2 energy calculations

JOURNAL OF COMPUTATIONAL CHEMISTRY, Issue 4 2006
Kazuya Ishimura
Abstract A new parallel algorithm has been developed for second-order Møller,Plesset perturbation theory (MP2) energy calculations. Its main projected applications are for large molecules, for instance, for the calculation of dispersion interaction. Tests on a moderate number of processors (2,16) show that the program has high CPU and parallel efficiency. Timings are presented for two relatively large molecules, taxol (C47H51NO14) and luciferin (C11H8N2O3S2), the former with the 6-31G* and 6-311G** basis sets (1032 and 1484 basis functions, 164 correlated orbitals), and the latter with the aug-cc-pVDZ and aug-cc-pVTZ basis sets (530 and 1198 basis functions, 46 correlated orbitals). An MP2 energy calculation on C130H10 (1970 basis functions, 265 correlated orbitals) completed in less than 2 h on 128 processors. © 2006 Wiley Periodicals, Inc. J Comput Chem 27: 407,413, 2006 [source]