Performance Comparisons (performance + comparison)

Distribution by Scientific Domains


Selected Abstracts


Performance comparison of MPI and OpenMP on shared memory multiprocessors

CONCURRENCY AND COMPUTATION: PRACTICE & EXPERIENCE, Issue 1 2006
Géraud Krawezik
Abstract When using a shared memory multiprocessor, the programmer faces the issue of selecting the portable programming model which will provide the best performance. Even if they restricts their choice to the standard programming environments (MPI and OpenMP), they have to select a programming approach among MPI and the variety of OpenMP programming styles. To help the programmer in their decision, we compare MPI with three OpenMP programming styles (loop level, loop level with large parallel sections, SPMD) using a subset of the NAS benchmark (CG, MG, FT, LU), two dataset sizes (A and B), and two shared memory multiprocessors (IBM SP3 NightHawk II, SGI Origin 3800). We have developed the first SPMD OpenMP version of the NAS benchmark and gathered other OpenMP versions from independent sources (PBN, SDSC and RWCP). Experimental results demonstrate that OpenMP provides competitive performance compared with MPI for a large set of experimental conditions. Not surprisingly, the two best OpenMP versions are those requiring the strongest programming effort. MPI still provides the best performance under some conditions. We present breakdowns of the execution times and measurements of hardware performance counters to explain the performance differences. Copyright © 2005 John Wiley & Sons, Ltd. [source]


Performance comparison of checkpoint and recovery protocols

CONCURRENCY AND COMPUTATION: PRACTICE & EXPERIENCE, Issue 15 2003
Himadri Sekhar Paul
Abstract Checkpoint and rollback recovery is a well-known technique for providing fault tolerance to long-running distributed applications. Performance of a checkpoint and recovery protocol depends on the characteristics of the application and the system on which it runs. However, given an application and system environment, there is no easy way to identify which checkpoint and recovery protocol will be most suitable for it. Conventional approaches require implementing the application with all the protocols under consideration, running them on the desired system, and comparing their performances. This process can be very tedious and time consuming. This paper first presents the design and implementation of a simulation environment, distributed process simulation or dPSIM, which enables easy implementation and evaluation of checkpoint and recovery protocols. The tool enables the protocols to be simulated under a wide variety of application, system, and network characteristics. The paper then presents performance evaluation of five checkpoint and recovery protocols. These protocols are implemented and executed in dPSIM under different simulated application, system, and network characteristics. Copyright © 2003 John Wiley & Sons, Ltd. [source]


Performance comparison of square root raised-cosine and lerner filters for the MDFT-TMUX filter bank

EUROPEAN TRANSACTIONS ON TELECOMMUNICATIONS, Issue 6 2002
A. Sudana Madhu Rao
This paper deals with prototype filters, namely, square root raised , cosine (SRC) and Lerner filters in the modified discrete Fourier transform (MDFT) transmultiplexer (TMUX) filter bank. The error obtained in the reconstruction of the original signal is compared for various filter orders. These filters can be employed in multicarrier modulation system- overlapped discrete multitone (overlapped DMT) or discrete wavelet multitone (DWMT) system. However, with the SRC filter, a large number of computations are required. By employing Lerner filter, we can reduce this by atleast 3- folds. We have carried out simulation studies for an 8-channel MDFT- TMUX filter bank and the results obtained are presented. [source]


Performance comparison of some dynamical and empirical downscaling methods for South Africa from a seasonal climate modelling perspective

INTERNATIONAL JOURNAL OF CLIMATOLOGY, Issue 11 2009
Willem A. Landman
Abstract The ability of advanced state-of-the-art methods of downscaling large-scale climate predictions to regional and local scale as seasonal rainfall forecasting tools for South Africa is assessed. Various downscaling techniques and raw general circulation model (GCM) output are compared to one another over 10 December-January-February (DJF) seasons from 1991/1992 to 2000/2001 and also to a baseline prediction technique that uses only global sea-surface temperature (SST) anomalies as predictors. The various downscaling techniques described in this study include both an empirical technique called model output statistics (MOS) and a dynamical technique where a finer resolution regional climate model (RCM) is nested into the large-scale fields of a coarser GCM. The study addresses the performance of a number of simulation systems (no forecast lead-time) of varying complexity. These systems' performance is tested for both homogeneous regions and for 963 stations over South Africa, and compared with each other over the 10-year test period. For the most part, the simulations method outscores the baseline method that uses SST anomalies to simulate rainfall, therefore providing evidence that current approaches in seasonal forecasting are outscoring earlier ones. Current operational forecasting approaches involve the use of GCMs, which are considered to be the main tool whereby seasonal forecasting efforts will improve in the future. Advantages in statistically post-processing output from GCMs as well as output from RCMs are demonstrated. Evidence is provided that skill should further improve with an increased number of ensemble members. The demonstrated importance of statistical models in operation capacities is a major contribution to the science of seasonal forecasting. Although RCMs are preferable due to physical consistency, statistical models are still providing similar or even better skill and should still be applied. Copyright © 2008 Royal Meteorological Society [source]


Performance comparison between fixed length switching and variable length switching

INTERNATIONAL JOURNAL OF COMMUNICATION SYSTEMS, Issue 5 2008
Chengchen Hu
Abstract Fixed length switching (FLS) and variable length switching (VLS) are two main types of switching architecture in high-speed input-queued switches. FLS is based on a cell-by-cell scheduling algorithm, while VLS operates on the variable packet granularity. This paper aims to make a comprehensive comparison between these two switching modes to guide the industrial design and academic research. We use stochastic models, Petri net models, analysis and simulations to investigate various performance measures of interest. Average packet latency, bandwidth utilization, segmentation and reassembly overhead, as well as packet loss are the identified key parameters that influence the outcome of the comparison. The results achieved in this paper are twofold. On one hand, it is shown that FLS enables smaller packet loss and lower packet delay in case of a short packet. On the other hand, VLS favors better bandwidth utilization, reduced implementation complexity and lower average packet delay. We recommend VLS in the conclusion since its disadvantages can be compensated by some methods, while the problems in FLS are difficult to be solved. Copyright © 2007 John Wiley & Sons, Ltd. [source]


Performance comparison of slow-light coupled-resonator optical gyroscopes

LASER & PHOTONICS REVIEWS, Issue 5 2009
M. Terrel
Abstract We investigate the connection between group velocity and rotation sensitivity in a number of resonant gyroscope designs. Two key comparisons are made. First, we compare two conventional sensors, namely a resonant fiber optic gyroscope (RFOG) and an interferometric fiber optic gyroscope (FOG). Second, we compare the RFOG to several recently proposed coupled-resonator optical waveguide (CROW) gyroscopes. We show that the relationship between loss and maximum rotation sensitivity is the same for both conventional and CROW gyroscopes. Thus, coupling multiple resonators together cannot enhance rotation sensitivity. While CROW gyroscopes offer the potential for large group indices, this increase of group index does not provide a corresponding increase in the maximum sensitivity to rotation. For a given footprint and a given total loss, the highest sensitivity is shown to be achieved either in a conventional RFOG utilizing a single resonator, or a conventional FOG. [source]


Performance comparison of thermal insulated packaging boxes, bags and refrigerants for single-parcel shipments

PACKAGING TECHNOLOGY AND SCIENCE, Issue 1 2008
S. P. Singh
Abstract A range of packaging solutions exists for products that must be kept within a specific temperature range throughout the supply-and-distribution chain. This report summarizes the results of studies conducted over a span of 2 years by the Consortium for Distribution Packaging at Michigan State University. Thermal insulation packaging materials such as expanded polystyrene, polyurethane, corrugated fibreboard, ThermalCor® and other composite packaging such as thermal insulating bags were studied. Phase change materials such as gel packs were also evaluated. Properties such as R-value, melting point and heat absorption were examined and are reported. Copyright © 2007 John Wiley & Sons, Ltd. [source]


Parallel tiled QR factorization for multicore architectures

CONCURRENCY AND COMPUTATION: PRACTICE & EXPERIENCE, Issue 13 2008
Alfredo Buttari
Abstract As multicore systems continue to gain ground in the high-performance computing world, linear algebra algorithms have to be reformulated or new algorithms have to be developed in order to take advantage of the architectural features on these new processors. Fine-grain parallelism becomes a major requirement and introduces the necessity of loose synchronization in the parallel execution of an operation. This paper presents an algorithm for the QR factorization where the operations can be represented as a sequence of small tasks that operate on square blocks of data (referred to as ,tiles'). These tasks can be dynamically scheduled for execution based on the dependencies among them and on the availability of computational resources. This may result in an out-of-order execution of the tasks that will completely hide the presence of intrinsically sequential tasks in the factorization. Performance comparisons are presented with the LAPACK algorithm for QR factorization where parallelism can be exploited only at the level of the BLAS operations and with vendor implementations. Copyright © 2008 John Wiley & Sons, Ltd. [source]


Cushioning the pressure vibration of a zeolite concentrator system using a decoupled balancing duct system

ENVIRONMENTAL PROGRESS & SUSTAINABLE ENERGY, Issue 2 2007
Feng-Tang Chang
Abstract A honeycomb Zeolite Rotor Concentrator (HZRC) is the main air pollution control device utilized by many semiconductor and optoelectronics manufacturers. Various plant exhaust streams are collected and then transferred to the HZRC for decontamination. In a conventional HZRC, the exhaust fan movement and the switching between different air ducts can cause significant duct pressure variations resulting in production interruption. The minimization of pressure fluctuations to ensure continuous operation of production lines while maintaining a high volatile organic compounds (VOCs) removal efficiency is essential for exhaust treatment in these high technology manufactures. The article introduces a decoupled balancing duct system (DBDS) for controlling the airflows to achieve a balanced pressure in the HZRC system by adding a flow rate control device to the VOCs loaded stream bypass duct of a conventional system. Performance comparisons of HZRC with DBDS and other air flow control systems used by the wafer manufacturers in Hsinchu Science Park, Taiwan are presented. DBDS system had been proved effectively to stabilize the pressure in the airflow ducts, and thus avoided pressure fluctuations; it helped to achieve a high VOCs removal efficiency while ensuring the stability of the HZRC. © 2007 American Institute of Chemical Engineers Environ Prog, 2007 [source]


Accelerating adaptive trade-off model using shrinking space technique for constrained evolutionary optimization

INTERNATIONAL JOURNAL FOR NUMERICAL METHODS IN ENGINEERING, Issue 11 2009
Yong Wang
Abstract Adaptive trade-off model (ATM) is a constraint-handling mechanism proposed recently. The main advantages of this model are its simplicity and adaptation. Moreover, it can be easily embedded into evolutionary algorithms for solving constrained optimization problems. This paper proposes a novel method for constrained optimization, which aims at accelerating the ATM using shrinking space technique. Eighteen benchmark test functions and five engineering design problems are used to test the performance of the method proposed. Experimental results suggest that combining the ATM with the shrinking space technique is very beneficial. The method proposed can promptly converge to competitive results without loss of the quality and the precision of the final results. Performance comparisons with some other state-of-the-art approaches from the literature are also presented. Copyright © 2008 John Wiley & Sons, Ltd. [source]


An analytical approach to the performance evaluation of the balanced gamma switch under multicast traffic,

INTERNATIONAL JOURNAL OF COMMUNICATION SYSTEMS, Issue 4 2007
Cheng Li
Abstract This paper presents the performance evaluation of a new cell-based multicast switch for broadband communications. Using distributed control and a modular design, the balanced gamma (BG) switch features high performance for unicast, multicast and combined traffic under both random and bursty conditions. Although it has buffers on input and output ports, the multicast BG switch follows predominantly an output-buffered architecture. The performance is evaluated under uniform and non-uniform traffic conditions in terms of cell loss ratio and cell delay. An analytical model is presented to analyse the performance of the multicast BG switch under multicast random traffic and used to verify simulation results. The delay performance under multicast bursty traffic is compared with those from an ideal pure output-buffered multicast switch to demonstrate how close its performance is to that of the ideal but impractical switch. Performance comparisons with other published switches are also studied through simulation for non-uniform and bursty traffic. It is shown that the multicast BG switch achieves a performance close to that of the ideal switch while keeping hardware complexity reasonable. Copyright © 2006 John Wiley & Sons, Ltd. [source]


Performance comparisons for adaptive LEO satellite links

INTERNATIONAL JOURNAL OF SATELLITE COMMUNICATIONS AND NETWORKING, Issue 3 2006
William G. Cowley
Abstract This paper considers the potential to achieve improved throughput in time-varying satellite links which have flexibility in information bit rate and/or transmit power. We assume that other parameters of the link budget such as antenna gains and operating frequency are fixed. Simple results are derived, which illustrate what improvements in data throughput or power consumption are possible under two low-earth orbit scenarios: inter-satellite links and satellite to ground communications. Copyright © 2006 John Wiley & Sons, Ltd. [source]


Performance comparisons and attachment: An investigation of competitive responses in close relationships

PERSONAL RELATIONSHIPS, Issue 3 2005
ANTHONY SCINTA
Two studies investigated whether affective responses to competitive performance situations are moderated by attachment style. In Study 1, participants (n= 115) imagined their reactions to a superior or inferior performance against their romantic partner or an acquaintance. Results showed that participants low in attachment avoidance, relative to those high in avoidance, indicated more positivity after an inferior performance (empathy effect) to their partners, and this finding held only in domains of high importance to the partner. In Study 2, participants (n= 53) imagined comparisons with their partner or a close friend. Low-avoidance participants, relative to high-avoidance participants, exhibited sympathy and empathy effects in comparisons involving their romantic partner but not those involving a friend. The findings are discussed in terms of one's model of other and perceived self,other separation, which are defined by avoidance but not anxiety. [source]


Evaluating high-performance computers,

CONCURRENCY AND COMPUTATION: PRACTICE & EXPERIENCE, Issue 10 2005
Jeffrey S. Vetter
Abstract Comparisons of high-performance computers based on their peak floating point performance are common but seldom useful when comparing performance on real workloads. Factors that influence sustained performance extend beyond a system's floating-point units, and real applications exercise machines in complex and diverse ways. Even when it is possible to compare systems based on their performance, other considerations affect which machine is best for a given organization. These include the cost, the facilities requirements (power, floorspace, etc.), the programming model, the existing code base, and so on. This paper describes some of the important measures for evaluating high-performance computers. We present data for many of these metrics based on our experience at Lawrence Livermore National Laboratory (LLNL), and we compare them with published information on the Earth Simulator. We argue that evaluating systems involves far more than comparing benchmarks and acquisition costs. We show that evaluating systems often involves complex choices among a variety of factors that influence the value of a supercomputer to an organization, and that the high-end computing community should view cost/performance comparisons of different architectures with skepticism. Published in 2005 by John Wiley & Sons, Ltd. [source]


APEX-Map: a parameterized scalable memory access probe for high-performance computing systems,

CONCURRENCY AND COMPUTATION: PRACTICE & EXPERIENCE, Issue 17 2007
Erich Strohmaier
Abstract The memory wall between the peak performance of microprocessors and their memory performance has become the prominent performance bottleneck for many scientific application codes. New benchmarks measuring data access speeds locally and globally in a variety of different ways are needed to explore the ever increasing diversity of architectures for high-performance computing. In this paper, we introduce a novel benchmark, APEX-Map, which focuses on global data movement and measures how fast global data can be fed into computational units. APEX-Map is a parameterized, synthetic performance probe and integrates concepts for temporal and spatial locality into its design. Our first parallel implementation in MPI and various results obtained with it are discussed in detail. By measuring the APEX-Map performance with parameter sweeps for a whole range of temporal and spatial localities performance surfaces can be generated. These surfaces are ideally suited to study the characteristics of the computational platforms and are useful for performance comparison. Results on a global-memory vector platform and distributed-memory superscalar platforms clearly reflect the design differences between these different architectures. Published in 2007 by John Wiley & Sons, Ltd. [source]


A performance comparison between the Earth Simulator and other terascale systems on a characteristic ASCI workload,

CONCURRENCY AND COMPUTATION: PRACTICE & EXPERIENCE, Issue 10 2005
Darren J. Kerbyson
Abstract This work gives a detailed analysis of the relative performance of the recently installed Earth Simulator and the next top four systems in the Top500 list using predictive performance models. The Earth Simulator uses vector processing nodes interconnected using a single-stage, cross-bar network, whereas the next top four systems are built using commodity based superscalar microprocessors and interconnection networks. The performance that can be achieved results from an interplay of system characteristics, application requirements and scalability behavior. Detailed performance models are used here to predict the performance of two codes representative of the ASCI workload, namely SAGE and Sweep3D. The performance models encapsulate fully the behavior of these codes and have been previously validated on many large-scale systems. One result of this analysis is to size systems, built from the same nodes and networks as those in the top five, that will have the same performance as the Earth Simulator. In particular, the largest ASCI machine, ASCI Q, is expected to achieve a similar performance to the Earth Simulator on the representative workload. Published in 2005 by John Wiley & Sons, Ltd. [source]


A performance comparison of individual and combined treatment modules for water recycling

ENVIRONMENTAL PROGRESS & SUSTAINABLE ENERGY, Issue 4 2005
Stuart Khan
Abstract An Advanced Water Recycling Demonstration Plant (AWRDP) was commissioned and constructed by the Queensland State Government in Australia. The AWRDP was used to study the effectiveness of a variety of treatment processes in the upgrading of municipal wastewater for water recycling applications. The AWRDP consists of eight modules, each housing an individual specific treatment process. These processes are flocculation, dissolved air flotation, dual media filtration, ozonation, biological activated carbon adsorption, microfiltration, reverse osmosis, and ultraviolet disinfection. The individual performances of the treatment processes were determined, as well as their interdependence in series. A range of chemical water quality parameters were investigated. The study provides a broad process comparison on the basis of an important catalogue of these key parameters. This will be valuable in the selection and optimization of treatment processes trains in full-scale water recycling applications. © 2005 American Institute of Chemical Engineers Environ Prog, 2005 [source]


Analytical modelling of users' behaviour and performance metrics in key distribution schemes

EUROPEAN TRANSACTIONS ON TELECOMMUNICATIONS, Issue 1 2010
Massimo Tornatore
Access control for group communications must ensure that only legitimate users can access the authorised data streams. This could be done by distributing an encrypting key to each member of the group to be secured. To achieve a high level of security, the group key should be changed every time a user joins or leaves the group, so that a former group member has no access to current communications and a new member has no access to previous communications. Since group memberships could be very dynamic, the group key should be changed frequently. So far, different schemes for efficient key distribution have been proposed to limit the key-distribution overhead. In previous works, the performance comparison among these different schemes have been based on simulative experiments, where users join and leave secure groups according to a basic statistical model of users' behaviour. In this paper, we propose a new statistical model to account for the behaviour of users and compare it to the modelling approach so far adopted in the literature. Our new model is able to to lead the system to a steady state (allowing a superior statistical confidence of the results), as opposed to current models in which the system is permanently in a transient and diverging state. We also provide analytical formulations of the main performance metrics usually adopted to evaluate key distribution systems, such as rekey overheads and storage overheads. Then, we validate our simulative outcomes with results obtained by analytical formulations. Copyright © 2009 John Wiley & Sons, Ltd. [source]


Assessment of fire protection performance of water mist applied in exhaust ducts for semiconductor fabrication process

FIRE AND MATERIALS, Issue 5 2005
Yi-Liang Shu
Abstract Fume exhaust pipes used in semiconductor facilities underwent a series of fire tests to evaluate the performance of a water mist system. The parameters considered were the amount of water that the mist nozzles used, the air flow velocity, the fire intensity and the water mist system operating pressure. In order to make a performance comparison, tests were also performed with a standard sprinkler system. The base case served as a reference and applied a single water mist nozzle (100 bar operating pressure, 7.3 l/min water volume flux and 200 µm mean droplet size) installed in the pipe (60 cm in diameter) subjected to a 350°C air flow with an average velocity of 2 m/s. In such a case, the temperature in the hot flow dropped sharply as the water mist nozzle was activated and reached a 60°C saturation point. Under the same operating conditions, four mist nozzles were applied, and made no further contribution to reducing the fire temperature compared with the case using only a single nozzle. Similar fire protection performances to that in the base case were still retained when the exhaust flow velocity increased to 3 m/s and the inlet air temperature was increased to 500°C due to a stronger input fire scenario, respectively. Changing to a water mist system produced a better performance than a standard sprinkler. With regard to the effect of operating pressure of water mist system, a higher operating pressure can have a better performance. The results above indicate that the droplet size in a water-related fire protection system plays a critical role. Copyright © 2005 John Wiley & Sons, Ltd. [source]


A flammability performance comparison between synthetic and natural clays in polystyrene nanocomposites

FIRE AND MATERIALS, Issue 4 2005
Alexander B. Morgan
Abstract Polymer-clay nanocomposites are a newer class of flame retardant materials of interest due to their balance of mechanical, thermal and flammability properties. Much more work has been done with natural clays than with synthetic clays for nanocomposite flammability applications. There are advantages and disadvantages to both natural and synthetic clay use in a nanocomposite, and some of these, both fundamental and practical, will be discussed in this paper. To compare natural and synthetic clays in regards to polymer flammability, two clays were used. The natural clay was a US mined and refined montmorillonite, while the synthetic clay was a fluorinated synthetic mica. These two clays were used as inorganic clays for control experiments in polystyrene, and then converted into an organoclay by ion exchange with an alkyl ammonium salt. The organoclays were used to synthesize polystyrene nanocomposites by melt compounding. Each of the formulations was analysed by X-ray diffraction (XRD), thermogravimetric analysis (TGA) and transmission electron microscopy (TEM). Flammability performance was measured by cone calorimeter. The data from the experiments show that the synthetic clay does slightly better at reducing the heat release rate (HRR) than the natural clay. However, all the samples, including the inorganic clay polystyrene microcomposites, showed a decreased time to ignition, with the actual nanocomposites showing the most marked decrease. The reason for this is postulated to be related to the thermal instability of the organoclay (via the quaternary alkyl ammonium). An additional experiment using a more thermally stable organoclay showed a time to ignition identical to that of the base polymer. Finally, it was shown that while polymer-clay nanocomposites (either synthetic or natural clay based) greatly reduce the HRR of a material, making it more fire safe, they do not provide ignition resistance by themselves, at least, at practical loadings. Specifically, the cone calorimeter HRR curve data appear to support that these nanocomposites continue to burn once ignited, rather than self-extinguish. Copyright © 2004 John Wiley & Sons, Ltd. [source]


The performance analysis of a two-stage transcritical CO2 cooling cycle with various gas cooler pressures

INTERNATIONAL JOURNAL OF ENERGY RESEARCH, Issue 14 2008
Arif Emre Özgür
Abstract A theoretical analysis of a two-stage transcritical CO2 cooling cycle is presented. The effect of a two-stage cycle with intercooling process on the system coefficient of cooling performance is presented for various gas cooler pressures. However, the performance comparison between one-stage and two-stage cycles is presented for same operating conditions. Gas cooler pressure, compressor isentropic efficiency, gas cooler efficiency, intercooling quantity and refrigerant outlet temperature from the gas cooler are used as variable parameters in the analysis. It is concluded that the performance of the two-stage transcritical CO2 cycle is approximately 30% higher than that of the one-stage transcritical CO2 cycle. Hence, the two-stage compression and intercooling processes can be assumed as valuable applications to improve the transcritical CO2 cycle performance. Copyright © 2008 John Wiley & Sons, Ltd. [source]


Achieving a near-optimum erasure correction performance with low-complexity LDPC codes

INTERNATIONAL JOURNAL OF SATELLITE COMMUNICATIONS AND NETWORKING, Issue 5-6 2010
Gianluigi Liva
Abstract Low-density parity-check (LDPC) codes are shown to tightly approach the performance of idealized maximum distance separable (MDS) codes over memoryless erasure channels, under maximum likelihood (ML) decoding. This is possible down to low error rates and even for small and moderate block sizes. The decoding complexity of ML decoding is kept low thanks to a class of decoding algorithms, which exploit the sparseness of the parity-check matrix to reduce the complexity of Gaussian elimination. ML decoding of LDPC codes is reviewed at first. A performance comparison among various classes of LDPC codes is then carried out, including a comparison with fixed-rate Raptor codes for the same parameters. The results confirm that a judicious LDPC code design allows achieving a near-optimum performance over the erasure channel, with very low error floors. Furthermore, it is shown that LDPC and Raptor codes, under ML decoding, provide almost identical performance in terms of decoding failure probability vs. overhead. Copyright © 2010 John Wiley & Sons, Ltd. [source]


On the effectiveness of runtime techniques to reduce memory sharing overheads in distributed Java implementations

CONCURRENCY AND COMPUTATION: PRACTICE & EXPERIENCE, Issue 13 2008
Marcelo Lobosco
Abstract Distributed Java virtual machine (dJVM) systems enable concurrent Java applications to transparently run on clusters of commodity computers. This is achieved by supporting Java's shared-memory model over multiple JVMs distributed across the cluster's computer nodes. In this work, we describe and evaluate selective dynamic diffing and lazy home allocation, two new runtime techniques that enable dJVMs to efficiently support memory sharing across the cluster. Specifically, the two proposed techniques can contribute to reduce the overheads due to message traffic, extra memory space, and high latency of remote memory accesses that such dJVM systems require for implementing their memory-coherence protocol either in isolation or in combination. In order to evaluate the performance-related benefits of dynamic diffing and lazy home allocation, we implemented both techniques in Cooperative JVM (CoJVM), a basic dJVM system we developed in previous work. In subsequent work, we carried out performance comparisons between the basic CoJVM and modified CoJVM versions for five representative concurrent Java applications (matrix multiply, LU, Radix, fast Fourier transform, and SOR) using our proposed techniques. Our experimental results showed that dynamic diffing and lazy home allocation significantly reduced memory sharing overheads. The reduction resulted in considerable gains in CoJVM system's performance, ranging from 9% up to 20%, in four out of the five applications, with resulting speedups varying from 6.5 up to 8.1 for an 8-node cluster of computers. Copyright © 2007 John Wiley & Sons, Ltd. [source]


Distributed loop-scheduling schemes for heterogeneous computer systems

CONCURRENCY AND COMPUTATION: PRACTICE & EXPERIENCE, Issue 7 2006
Anthony T. Chronopoulos
Abstract Distributed computing systems are a viable and less expensive alternative to parallel computers. However, a serious difficulty in concurrent programming of a distributed system is how to deal with scheduling and load balancing of such a system which may consist of heterogeneous computers. Some distributed scheduling schemes suitable for parallel loops with independent iterations on heterogeneous computer clusters have been designed in the past. In this work we study self-scheduling schemes for parallel loops with independent iterations which have been applied to multiprocessor systems in the past. We extend one important scheme of this type to a distributed version suitable for heterogeneous distributed systems. We implement our new scheme on a network of computers and make performance comparisons with other existing schemes. Copyright © 2005 John Wiley & Sons, Ltd. [source]


Sexual conflicts, loss of flight, and fitness gains in locomotion of polymorphic water striders

ENTOMOLOGIA EXPERIMENTALIS ET APPLICATA, Issue 3 2007
Pablo Perez Goodwyn
Abstract In insect wing polymorphism, morphs with fully developed, intermediate, and without wings are recognized. The morphs are interpreted as a trade-off between flight and flightlessness; the benefits of flight are counterbalanced by the costs of development and the maintenance of wings and flight muscles. Such a trade-off has been widely shown for reproductive and developmental parameters, and wing reduction is associated with species of stable habitats. However, in this context, the role of water locomotion performance has not been well explored. We chose seven water striders (Heteroptera: Gerridae) as a model to study this trade-off and its relation to sexual conflicts, namely, Aquarius elongatus (Uhler), Aquarius paludum (Fabr.), Gerris insularis (Motschulsky), Gerris nepalensis Distant, Gerris latiabdominis Miyamoto, Metrocoris histrio (White), and Rhagadotarsus kraepelini Breddin. We estimated the locomotion performance as the legs' stroke force, measured on tethered specimens placed on water with a force transducer attached to their backs. By dividing force by body weight, we made performance comparisons. We found a positive relationship between weight and force, and a negative one between weight and the force-to-weight ratio among species. The trade-off between water and flight locomotion was manifested as differences in performance in terms of the force/weight ratio. However, the bias toward winged or wing-reduced morphs was species dependent, and presumably related to habitat preference. Water strider species favouring a permanent habitat (G. nepalensis) showed higher performance in the apterous morph, but in those favouring temporary habitats (A. paludum and R. kraepelini) morphs' performance did not differ significantly. Males had higher performance than females in all but three species studied (namely, A. elongatus, G. nepalensis, and R. kraepelini); these three have a type II mating strategy with minimized mating struggle. We hypothesized that in type I mating system, in which males must struggle strongly to subdue the female, males should outperform females to copulate successfully. This was not necessarily true among males of species with type II mating. [source]


DNAPL Characterization Methods and Approaches, Part 2: Cost Comparisons

GROUND WATER MONITORING & REMEDIATION, Issue 1 2002
Mark L. Kram
Contamination from the use of chlorinated solvents, often classified as dense nonaqueous phase liquids (DNAPLs) when in an undissolved state, pose environmental threats to ground water resources worldwide. DNAPL site characterization method performance comparisons are presented in a companion paper (Kram et al. 2001). This study compares the costs for implementing various characterization approaches using synthetic unit model scenarios (UMSs), each with particular physical characteristics. Unit costs and assumptions related to labor, equipment, and consumables are applied to determine costs associated with each approach for various UMSs. In general, the direct-push sensor systems provide cost-effective characterization information in soils that are penetrable with relatively shallow (less than 10 to 15 m) water tables. For sites with impenetrable lithology using direct-push techniques, the Ribbon NAPL Sampler Flexible Liner Underground Technologies Everting (FLUTe) membrane appears to be the most cost-effective approach. For all scenarios studied, partitioning interwell tracer tests (PITTs) are the most expensive approach due to the extensive pre-and post-PITT requirements. However, the PITT is capable of providing useful additional information, such as approximate DNAPL saturation, which is not generally available from any of the other approaches included in this comparison. [source]


Error performance for relaying protocols with multiple decode-and-forward relays

INTERNATIONAL JOURNAL OF COMMUNICATION SYSTEMS, Issue 8 2010
Wenbo Xu
Abstract This paper investigates the error performance of three relaying protocols with multiple decode-and-forward relays. In the first protocol, relays that can decode correctly will forward the signals from source. Nevertheless, selection cooperation (SC) and opportunistic relaying (OR) are adopted to select only a single relay to forward in the other two protocols, respectively. At sufficiently high signal-to-noise ratio, the upper bounds on bit error probability are derived for three protocols, where the developments apply for various channel fading models. Simulation results are provided to verify the tightness of the analytical bounds, and the performance comparisons among different relaying protocols are presented. Copyright © 2010 John Wiley & Sons, Ltd. [source]


Scalable and lightweight key distribution for secure group communications

INTERNATIONAL JOURNAL OF NETWORK MANAGEMENT, Issue 3 2004
Fu-Yuan Lee
Securing group communications in dynamic and large-scale groups is more complex than securing one-to-one communications due to the inherent scalability issue of group key management. In particular, cost for key establishment and key renewing is usually relevant to the group size and subsequently becomes a performance bottleneck in achieving scalability. To address this problem, this paper proposes a new approach that features decoupling of group size and computation cost for group key management. By using a hierarchical key distribution architecture and load sharing, the load of key management can be shared by a cluster of third parties without revealing group messages to them. The proposed scheme provides better scalability because the cost for key management of each component is independent of the group size. Specifically, our scheme incurs constant computation and communication overheads for key renewing. In this paper, we present the detailed design of the proposed scheme and performance comparisons with other schemes. Briefly, our scheme provides better scalability than existing group key distribution approaches.,Copyright © 2004 John Wiley & Sons, Ltd. [source]


Semi-random LDPC codes for CDMA communication over non-linear band-limited satellite channels

INTERNATIONAL JOURNAL OF SATELLITE COMMUNICATIONS AND NETWORKING, Issue 4 2006
Mohamed Adnan Landolsi
Abstract This paper considers the application of low-density parity check (LDPC) error correcting codes to code division multiple access (CDMA) systems over satellite links. The adapted LDPC codes are selected from a special class of semi-random (SR) constructions characterized by low encoder complexity, and their performance is optimized by removing short cycles from the code bipartite graphs. Relative performance comparisons with turbo product codes (TPC) for rate 1/2 and short-to-moderate block sizes show some advantage for SR-LDPC, both in terms of bit error rate and complexity requirements. CDMA systems using these SR-LDPC codes and operating over non-linear, band-limited satellite links are analysed and their performance is investigated for a number of signal models and codes parameters. The numerical results show that SR-LDPC codes can offer good capacity improvements in terms of supportable number of users at a given bit error performance. Copyright © 2006 John Wiley & Sons, Ltd. [source]


Enhancing molecular discovery using descriptor-free rearrangement clustering techniques for sparse data sets

AICHE JOURNAL, Issue 2 2010
Peter A. DiMaggio Jr.
Abstract This article presents a descriptor-free method for estimating library compounds with desired properties from synthesizing and assaying minimal library space. The method works by identifying the optimal substituent ordering (i.e., the optimal encoding integer assignment to each functional group on every substituent site of molecular scaffold) based on a global pairwise difference metric intended to capture smoothness of the compound library. The reordering can be accomplished via a (i) mixed-integer linear programming (MILP) model, (ii) genetic algorithm based approach, or (iii) heuristic approach. We present performance comparisons between these techniques as well as an independent analysis of characteristics of the MILP model. Two sparsely sampled data matrices provided by Pfizer are analyzed to validate the proposed approach and we show that the rearrangement of these matrices leads to regular property landscapes which enable reliable property estimation/interpolation over the full library space. An iterative strategy for compound synthesis is also introduced that utilizes the results of the reordered data to direct the synthesis toward desirable compounds. We demonstrate in a simulated experiment using held out subsets of the data that the proposed iterative technique is effective in identifying compounds with desired physical properties. © 2009 American Institute of Chemical Engineers AIChE J, 2010 [source]