Home About us Contact | |||
Real Data (real + data)
Terms modified by Real Data Selected AbstractsProbability plots based on Student's t -distributionACTA CRYSTALLOGRAPHICA SECTION A, Issue 4 2009Rob W. W. Hooft The validity of the normal distribution as an error model is commonly tested with a (half) normal probability plot. Real data often contain outliers. The use of t -distributions in a probability plot to model such data more realistically is described. It is shown how a suitable value of the parameter , of the t -distribution can be determined from the data. The results suggest that even data that seem to be modeled well using a normal distribution can be better modeled using a t -distribution. [source] An automated pottery archival and reconstruction systemCOMPUTER ANIMATION AND VIRTUAL WORLDS (PREV: JNL OF VISUALISATION & COMPUTER ANIMATION), Issue 3 2003Martin Kampel Abstract Motivated by the current requirements of archaeologists, we are developing an automated archival system for archaeological classification and reconstruction of ceramics. Our system uses the profile of an archaeological fragment, which is the cross-section of the fragment in the direction of the rotational axis of symmetry, to classify and reconstruct it virtually. Ceramic fragments are recorded automatically by a 3D measurement system based on structured (coded) light. The input data for the estimation of the profile is a set of points produced by the acquisition system. By registering the front and the back views of the fragment the profile is computed and measurements like diameter, area percentage of the complete vessel, height and width are derived automatically. We demonstrate the method and give results on synthetic and real data. Copyright © 2003 John Wiley & Sons, Ltd. [source] Pedestrian Reactive Navigation for Crowd Simulation: a Predictive ApproachCOMPUTER GRAPHICS FORUM, Issue 3 2007Sébastien Paris This paper addresses the problem of virtual pedestrian autonomous navigation for crowd simulation. It describes a method for solving interactions between pedestrians and avoiding inter-collisions. Our approach is agent-based and predictive: each agent perceives surrounding agents and extrapolates their trajectory in order to react to potential collisions. We aim at obtaining realistic results, thus the proposed model is calibrated from experimental motion capture data. Our method is shown to be valid and solves major drawbacks compared to previous approaches such as oscillations due to a lack of anticipation. We first describe the mathematical representation used in our model, we then detail its implementation, and finally, its calibration and validation from real data. [source] Parallel heterogeneous CBIR system for efficient hyperspectral image retrieval using spectral mixture analysisCONCURRENCY AND COMPUTATION: PRACTICE & EXPERIENCE, Issue 9 2010Antonio J. Plaza Abstract The purpose of content-based image retrieval (CBIR) is to retrieve, from real data stored in a database, information that is relevant to a query. In remote sensing applications, the wealth of spectral information provided by latest-generation (hyperspectral) instruments has quickly introduced the need for parallel CBIR systems able to effectively retrieve features of interest from ever-growing data archives. To address this need, this paper develops a new parallel CBIR system that has been specifically designed to be run on heterogeneous networks of computers (HNOCs). These platforms have soon become a standard computing architecture in remote sensing missions due to the distributed nature of data repositories. The proposed heterogeneous system first extracts an image feature vector able to characterize image content with sub-pixel precision using spectral mixture analysis concepts, and then uses the obtained feature as a search reference. The system is validated using a complex hyperspectral image database, and implemented on several networks of workstations and a Beowulf cluster at NASA's Goddard Space Flight Center. Our experimental results indicate that the proposed parallel system can efficiently retrieve hyperspectral images from complex image databases by efficiently adapting to the underlying parallel platform on which it is run, regardless of the heterogeneity in the compute nodes and communication links that form such parallel platform. Copyright © 2009 John Wiley & Sons, Ltd. [source] Measuring dispersal and detecting departures from a random walk model in a grasshopper hybrid zoneECOLOGICAL ENTOMOLOGY, Issue 2 2003R. I. Bailey Abstract. 1. The grasshopper species Chorthippus brunneus and C. jacobsi form a complex mosaic hybrid zone in northern Spain. Two mark,release,recapture studies were carried out near the centre of the zone in order to make direct estimates of lifetime dispersal. 2. A model framework based on a simple random walk in homogeneous habitat was extended to include the estimation of philopatry and flying propensity. Each model was compared with the real data, correcting for spatial and temporal biases in the data sets. 3. All four data sets (males and females at each site) deviated significantly from a random walk. Three of the data sets showed strong philopatry and three had a long dispersal tail, indicating a low propensity to move further than predicted by the random walk model. 4. Neighbourhood size estimates were 76 and 227 for the two sites. These estimates may underestimate effective population size, which could be increased by the long tail to the dispersal function. The random walk model overestimates lifetime dispersal and hence the minimum spatial scale of adaptation. 5. Best estimates of lifetime dispersal distance of 7,33 m per generation were considerably lower than a previous indirect estimate of 1344 m per generation. This discrepancy could be influenced by prezygotic isolation, an inherent by-product of mosaic hybrid zone structure. [source] Short-term load forecasting using informative vector machineELECTRICAL ENGINEERING IN JAPAN, Issue 2 2009Eitaro Kurata Abstract In this paper, a novel method is proposed for short-term load forecasting, which is one of the important tasks in power system operation and planning. The load behavior is so complicated that it is hard to predict the load. The deregulated power market is faced with the new problem of an increase in the degree of uncertainty. Thus, power system operators are concerned with the significant level of load forecasting. Namely, probabilistic load forecasting is required to smooth power system operation and planning. In this paper, an IVM (Informative Vector Machine) based method is proposed for short-term load forecasting. IVM is one of the kernel machine techniques that are derived from an SVM (Support Vector Machine). The Gaussian process (GP) satisfies the requirements that the prediction results are expressed as a distribution rather than as points. However, it is inclined to be overtrained for noise due to the basis function with N2 elements for N data. To overcome this problem, this paper makes use of IVM that selects necessary data for the model approximation with a posteriori distribution of entropy. That has a useful function to suppress the excess training. The proposed method is tested using real data for short-term load forecasting. © 2008 Wiley Periodicals, Inc. Electr Eng Jpn, 166(2): 23, 31, 2009; Published online in Wiley InterScience (www. interscience.wiley.com). DOI 10.1002/eej.20693 [source] A study of economic evaluation of demand-side energy storage system in consideration of market clearing priceELECTRICAL ENGINEERING IN JAPAN, Issue 1 2007Ken Furusawa Abstract In Japan the electricity market will open on April 1, 2004. Electric utility, Power Producer and Supplier (PPS), and Load Service Entity (LSE) will join the electricity market. LSEs purchase electricity based on the Market Clearing Price (:MCP) from the electricity market. LSEs supply electricity to the customers that contracted with the LSEs on a certain electricity price, and one to the customers that introduced Energy Storage System (:ES) on a time-of-use pricing. It is difficult for LSEs to estimate whether they have any incentive to promote customers to introduce ES or not. This paper evaluates the reduction of LSEs' purchasing cost from the electricity market and other LSEs' purchasing cost by introducing ES to customers. It is clarified which kind of customers has the effect of decreasing LSEs' purchasing cost and how much MCP of the whole power system the demand-side energy storage systems change. Through numerical examples, this paper evaluates the possibility of giving the cost merit to both customers with energy storage systems and LSE by using real data for a year's worth of MCP. © 2006 Wiley Periodicals, Inc. Electr Eng Jpn, 158(1): 22,35, 2007; Published online in Wiley InterScience (www.interscience.wiley.com). DOI 10.1002/eej.20447 [source] Sampling from Dirichlet partitions: estimating the number of speciesENVIRONMETRICS, Issue 7 2009Thierry Huillet Abstract The Dirichlet partition of an interval can be viewed as the generalization of several classical models in ecological statistics. We recall the unordered Ewens sampling formulae -ESF) from finite Dirichlet partitions. As this is a key variable for estimation purposes, focus is on the number of distinct visited species in the sampling process. These are illustrated in specific cases. We use these preliminary statistical results on frequencies distribution to address the following sampling problem: what is the estimated number of species when sampling is from Dirichlet populations? The obtained results are in accordance with the ones found in sampling theory from random proportions with Poisson,Dirichlet -PD) distribution. To conclude with, we apply the different estimators suggested to two different sets of real data. Copyright © 2009 John Wiley & Sons, Ltd. [source] Design-based empirical orthogonal function model for environmental monitoring data analysis,ENVIRONMETRICS, Issue 8 2008Breda Munoz Abstract An empirical orthogonal function (EOF) model is proposed as a prediction method for data collected over space and time. EOF models are widely used in a number of disciplines, including Meteorology and Oceanography. The appealing feature of this model is the advantage of not requiring any assumption for the covariance matrix structure. However, there is a need to account for the errors associated with the spatial and temporal features of the data. This is accomplished by incorporating information from the sampling design, used to establish the network, into the model. The theoretical developments and numerical solutions are presented in the first section of the paper. An application of the model to real data and the results of validation analyses are also presented. Copyright © 2008 John Wiley & Sons, Ltd. [source] Generalized Birnbaum-Saunders distributions applied to air pollutant concentrationENVIRONMETRICS, Issue 3 2008Víctor Leiva Abstract The generalized Birnbaum-Saunders (GBS) distribution is a new class of positively skewed models with lighter and heavier tails than the traditional Birnbaum-Saunders (BS) distribution, which is largely applied to study lifetimes. However, the theoretical argument and the interesting properties of the GBS model have made its application possible beyond the lifetime analysis. The aim of this paper is to present the GBS distribution as a useful model for describing pollution data and deriving its positive and negative moments. Based on these moments, we develop estimation and goodness-of-fit methods. Also, some properties of the proposed estimators useful for developing asymptotic inference are presented. Finally, an application with real data from Environmental Sciences is given to illustrate the methodology developed. This example shows that the empirical fit of the GBS distribution to the data is very good. Thus, the GBS model is appropriate for describing air pollutant concentration data, which produces better results than the lognormal model when the administrative target is determined for abating air pollution. Copyright © 2007 John Wiley & Sons, Ltd. [source] Emissions of greenhouse gases attributable to the activities of the land transport: modelling and analysis using I-CIR stochastic diffusion,the case of SpainENVIRONMETRICS, Issue 2 2008R. Gutiérrez Abstract In this study, carried out on the basis of the conclusions and methodological recommendations of the Fourth Assessment Report (2007) of the International Panel on Climate Change (IPCC), we consider the emissions of greenhouse gases (GHG), and particularly those of CO2, attributable to the activities of land transport, for all sectors of the economy, as these constitute a significant proportion of total GHG emissions. In particular, the case of Spain is an example of a worrying situation in this respect, both in itself and in the context of the European Union. To analyse the evolution, in this case, of such emissions, to enable medium-term forecasts to be made and to obtain a model that will enable us to analyse the effects of possible corrector mechanisms, we have statistically fitted a inverse Cox-Ingersoll-Ross (I-CIR) type nonlinear stochastic diffusion process, on the basis of the real data measured for the period 1990,2004, during which the Kyoto protocol has been applicable. We have studied the evolution of the trend of these emissions using estimated trend functions, for which purpose probabilistic complements such as trend functions and stationary distribution are incorporated, and a statistical methodology (estimation and asymptotic inference) for this diffusion, these tools being necessary for the application of the analytical methodology proposed. Copyright © 2007 John Wiley & Sons, Ltd. [source] Case-control association testing in the presence of unknown relationshipsGENETIC EPIDEMIOLOGY, Issue 8 2009Yoonha Choi Abstract Genome-wide association studies result in inflated false-positive results when unrecognized cryptic relatedness exists. A number of methods have been proposed for testing association between markers and disease with a correction for known pedigree-based relationships. However, in most case-control studies, relationships are generally unknown, yet the design is predicated on the assumption of at least ancestral relatedness among cases. Here, we focus on adjusting cryptic relatedness when the genealogy of the sample is unknown, particularly in the context of samples from isolated populations where cryptic relatedness may be problematic. We estimate cryptic relatedness using maximum-likelihood methods and use a corrected ,2 test with estimated kinship coefficients for testing in the context of unknown cryptic relatedness. Estimated kinship coefficients characterize precisely the relatedness between truly related people, but are biased for unrelated pairs. The proposed test substantially reduces spurious positive results, producing a uniform null distribution of P -values. Especially with missing pedigree information, estimated kinship coefficients can still be used to correct non-independence among individuals. The corrected test was applied to real data sets from genetic isolates and created a distribution of P -value that was close to uniform. Thus, the proposed test corrects the non-uniform distribution of P -values obtained with the uncorrected test and illustrates the advantage of the approach on real data. Genet. Epidemiol. 33:668,678, 2009. © 2009 Wiley-Liss, Inc. [source] Quantification and correction of bias in tagging SNPs caused by insufficient sample size and marker density by means of haplotype-dropping,GENETIC EPIDEMIOLOGY, Issue 1 2008Mark M. Iles Abstract Tagging single nucleotide polymorphisms (tSNPs) are commonly used to capture genetic diversity cost-effectively. It is important that the efficacy of tSNPs is correctly estimated, otherwise coverage may be inadequate and studies underpowered. Using data simulated under a coalescent model, we show that insufficient sample size can lead to overestimation of tSNP efficacy. Quantifying this we find that even when insufficient marker density is adjusted for, estimates of tSNP efficacy are up to 45% higher than the true values. Even with as many as 100 individuals, estimates of tSNP efficacy may be 9% higher than the true value. We describe a novel method for estimating tSNP efficacy accounting for limited sample size. The method is based on exclusion of haplotypes, incorporating a previous adjustment for insufficient marker density. We show that this method outperforms an existing Bootstrap approach. We compare the efficacy of multimarker and pairwise tSNP selection methods on real data. These confirm our findings with simulated data and suggest that pairwise methods are less sensitive to sample size, but more sensitive to marker density. We conclude that a combination of insufficient sample size and overfitting may cause overestimation of tSNP efficacy and underpowering of studies based on tSNPs. Our novel method corrects much of this bias and is superior to a previous method. However, sample sizes larger than previously suggested may be required for accurate estimation of tSNP efficacy. This has obvious ramifications for tSNP selection both in candidate regions and using HapMap or SNP chips for genomewide studies. Genet. Epidemiol. 31, 2007. © 2007 Wiley-Liss, Inc. [source] A novel method to identify gene,gene effects in nuclear families: the MDR-PDTGENETIC EPIDEMIOLOGY, Issue 2 2006E.R. Martin Abstract It is now well recognized that gene,gene and gene,environment interactions are important in complex diseases, and statistical methods to detect interactions are becoming widespread. Traditional parametric approaches are limited in their ability to detect high-order interactions and handle sparse data, and standard stepwise procedures may miss interactions that occur in the absence of detectable main effects. To address these limitations, the multifactor dimensionality reduction (MDR) method [Ritchie et al., 2001: Am J Hum Genet 69:138,147] was developed. The MDR is wellsuited for examining high-order interactions and detecting interactions without main effects. The MDR was originally designed to analyze balanced case-control data. The analysis can use family data, but requires a single matched pair be selected from each family. This may be a discordant sib pair, or may be constructed from triad data when parents are available. To take advantage of additional affected and unaffected siblings requires a test statistic that measures the association of genotype with disease in general nuclear families. We have developed a novel test, the MDR-PDT, by merging the MDR method with the genotype-Pedigree Disequilibrium Test (geno-PDT)[Martin et al., 2003: Genet Epidemiol 25:203,213]. MDR-PDT allows identification of single-locus effects or joint effects of multiple loci in families of diverse structure. We present simulations to demonstrate the validity of the test and evaluate its power. To examine its applicability to real data, we applied the MDR-PDT to data from candidate genes for Alzheimer disease (AD) in a large family dataset. These results show the utility of the MDR-PDT for understanding the genetics of complex diseases. Genet. Epidemiol. 2006. © 2005 Wiley-Liss, Inc. [source] Finding starting points for Markov chain Monte Carlo analysis of genetic data from large and complex pedigreesGENETIC EPIDEMIOLOGY, Issue 1 2003Yuqun Luo Abstract Genetic data from founder populations are advantageous for studies of complex traits that are often plagued by the problem of genetic heterogeneity. However, the desire to analyze large and complex pedigrees that often arise from such populations, coupled with the need to handle many linked and highly polymorphic loci simultaneously, poses challenges to current standard approaches. A viable alternative to solving such problems is via Markov chain Monte Carlo (MCMC) procedures, where a Markov chain, defined on the state space of a latent variable (e.g., genotypic configuration or inheritance vector), is constructed. However, finding starting points for the Markov chains is a difficult problem when the pedigree is not single-locus peelable; methods proposed in the literature have not yielded completely satisfactory solutions. We propose a generalization of the heated Gibbs sampler with relaxed penetrances (HGRP) of Lin et al., ([1993] IMA J. Math. Appl. Med. Biol. 10:1,17) to search for starting points. HGRP guarantees that a starting point will be found if there is no error in the data, but the chain usually needs to be run for a long time if the pedigree is extremely large and complex. By introducing a forcing step, the current algorithm substantially reduces the state space, and hence effectively speeds up the process of finding a starting point. Our algorithm also has a built-in preprocessing procedure for Mendelian error detection. The algorithm has been applied to both simulated and real data on two large and complex Hutterite pedigrees under many settings, and good results are obtained. The algorithm has been implemented in a user-friendly package called START. Genet Epidemiol 25:14,24, 2003. © 2003 Wiley-Liss, Inc. [source] Joint inversion of multiple data types with the use of multiobjective optimization: problem formulation and application to the seismic anisotropy investigationsGEOPHYSICAL JOURNAL INTERNATIONAL, Issue 2 2007E. Kozlovskaya SUMMARY In geophysical studies the problem of joint inversion of multiple experimental data sets obtained by different methods is conventionally considered as a scalar one. Namely, a solution is found by minimization of linear combination of functions describing the fit of the values predicted from the model to each set of data. In the present paper we demonstrate that this standard approach is not always justified and propose to consider a joint inversion problem as a multiobjective optimization problem (MOP), for which the misfit function is a vector. The method is based on analysis of two types of solutions to MOP considered in the space of misfit functions (objective space). The first one is a set of complete optimal solutions that minimize all the components of a vector misfit function simultaneously. The second one is a set of Pareto optimal solutions, or trade-off solutions, for which it is not possible to decrease any component of the vector misfit function without increasing at least one other. We investigate connection between the standard formulation of a joint inversion problem and the multiobjective formulation and demonstrate that the standard formulation is a particular case of scalarization of a multiobjective problem using a weighted sum of component misfit functions (objectives). We illustrate the multiobjective approach with a non-linear problem of the joint inversion of shear wave splitting parameters and longitudinal wave residuals. Using synthetic data and real data from three passive seismic experiments, we demonstrate that random noise in the data and inexact model parametrization destroy the complete optimal solution, which degenerates into a fairly large Pareto set. As a result, non-uniqueness of the problem of joint inversion increases. If the random noise in the data is the only source of uncertainty, the Pareto set expands around the true solution in the objective space. In this case the ,ideal point' method of scalarization of multiobjective problems can be used. If the uncertainty is due to inexact model parametrization, the Pareto set in the objective space deviates strongly from the true solution. In this case all scalarization methods fail to find the solution close to the true one and a change of model parametrization is necessary. [source] Measuring finite-frequency body-wave amplitudes and traveltimesGEOPHYSICAL JOURNAL INTERNATIONAL, Issue 1 2006Karin Sigloch SUMMARY We have developed a method to measure finite-frequency amplitude and traveltime anomalies of teleseismic P waves. We use a matched filtering approach that models the first 25 s of a seismogram after the P arrival, which includes the depth phases pP and sP. Given a set of broad-band seismograms from a teleseismic event, we compute synthetic Green's functions using published moment tensor solutions. We jointly deconvolve global or regional sets of seismograms with their Green's functions to obtain the broad-band source time function. The matched filter of a seismogram is the convolution of the Green's function with the source time function. Traveltimes are computed by cross-correlating each seismogram with its matched filter. Amplitude anomalies are defined as the multiplicative factors that minimize the RMS misfit between matched filters and data. The procedure is implemented in an iterative fashion, which allows for joint inversion for the source time function, amplitudes, and a correction to the moment tensor. Cluster analysis is used to identify azimuthally distinct groups of seismograms when source effects with azimuthal dependence are prominent. We then invert for one source time function per group. We implement this inversion for a range of source depths to determine the most likely depth, as indicated by the overall RMS misfit, and by the non-negativity and compactness of the source time function. Finite-frequency measurements are obtained by filtering broad-band data and matched filters through a bank of passband filters. The method is validated on a set of 15 events of magnitude 5.8 to 6.9. Our focus is on the densely instrumented Western US. Quasi-duplet events (,quplets') are used to estimate measurement uncertainty on real data. Robust results are achieved for wave periods between 24 and 2 s. Traveltime dispersion is on the order of 0.5 s. Amplitude anomalies are on the order of 1 db in the lowest bands and 3 db in the highest bands, corresponding to amplification factors of 1.2 and 2.0, respectively. Measurement uncertainties for amplitudes and traveltimes depend mostly on station coverage, accuracy of the moment tensor estimate, and frequency band. We investigate the influence of those parameters in tests on synthetic data. Along the RISTRA array in the Western US, we observe amplitude and traveltime patterns that are coherent on scales of hundreds of kilometres. Below two sections of the array, we observe a combination of frequency-dependent amplitude and traveltime patterns that strongly suggest wavefront healing effects. [source] Geoelectric dimensionality in complex geological areas: application to the Spanish Betic ChainGEOPHYSICAL JOURNAL INTERNATIONAL, Issue 3 2004Anna Martí SUMMARY Rotational invariants of the magnetotelluric impedance tensor may be used to obtain information on the geometry of underlying geological structures. The set of invariants proposed by Weaver et al. (2000) allows the determination of a suitable dimensionality for the modelling of observed data. The application of the invariants to real data must take into account the errors in the data and also the fact that geoelectric structures in the Earth will not exactly fit 1-D, 2-D or simple 3-D models. In this work we propose a method to estimate the dimensionality of geoelectric structures based on the rotational invariants, bearing in mind the experimental error of real data. A data set from the Betic Chain (Spain) is considered. We compare the errors of the invariants estimated by different approaches: classical error propagation, generation of random Gaussian noise and bootstrap resampling, and we investigate the matter of the threshold value to be used in the determination of dimensionality. We conclude that the errors of the invariants can be properly estimated by classical error propagation, but the generation of random values is better to ensure stability in the errors of strike direction and distortion parameters. The use of a threshold value between 0.1 and 0.15 is recommended for real data of medium to high quality. The results for the Betic Chain show that the general behaviour is 3-D with a disposition of 2-D structures, which may be correlated with the nature of the crust of the region. [source] Synthesis of a seismic virtual reflector,GEOPHYSICAL PROSPECTING, Issue 3 2010Flavio Poletto ABSTRACT We describe a method to process the seismic data generated by a plurality of sources and registered by an appropriate distribution of receivers, which provides new seismic signals as if in the position of the receivers (or sources) there was an ideal reflector, even if this reflector is not present there. The data provided by this method represent the signals of a virtual reflector. The proposed algorithm performs the convolution and the subsequent sum of the real traces without needing subsurface model information. The approach can be used in combination with seismic interferometry to separate wavefields and process the reflection events. The application is described with synthetic examples, including stationary phase analysis and with real data in which the virtual reflector signal can be appreciated. [source] Addressing non-uniqueness in linearized multichannel surface wave inversionGEOPHYSICAL PROSPECTING, Issue 1 2009Michele Cercato ABSTRACT The multichannel analysis of the surface waves method is based on the inversion of observed Rayleigh-wave phase-velocity dispersion curves to estimate the shear-wave velocity profile of the site under investigation. This inverse problem is nonlinear and it is often solved using ,local' or linearized inversion strategies. Among linearized inversion algorithms, least-squares methods are widely used in research and prevailing in commercial software; the main drawback of this class of methods is their limited capability to explore the model parameter space. The possibility for the estimated solution to be trapped in local minima of the objective function strongly depends on the degree of nonuniqueness of the problem, which can be reduced by an adequate model parameterization and/or imposing constraints on the solution. In this article, a linearized algorithm based on inequality constraints is introduced for the inversion of observed dispersion curves; this provides a flexible way to insert a priori information as well as physical constraints into the inversion process. As linearized inversion methods are strongly dependent on the choice of the initial model and on the accuracy of partial derivative calculations, these factors are carefully reviewed. Attention is also focused on the appraisal of the inverted solution, using resolution analysis and uncertainty estimation together with a posteriori effective-velocity modelling. Efficiency and stability of the proposed approach are demonstrated using both synthetic and real data; in the latter case, cross-hole S-wave velocity measurements are blind-compared with the results of the inversion process. [source] Inversion of time-dependent nuclear well-logging data using neural networksGEOPHYSICAL PROSPECTING, Issue 1 2008Laura Carmine ABSTRACT The purpose of this work was to investigate a new and fast inversion methodology for the prediction of subsurface formation properties such as porosity, salinity and oil saturation, using time-dependent nuclear well logging data. Although the ultimate aim is to apply the technique to real-field data, an initial investigation as described in this paper, was first required; this has been carried out using simulation results from the time-dependent radiation transport problem within a borehole. Simulated neutron and ,-ray fluxes at two sodium iodide (NaI) detectors, one near and one far from a pulsed neutron source emitting at ,14 MeV, were used for the investigation. A total of 67 energy groups from the BUGLE96 cross section library together with 567 property combinations were employed for the original flux response generation, achieved by solving numerically the time-dependent Boltzmann radiation transport equation in its even parity form. Material property combinations (scenarios) and their correspondent teaching outputs (flux response at detectors) are used to train the Artificial Neural Networks (ANNs) and test data is used to assess the accuracy of the ANNs. The trained networks are then used to produce a surrogate model of the expensive, in terms of computational time and resources, forward model with which a simple inversion method is applied to calculate material properties from the time evolution of flux responses at the two detectors. The inversion technique uses a fast surrogate model comprising 8026 artificial neural networks, which consist of an input layer with three input units (neurons) for porosity, salinity and oil saturation; and two hidden layers and one output neuron representing the scalar photon or neutron flux prediction at the detector. This is the first time this technique has been applied to invert pulsed neutron logging tool information and the results produced are very promising. The next step in the procedure is to apply the methodology to real data. [source] 2D data modelling by electrical resistivity tomography for complex subsurface geologyGEOPHYSICAL PROSPECTING, Issue 2 2006E. Cardarelli ABSTRACT A new tool for two-dimensional apparent-resistivity data modelling and inversion is presented. The study is developed according to the idea that the best way to deal with ill-posedness of geoelectrical inverse problems lies in constructing algorithms which allow a flexible control of the physical and mathematical elements involved in the resolution. The forward problem is solved through a finite-difference algorithm, whose main features are a versatile user-defined discretization of the domain and a new approach to the solution of the inverse Fourier transform. The inversion procedure is based on an iterative smoothness-constrained least-squares algorithm. As mentioned, the code is constructed to ensure flexibility in resolution. This is first achieved by starting the inversion from an arbitrarily defined model. In our approach, a Jacobian matrix is calculated at each iteration, using a generalization of Cohn's network sensitivity theorem. Another versatile feature is the issue of introducing a priori information about the solution. Regions of the domain can be constrained to vary between two limits (the lower and upper bounds) by using inequality constraints. A second possibility is to include the starting model in the objective function used to determine an improved estimate of the unknown parameters and to constrain the solution to the above model. Furthermore, the possibility either of defining a discretization of the domain that exactly fits the underground structures or of refining the mesh of the grid certainly leads to more accurate solutions. Control on the mathematical elements in the inversion algorithm is also allowed. The smoothness matrix can be modified in order to penalize roughness in any one direction. An empirical way of assigning the regularization parameter (damping) is defined, but the user can also decide to assign it manually at each iteration. An appropriate tool was constructed with the purpose of handling the inversion results, for example to correct reconstructed models and to check the effects of such changes on the calculated apparent resistivity. Tests on synthetic and real data, in particular in handling indeterminate cases, show that the flexible approach is a good way to build a detailed picture of the prospected area. [source] The effects of near-surface conditions on anisotropy parameter estimations from 4C seismic dataGEOPHYSICAL PROSPECTING, Issue 1 2006Bärbel Traub ABSTRACT We present a study of anisotropic parameter estimation in the near-surface layers for P-wave and converted-wave (C-wave) data. Near-surface data is affected by apparent anisotropy due to a vertical velocity compaction gradient. We have carried out a modelling study, which showed that a velocity gradient introduces apparent anisotropy into an isotropic medium. Thus, parameter estimation will give anomalous values that affect the imaging of the target area. The parameter estimation technique is also influenced by phase reversals with diminishing amplitude, leading to erroneous parameters. In a modelling study using a near-surface model, we have observed phase reversals in near-surface PP reflections. The values of the P-wave anisotropy parameter , estimated from these events are about an order of magnitude larger than the model values. Next, we use C-wave data to estimate the effect of anisotropy (,) and compute , from these values. These calculated ,-values are closer to the model values, and NMO correction with both ,-values shows a better correction for the calculated value. Hence, we believe that calculating , from , gives a better representation of the anisotropy than picked , from the P-wave. Finally, we extract the anisotropy parameters , and , from real data from the Alba Field in the North Sea. Comparing the results with reference values from a model built according to well-log, VSP and surface data, we find that the parameters show differences of up to an order of magnitude. The ,-values calculated from the C-wave anisotropy parameter , fit the reference values much better and show values of the same order of magnitude. [source] Minimum weighted norm wavefield reconstruction for AVA imagingGEOPHYSICAL PROSPECTING, Issue 6 2005Mauricio D. Sacchi ABSTRACT Seismic wavefield reconstruction is posed as an inversion problem where, from inadequate and incomplete data, we attempt to recover the data we would have acquired with a denser distribution of sources and receivers. A minimum weighted norm interpolation method is proposed to interpolate prestack volumes before wave-equation amplitude versus angle imaging. Synthetic and real data were used to investigate the effectiveness of our wavefield reconstruction scheme when preconditioning seismic data for wave-equation amplitude versus angle imaging. [source] Green's function interpolations for prestack imagingGEOPHYSICAL PROSPECTING, Issue 1 2000Manuela Mendes A new interpolation method is presented to estimate the Green's function values, taking into account the migration/inversion accuracy requirements and the trade-off between resolution and computing costs. The fundamental tool used for this technique is the Dix hyperbolic equation (DHE). The procedure, when applied to evaluate the Green's function for a real source position, uses the DHE to derive the root-mean-square velocity, vRMS, from the precomputed traveltimes for the nearest virtual sources, and by linear interpolation generates vRMS for the real source. Then, by applying the DHE again, the required traveltimes and geometrical spreading can be estimated. The inversion of synthetic data demonstrates that the new interpolation yields excellent results which give a better qualitative and quantitative resolution of the imaging sections, compared with those carried out by conventional linear interpolation. Furthermore, the application to synthetic and real data demonstrates the ability of the technique to interpolate Green's functions from widely spaced virtual sources. Thus the proposed interpolation, besides improving the imaging results, also reduces the overall CPU time and the hard disk space required, hence decreasing the computational effort of the imaging algorithms. [source] Source-based morphometry: The use of independent component analysis to identify gray matter differences with application to schizophreniaHUMAN BRAIN MAPPING, Issue 3 2009Lai Xu Abstract We present a multivariate alternative to the voxel-based morphometry (VBM) approach called source-based morphometry (SBM), to study gray matter differences between patients and healthy controls. The SBM approach begins with the same preprocessing procedures as VBM. Next, independent component analysis is used to identify naturally grouping, maximally independent sources. Finally, statistical analyses are used to determine the significant sources and their relationship to other variables. The identified "source networks," groups of spatially distinct regions with common covariation among subjects, provide information about localization of gray matter changes and their variation among individuals. In this study, we first compared VBM and SBM via a simulation and then applied both methods to real data obtained from 120 chronic schizophrenia patients and 120 healthy controls. SBM identified five gray matter sources as significantly associated with schizophrenia. These included sources in the bilateral temporal lobes, thalamus, basal ganglia, parietal lobe, and frontotemporal regions. None of these showed an effect of sex. Two sources in the bilateral temporal and parietal lobes showed age-related reductions. The most significant source of schizophrenia-related gray matter changes identified by SBM occurred in the bilateral temporal lobe, while the most significant change found by VBM occurred in the thalamus. The SBM approach found changes not identified by VBM in basal ganglia, parietal, and occipital lobe. These findings show that SBM is a multivariate alternative to VBM, with wide applicability to studying changes in brain structure. Hum Brain Mapp, 2009. © 2008 Wiley-Liss, Inc. [source] Assessing the predictive performance of artifIcial neural network-based classifiers based on different data preprocessing methods, distributions and training mechanismsINTELLIGENT SYSTEMS IN ACCOUNTING, FINANCE & MANAGEMENT, Issue 4 2005Adrian Costea We analyse the implications of three different factors (preprocessing method, data distribution and training mechanism) on the classification performance of artificial neural networks (ANNs). We use three preprocessing approaches: no preprocessing, division by the maximum absolute values and normalization. We study the implications of input data distributions by using five datasets with different distributions: the real data, uniform, normal, logistic and Laplace distributions. We test two training mechanisms: one belonging to the gradient-descent techniques, improved by a retraining procedure, and the other is a genetic algorithm (GA), which is based on the principles of natural evolution. The results show statistically significant influences of all individual and combined factors on both training and testing performances. A major difference with other related studies is the fact that for both training mechanisms we train the network using as starting solution the one obtained when constructing the network architecture. In other words we use a hybrid approach by refining a previously obtained solution. We found that when the starting solution has relatively low accuracy rates (80,90%) the GA clearly outperformed the retraining procedure, whereas the difference was smaller to non-existent when the starting solution had relatively high accuracy rates (95,98%). As reported in other studies, we found little to no evidence of crossover operator influence on the GA performance. Copyright © 2005 John Wiley & Sons, Ltd. [source] Clustering technique for risk classification and prediction of claim costs in the automobile insurance industryINTELLIGENT SYSTEMS IN ACCOUNTING, FINANCE & MANAGEMENT, Issue 1 2001Ai Cheo Yeo This paper considers the problem of predicting claim costs in the automobile insurance industry. The first stage involves classifying policy holders according to their perceived risk, followed by modelling the claim costs within each risk group. Two methods are compared for the risk classification stage: a data-driven approach based on hierarchical clustering, and a previously published heuristic method that groups policy holders according to pre-defined factors. Regression is used to model the expected claim costs within a risk group. A case study is presented utilizing real data, and both risk classification methods are compared according to a variety of accuracy measures. The results of the case study show the benefits of employing a data-driven approach. © 2001 John Wiley & Sons, Ltd. [source] Improved GMM with parameter initialization for unsupervised adaptation of Brain,Computer interfaceINTERNATIONAL JOURNAL FOR NUMERICAL METHODS IN BIOMEDICAL ENGINEERING, Issue 6 2010Guangquan Liu Abstract An important property of brain signals is their nonstationarity. How to adapt a brain,computer interface (BCI) to the changing brain states is one of the challenges faced by BCI researchers, especially in real application where the subject's real intent is unknown to the system. Gaussian mixture model (GMM) has been used for the unsupervised adaptation of the classifier in BCI. In this paper, a method of initializing the model parameters is proposed for expectation maximization-based GMM parameter estimation. This improved GMM method and other two existing unsupervised adaptation methods are applied to groups of constructed artificial data with different data properties. Performances of these methods in different situations are analyzed. Compared with the other two unsupervised adaptation methods, this method shows a better ability of adapting to changes and discovering class information from unlabelled data. The methods are also applied to real EEG data recorded in 19 experiments. For real data, the proposed method achieves an error rate significantly lower than the other two unsupervised methods. Results of the real data agree with the analysis based on the artificial data, which confirms not only the effectiveness of our method but also the validity of the constructed data. Copyright © 2009 John Wiley & Sons, Ltd. [source] Real-time signal processing by adaptive repeated median filtersINTERNATIONAL JOURNAL OF ADAPTIVE CONTROL AND SIGNAL PROCESSING, Issue 5 2010K. Schettlinger Abstract In intensive care, a basic goal is to extract the signals from very noisy time series in real time. We propose a robust online filter with an adaptive window width, which yields a smooth representation of the denoised data in stable periods and which is also able to trace typical patterns such as level shifts or trend changes with small time delay. Several versions of this method are evaluated and compared with a simulation study and on real data. Copyright © 2009 John Wiley & Sons, Ltd. [source] |