Home About us Contact | |||
Synthetic Data Sets (synthetic + data_set)
Selected AbstractsNew procedures to decompose geomagnetic field variations and application to volcanic activitiyGEOPHYSICAL JOURNAL INTERNATIONAL, Issue 1 2008Ikuko Fujii SUMMARY We report the development of numerical procedures for extracting long-term geomagnetic field variations caused by volcanic activity from an observed geomagnetic field by using statistical methods. The newly developed procedures are to estimate the trend from the observed data, as well as variations of non-volcanic origin such as periodic components, components related to external geomagnetic variations and observational noise. We also aim at referring to data obtained at a remote standard geomagnetic observatory rather than using a temporarily installed reference site for reasons of data quality. Two different approaches,a Bayesian statistical method and a Kalman filter method,are applied to decompose the geomagnetic field data into four components for comparison. The number of filter coefficients and the degree of condition realizations are optimized on the basis of minimization of the information criteria. The two procedures were evaluated by using a synthetic data set. Generally, the results of both methods are equally sufficient. Subtle differences are seen at the first, several data points due to arbitrarily selected initial values in the case of the Kalman filter method and at the smaller residual for the Bayesian statistical method. The largest differences are in computation time and memory size. The Kalman filter method runs a thousand times faster on a testing workstation and requires less memory than the Bayesian method. The Kalman filter method was applied to the total intensity data at Kuchi-erabu-jima volcano. The result suggests that the procedure works reasonably well. [source] Seismic modelling study of a subglacial lakeGEOPHYSICAL PROSPECTING, Issue 6 2003José M. Carcione ABSTRACT We characterize the seismic response of Lake Vostok, an Antarctic subglacial lake located at nearly 4 km depth below the ice sheet. This study is relevant for the determination of the location and morphology of subglacial lakes. The characterization requires the design of a methodology based on rock physics and numerical modelling of wave propagation. The methodology involves rock-physics models of the shallow layer (firn), the ice sheet and the lake sediments, numerical simulation of synthetic seismograms, ray tracing, ,,p transforms, and AVA analysis, based on the theoretical reflection coefficients. The modelled reflection seismograms show a set of straight events (refractions through the firn and top-ice layer) and the two reflection events associated with the top and bottom of the lake. Theoretical AVA analysis of these reflections indicates that, at near offsets, the PP-wave anomaly is negative for the ice/water interface and constant for the water/sediment interface. This behaviour is shown by AVA analysis of the synthetic data set. This study shows that subglacial lakes can be identified by using seismic methods. Moreover, the methodology provides a tool for designing suitable seismic surveys. [source] A kernel-based core growing clustering methodINTERNATIONAL JOURNAL OF INTELLIGENT SYSTEMS, Issue 4 2009T. W. Hsieh In this paper, a novel clustering method in the kernel space is proposed. It effectively integrates several existing algorithms to become an iterative clustering scheme, which can handle clusters with arbitrary shapes. In our proposed approach, a reasonable initial core for each of the cluster is estimated. This allows us to adopt a cluster growing technique, and the growing cores offer partial hints on the cluster association. Consequently, the methods used for classification, such as support vector machines (SVMs), can be useful in our approach. To obtain initial clusters effectively, the notion of the incomplete Cholesky decomposition is adopted so that the fuzzy c-means (FCM) can be used to partition the data in a kernel defined-like space. Then a one-class and a multiclass soft margin SVMs are adopted to detect the data within the main distributions (the cores) of the clusters and to repartition the data into new clusters iteratively. The structure of the data set is explored by pruning the data in the low-density region of the clusters. Then data are gradually added back to the main distributions to assure exact cluster boundaries. Unlike the ordinary SVM algorithm, whose performance relies heavily on the kernel parameters given by the user, the parameters are estimated from the data set naturally in our approach. The experimental evaluations on two synthetic data sets and four University of California Irvine real data benchmarks indicate that the proposed algorithms outperform several popular clustering algorithms, such as FCM, support vector clustering (SVC), hierarchical clustering (HC), self-organizing maps (SOM), and non-Euclidean norm fuzzy c-means (NEFCM). © 2009 Wiley Periodicals, Inc.4 [source] Deconvolution of femtosecond time-resolved spectroscopy data in multivariate curve resolution.JOURNAL OF CHEMOMETRICS, Issue 7-8 2010Application to the characterization of ultrafast photo-induced intramolecular proton transfer Abstract In femtosecond absorption spectroscopy, deconvolution of the measured kinetic traces is still an important issue as photochemical processes that may possess shorter characteristic times than the time resolution of the experiment are usually considered. In this work, we propose to perform deconvolution of the time-dependent concentration profiles extracted from multivariate curve resolution (MCR) applied to spectrokinetic data. The profiles are fitted with a model function including a description of the instrumental response function (IRF) of the experiment. The method combines the potential benefits of soft-modeling data analysis with the ones of hard-modeling for parameter estimation. The potential of the method is demonstrated first analyzing five synthetic data sets for which IRF of different widths are simulated. It is then successfully applied to resolve femtosecond UV-visible transient absorption spectroscopy data investigating the photodynamics of salicylidene aniline, a photochromic molecule of wide interest. Considering a time resolution of 150,fs for the IRF, a characteristic time of 45,fs is recovered for the first step of the photo-induced process which consists of an ultrafast intramolecular proton transfer. Our results also confirm the existence of an intermediate species with a characteristic time of 240,fs. Copyright © 2010 John Wiley & Sons, Ltd. [source] Releasing multiply imputed, synthetic public use microdata: an illustration and empirical studyJOURNAL OF THE ROYAL STATISTICAL SOCIETY: SERIES A (STATISTICS IN SOCIETY), Issue 1 2005Jerome P. Reiter Summary., The paper presents an illustration and empirical study of releasing multiply imputed, fully synthetic public use microdata. Simulations based on data from the US Current Population Survey are used to evaluate the potential validity of inferences based on fully synthetic data for a variety of descriptive and analytic estimands, to assess the degree of protection of confidentiality that is afforded by fully synthetic data and to illustrate the specification of synthetic data imputation models. Benefits and limitations of releasing fully synthetic data sets are discussed. [source] Wavelet-based estimation of a discriminant functionAPPLIED STOCHASTIC MODELS IN BUSINESS AND INDUSTRY, Issue 3 2003Woojin Chang Abstract In this paper, we consider wavelet-based binary linear classifiers. Both consistency results and implementational issues are addressed. We show that under mild assumptions on the design density wavelet discrimination rules are L2 -consistent. The proposed method is illustrated on synthetic data sets in which the ,truth' is known and on an applied discrimination problem from the industrial field. Copyright © 2003 John Wiley & Sons, Ltd. [source] |