Weierstrass Institute for
Applied Analysis and Stochastics
We are delighted to be hosting the COMPSTAT conference here at the Humboldt- Universit¨at zu Berlin. For the second time the conference is in Berlin, but for the first time in the former Eastern part of the city: a dream of many IASC members became reality!
More wishes for this 15th COMPSTAT may become real since this COMPSTAT is different from all the former conferences. There are several main changes:
1. We have only accepted electronic submissions
2. All papers were graded and handled electronically
3. Registration and workflow was entirely web-based
4. We have changed the conference model to a LONG weekend type one
5. We provide the proceedings on-line as an and on CD
We have undergone this effort to signal to the statistical and computer science com- munity that COMPSTAT and IASC are the number one markets and organizations for conferences and top publication activities in Computational Statistics. You will be given the opportuinity to express your opinion in a questionnaire. We had 280 submissions of which 90 papers have been selected for long presentations and 74 as short communica- tions and 63 posters. Given the high quality of the papers we have also created a CD with the Short Communications. In cooperation with Physika and Springer Verlag we realized the platform. This technique allows the conference visitor and reader of our COMPSTAT proceedings to read the book parallel on the website www.mdtech.de. From this web-site and also from the CD that is enclosed in all conference bags the reader may rerun examples interactively and follow the computational arguments given in the papers.
This conference was only possible through the combined efforts of a number of different people. First I would like to personally thank the Scientific Programme Committee (SPC) who dedicated so much of their time in selecting and judging the high quality submissions. I thank them for all of their patience and initiative in working with the new on-line grading system. Their comments and feedback have been extremely valuable and highly appreciated. I would also like to thank the session chairs and discussants whose participation is vital to the success of our conference.
Many thanks go to the number of people who have helped to bring this conference together through numerous hours of planning and organizing. These are my colleagues of the Local Organizing Committee (LOC) who managed the boat trip, the conference dinner with the musical framework, the tour to Potsdam and the internal framework. These are of course also the support staff,the Masters and Ph.D. students who helped in numerous ways to smoothly run this COMPSTAT 2002. In particular these are Uwe Ziegenhagen and Benjamin Schu¨ler who had been the hands of the LOC and were crucial to the on-line implementations. The willingness of all these people to dedicate their evenings and weekends to this great conference will certainly bring future fruits. Finally, I would like to express my appreciation and deep gratitude to our sponsors without whom this conference would have been not possible.
Berlin, 2nd October 2002
The Scientific Programme Committee (SPC) was responsible for the scientific content of Compstat 2002. It prepared the final list of conference topics and invited speak- ers, selected contributed papers, short communications and posters from amongst the submitted abstracts and refereed contributed papers. The SPC consists of:
Prof. Jaromir Antoch - Charles University, Prague Prof. Adrian Bowman - University Gardens, Glasgow Prof. Michel Delecroix - ENSAI, Bruz
Prof. Wolfgang H¨ardle - Humboldt–Universit¨at zu Berlin
Prof. Wenceslao Manteiga, Universidad de Santiago de Compostela Prof. Junji Nakano - Institute of Statistical Mathematics, Tokyo Prof. Michael Schimek - Karl-Franzens-University, Graz
Prof. Antony Unwin - University Augsburg, Augsburg
Prof. Peter Van der Heijden - University of Utrecht, Utrecht
The preparation of the conference was only possible through the combined effort of sev- eral institutes: The Institute for Statistics and Econometrics from Humboldt-Universit¨at zu Berlin, the Institute for Statistics from Freie Universit¨at zu Berlin and the Weierstrass Institute for Applied Analysis and Stochastics.
The Local Organizing Committee (LOC) was responsible for functional organization of Compstat 2002, including the selection of the most suitable locations, preparation of the internet site and conference software, arrangement of the social programm, production and publication of the proceedings volume, organization of the software and book exhi- bitions and coordinating the contact between invited speakers, discussants, contributing authors, participants, sponsors, publishers and exhibitors. The LOC consists of:
Dr. Yasemin Boztug
Prof. Dr. Herbert Bu¨ning Prof. Dr. Wolfgang H¨ardle Prof. Dr. Lutz Hildebrandt Dr. Sigbert Klinke
Prof. Dr. Uwe Ku¨chler Prof. Dr. Bernd R¨onz Dr. Peter Schirmbacher
Dipl. Ing. Benjamin Schu¨ler Prof. Dr. Vladimir Spokoiny Prof. Dr. Hans Gerhard Strohe Prof. Dr. Ju¨rgen Wolters
Cand. econ. Uwe Ziegenhagen
The registration desk in the foyer of the main building (Unter den Linden 6) is open:
Saturday, August 24 from 8:00 until 18:00
Sunday, August 25 from 8:00 until 18:00
Monday, August 26 from 8:00 until 14:00
Tuesday, August 27 from 8:00 until 18:00
Wednesday, August 28 from 8:00 until 13:00
Meeting badges are essential for admission to the Meeting venues and to the academic sessions and social events. Therefore we would like to ask that you to wear your badge at all times.
Accompanying persons are welcome to attend the Welcome Reception in the Sauriersaal at the Museum for Natural History on August 24th.
For the Boat tour with the conference dinner on Monday, accompanying persons are asked to buy have a ticket. Dinner tickets can be purchased at the conference desk in the main University building. All tickets for the Potsdam tour are sold, but due to cancellation the staff at the conference desk might be able to offer you a ticket.
Drinks will be served in the conference tent in the garden of the university building. Coffee, Tea, soft drinks and juices will be served.
XploRe is a combination of classical and modern statistical procedures, in conjunction with sophisticated, interactive graphics. XploRe is the basis for statistical analysis, research, and teaching. Its purpose lies in the exploration and analysis of data, as well as in the development of new techniques. The statistical methods of XploRe are provided by various quantlets.
In this workshop we give an overview about functions and features of XploRe and demon- strate detailed examples of data analysis with XploRe.
During the Computational Statistics Conference 2002 held in Berlin and proudly spon- sored by MD*Tech, we are planning to organize the first XploRe User Group (XUG) meeting.
Our planned topics are:
1. Future of XploRe (codename: Ixylon)
2. Suggestions to improve the user interface
3. Suggestions for further statistical methods
4. Requests for comments
5. Open discussion
XML, a markup language for presenting information as semi-structured documents, is often called an enabling technology“ . To cite the Oxford English Dictionary: Enable,
v. To Supply with the requisite means or opportunities to an end or for an object. But
markup languages do not, by themselves give off-the-shelf answer to questions related to electronic or web-publishing or content- dissimination. They provide an opportunity. An opportunity to take a look at the information from a more abstract level. Not as pages, lines, tables or graphics, but as packets with labels. Once information is tagged“ ,
it has been classified into distinct logical elements, independently from any layout or
visualization concepts. This turns the information into a resuseable miniature library, searcheable, retrieveable, useable.
This 4 hour tutorial will give an introduction to the concepts and languages that stand behind the eXtensible Markup Language. It will show, how XML works, how a serverside
usage can be implemented and what the advantages of this technology are in compar- ison to other standards. Example implementations will go into details of XML usage. They will demonstrate how XML technology can be used for applications in the field of computational statistics.
Tickets can be purchased at the conference desk.
There will be three prizes:
the Young Authors Prize for the best FULL or SHORT contribution by an author younger than 35 years
the Poster Competition for the best poster presented during the conference
the MD*Tech Software Award sponsored by MD*Tech for the best Software con- tribution
The winners of the prizes will be presented during the Closing Session on Wednesday.
Access to internet and email will be available via workstations, kindly sponsored by SUN Microsystems. Furthermore there are several internet cafes near Humboldt-Universit¨at zu Berlin.
The computers are in the Computer Center (Humboldt-Galerie). Please check the door to the Computer center for opening hours.
The Humboldt-Universit¨at zu Berlin will not assume any responsibility for accident, loss or damage, or for delays or modifications in the programme, caused by unforeseen circumstances. We will not assume indemnities requested by contractors or participants in the case of cancellation of the Meeting due to unforeseen circumstances.
Official opening hours of banks in Germany vary. Some exchange offices located at the larger train stations (Friedrichstrasse and Alexanderplatz) are open on weekends. It is also possible to change foreign currency into Euro in many hotels, but for a higher transaction fee.
Electric sockets in Germany carry 220V/50Hz and conform to the standard continental type. Travel adaptors may be useful for electric appliances with other standards and can be bought, e.g. in the Saturn -department store at Alexanderplatz.
At the bigger shopping-centers and malls, shops are open until 8p.m., on Saturdays un- til 4p.m. Other shops close at 6:30. Generally, shops are not closed for lunch breaks. The high temple of the consumer religion is the KaDeWe department store on Witten- bergplatz (U-Bahn station), the largest department store in Europe. The newly opened complex on the Potsdamer Platz offers more than 100 new shops and a multitude of restaurants, cafes, and pubs.
Each weekend many flea-markets invite you to look for great bargains. The most pop- ular one is the Kunst- und Tr¨odelmarkt“ on Straße des 17. Juni, but also along the
Kupfergraben near Museum’s Island and Humboldt-Universit¨at. There are many stalls
with art objects, books and records.
Funk-Taxi Berlin: (030) 26 10 26
Spree-Funk: (030) 44 33 22
In Berlin pharmacies can be found all over the town. For overnight service there are always one or two pharmacies open in every district. All pharmacies post signs directing you to the nearest open one.
Hotline: 0 11 89
(English speaker available at all numbers) Police: 110
Fire Brigade and Ambulance: 112
German Red Cross: (030) 85 00 55
Ambulance: (030) 31 00 31
After-hour doctor: (030) 31 00 33 99
The boat trip through Berlin will take place on the afternoon of August 26th 2002.
We will board the MS Brandenburg“ at the
Berliner Dom“ which is near Humboldt-
Universit¨at and start the cruise in the historical center of Berlin: Museumsinsel, theatre
quarter, station Friedrichstrasse, former border between the Eastern and the Western part of Berlin. After visiting the Reichstag (parliament), the seat of the German Chan- cellor and the construction site for the new Central Station Lehrter Stadtbahnhof“ we
turn around at Humboldt-harbor and go back to the historical center where we pass the
St. Nicolas quarter, the Alexanderplatz and the Berlin Town Hall.
After passing the remains of the Berlin Wall at the East Side Gallery and the industrial districts of East Berlin we cruise to K¨openick, where Berlin’s green heart is beating and see the beautiful castle.
We are visiting the big and the small Mu¨ggelsee, take a look on Neu Venedig“ and after
going around the Müggelberge“ mountains we head towards the Marriott Courtyard
Hotel in Köpenick for an exellent dinner.
The Brandenburg state capital used to be the residence of Prussian kings and German Kaisers before World War I. That is the reason why the city is surrounded by picturesque parks and palaces. One of the smaller palaces which is nevertheless a masterpiece of Ger- man Rococo architecture is Sanssouci. It was built for king Frederick II the Great by his friend and architect Georg Wenzeslaus von Knobelsdorff around 1746. The palace gave its name to the huge park Sanssouci including several other Palaces, original buildings and hundreds of sculptures and monuments. Besides the Sanssouci park, you will visit
the historical city centre with the so-called Dutch Quarter and the classicist St. Nicolai church which was built by the most creative Prussian architect Karl-Friedrich Schinkel.
Furthermore, you will have a walk in the park Neuer Garten“ with the comparatively
new palace Cecilienhof built for the last imperial crown prince Wilhelm von Preußen and
his wife duchess Cecilie von Mecklenburg-Schwerin. Here, the leaders of the victorious powers held the Potsdam Conference after World War II. in order to re-structure post- war Central Europe. The bus tour for Potsdam starts and ends in front of the main building of the Humboldt-Universit¨at zu Berlin.
Brandenburger Tor, Unter den Linden, Friedrichstraße, Alexanderplatz ... you could continue the enumeration of first-class sights in Berlin’s old and new centre endlessly, an excursion through this historical as well as lively district belongs to every Berlin tour. In the avenues Unter den Linden and Karl-Liebknecht-Straße every building has its own story to tell - the government and embassy buildings, Staatsoper, Komische Oper (national and comic opera), Maxim-Gorki-Theater at the Lustgarten park, the Humboldt University, Museums’ Island with exhibitions of worldwide rank, Berlin’s cathedral, the former Palast der Republik“ which replaced the busted city castle, the Zeughaus, the
Neue Wache, to name only the most significant attractions.
The Brandenburger Tor, symbol of Berlin, of German separation and reunification is surely the city’s most famous building. In one of its wings you will find an office of Berlin’s tourist information. Alexanderplatz square with its coolish but impressive tower blocks is surmounted by the 365 meter high Fernsehturm (television tower), the city’s highest building.
The shopping and strolling avenue Friedrichstraße heads south towards the former border crossing Checkpoint Charlie (see Kreuzberg) and north towards Oranienburger Straße. This former Jewish quarter has developped a vital clubbing site that is overtopped by the New Synagogue’s golden dome. Exclusive and eccentric shops, chic cocktail bars and scruffy backyard romance encounter in and around Hackesche H¨ofe. Around Schiffbauerdamm, near Deutsches Theater, Charit´e clinic and new government quarter
you will often meet prominent politicians.
If you prefer to avoid turbulences you might wish to visit the Dorotheenst¨adtischer
Friedhof“ cemetary, where outstanding personalities like Hegel, Brecht or the architect
Schinkel are buried.
Mitte has got a lot to offer south of Unter den Linden / Karl-Liebknecht-Straße, too: Next to the majestic Rotes Rathaus townhall the Nikolaiviertel quarter has preserved the charme of a small town of the 18th century. Not far from the conspicuous dome of St. Hedwig’s cathedral you will encounter one of Europe’s most beautiful places: Schinkel-designed Gendarmenmarkt square with Schauspielhaus theatre and concert hall and German and French dome. Between the districts of Mitte and Tiergarten spreads Potsdamer Platz, on of Berlin’s centres.
Berlin has three different fare zones (A, B, C):
Zone A: This is the area within the Berlin urban rail (S-Bahn) ring line. Zone B: The area outside the ring up to the city border.
Zone C: comprises the area surrounding Berlin (3 honeycombs). This sub-area is divided into 8 parts, each belonging to an administrative district. The Potsdam-Mittelmark area is included in the city district of Potsdam.
With the AB ticket, you can always be sure of having the right fare when travelling in Berlin. Single fare tickets are valid for 2 hours whereas short trip tickets can be used for at most three stops only.
For further information about tickets, timetables, stops, etc. you can look at: http:
Prof. Dr. Hans Ju¨rgen Pr¨omel
Vice President for Research of Humboldt-Universit¨at zu Berlin
Prof. Dr. Wolfgang H¨ardle
Professor for Statistics and Head of SPC & LOC
13:30 An implementation for regression quantile estimation
13:30 A Hotelling Test based on MCD
14:00 Data Depth and Quality Control
Giovanni C. Porzio (cancelled)
14:15 Experiments of robust ESACF identification of ARIMA models
14:30 Analyzing data with robust multivariate methods and diagnostic plots
13:30 A Selfdocumenting Programming Environment for Weighting
14:00 Analysing large data sets: is aggregation necessarily bad?
14:15 Statistical tests for pruning association rules
14:30 Blockmodeling techniques for Web Mining
13:30 An index of dissimilarity among time series: an application to the inflation rates of the EU countries
13:45 An algorithm to estimate time varying parameter SURE models under different type of restrictions
14:15 Collinearity Diagnostics: VIF’s Revisited
14:30 ModelBuilder - an automated general-to-specific modelling tool
14:15 Computational Methods for Time Series Analysis
15:30 skewness and fat tails in discrete choice models
16:00 Structural Equation Models for Finite Mixtures
16:30 Unobserved heterogenity in store choice models
Ignacio Rodriguez-del-Bosque (cancelled)
17:00 Long range dependence in spatial processes
Mar´ıa del Pilar Frias
17:15 Overdispersed integer valued autoregressive models
15:30 Canonical variates for Recursive Partitioning in Data Mining
16:00 Interactive Graphics for Data Mining
Daniela Di Benedetto
16:30 Factorial Correspondence Analysis : A dual approach for semantics and indexing
16:45 Research Issues in E-Commerce Mining
17:00 Data Driven Approach To Market Data Visualization and Segmen- tation: an application on Yugoslav market
15:30 Fast and robust filtering of time series with trends
16:00 Robust Estimation with Discrete Explanatory Variables
16:30 Robust Principal Components Regression
17:00 Robust Learning Process in Neural Network with applications in
17:15 The medcouple, a robust estimator of skewness
16:30 Algorithmic and Computational Procedures for a Markov Model in
Juan Eloy Ruiz-Castro
17:00 Application of ”aggregated classifiers” in survival time studies
09:00 Forecasting PC-ARIMA models for functional data
Mariano J. Valderrama
09:00 The Forward Search
09:30 Computational connections between robust multivariate analysis and clustering
10:00 Weights and Fragments
09:00 A Wildlife Simulation Package (WiSP)
09:30 Managing biological knowledge using text clustering and feature extraction
09:45 Optimal sampling with quantization applied to a pharmacokinetic problem
10:00 Bagging tree classifiers for Glaucoma diagnosis
09:00 Detection of locally stationary segments in time series - algorithms and applications.
09:30 A Time Series construction of an alert threshold using resampling methods
Andres M. Alonso
09:45 Statistical Temporal Rules
10:00 Pattern Recognition of Time Series using Wavelets
09:45 Iterating m out of n Bootstrap under Nonregular Smooth Function
Kar Yee Cheung
10:00 A re-sampling approach to cluster validation
11:00 Bootstrapping Threshold Autoregressive Models
11:30 Evaluating the GPH estimator via bootstrap technique
11:00 Robust Time Series Analysis Through the Forward Search
11:30 Using the Forward Library in S-Plus
11:00 Time Series Modelling using mobile communications and Broad- band Internet
11:30 Missing values resampling for Time Series
Andres M. Alonso
11:00 Sample Size Computations for Tolerance Region Estimation
11:15 Supervised Clustering of Genes
11:30 Two-Stage Screening With a Limited Number of Samples in the
11:45 Exploring high-dimensional dynamic data via dimension reduction
14:15 KyPlot as a tool for graphical data analysis
14:15 CAnoVaQc : a Software for Causal Modeling
Marc A. Mu¨ller
14:45 Java Applet for Asymmetrical Face Plots
15:00 Exact Nonparametrical Inference in R
14:15 Statistical inference for a Robust Measure of Multiple Correlation
14:45 Understanding influential observations in regression estimation of finite population means using graphs.
Maria Giovanna Ranalli (cancelled)
15:00 Variance stabilization and robust normalization for microarray gene expression data
Anja von Heydebreck
15:30 White test for the least weighted squares
Jan Amos Visek
14:15 An algorithm for the construction of experimental designs with fixed and random blocks
14:45 Optimal cross-over designs for mixed models
15:00 Comparison of Nested Simulated Annealing and Reactive Tabu Search for Efficient Experimental Designs with Correlated Data Neil Coombes
15:00 Bayesian Semiparametric Seemingly Unrelated Regression
15:30 Data extraction from dense 3D surface models
16:30 Standardized partion Spaces
17:00 Cross-Selection Exchange Algorithm for Constructing Supersatu- rated Designs
17:15 Design Approach for the Exponential Regression Model
17:30 Construction of T-optimum designs for multiresponse dynamic models
16:30 Representing Knowledge in the Statistical System Jasp
17:00 GENESEES v1.0 - GENEralised software for Sampling Estimates and Errors in Surveys
17:15 A prototype of GENEralised software for Sampling Estimates and Errors in Surveys (GENESEES) in presence of unit non-response Fabrizio Solari
16:30 Exploring the structure of regression surfaces by using SiZer map for additive models
Roc´ıo Raya Miranda
17:00 Locally adaptive function for categorical regression models
17:30 Unbiased Partial Spline Fitting under Autoregressive Errors
Michael G. Schimek
16:30 The MISSION Client: Navigating Ontology Information for Statis- tical Query Formulation and Publication in Distributed Statistical Information Systems
17:00 Data representation system intended for statistical education
17:15 Parallel and Distributed Computing in a Java Based Statistical
17:30 Statistical computing on Web Browsers with the dynamic link li- brary
09:00 Supervised Learning from Micro-Array Data
09:45 Development of a Framework for Analyzing Process Monitoring Data with Applications to Semiconductor Manufacturing Process Yeo-Hun Yoon
10:15 A 2 step experimental method for the analysis of multiple tables of categorical variables
Juan Ignacio Modron˜o
10:30 A computer and statistics analysis of the 2000 United States pres- idential election
09:45 Trans-Dimensional Markov Chains and their Applications in Statis- tics
09:45 Testing for simplification in spatial models
10:15 A study on the spatial distribution of convenience stores in the
Tokyo metropolitan area
10:30 Likelihood Based Parameter Estimation for Gaussian Markov Ran- dom Fields
09:45 Sweave: Dynamic generation of statistical reports using literate data analysis
10:15 JMulTi - A toolset for analysing multiple timeseries
10:30 NP-SIZE: A program to compute power and sample size for non- parametric tests
Daniele De Martini
09:45 Data Compression and Selection of Variables, with Respect to Ex- act Inference
10:15 Fractional Exponential Models: a multi-step ahead frequency do- main criterion
10:30 Model Selection in Neural Network Regression with Dependent
Michele La Rocca
11:15 Interactive Exploratory Analysis of Spatio-Temporal Data
11:15 Comparing Two Partitions : Some Proposals and Experiments
11:15 Estimating the number of unused words that Shakespeare knew
11:30 The development of a Web-based Rainfall Atlas for Southern Africa
11:15 Statistical software VASMM for variable selection in multivariate methods
11:15 Data augmentation algorithm in the analysis of contingency tables with misclassification
11:30 Model-based Classification of Large Data Sets
Tickets may be purchased at the conference desk, for the starting point please see the map of the inner city“ .
09:00 Mice and Elephants Visualization of Internet Traffic
J. S. Marron
09:00 Imputation of Continuous Variables Missing at Random using a
Simulated Scores Estimator
09:30 Fitting factor models for ranking data
09:00 Clockwise bivariate boxplots
09:30 Clustering Graphics
09:45 Variability of Nominal Data: Measurement and Graphical Presen- tation
10:00 Visual Exploration of Simple Association Models with Dou- bledecker & Mosaic Plots
09:00 Computer intensive methods for mixed-effects models
09:30 Decomposition of a Probability Table into a Mixture of Permuted
Discretized Binormal Tables
09:45 Parameters Estimation of Block Mixture Models
09:00 Induction of Association Rules: Apriori Implementation
09:30 Rough Sets and Association Rules - which is efficient?
09:45 Functional Principal Component of the intensity of a Doubly
Stochastic Poisson Process
Paula R. Bouzas
10:15 Forced Classification in Principal Component Analysis
10:30 An asymptotic non-null distribution of Hotelling T-square statistics under the elliptical distributions
09:45 Application of Hopfield-like Neural Networks to Nonlinear Factor- ization
10:15 Logistic Discriminant Analysis and Probabilistic Neural Networks in estimating erosion risk
10:30 Neural networks and air pollution control system
Bel´en Fern´andez de Ca
10:15 Maximum likelihood estimation of a CAPM with time-varying beta
Giovanni De Luca
10:30 Tests for covariance stationarity and white noise, with an applica- tion to Euro/US Dollar exchange rate
10:15 least squares reconstruction of binary images using eigenvalue op- timization
11:15 A conceptual approach to edit and imputation in repeated surveys
11:30 Measuring Electronic Commerce by Non Symmetrical Association
11:45 Semiparametric Regression Analysis under Imputation for Missing
11:15 MCMC model for estimation poverty risk factors using household budget data
11:45 Choosing the order of a hidden Markov chain through cross- validated likelihood
12:00 Connected Markov chain over three way contingency tables with fixed two-dimensional marginals
11:15 e-stat: A web-based learning environment in applied statistics
11:45 EMILeA-stat: Structural and didactic features for teaching statis- tics through an internet-based multi-medial environment
12:15 e-stat: Automatic evaluation of exercises and tests
12:45 e-stat: Basic Stochastic Finance At School Level
11:15 Combining Graphical Models and Principal Component Analysis for Statistical Process Control
11:45 Detection of outliers in multivariate data
Carla Santos Pereira
12:15 Output-sensitive computation of the multivariate ECDF and re- lated problems
Carlos M. Fonseca
12:30 Model-free Prediction Method for Multivariate Data
12:45 Improved Fitting of Constrained Multivariate Regression Models using Automatic Differentiation
12:00 A State Space Model for Non-Stationary Functional Data
12:30 A new non-linear partial least squares modelling based on a full inner relation
12:45 Clusterwise PLS regression on a stochastic process
13:00 Effect of analysis of pair-matched data using unconditional logistic regression model
Mohamed Ahmed Moussa
13:15 Regression using core functions
12:15 Metadata-bound Data Combination in Database Federations
12:30 MetaNet:Towards an integrated view of statistical metadata
13:00 On the Use of Metadata in Statistical Computing
14:30 Relativity and Resolution for High Dimensional Information Visu- alization with Generalized Association Plots (GAP)
14:30 A Bayesian Model for Compositional Data Analysis
15:00 Accurate Signal Estimation Near Discontinuities Using Bayesian
15:15 A Comparison of Marginal Likelihood Computation Methods
14:30 e-stat: Development of a Scenario for Statistics in Chemical Enge- neering
15:00 XQC: A User Interface for Statistical Computing via the Internet
15:15 e-stat: Web-based learning and teaching of statistics in secondary schools
15:15 Missing Data Incremental Imputation through Tree Based Methods
15:45 Hyper-rectangular Space Partitioning Trees: a few insights
Isabelle De Macq
15:45 Data quality and confidentiality issues in large surveys
16:30 Bayesian automatic parameter estimation of threshold Autoregrre- sive (TAR) models using Markov chain Monte Carlo.
17:00 Maneuvering target tracking by using particle filter method with model switching structure
17:30 On the use of particle filters for Bayesian image restoration
16:30 Intelligent Web-Based Training (I-WBT) in Applied Statistics
17:00 MD*Book online & e-stat : Generating e-stat modules from LaTeX
Sigbert Klinke (cancelled)
17:30 XQS/MD*Crypt as Means of Education and Computation
16:30 Growing and Visualizing Prediction Paths Trees in Market Basket
17:00 Different ways to see a tree - KLIMT
17:30 Optimally trained regression trees and Occam’s razor
09:00 Teaching Experiences with CyberStats and other Electronic Text- books
09:00 Evolutionary algorithms with competing heuristics in computa- tional statistics
09:30 Algorithms for CM- and S-estimates
09:45 Parallel Algorithms for Inference in Spatial Gaussian Models
10:15 Calculation of Partial Spline Estimates Using Orthogonal Transfor- mations
09:15 Nonparametric Components in Multivariate Discrete Choice Mod- els
09:30 On the estimation of functional logistic regression
09:45 Classification based on the support vector machine, regression depth, and discriminant analysis
10:15 Statistical optimization in the computer age
Closing Session with presentation of awards.
Final Session with presentation of awards.
The Computational Finance is a satellite conference of Compstat 2002. It is organized by Dr. Helmut Herwartz and Sonderforschungsbreich 373.
14:00 On explicit option pricing in OU-type and related stochastic volatil- ity models
Robert G. Tompkins
14:35 A functional principal components approach to modelling implied volatilities
14:55 Efficient price caching for derivatives pricing
15:15 Multiscale estimation of processes related to the fractional Black- Scholes equation
16:00 Open issues in the Monte-Carlo simulation of credit portfolios
Hans G. Lotter
16:35 Assessing the quality of VaR forecasts
16:55 Comparison of Fourier inversion methods for delta-gamma- approximations to VaR
17:15 Two-step generalized maximum entropy estimation of dynamic pro- gramming models with sample selection bias
Rosa Bernardini Papalia
17:35 Spreadsheet-based option pricing under stochastic volatility
09:00 Risk Analysis for Large Diversified Portfolios
10:00 Evaluation of out-of-sample probability density forecasts with ap- plications to stock prices
10:35 An algorithm to detect multiple change-points in the residual vari- ance of financial time series
10:55 Exchange rate returns and time varying correlations
11:15 Likelihood inferences in GARCH-BL models
Giuseppe Storti,Cosimo Vitale
11:35 Online forecasting of house prices with the Kalman filter
14:00 Portfolio optimization under credit risk
14:35 Fitting a normal-Pareto distribution to the residuals of financial data
14:55 Credit contagion and aggregate losses
15:30 A PDE Based Implementation of the Hull&White Model for Cash- flow Derivatives
16:05 Testing the Diffusion Coefficient
16:25 Using spreadsheets for financial statistics
16:45 Implied trinomial trees and their implementations with XploRe
17:05 Nonparametric Moment Estimation for Short Term Interest Rate
Robert G. Tompkins firstname.lastname@example.org, Vienna University of Technology
Friedrich Hubalek Vienna University of Technology
Given that the traded prices for options are inconsistent with the assumptions of Geomet- ric Brownian Motion and constant volatility, a number of authors have considered alternative modelling approaches to Black and Scholes (1973). An extremely popular approach has been to include stochastic volatility when pricing options. Heston (1993) among others has derived a closed form solution for such a model. Given that the price process is not necessarily contin- uous, one approach has been to include jumps in the underlying asset price process. The first to consider this was Merton (1976) in a jump diffusion context. Recently, Bates (1996, 2000) and Barndorff-Nielsen and Shephard (2001) extended this approach to consider models incor- porating non-normal price processes (mimicking jumps) with subordinated stochastic volatility processes.
In this paper, we consider closed form solutions for a model drawn from the general framework suggested by Barndorff-Nielsen and Shephard (2001). This model assumes that an Ornstein- Uhlehbeck process describes the volatility with disturbances following a generalised hyperbolic distribution. As with the Madan, Carr and Chang (1998) Variance Gamma Process, the vari- ance of this model follows a Gamma distribution. We also consider alternative generalised hyperbolic distributions including the inverse Gaussian distribution.
We provide a derivation of an explicit option pricing formula for European options which preserves the (minimal) martingale measures and relies upon Laplace inversion methods. Sim- ulations show that this model can address the return skewness and strike-price biases in the Black and Scholes (1973) model. Empirical tests of the model for the S&P 500 and British
Pound/ US Dollar futures and options markets reveal differences in the statistical and risk neutral density functions.
Matthias Fengler email@example.com, Humboldt-Universit¨at zu Berlin
In this paper we propose a functional principal components approach to modelling implied volatilities. In a first step we regard the realizations of implied volatilities of a certain ma- turites as random functions, and develop a functional principal components approach. We provide stability analysis of principal components and hypothesis testing for principal func- tions. Secondly we show how this can be used for modeling the several maturity strings jointly. Our results indicate that this may be superior modelling approach than linear specifications as proposed in earlier literature.
Dr. Peter Schwendner firstname.lastname@example.org, Sal. Oppenheim Dr. Bernd Engelmann Deutsche Bundesbank S¨oren Gerlach shift, Software fu¨r trading und risk
Market makers for derivatives products have to quote the prices of thousands of instruments continuously during the trading day while market parameters (underlying prices, implied volatil- ities) are fluctuating. Being able to quickly recalculate the product prices is essential for market makers to prevent trading losses because their contributed prices are observed from competitors and other traders searching for arbitrage opportunities all the day. Many papers of the standard computational finance literature answer the question how a single option can be priced a single time quickly and precisely. Our paper discusses several methods to price many options very often with slightly different market parameters throughout the trading day. The basic idea is to calculate a lot of prices for each instrument on a market parameter grid during the night and to store the results in appropriate matrices (price caches). During the trading day, the precalculated prices are only read from the price caches. The different methods vary in the choice of coordinates and pricing models, the design of the price caches and the interpolation schemes that are applied to the price data. Numerical examples are given.
Rosaura Fernandez-Pascual email@example.com, University of Jaen, Spain M.D. Ruiz-Medina University of Granada, Spain J.M. Angulo University of Granada, Spain
The Black-Scholes theory plays an important role in the derivation of volatility estimates (see Barles and Soner, 1998; Dempster and Richards, 2000; Pechtl, 1999; Smith, 2000, among others). Different generalizations of the Black-Scholes equation have been studied, involving, for example, stochastic volatility (see Bates, 1996; Ghysels, Harvey and Renault, 1996; Hobson and Rogers, 1998; Lewis, 2000), fractional derivatives (see Wang, Qiu and Ren, 2001; Wyss, 2000), fractal and multifractal processes (see Heyde, 1999; Mandelbrot, 1997), etc. In particular, in the case where fractional derivatives are considered, the Black-Scholes model is defined in terms of fractional integrated white noise when the derivatives are of order larger than or equal to
0.5, and in terms of fractional Brownian motion when the derivatives are of order less than
0.5. In this paper we address the filtering and prediction problems associated with these models using the theory of fractional generalized random fields (see Angulo, Ruiz-Medina and Anh,
2000; Ruiz-Medina, Angulo and Anh, 2001; Ruiz-Medina, Anh and Angulo, 2001).
Ludger Overbeck firstname.lastname@example.org, Deutsche Bank
- Basic features of credit portfolio models
- Extreme quantiles of a simulated loss distribution
- Correlations/Joint occurence of rare events
- Capital allocation based on Monte-Carlo-Simulations
- Important sampling in the underlying factor model
Zdenek Hlavka email@example.com, Humboldt-Universit¨at zu Berlin Prof. Dr. Wolfgang H¨ardle firstname.lastname@example.org, Humboldt-Universit¨at zu Berlin Gerhard Stahl Gerhard.Stahl@bakred.bund.de, BAKred
The simplest models for Value-at-Risk prediction are based on the reduction of the dimension- ality of the risk-factor space. We discuss briefly various types of models based on this approach. The main focus of the paper is the comparison of the quality of VaR forecasts. For this pur- pose, we adapt methods for the verification of probability forecasts which are often used in the
weather forecasts (Murphy and Winkler, 1992). These methods of comparing the VaR forecasts are based on the probabilities of fixed intervals rather than on single quantile of the underlying probability distribution and allow thus to assess the differences between the considered VaR modells from yet another point of view.
Stefan Jaschke email@example.com, Weierstrass-Institut for Applied Analysis and Stochastics (WIAS)
Different Fourier inversion techniques are compared in terms of their worst-case behavior in the context of computing Value at Risk in Delta-Gamma-Approximations.
Rosa Bernardini Papalia firstname.lastname@example.org, Perugia University
Over the last decade there has been growing a substantial literature on the structural estimation of dynamic discrete choice models based on dynamic programming (DP) (Rust 1994; Hotz, Miller et al. 1992, 1993). They have found application in many fields such as industrial or- ganization, labor economics, and public finance, among others. However, the computational burden of estimation has been an important constraint with relevant implication for empirical works. In this paper we are concerned with estimation problems in presence of corner solu- tions. In this case, the common approach to estimate the Euler stochastic equations (Hansen and Singleton, 1982) has to be extended to allow for discontinuity and non-differentiability in the utility function. However, marginal conditions of optimality hold only at interior solutions, and a selection bias is introduced when the sub-sample of interior solutions is used. Moreover, not all the structural parameters can be identified exploiting only the structure in the marginal conditions of optimality. This paper presents an estimation approach based on the Maximum Entropy Principle (MEP). The GME formalism represents a new basis to recover the unknown parameters and the unobserved variables. Following Hotz and Miller (1993) a representation of the conditional choice value functions in term of choice probabilities is considered. We show how GME can be used to deal with the problem of defining the Euler equation for the corner solution case and we suggest a specification of the DP model based on the idea to introduce an additional constraint on the conditional choice value functions, which provides a correction for the sample selection bias. The presented two-step GME estimator is more flexible than
standard ML and ME estimators. The estimation procedure has the advantage of being consis- tent with the underlying data generation process and eventually with the restrictions implied by economic theory or by some non sample information. The GME method of estimation proceeds by specifying certain moments of unknown probability distributions and does not require the specification of the density function of the population. Since we can have a large number of density functions with the same moments, the MEP is used to get a unique distribution with these moments. In other worlds, using MEP, it is possible to indirectly determine the density function of the population. Unlike ML estimators, the GME approach does not require explicit distributional assumptions, performs well with small samples, and can incorporate inequality restrictions.
Christian Hafner Christian.Hafner@electrabel.com, Electrabel
Option pricing under stochastic volatility does not yield, in general, closed form solutions and often relies on time-consuming methods such as Monte Carlo simulation. In practice, this is the most important argument for still using Black-Scholes type formulae. In this talk it is argued that using simple approximations of the Hull and White type, one can generate closed form expressions of sufficient accuracy for a variety of underlying processes such as GBM and OU. These approximations can conveniently be used in spreadsheet applications, as will be demonstrated in the talk.
Eckhard Platen Eckhard.Platen@uts.edu.au, University of Technology Sydney
The measurement of risk for portfolios of large financial institutions plays a crucial role in the future development of financial technologies. Based on a new characterisation of asymptotic portfolios highly efficient risk measurement methodologies can be derived. These are suitable for portfolios with hundreds or thousands of instruments. Value at Risk analysis can be performed without time consuming simulations.
Yongmiao Hong Cornell University
A recent important development in time series econometrics and financial econometrics is to forecast probability distributions and to track certain aspects of distribution forecasts such as value at risk to quantify portfolio risk. In this paper, a rigorously founded, generally applicable, omnibus procedure for evaluating out-of-sample probability density forecasts is proposed, using a newly developed generalized spectral approach. This procedure is supplemented with a class of separate inference procedures that focus on various specific aspects of density forecasts to reveal information on possible sources of suboptimal density forecasts. The finite sample performance of the proposed procedures is examined via simulation. Applications to S&P 500 index and IBM daily returns are considered, where a variety of popular density forecast models, including RiskMetrics of J.P. Morgan and various GARCH models, are evaluated using the proposed procedures. Evidence on various deficiencies in these forecast models is documented, suggesting room and directions for further improvements upon them. In particular, it is found that poor modelling for conditional mean is as important as poor modelling for conditional variance and improper specification for innovation distribution. This sheds some light on the existing density forecast practice, which focuses on volatility dynamics and innovation distributions and pays little attention to conditional mean dynamics.
Andreas Stadie email@example.com, University G¨ottingen
The cummulative sums of squares (CUSUMSQ) provide a means for detecting change-points in the volatility (variance) of time series. An algorithm for this purpose was proposed by Inclan and Tiao (1994) and an alternative by Diebold, Hahn and Tay (1998). I describe a third algorithm that combines some of the features of the previous two. The results of a simulation experiment to compare the performance of the algorithms are presented. As an example the algorithm is applied in the following context: A popular method of checking the fit of financial time series models is to examine the probability integral transform of the observations with respect to their forecast distribution. If the model is correct then the transformed values are iid U(0,1) distributed. Applying a second transformation, the inverse standard normal distribution function, produces ”pseudo-residuals” that are iid N(0,1) distributed. Applying the above algorithm to the CUSUMSQ of the pseudo-residuals provides a method of detecting
periods in which the model fails to predict the observed volatility in the series. This is illustrated for a moving-density model fitted to the daily returns of the Deutsche Bank AG.
Edy Zahnd firstname.lastname@example.org), Rue des Parcs 67, 2000 Neuchatel
We will estimate GARCH models without the restriction of constant conditional correlation.
Dr Giuseppe Storti email@example.com, University Salerno
In this paper we analyse the statistical properties of the GARCH Bilinear (GARCH-BL) model recently proposed by Storti and Vitale (2001). With respect to the standard GARCH (Bollerslev,
1986), the GARCH-BL model allows for two main advantages. First, the model allows to capture leverage effects in the volatility process by including interaction terms between past returns and volatilities into the equation for the conditional variance. Second a more flexible parametric structure is permitted. Under the assumption of conditional normality, the log- likelihood function of the model can be maximized with respect to the unknown parameters by means of an EM type algorithm. The procedure is illustrated in the paper. The availability of likelihood inference for this model allows to define a formal testing procedure for verifying the hypothesis of asymmetry in the volatility process (which means presence of leverage effects). More precisely the null hypothesis of no asymmetric effects is tested against the alternative hypothesis of a GARCH-BL model of a predefined order. The actual modelling procedure will be illustrated by means of an application to some financial time series data. From a statistical point of view, the so called leverage effect, often observed in financial time series, stems from the presence of a negative correlation between past shocks and volatility. Hence, we expect the effect of a negative return on the current volatility to be higher than that associated to a positive return of the same magnitude.
In the last ten years, starting from the seminal paper of Nelson (1991) a substantial amount of research efforts has been devoted to the analysis of conditionally heteroskedastic models allowing for similar asymmetric effects. Among the others, we remember the work of Glosten, Jagannathan and Runkle (1993) proposing the GJR model and also the Generalized Quadratic ARCH (GQARCH) by Sentana (1995) and the Threshold GARCH (TGARCH) model proposed by Rabemananjara and Zakoian (1993).
We develop a statistical model of house price determination, that is well-grounded in economic theory.
The model is cast into state space form which opens up the application of the Kalman filter to estmate the model and use it for online forecasting. These forecasts combine the model parameters with the house characteristics provided by users through an HTML interface.
Point forecasts and prediction intervals are returned online. These provide useful information for potential buyers and sellers of a house in an otherwise intransparent market.
Based on the models of Hull and White (1990) for the pricing of non-defaultable bonds and Schmid and Zagst (2000) for the pricing of defaultable bonds we develop a framework for the optimal allocation of assets out of a universe of sovereign bonds with different time to maturity and quality of the issuer. Our methodology can also be applied to other asset classes like corporate bonds. We estimate the model parameters by applying Kalman filtering methods as described in Schmid and Kalemanova (2001). Based on these estimates we apply Monte Carlo simulation techniques to simulate the prices for a given set of bonds for a future time horizon. For each future time step and for each given portfolio composition these scenarios yield distributions of future cash flows and portfolio values. We show how the portfolio composition can be optimized by maximizing the expected final value or return of the portfolio under given constraints like a minimum cash flow per period to cover the liabilities of a company and a maximum tolerated risk. To visualize our methotology we present a case study for a portfolio consisting of German, Italian, and Greek sovereign bonds.
The Normal-Pareto (NP) distribution assumes that, for negative log returns of financial series, the innovations after the fitting of AR-GARCH models, have a normal distribution below a threshold value and a generalised Pareto distribution (GPD) above it. This threshold value can be estimated by maximum likelihood estimation (MLE). Monte Carlo simulations of normal, as well as heavy tailed error-distributions, are used to compare the fit of this distribution with other methods to calculate Value-at-Risk and Expected Shortfall. It is also applied to real time series like stock exchange data.
Credit contagion refers to the propagation of economic distress from one firm or sovereign government to another. In this paper we model credit contagion phenomena and study the fluctuation of aggregate credit losses on large portfolios of financial positions. The joint dy- namics of firms’ credit ratings is modeled by a voter process, which is well-known in the theory of interacting particle systems. We clarify the structure of the equilibrium joint rating distri- bution using ergodic decomposition. We analyze the quantiles of the portfolio loss distribution and in particular their relation to the degree of model risk. After a proper re-scaling taking care of the fat tails of the contagion dynamics, we provide a normal approximation of both the equilibrium rating distribution and the portfolio loss distribution. Using large-deviations techniques, we consider the issue of gains from portfolio diversification.
A new implementation for the one-dimensional Hull&White model is developed. It is motivated by a geometrical approach to construct an invariant manifold for the future dynamics of forward zero coupon bond prices under the forward martingale measure. This reduces the option-pricing problem for cashflow derivatives to the solution of a series of heat equations. The heat equation
is solved by a standard Crank-Nicolson scheme. The new method avoids the calibration used in traditional solution aproaches like trinomial trees. A generalisation of the reduction approach is also presented for some generalised Hull&White models.
In mathematical finance diffusion models are widely used and a variety of different parametric models for the drift and diffusion coefficient coexist in the literature. Since derivative prices de- pend on the particular parametric model of the diffusion coefficient function, a misspecification of this function leads to misspecified option prices. We develop two tests about a parametric form of the diffusion coefficient. The finite sample properties of the tests are inverstigated in a simulation study and the tests are applied to the 7-day Eurodollar rate, the DAX and the NEMAX stock market indizes. For all three processes, we find in the empirical analysis, that our tests reject all tested models.
MD*ReX is a client/server based spreadsheet solution designed for statistical analysis. Using Excel’s wellknown interface, this application provides intuitive access to XploRe’s statistical methodology. Circumventing the numerical impreciseness of Excel while exploiting its graphi- cal capabilities is the development rationale for MD*ReX. The design addresses novice users of statistical applications as well as experienced researchers interested in developing own methods. We will demonstrate the usage of MD*ReX in the context of financial statistics with applica- tions of copula based Value-at-Risk, methods of quantifying implied volatilities and non-linear time series analysis.
Options pricing has become one of the most commonly used machinery for portfolio manage- ment and hedging. The very first approach of Black and Scholes, assuming a constant volatility, has been overcome as implausible. Especially since the 1987 crash, the markets implied Black- Scholes volatilities have shown a negative relationship between implied volatilities and strike prices. This skew structure of the volatilities is commonly called ”the volatility smile”. The
implied multinomial trees are one of the great deal of methods which enable pricing of deriva- tive securities consistent with the existing market. There is a variety of ways how to construct implied trees. While the binomial trees have just the minimal number of parameters to fit the smile, multinomial trees have some degrees of freedom. However, any kind of an implied tree is converging to the same theoretical results. Derman, Kani and Chriss (1996) in the Quan- titative Research group of Goldman & Sachs devised an algorithm of constructing the Implied Trinomial Trees from the current market option data. They used the additional parameters to conveniently choose the state space of all node prices, only the transition probabilities are constrained by market options. The flexibility can be advantageous in matching trees to smiles. This algorithmic approach is presented in this paper. No new theoretical aspects were added, but the way how to implement such a model into XploRe have been shown. This software seems to be very suitable for solving this type of problems. The whole procedure is then illustrated on real market data.
The main feature of the options is their forward-looking character. The development of the underlying asset process is uncertain, therefore the holder of the options has to have some expectations of the future value of the options. Researchers try to extract the information from the market option prices. There exist various approaches to recover the state price den- sity at one particular time to maturity. For example, the most famous Black-Scholes model, while researchers have found this model cannot de-scribe the underlying process precisely, the newer parametric method, a mixture of two log-normal density model, and some nonparamet- ric meth-ods, such as A¨´yt-Sahalia (1996) and Rookley(1997)’s method, and implied binomial trees (see e.g. Rubistein (1994), Derman and Kani (1994), Barle and Cakici (1998)) method. In this paper, we try to estimate variance, skewness and kurtosis of the state price density function nonparametrically for short term interest rate options, where their movements with the time found to be associated with the interventions of the European Central Bank, such as the adjustment of the interest rates and intervention in the exchange markets.