This course will be given in period 3 2001. Start Jan 26 2001. Fridays 9:15-11 in room 1537.

Preliminary information:

The first part of the course material can be obtained from
Course Package
Reading Assignments & Lecture plan

Home Works

Revisions: This time the course will concentrate on what I regard as the core areas: Inference and Bayesian statistics and support vector technology. I will try to interpret popular alternative approaches in these terms. I will also cover some fundamental parts of causality inference. This is an until now rather heretical area that has been made respectable by the pioneering work of Judea Pearl and others. Unlike many courses in this area, this one is absolutely free of commercial software and business applications. You will get more out of the course if you take the examination as a project related to your research data, but the form of examination is entirely up to you: presentation in class, homework, project or a combination of these and/or something else! The most difficult part is perhaps to define your personal examination procedure!

Send me an email if you want to participate!

Tentative reading list:

A: General Methodology :

  1. Keichii Noe: Philosophical aspects of Discovery Science

B: Bayesian methods:

  1. E. T. Jaynes: Probability theory: The logic of Science, Ch 1,2,4,5,24.
  2. S.Arnborg: Bayesian Data Mining - Part I.
  3. Robins, Wasserman: Conditioning, Likelihood and Coherence: A review of some foundational concepts. (JASA 95 (Dec 2000)1340--1345.)

C: Markov Chain Monte Carlo

  1. Gilks, Richardson, Spiegelhalter: Introducing Markov Chain Monte Carlo.
  2. Green: MCMC in image analysis.
  3. Mannila, Toivonen, Korhola and Olander: Learning, mining or modeling? A case study from Paleoecology, in Discovery Science, LNCS 1532.
  4. Niclas Bergman: Recursive Bayesian Estimation, PhD Thesis, Linköping University.

D:Time series and prediction :

  1. G.L. Bretthorst - Bayesian spectrum analysis and parameter estimation LNS 48, Springer Verlag, Ch 3-5.
  2. Ch 1 of: Time Series Prediction: Forecasting the Future and Understanding the Past. Weigend, A. S., and N. A. Gershenfeld (Eds.) (1994) Santa Fe Institute Studies in the Sciences of Complexity XV. (Proceedings of the NATO Advanced Research Workshop on Comparative Time Series Analysis, Santa Fe, NM, May 1992.) Reading, MA: Addison-Wesley.
  3. P. Vitanyi, Ming Li: On Prediction by data Compression, in LNCS ?

E: Stochastic Complexity and Classification (unsupervised and supervised)

  1. J. J. Oliver and D. J. Hand, Introduction to Minimum Encoding Inference, [TR 4-94] Dept. Stats. Open Univ. and also TR 94/205 Dept. Comp. Sci. Monash Univ.
  2. G.I. Webb: Further Experimental Evidence against the Utility of Occams Razor,
    Journal of AI research 4(1996) 397-417.
  3. J. J. Oliver and R. A. Baxter, MML and Bayesianism: Similarities and Differences, [TR 94/206]

F: Causality.

  1. Glymour, Cooper: Computation, Causation and Discovery(1999), Ch 2-3.
  2. Dawid: Causal Inference without Counterfactuals (with Discussion). JASA 95 450(2000).
  3. N. Friedman, K. Murphy, S. Russel: Learning the structure of probabilistic networks, Uncertainty in Artificial Intelligence, 1998.

G: Support vector technology, applications in genomics.

  1. "Theory of SV Machines", CSD-TR-96-17, Royal Holloway, University of London, Egham, UK, 1996.
  2. ( D. Slonin, et al.: Class prediction and discovery using gene expression data. (TR, Whitehead Inst. Sept 1999(also Golub et al., Science Oct 15 1999: 531-537)) (data at
  3. M. Brown et al.:Knowledge based analysis of microarray gene expression data by using support vector machines (PNAS 97 Jan 2000).
  4. Butte et. al.,: Discovering functional relationships between RNA expression and chemotherapeutic susceptibility using relevance networks. PNAS 97, no 22, 12182-12186, Oct 2000.

H: Visualisation of non-geometrical data..

  1. Buja et al (1996). Interactive High-Dimensional Data Visualization. Journal of Computation and Graphical Statististics. Vol 5, No. 1.
  2. J.H. Friedman: Exploratory projection pursuit. JASA 82(1987) pp 249--266.


Jan 26: Overview and administrative, planning discussions.

Reading: Jaynes Preface, Ch1, Ch2 (you may skip the detailed derivations of product and sum rules. These are also in my report 'Bayesian Data Mining', but new and better derivations exist), Ch4 Ch5. The whole set of Jaynes notes is temporarily on nadas directory /misc/tcs/datamining/Jayne. I remove it when the book is out.
Feb 2: Bayesianism, inference. Normative claims.

Reading: A survey of Bayesian Data mining (Sections 1-3, skip what you already read in Jaynes).
Feb 9: Bayesian decision making, alternatives,

Reading: Gilks et al: Introducing MCMC. (if you have time:Green: MCMC in image analysis.)
Feb 16: Ellsbergs paradox and ambiguity aversion. Nonlinear utility functions. Markov Chain Monte Carlo.

Reading: Green, Cheeseman&Stutz. If you have time: Bergman, Ch3, Ch6 (includes a summary of what you already read).
Feb 23: MCMC Applications. Prediction and estimation. Particle filters.
March 2: No lecture (sportlov)
March 9:Time series mining. Application: Tomas Carlsson.
March 16: Time series, identifiability. Distribution free learning. Support vector technology.
March 23: Causality. Confounders, inference
March 30: tba.

Some resources:

The Nada /misc directory has afs address /afs/