There will be a repeat of previous weeks lecture outside my room at 9:15, Fridays.

Reading assignment: Feb 4: Jaynes: preface, Ch1 Ch2 (do not follow the 'proof' very closely -- it is the type of explanatory proof typical for theoretical physics), ch 4 & 5. See in folder 'course Package (kurspaket)' Take a look at Exercises 2.1 and 2.2.

The whole set of Jaynes notes is temporarily on nadas directory /misc/tcs/datamining/Jayne. I remove it when the book is out.

Lecture on Friday Jan 28, 15:15.
Topic: Principles of Bayesian Statistics: Priors, Likelihoods,
posteriors, decision analysis and estimates, with examples.

Friday Feb 4: Popular models: Generalization of Beta to Dirichlet,
Graphical models and HMMs. Bayesian regression and feature selection.

Reading: Jaynes Ch2.

Friday Feb 11: Foundations and alternatives to Bayesian inference.
Is Cox's/Jaynes' argument valid? Coherence.
Extended probability. Robust Bayes analysis,
Dempster/Shafer theory and fuzzy sets.

Reading: Bergmans Thesis, Ch 3 and 6.

Friday Feb 18: Markov Chain Monte Carlo.

Friday Feb 25: No Lecture.

Friday March 4: Chapman Kolmogorov equation and particle filters
(sequential MCMC) Lecture starts at 15:30!!

Friday March 11: Time Series and latent states

Morning Lecture on Wednesday March 16

Friday March 18: Support vectors and kernel methodolygy

Future schedule will be announced here later.

COURSE STYLE

Two-hour lectures are given on Fridays, after noon.
Language is English or Swedish, depending on participants.
Students are expected to improve their understanding of the
topics by reading recommended survey and research papers, and
by solving at least some weekly bookwork/rider problems.
Topics are agreed on during the first lecture,
and students are encouraged to present material
they master well, in class.
Examination is by three parts(the mix is negotiated, only the first part
is required):

Discussion of a submitted list of studied papers, not necessarily only
among those recommended.

Discussion of solutions to weekly problems.

Discussion of solutions to open-ended homeworks.

Discussion of a project involving methods studied and
(preferably) related to students research topic.

RECOMMENDED BOOKS:

One of the following is recommended as a useful reference
and summary of the research area:

"Principles of Data Mining", MIT Press 2001, by David J. Hand, Heikki Mannila and Padhriac Smyth.

Michael Berthold and David J. Hand (eds):
Intelligent Data Analysis, An Introduction

Springer-Verlag, 1999
ISBN 3-540-65808-4

E.T. Jaynes: Probability Theory, the Logic of Science (L. Bretthorst, Editor)
Cambridge University Press 2003,
ISBN:0521592712

(fascinating, sometimes controversial)

A. Gelman, J. Carlin, H Stern, D. Rubin: Bayesian data Analysis,
Chapman & Hall, 2003 (Second Edition)

(very comprehensive, both practice and theory)

Benardo and Smith: Bayesian Theory, John Wiley & Sons, 1994,
ISBN 0-471-92416-4

(Fairly advanced/theoretical)

Data Mining: Challenges and Opportunities, Idea Group Publishing, 2003
Editor: John Wang. ISBN 1591400511

( Your lecture notes is expanded version of Ch1 of this)

First Lecture: Course overview, presentation of participants,
planning discussion.
Lecture schedule will be defined based on
participants interests. The following are among topics that
have been covered in this course previously
(not all in the same course instance-no more than half the
topics can be conveniently covered in one course):

Bayesian and frequentist inference - relationships and
philosophical issues

Overview of important statistical models

Multiple testing: FEW, FDR, specificity and power, ROC characterization

Model and predictor selection as decision problem:
Loss functions, cross validation (Bernardo/Smith Ch6)

Bayesian view of exploratory data analysis and model adequacy(Gelman et al)

The overfitting problem and its appearances in Bayesian statistics.

Graphical statistical models and Bayesian networks

Unsupervised classification and clustering through mixture models

Markov Chain Monte Carlo methods in Bayesian inference

Exchangeability, Bayesian regression approaches, hierarchical models.

Sequential Markov models (particle filters)

Robust Bayes, Dempster/Shafer and soft computing

Support vector techniques: distribution-independent analysis

Read more on literature for above topics in previous course pages.
The first part of the course material can be obtained from

Course Package

- B-course e-course, Complex Systems Group, University of Helsinki
- Xgobi tool at /misc/tcs/xgobi (also compiled for SUN solaris 2) A new version of gobi, ggobi, is interfaceable to R and Postgres:www.ggobi.org
- Support vector Machines
- I installed the Support Vector Machine on /afs/nada.kth.se/misc/tcs/svm/sv-src/.