The lectures will be of two kinds: Foundational areas like the
of Bayesian analysis, unsupervised classification, MDL, soft
will be covered in seminar lectures by me and invited speakers.
in areas of interest to students will be presented by students and/or
speakers and discussed in class. There will be less emphasis on a
of concrete computational procedures, since once the basic theory and
range of application areas are understood, it is easy to look up the
computational procedures required.
U.M Fayyad, G. Piatetsky-Shapiro, P. Smyth R Uthurusamy (Eds)
Advances in Knowledge Discovery and Data Mining.
AAAI Press, Menlo Park, CA 1996. ISBN 0-262-56097-6
H. Mannila: Methods and problems in Data Mining,
ICDT 97, LNCS 1186, pp 41-55
T. R. Willemain, "Model Formulation: What experts think
and when," Operations Research, Vol. 43, no. 6, pp.916-932,
B: Bayesian methods and Markov Chain Monte Carlo :
In Fayyad et al: Ch 3, 11, 13, 20
E. T. Jaynes: Probability theory: The logic of Science, Ch
D.S. Sivia: Ch2: Parameter estimation and Ch 6: Non-parametric
in: Data Analysis, A Bayesian Tutorial Clarendon Press Oxford 1996
Hastings: Monte Carlo sampling methods using Markov chains, and
applications. Biometrika 57(1970) p 97-109
R.M.Neal: Probabilistic Inference using Markov Chain Monte Carlo
TR CRG-TR-93-1, University of Toronto, CS Department
Bibliography from B 5.
C: Soft Computing:
Azvine, Azarmi and Tsui: An Introduction to Soft Computing - A
for building Intelligent Systems LNCS1198, 1997, pp 191-210.
Baldwin, Martin: Basic concepts of a Fuzzy Logic Data Browser
applications. Software Agents and Soft Computing, LNCS1198, 1997, pp
Xiaohua Hu, Nick Cerone: Mining Knowledge Rules from Databases: A
Set Approach IEEE 1996 Data Engineering Conference, pp 96-105.
SummarySQL - A Fuzzy Tool For Data Mining
Dan Rasmussen, Ronald R. Yager, Intelligent Data Analysis, 1(1)(1997)
Heinonen, Mannila: Attribute oriented induction and conceptual
University of Helsinki Dept Computer Science, report C-1996-2.
D: Stochastic Complexity and Classification (unsupervised and
In Fayyad et al: Ch 6, 7, 19
J. Rissanen: Stochastic Complexity(with discussion), J.R.
Soc B(1987) 49(3) pp 223-239 and 252-265.
C.S. Wallace and P.R. Freeman: Estimation and inference by
Coding(with discussion). J.R. Statist. Soc B(1987) 49(3) pp 240-265.
Cullen Schaffer: Selecting a Classification Method by
G.I. Webb: Further Experimental Evidence against the Utility of
Journal of AI research 4(1996) 397-417.
S.P. Curram & J. Mingers. "Neural Networks, Decision
Induction and Discriminant Analysis: An Empriical Comparison."
of the Operational Research Society. 45(4) 1994 pp 440-450.
Mats Gyllenberg, Timo Koski and Martin Verlaan:
Classification of binary vectors by stochastic complexity.
J Multivariate ananlysis 62(1997)
H.G. Gyllenberg, M. Gyllenberg, T. Koski, T Lund:
Stochastic complexity as a taxonomic tool. TRITA-MAT-97-MS-02, KTH.
M. Gyllenberg, T. Koski, T. Lahti:
Associative memories for clusters of binary vectors using MATLAB neural network toolbox. Proc of the Nordic MATLAB conference.
E:Time series and prediction :
In Fayyad et al: Ch 9, 22
Casdagli, Des Jardins, Eubank, Farmer, Gibson, Hunter, Theiler:
modeling of Chaotic Time Series: Theory and Applications.
(to read before Casdagli et al): Ch 1 of: Time Series Prediction: Forecasting the Future and
Past. Weigend, A. S., and N. A. Gershenfeld (Eds.) (1994) Santa Fe
Institute Studies in the Sciences of Complexity XV. (Proceedings of the
NATO Advanced Research Workshop on Comparative Time Series Analysis,
Santa Fe, NM, May 1992.) Reading, MA: Addison-Wesley.
F: Spatial applications:
"Fast Spatio-Temporal Data Mining of Large Geophysical
The First International Conference on Knowledge Discovery and Data
Montreal, Quebec, Canada, Aug 1995.
Bettini, Wang, Jajodia: Testing Complex Relationships Inviolving
Granularities and its Application to Data Mining:PODS 96
Wang, Chirn, Marr, Shapiro, Shasha, Zhang: Combinatorial Pattern
for Scientific Data: Some preliminary results SIGMOD 94 115-125.
Li, Yu, Castelli: Hierarchyscan: A hierarchical Similarity Search
for Databases of Long Sequences IEEE 96 Data Engineering
Gray, Bosworth, Layman, Pirahesh: Data Cube: A Relational
Operator Generalizing Group-By, Cross-Tab, and Sub-totals. IEEE 96
Engineering, pp 152-159
Abram, Treinish: An extended data-flow architecture for data
and visualization. IEEE 95 Visualization, pp 263-270.
Martin, Ward: High Dimensional Brushing for Interactive
of Multivariate Data. IEEE 95 Visualization, pp 271-278.
Buja et al (1996). Interactive
High-Dimensional Data Visualization. Journal of Computation and Graphical
Statististics. Vol 5, No. 1.
H: Mediation and Brokerage:
Calmet, Debertin, Jekutsch, Schu: An executable graphical
of mediatory information systems. IEEE 96 Data Engineering, pp
Papakonstantinou, Garcia-Molina, Ullman: MedMaker: A mediation
based on declarative specifications. IEEE 96 Data Engineering, pp
"Integrating Distributed Object Management into EOS",
Info Systems, 5(5):58-59, May 1995.
"The Conquest Modeling Framework for Geoscientific
UCLA CSD Technical Report #940039, Oct 1994.
Participating students can choose from the reading list and define
individual mix of the following examination forms, depending on
preferences and learning needs, to a total assesed by the examiner to
credits(poäng). One project, possibly a small one, must be
Presentations at Seminars,
The project could involve data used in your research project
with your project leader).