## Course 2D5345: Introductory Spatial Information Theory for Geoinformatics Applications. Course Overview

1. Introduction to a course about concepts of two-dimensional spatial modelling and computation

1.1 What is the course about ?

The stated goal of the course is to present and analyse a selection of geometric concepts, data structures and algorithms which have found important applications in geoinformatics.

This will be done by first informally presenting elements of Spatial Information Theory. This presentation will include an introduction to the basic mathematical concepts upon which much of the rest of the course will be based. To CS students, some of this material will be well-known, but I will make an effort to present the material so as to make a reasonably coherent theory, to enable both those with little previous computer science background to understand the following subjects, and also to give those with a solid CS background some perspective of the area of geographic computing.

This part will require roughly one third of the lectures in the course.

Following this conceptual introduction, there will be a part which covers in some depth representations, algorithms, and "computational algebras" for object- ("vector") and field- ("raster") based models. Together, this part is planned to take up almost half of the lectures.

The final two lectures will discuss a special but very important topic in GIS design: data structures and algorithms for efficient retrieval of spatial data in databases.

The course has been announced as a course in Applied Computational Geometry, but it would probably be more appropriately described as a course in Spatial Information Theory. The latter term has only very recently come into use, which is why the first name was originally preferred.

Computational Geometry is traditionally a subfield of the theory of algorithms and data structures and has taken much of its methods and objectives from that major subject of Theoretical Computer Science, whereas Spatial Information Theory is the general study of spatial modelling, including conceptual structures, language and other architectural issues for geometric computation, and geometric computing, with an emphasis on methods which are robust, topology-preserving, efficient, and have a reasonably transparent implementation.

Thus, rather than putting all emphasis on theoretical issues of spatial modelling or algorithmic complexity, the perspective taken here is the holistic one of a geographical information system architect.

In that profession you need to thoroughly understand the conceptual structure of the underlying problem area, and you need to select methods that fit well together into one large system. It is not essential that you use for every problem the theoretically fastest algorithm, although overall performance is a very important objective, and some particularly fundamental and ubiquitous methods, like triangulation and map overlay, need to perform both almost optimally fast and simultaneously behave in a numerically robust and topology-preserving manner.

In most geographical applications - although not in geological or city planning ones - the space to be represented is usually two-dimensional, or sometimes "2 1/2 - dimensional", but never truly three- dimensional. We take this as an excuse for avoiding any discussion of 3-D problems, which are in general much more complicated than 2-D ones.

The course is divided into three parts.

The presentation starts by informally introducing basic topological and geometrical concepts. Methods for representing plane geometric objects in computers are then described. A set of fundamental computational geometry problems in the plane are introduced and discussed, such as convexity, shortest paths, and neighbourhood.

The second part of the course deals with two important applications: plane network problems and TIN (Triangulated Irregular Network) models and their associated basic computational issues.

The third and last part of the course discusses a special but very important topic in GIS: the design of data structures and access algorithms for efficient storage and retrieval of spatial data in databases. Basic file organisation techniques are covered. Indexing techniques for point, line, and region data are surveyed.

The course consists of lectures and mandatory computer exercises.

Examination is taken in the form of program system development and evaluation group projects, with written and oral presentation.

Prerequisites:

Since the course is intended to be relevant for all CGI Ph D students, only basic programming competence will be required to be able to follow the course.

However, to carry out the group projects, usually considerable computer science maturity will be required from the group as a whole, as well as domain knowledge and management and documentation skills.

Therefore, we will form "cross-disciplinary" project groups, where the different competences of group members is employed in different project roles.

1.2 Who should attend ?

The course is primarily intended for Ph D students of the Centre for Geoinformatics (CGI), which is a joint research competence centre of KTH, Stockholm University and FOA, the Defence Research Institute.

It is open also for those KTH M. Sc. students who want to learn about how geometry and topology is applied in Geographical Information Systems, GIS. Such students may be expected to come from the curricula of Surveying, Computing Technology, Electronics, or Technical Physics, but also students of Vehicle Technology and Mechanical Technology might find some interesting applications to robotics and navigation.

Because the CGI Ph D students come from a wide variety of academic backgrounds, the course does not require much prior mathematical or computer science knowledge in addition to high-school mathematics, but basic programming competence is assumed.

Why would students from all these specialized fields of study want to know about this subject in the fashion to be applied here ?

The CGI students have previously taken part in a seminar series entitled Readings in GIS. During that course, one of the topics discussed was "GIS - science or tool in academic education ?". So, most if not all CGI students have already come across the issue of formal representation and computing techniques in GIS applications.

For other participants, it should be pointed out that geographical information systems may very well be used without much knowledge of the methods presented here, indeed almost all uses of GIS are practised in that way.

However, for thoroughly professional GIS users, it frequently becomes necessary to acquire a detailed insight into computer representations of space and sometimes also into questions of how to structure geo- metrical or topological computations.

Also, this is obviously true for "GIS tool builders", i e people who want to build programs and systems for geographical applications. It is my ambition that such students would find the course contents highly relevant to them, although maybe the level of presentation is sometimes a little superficial.

If you are neither aspiring to become a professional GIS user or a GIS tool builder, maybe anyway you could find some of the material in the course intellectually stimulating in its own right. Geometry is, after all, one of oldest branches of mathematics, and its comparatively recent wedding to computer science has produced some quite beautiful offspring.

However, it should follow from the course objectives that the course is not intended for those primarily interested in the theory, rather than the applications, of geometric and topological reasoning and algorithms. And if you have advanced knowledge about data structures and algorithms already, you might find most of the treatment superficial and boring.

1.3 How is the course going to be taught, organized, and managed ?

Formally, the course consists of lectures, programming projects, and evaluation projects. There will be no formal written examinations, but there will be a rigorous credit system anyway. Here is how it is intended to work:

- the style of treatment will be informal, leaving most proofs to the serious student to ponder on her own, and not really requiring a deep theoretical knowledge of the presented material. This might leave the mathematically mature student without much gratification, but we will do our best to make up for that by presenting topics which should be interesting and relevant as a mathematical and computational basis for the subject of geoinformatics. Although the mathematical depth of the concepts presented is not much more than that of high-school mathematics, a considerable amount of practically relevant new research results will be covered. No sophisticated performance analysis of algorithms will be given.

- the lectures are intended to give the attendees a basic insight into the subject; they will only cover material which you will also find in the course literature in places specified in the detailed course agenda (see 1.4, below). So why would you want to go to the lectures ? Perhaps because we will try to make them a collegial discussion rather than a formal, one-sided presentation, and a place where you can get some help with structuring your own study.

Credit will be given from making both written and oral project presentations to the class; to pass the course, you need to carry out both a fully-documented system design and evaluation project, and prepare and present a twenty-minute lecture on the essential results of that project. Now, since the projects are far too ambitious to be carried out by each participant alone, after registering your interest in the proposed project topics,

(1) you will be assigned to one of these project topics and to a cross-disciplinary project group

(2) each participant will be given a formal project role after having taken part in a group interview

The reasons for this bureaucracy are twofold:

(1) it is a way for me to control the wide span of computing competence which is to be expected from the participants; and

(2) it will provide the participants with some live experience in the methods of management of system design and implementation projects, one which most of the students will otherwise find it hard to get until they are asked by their future business manager to carry out such a role. This experience might hopefully also promote future understanding of the different roles that are occupied by people from different professional categories in the workplace.

The following roles will be available:

(1) Program manager. The program manager is to play the role of user representative, which means that (s)he must set down the requirements for the project clearly in writing, review the project specification's conformance with these requirements, regularly check the progress of the project, report when there seem to be a danger of significant delays or technical failures, and when changes in specifications are requested by the designer project, sign that the delivery of the final product is satisfying the requirements, and make up and present a full report on the project and its result from the user's perspective. It may turn out that the same person will have to be program manager for more than one project.

(2) Project manager. The project manager will have all the formal responsibility to make agreements with the Program manager on behalf of the project group. With this responsibility also follows the right to have the final word in disputes over overall design decisions, although the project manager will usually not be a computing expert. The project manager must ensure that the design is documented from all relevant perspectives: user manual, design documents on the system level and on the level of program modules, test and evaluation report.

(3) Technical experts. The technical experts (usually two of them in this case) will do the real programming, testing, and evaluation work, as well as the basic technical documentation. But it is not the intention that they are to carry the whole burden of work in the project, so if they find the management uncooperative, they may call a strike...

1.4 What is contained in the package ?

To answer this question, I will start by giving first a brief overview of the main topics that will be covered. Below, you will also find a detailed plan of the lecture contents and where to find a treatment of the topic in the course literature.

As you would expect, the course starts with an:

1.4.1 Introduction

Here, I will guide the participants through a discussion about WHY a geoinformatics researcher should know about computational geometry. I will tell you HOW this course is going to be taught, roughly WHAT contents it will have, and also HOW and WHERE one will find additional information if one wants to build a program system for geographical computations.

The first two issues have already been dealt with above. So, WHAT topics will be treated in the course ?

1.4.2 Fundamentals of Spatial Information Theory.

The first four topical lectures will be about fundamentals of Spatial Information Theory. It will include an introduction to the basic mathematical concepts upon which much of the rest of the course will be based. To CS students, most of the material will be well-known, but I will make an effort to present the material so as to make a reasonably coherent theory, to enable both those with little previous computer science background to understand the following subjects, and also to give those with a solid CS background some perspective of the foundations of geographic computing.

The lectures will deal with fundamental concepts like Sets, Relations, Functions, Neighborhood, Distance, Geometry, Topology, Dimensionality, Convexity etc. I will further discuss the concept of abstract spatial data types and their algebraic properties, i.e., combinations of data structures and operations which can be used to implement the conceptual models of objects in Euclidean 2-space. The lectures will be based mainly on material from Ch. 4, 3, and 5 of Michael Worboys' book, GIS: A Computing Perspective, and from Ch. 2 of Markus Schneider's Ph D thesis, Spatial Data Types for Database Systems. To complement these, the paper by Clementini & al (see the references), will be surveyed.

1.4.3 Computer Representation of Abstract Spatial Data Types.

This topic will take up six lectures, or half the course except for the introduction. It will deal with modern methods for representing spatial objects in computers, and with the realization of operations on them. Most of the time will be devoted to geometrical-object-based concepts. Two lectures will deal with Triangulated Irregular Networks, which is one way to represent spatial fields, although not the simplest one, which is raster representation.

1.4.4 Overview of Indexing Techniques for Efficient Spatial Search in Databases.

This is a large topic since the issue of efficient spatial indexing of databases has not yet found its canonical form, although there is a candidate, a tree structure which was introduced by Michael Freeston in 1995. Freeston's paper is briefly surveyed, together with a number of other approaches: R-trees, BANG-files, LSD-trees, and various forms of Quadtrees. For the practical usefulness of GIS, it is also an important topic, although it may seem esoteric from a user's viewpoint.

2. Detailed course contents (with references to relevant chapters in the course literature).

NOTE: this is a preliminary plan only !

Lecture 1: Introduction to this course on spatial modelling and computation. See above.

Lecture 2: Introduction to spatial modelling.
The modelling process (W 4.1). The object-oriented approach (W 2.4). Field-based models (W 4.2 except 4.2.3). Object-based models (W 4.3 except 4.3.2-4.3.4). Integrating field and object models (W 4.4). Computing with geographic data (W 5.1).

Lecture 3: Fundamental spatial concepts I: Basics.
What is space ? (W 3.1). Euclidean space: points, lines, polygons, transformations (W 3.2). Set-based geometry of space: sets, relations, functions, convexity (W 3.3).

Lecture 4: Fundamental spatial concepts II: Topology of space.
Topological spaces, pointset topology, Euclidean pointset topology, Euclidean combinatorial topology (W 3.4).

Lecture 5: Fundamental spatial concepts III: Network spaces and metric spaces. Abstract graphs, planar graphs (W 3.5). Metric spaces (W 3.6).

Lecture 6: Discrete representations of space and spatial objects I: Realm-based representation of the discrete Euclidean plane (W 5.2, S 2.4-2.5).

Lecture 7: Discrete representations of space and spatial objects II: The spatial object domain: spaghetti, topology, doubly-connected-edge-list, representing strongly connected areal planar objects, other boundary representations (W 5.3). Representations of field-based models: regular tessellations, irregular tessellations, Delaunay triangulations, triangulations of polygons, tessellations of the sphere (W 5.4). Representation of networks (W 5.7.1)

Lecture 8: Operations on spatial objects: Operations on fields (W 4.2.3). Static spatial operations (W 4.3.2). Dynamic spatial operations (W 4.3.3). Realm-based spatial data types (S 4.1). Topological relationships (CFO). The ROSE algebra (S 4.4). Operations on networks (W 5.7.2-5.7.4).

Lecture 9: Algorithms for spatial operations I: Metric and Euclidean algorithms (W 5.5.1). Topological algorithms (W 5.5.2). Set-based algorithms (W 5.5.3). Raster algorithms (W 5.5.4).

Lecture 10: Algorithms for spatial operations II: Triangulation algorithms (W 5.5.5). Raster-vector conversion (W 5.6). Network traversal (W 5.7.2). Shortest path (W 5.7.3). Other network algorithms (W 5.7.4).

Lecture 11: Hierarchical representations of space.

Lecture 12: Adaptive triangulated irregular models.

Lecture 13: Search and indexing I:

Lecture 14: Search and indexing II:

1. Books

1.1. Worboys, M.: GIS - a Computing Perspective. Taylor & Francis 1995. Ch. 3-6. Copies will be available for purchase at the THS bookshop.
1.2. Schneider, M.: Spatial Data Types for Database Systems. Ph D thesis, FernUniversität Hagen, 1995 (15 copies of the thesis will be available for purchase during the course).

2. Articles

2.1. E. Clementini, P. Di Felice, P. van Oosterom: A Small Set of Formal Topological Relationships Suitable for End-User Interaction. In: Abel and B. C. Ooi (eds), Advances in Spatial Databases. Lecture Notes in Computer Science 692. Springer Verlag 1993, pp. 277-295.
2.2. E. Clementini, P. Di Felice: Approximate Topological Relations. Int. J. Approximate Reasoning, Vol. , No. , 1997.
2.3. J. G. Stell, M. F. Worboys: The Algebraic Structure of Sets of Regions. To appear in Proc. COSIT '97.
2.4. M. Overmars: Teaching Computational Geometry. ftp://ftp.cs.ruu.nl/pub/RUU/CS/techreps/CS-1993/1993-34.ps.gz
2.5. D. T. Lee: Computational Geometry. ACM Computing Surveys, Vol. 28, No. 1, March 1996, pp. 27-31.
2.6. R. Tamassa: Data Structures. ACM Computing Surveys, Vol. 28, No. 1, March 1996, pp. 23-26.
2.7. E. M. Reingold: Basic Techniques for Design and Analysis of Algorithms. ACM Computing Surveys, Vol. 28, No. 1, March 1996, pp. 19-21.
2.8. De Floriani, E. Puppo, P. Magillo: Geometric Structures and Algorithms for Geographic Information Systems. Technical Report DISI-TR-97-08. Dipartimento di Informatica e Scienze dell'Informazione, Universita di Genova, June 1997.
2.9. L. De Floriani, E. Puppo, A Hierarchical Triangle-based Model for Terrain Description. In: A.U. Frank, I. Campari, U. Formentini (Eds.): Theories and Methods of Spatio-Temporal Reasoning in Geographic Space, Lecture Notes in Computer Science, No. 639, Springer-Verlag, 1992, pp. 236-251.
2.10. M. Bertolotto, L. De Floriani, P. Marzano: A Unifying Framework for Multilevel Description of Spatial Data. In: A. U. Frank, W. Kuhn (Eds.): Spatial Information Theory: A Theoretical Basis for GIS, International Conference COSIT '95, Lecture Notes in Computer Science, Vol. 988, Springer, 1995, pp. 259-278.