**DD1344 Fundamentals of Computer Science (Grundläggande Datalogi)**

**Grudat-07**

**2008-02-25**

**Take Home Exam (tentamen)**

**NOTES: (Please read these carefully!)**

**(i)** Clearly mark at the top of __ each__ sheet you use: (a)
your

**(ii)** On the __ front page__ indicate (a) how many pages are
contained in your work in

**(iii)** There are two ways to submit your work. (**a**) Your work may
be handed in to NADA's studentexpedition no later than **Tuesday 26th February 2008
at 12.00 am midday**. After this time it will be marked as late, and marks will be
subtracted pro-rata. (**b**) If for any reason your are unable to reach the
studentexpedition yourself (for example if you are at work) then you may __post__
your manuscript to: *Studentexpeditionen, CSC, KTH, 100 44 Stockholm*.
The date on the postmark will be taken as the date and time of your
submission. The __same deadline__ applies to manuscripts submitted by post.

I do not accept electronic submissions under any circumstances.

**(iv)** If manuscripts are submitted in any other place or by any other
means than those described in Part (iii) then the examiner and CSC cannot be
held responsible in the case that manuscripts are lost.

**(v)** If you have any questions about the exam (for example, if you do
not understand a question) for the fastest response you should call my mobile number 076 223 86 79. Please
do not call before 09.00 or after 17.00!

**(vi)** You may use your kursbunt, and any course books, as well as a dictionary. You are allowed
to use the internet including search engines. However, any work which you download
and use must be **credited to its source**. You are **not** allowed to
discuss or develop your answer with anyone else before all manuscripts have
been handed in. You are **not** allowed to copy anyone else's work. By handing
in your manuscript you are declaring that you have abided by these rules. In
the case that cheating is suspected, actions will be taken against all students
involved.

**(vii)** Write **clearly**. No marks will be awarded for work that I
cannot read. You may write your answers in swedish or in english.

**(viii)** **Model answers** will be placed on the course web page, but not before all students
who are doing komplettering have finished and submitted their work satisfactorily.

**(ix)** **English Language Definition** "brute force":
A primitive programming style in which the
programmer relies on the computer's processing power instead
of using his own intelligence to simplify the problem, often
ignoring problems of scale and applying naive methods suited
to small problems directly to large ones.

__ Marking Scheme__.

This paper is divided into two parts I and part II.

The grades D and E, can be achieved by answering part I questions only to the level of 80% and 70% of the available marks respectively.

The grade Fx (which allows komplettering) requires 60% of the available marks on part I.

The grades A, B and C are achieved by obtaining at least 80% on part I, and respectively 60%, 40% and 20% of the marks available on part II.

Part I questions cover basic material from the course, drawn from lectures 2 to 6 (as described on the course web page). Part II questions cover the more advanced material of the course drawn from lectures 7 to 11.

You should use the Python syntax for any algorithms which you are asked to design. Be aware that any syntax errors you make may result in your algorithm design being wrong in which case marks will be deducted accordingly. Minor and unimportant syntax errors may however be ignored by the examiner.

** Part I (Total 50 points) **.

** Question 1 (Total 10 points) **.

** (i) (8 points)** At the end of each week, a Swedish web site publishes details
of the ten cheapest and ten most expensive apartments sold in Stockholm during that week.
Write an efficient algorithm which calculates these lists.
You may assume that each week's data is
stored internally in the program in the form of a vector WeekList[i]. You should make your algorithm
as efficient as possible with regard to computation time.

** (ii) (2 points)** If x apartments are sold in one week, how many comparisons does your
algorithm normally require to build both lists?

** Question 2 (Total 15 points)**.

** (i) (3 points) ** A list of n floating point numbers is stored in an array MyList.
Show that the problem of computing a (not necessarily unique) pair of elements in MyList
with the smallest difference, i.e. returning a pair i,j with 0 <= i < n and 0 <= j < n such that for any
0 <= k < n and 0 <= m < n

Abs ( MyList[m] - MyList[k] ) >= Abs ( MyList[i] - MyList[j] ) ,

can be solved by a brute force algorithm with worst case performance of about n^2 (i.e. n squared) computation steps. Use the Python language to express your algorithm design.

** (ii) (3 points) ** Motivate your estimate of the run-time performance of your algorithm
for part (i) by means of informal reasoning about the behaviour of your program.

** (iii) (9 points) ** Using a divide and conquer algorithm design, show that the problem described in part (i)
can also be solved by an algorithm with a worst case performance of about n log(n) (i.e. logarithm to base 2)
computation steps. You may establish the run-time performance of your algorithm by simply quoting
the performance of any algorithm studied in the course. You will not need to reason about the behaviour
of your program in this case.

** Question 3 (Total 25 points) **.

** (i) (5 points) ** Explain how a binary tree with nodes that are labelled by

** (ii) (6 points) ** Explain when this data structure is likely to be efficient in terms of memory usage,
and when it is likely to be inefficient. What better data structure could be used if the tree is inefficient.
Explain your alternative approach briefly.

** (iii) (10 points) ** Using the Python language, design a recursive algorithm which compares two
binary trees of integers (both stored as arrays), and returns the value TRUE if they have exactly the same tree structure and node
labelling, and FALSE otherwise. You may assume that a tree always has at least one node. You may also make the simplifying assumption that both arrays have the same length.

** (iv) (4 points)** Explain carefully why your algorithm never enters an infinite loop.

** Part II (Total 45 points) **.

** Question 4 (Total 15 points)**.
Recall how the problem of string matching can be solved by building a Knuth-Morris-Pratt (KMP) automaton.
For this automaton, the problem of avoiding backtracking in the text to be searched can be solved
by just three rules for backtracking in the automaton alone. These rules are normally used to define
a vector Next[], they are:

(1) Next[1] = 0

(2) Next[i] = 1 if the search string up to position i has no repetitions

(3) Next[i] = j+1 if for some j < i -1, the last j characters in the search string are the same as the first j characters in the search string.

** (i) (4 points)** Using rules (1), (2) and (3) only, draw a KMP automaton which searches for the string "11011011" .

** (ii) (8 points)** Show that your automaton described in part (i) can in fact be further optimised by modifying some of the transition steps. Redraw your optimised version of the automaton.
By numbering the states appropriately, list the number of every state which has a transition that can be optimised. For each such state, explain in words which transition from that state can be optimised

** (iii) (3 points) ** Special agent Smith runs your string matching automaton on secret
text files containing the digits "0" , ... , "9", and doesn't see much improvement over
his old brute force algorithm. Dissappointed,
he complains to you the algorithm designer. What would you say to him?

** Question 5 (Total 30 points) **.
Consider the following recursive BNF grammar definition, over the alphabet { a, b }, for a
language L:

< L > ::= < P1 > | < P2 >

< P1 > ::= "b" | "b" < P3 > | "a" < P4 >

< P2 > ::= "a" | "a" < P3 > | "b" < P4 >

< P3 > ::= "a" < P2 > | "b" < P1 >

< P4 > ::= "a" < P1 > | "b" < P2 >

** (i) (3 points) ** Explain in words what language of strings
of a's and b's the grammar for language L defines.

** (ii) (8 points) ** Draw a finite automaton which accepts exactly the strings in language L. Make sure you clearly mark the start state, all accepting states, and each transition arrow including the
alphabet symbol that triggers it.

** (iii) (3 points)** Can you conclude that the language L is a regular language?
If so, how? Motivate your answer.

** (iv) (10 points)** Define a recursive descent parser for the language L by means of Python code for
an appropriate set of parsing functions. You may assume that the input string to be parsed is stored in a queue.

** (v) (4 points) ** In terms of computer memory usage, what advantages does the finite automaton parser for L have over the recursive descent parser for L?

** (vi) (2 points)** Generally speaking, does a recursive descent parser have any advantages over a finite automaton based parser? Describe two important advantages.