DD1344 Fundamentals of Computer Science (Grundläggande Datalogi)

Grudat-07

2008-02-25

Take Home Exam (tentamen)

NOTES: (Please read these carefully!)

(i) Clearly mark at the top of each sheet you use: (a) your name, (b) the page number.

(ii) On the front page indicate (a) how many pages are contained in your work in total. Clearly mark: (b) your name (c) your personnummer, (d) your e-mail address (in case I need to contact you)

(iii) There are two ways to submit your work. (a) Your work may be handed in to NADA's studentexpedition no later than Tuesday 26th February 2008 at 12.00 am midday. After this time it will be marked as late, and marks will be subtracted pro-rata. (b) If for any reason your are unable to reach the studentexpedition yourself (for example if you are at work) then you may post your manuscript to: Studentexpeditionen, CSC, KTH, 100 44 Stockholm. The date on the postmark will be taken as the date and time of your submission. The same deadline applies to manuscripts submitted by post.

I do not accept electronic submissions under any circumstances.

(iv) If manuscripts are submitted in any other place or by any other means than those described in Part (iii) then the examiner and CSC cannot be held responsible in the case that manuscripts are lost.

(v) If you have any questions about the exam (for example, if you do not understand a question) for the fastest response you should call my mobile number 076 223 86 79. Please do not call before 09.00 or after 17.00!

(vi) You may use your kursbunt, and any course books, as well as a dictionary. You are allowed to use the internet including search engines. However, any work which you download and use must be credited to its source. You are not allowed to discuss or develop your answer with anyone else before all manuscripts have been handed in. You are not allowed to copy anyone else's work. By handing in your manuscript you are declaring that you have abided by these rules. In the case that cheating is suspected, actions will be taken against all students involved.

(vii) Write clearly. No marks will be awarded for work that I cannot read. You may write your answers in swedish or in english.

(viii) Model answers will be placed on the course web page, but not before all students who are doing komplettering have finished and submitted their work satisfactorily.

(ix) English Language Definition "brute force": A primitive programming style in which the programmer relies on the computer's processing power instead of using his own intelligence to simplify the problem, often ignoring problems of scale and applying naive methods suited to small problems directly to large ones.


Marking Scheme.
This paper is divided into two parts I and part II.
The grades D and E, can be achieved by answering part I questions only to the level of 80% and 70% of the available marks respectively.
The grade Fx (which allows komplettering) requires 60% of the available marks on part I.
The grades A, B and C are achieved by obtaining at least 80% on part I, and respectively 60%, 40% and 20% of the marks available on part II.

Part I questions cover basic material from the course, drawn from lectures 2 to 6 (as described on the course web page). Part II questions cover the more advanced material of the course drawn from lectures 7 to 11.

You should use the Python syntax for any algorithms which you are asked to design. Be aware that any syntax errors you make may result in your algorithm design being wrong in which case marks will be deducted accordingly. Minor and unimportant syntax errors may however be ignored by the examiner.



Part I (Total 50 points) .

Question 1 (Total 10 points) .

(i) (8 points) At the end of each week, a Swedish web site publishes details of the ten cheapest and ten most expensive apartments sold in Stockholm during that week. Write an efficient algorithm which calculates these lists. You may assume that each week's data is stored internally in the program in the form of a vector WeekList[i]. You should make your algorithm as efficient as possible with regard to computation time.

(ii) (2 points) If x apartments are sold in one week, how many comparisons does your algorithm normally require to build both lists?

Question 2 (Total 15 points).

(i) (3 points) A list of n floating point numbers is stored in an array MyList. Show that the problem of computing a (not necessarily unique) pair of elements in MyList with the smallest difference, i.e. returning a pair i,j with 0 <= i < n and 0 <= j < n such that for any 0 <= k < n and 0 <= m < n
Abs ( MyList[m] - MyList[k] ) >= Abs ( MyList[i] - MyList[j] ) ,
can be solved by a brute force algorithm with worst case performance of about n^2 (i.e. n squared) computation steps. Use the Python language to express your algorithm design.

(ii) (3 points) Motivate your estimate of the run-time performance of your algorithm for part (i) by means of informal reasoning about the behaviour of your program.

(iii) (9 points) Using a divide and conquer algorithm design, show that the problem described in part (i) can also be solved by an algorithm with a worst case performance of about n log(n) (i.e. logarithm to base 2) computation steps. You may establish the run-time performance of your algorithm by simply quoting the performance of any algorithm studied in the course. You will not need to reason about the behaviour of your program in this case.

Question 3 (Total 25 points) .

(i) (5 points) Explain how a binary tree with nodes that are labelled by any data structure whatsoever can be stored and accessed using an array.

(ii) (6 points) Explain when this data structure is likely to be efficient in terms of memory usage, and when it is likely to be inefficient. What better data structure could be used if the tree is inefficient. Explain your alternative approach briefly.

(iii) (10 points) Using the Python language, design a recursive algorithm which compares two binary trees of integers (both stored as arrays), and returns the value TRUE if they have exactly the same tree structure and node labelling, and FALSE otherwise. You may assume that a tree always has at least one node. You may also make the simplifying assumption that both arrays have the same length.

(iv) (4 points) Explain carefully why your algorithm never enters an infinite loop.



Part II (Total 45 points) .

Question 4 (Total 15 points). Recall how the problem of string matching can be solved by building a Knuth-Morris-Pratt (KMP) automaton. For this automaton, the problem of avoiding backtracking in the text to be searched can be solved by just three rules for backtracking in the automaton alone. These rules are normally used to define a vector Next[], they are:
(1) Next[1] = 0
(2) Next[i] = 1 if the search string up to position i has no repetitions
(3) Next[i] = j+1 if for some j < i -1, the last j characters in the search string are the same as the first j characters in the search string.

(i) (4 points) Using rules (1), (2) and (3) only, draw a KMP automaton which searches for the string "11011011" .

(ii) (8 points) Show that your automaton described in part (i) can in fact be further optimised by modifying some of the transition steps. Redraw your optimised version of the automaton. By numbering the states appropriately, list the number of every state which has a transition that can be optimised. For each such state, explain in words which transition from that state can be optimised and why.

(iii) (3 points) Special agent Smith runs your string matching automaton on secret text files containing the digits "0" , ... , "9", and doesn't see much improvement over his old brute force algorithm. Dissappointed, he complains to you the algorithm designer. What would you say to him?

Question 5 (Total 30 points) . Consider the following recursive BNF grammar definition, over the alphabet { a, b }, for a language L:
< L > ::= < P1 > | < P2 >
< P1 > ::= "b" | "b" < P3 > | "a" < P4 >
< P2 > ::= "a" | "a" < P3 > | "b" < P4 >
< P3 > ::= "a" < P2 > | "b" < P1 >
< P4 > ::= "a" < P1 > | "b" < P2 >

(i) (3 points) Explain in words what language of strings of a's and b's the grammar for language L defines.

(ii) (8 points) Draw a finite automaton which accepts exactly the strings in language L. Make sure you clearly mark the start state, all accepting states, and each transition arrow including the alphabet symbol that triggers it.

(iii) (3 points) Can you conclude that the language L is a regular language? If so, how? Motivate your answer.

(iv) (10 points) Define a recursive descent parser for the language L by means of Python code for an appropriate set of parsing functions. You may assume that the input string to be parsed is stored in a queue.

(v) (4 points) In terms of computer memory usage, what advantages does the finite automaton parser for L have over the recursive descent parser for L?

(vi) (2 points) Generally speaking, does a recursive descent parser have any advantages over a finite automaton based parser? Describe two important advantages.