# TDTS10 Exam Computer Architecture **Jour:** Ahmed Rezine (Tel: 013 28 1938) #### Admitted materials o Dictionary between English and some other language #### • General instructions - o You may answer in Swedish or in English. - o Write clearly, unreadable text will be ignored. - o The exam is for 40 points. **Preliminary** scale: 36 to 40 points correspond to a 5, 31 to 36 points correspond to a 4, 21 to 30 points correspond to a 3. This scale is **Preliminary** and can be changed by the examiner. It is only given as an indication. #### Question 1. (4 points) Suppose the procedure below is called with "a"=4: - If "a" is smaller or equal to one (line 2) then - It returns the value of "a" (line 3) - Otherwise, it returns "a" times the result of calling itself with "a"-1 (line 4). ``` 1. int factorial(int a){ 2. if(a <= 1) 3. return a; 4. return a*factorial(a-1); 5. }</pre> ``` Such code can be translated to MIPS in order to run on a MIPS architecture. Without writing any MIPS code, explain how can such a translation handle the fact that the procedure "factorial" is recursive (i.e., calls itself with possibly different arguments each time)? Explain the involved mechanism with a neat figure. The figure need not be specific to the factorial procedure, but it should explain how recursive procedures are handled in MIPS. #### Question 2. (2 points) Name, and briefly explain, an advantage and a disadvantage of direct mapped caches over fully associative caches. #### Question 3. (3 points) Give a concrete example (i.e., MIPS code and a property of the underlying architecture) of a structural performance hazard in a pipelined architecture. ## Question 4. (2 points) Briefly explain the difference between data pipeline hazards and control pipeline hazards. ## Question 5. (3 points) Give an example of each of the following instruction classes: arithmetic instructions, logical instructions and data transfer instructions in the MIPS assembly language. #### Question 6. (2 points) Why are processes preempted by the OS when they result in a Translation Lookaside Buffer miss? #### Question 7. (2 points) Give two examples of datapath elements: a combinational logic element and a state element. For the following questions, Q8 to Q13, it is mandatory to write couple of lines of explanation to discuss your approach to solve the problem. #### Question 8. (4 points) Consider two different implementations of the same instruction set architecture. There are four classes of instructions, A, B, C, and D. The clock rate and CPI of each implementation are given in the following table. | | Clock Rate | CPI Class A | CPI Class B | CPI Class C | CPI Class D | |----|------------|-------------|-------------|-------------|-------------| | P1 | 2 GHz | 1 | 2 | 3 | 1 | | P2 | 3 GHz | 2 | 1 | 2 | ? | Assume a program with 10% of the instructions belonging to class A, 20% to class B, 40% to class C, and 30% to class D. The program is to be repeatedly run as fast as possible. What is the smallest CPI value for Class D instructions on architecture P2 (marked "?" in the table) that makes executions of the assumed program in average faster on P1 than on P2? #### Question 9. (4 points) Consider the MIPS assembly instructions below. Assume "\$a0" contains some natural value in {0, 1, 2, ...}. - (I) What is the value stored in "\$v0" at the end of the loop (i.e., line 6) if "\$a0" contained 2 at the beginning of the loop (i.e., line 1)? (1 point) - (II) Same question as above if "\$a0" contained 10 instead of 2 at the beginning of the loop? (3 points) ``` 1. move $t0, $zero 2. move $v0, $zero 3. loop: add $v0, $v0, $t0 4. addi $t0, $t0, 1 5. bne $t0, $a0, loop 6. ``` ## Question 10. (2 points) Convert the numbers -11 and 7 to 16-bit binary numbers (use 2's complement for negative numbers). Perform the addition -11 + 7 in binary. #### Question 11. (4 points) Consider the datapath shown in Figure 1. - (i) Explain in detail how a "sub \$d, \$s, \$t" instruction (computing \$d = \$s \$t) is executed in the datapath. Mention the involved parts and how they contribute to the execution. - (ii) Explain in detail how a "lw \$t, C(\$s)" instruction (where C is some immediate) is executed in the datapath. Mention the involved parts and how they contribute to the execution. # Question 12. (3 points) Draw the pipelined execution corresponding to the code below. Recall that five stages were involved in the lectures: instruction fetch, register read, ALU operation, memory access, and register write. You should not assume forwarding (aka. bypassing) in your answer. Different caches are used for instructions and data. Identify possible data-hazards in this code. Can you reschedule the instructions in order to decrease the total number of required cycles while maintaining the same result after execution? What is the number of required cycles before and after rescheduling? lw \$t1, 0(\$t0) lw \$t2, 4(\$t0) addi \$t3, \$t2, 2 add \$t4, \$t4, \$t4 ## Question 13. (3 points) Assume a direct mapped cache using 6 bits per block and 14 bits per tag. - (i) What is the size of each block? What is the largest number of blocks this cache can contain? (1 point) - (ii) What is the index of the block of the byte at address 133? (1 point) - (iii) Suppose the tag of the block of the byte at address 133 is "tg". What is the address of the byte with the same offset and index as the byte at address 133 but with a tag equal to "tg+1"? (1 point)