### LINKÖPINGS TEKNISKA HÖGSKOLA Institutionen för datavetenskap Petru Eles ## Tentamen i kursen ## Datorarkitektur - TDDI03 2019-01-16, kl. 14-18 Hjälpmedel: Engelsk ordbok. **Supporting material:** English dictionary. Poänggränser: Maximal poäng är 40. För godkänt krävs sammanlagt 21 poäng. **Points:** Maximum points: 40. In order to pass the exam you need a total of minimum 21 points. Jourhavande lärare: Petru Eles 013281396 Good luck !!! # Tentamen i kursen Datorarkitektur - TDDI03, 2019-01-16, kl. 14-18 Du kan skriva på svenska eller engelska! 1. | a)<br>b). | | | | |-------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|--|--| | | (3p) | | | | 2. | A four-way set-associative cache has lines of $32 (=2^5)$ bytes and a total size of 16 Kbytes $(=2^{14})$ . The main memory has a size of 64-Mbyte $(=2^{26})$ . Show the format of main memory addresses (give the total number of address bits, what different groups of address bits indicate, and how long each of these groups is). (3p) | | | | 3. a) b) | What is the role of the page table in a virtual memory system? What data does it store? The page table is very large, usually too large to be stored in main memory. Such a large size, at the same time, makes access to the page table very slow. How is this solved in current microprocessor architectures. (3p) | | | | 4. a) b) c) | Consider a pipelined processor with $k$ pipeline stages. What is the theoretical acceleration (ignoring overheads) for a sequence of $n$ instructions, compared to a similar but non-pipelined processor? Show how you obtain the formula! What is the acceleration of a sequence of 75 instructions if the number of pipeline stages is 14? What is the acceleration for an infinitely long sequence if the number of pipeline stages is 14? (3p) | | | | 5. | Dynamic branch prediction with a two-bit scheme. How does it work? (2p) | | | | 6. | Enumerate five of the main characteristics of RISC architectures. (2p) | | | | 7. | Branch history table: what does it contain and how is it used? (2p) | | | #### Tentamen i kursen Datorarkitektur - TDDI03, 2019-01-16, kl. 14-18 Du kan skriva på svenska eller engelska! 8. - a) What is a superscalar architecture? - b) Draw a block-diagram of a superscalar unit. (3p) 9. Consider the following sequence of machine instructions: - 1. R4 < -R10 + R1 - 2. R7 <- R5 \* 10 - 3. R11 <- R4 R0 - 4. R3 <- R11 R16 - 5. $R10 \leftarrow R7 + 101$ - 6. R9 <- R3 + R12 - 7. R8 <- R6 \* R0 - 8. R2 <- R3 \* R7 - 9. R13 <- R17 + 2 - 10. R5 <- R12 \* R22 - 11. R15 <- R5 + 3 - a) Indicate the data dependencies among instructions. - b) Consider a superscalar computer on which the execution of each instruction takes one cycle; the computer has two units that execute addition&subtraction and one unit for multiplication. Produce a table, according to the model below, showing how instructions are executed in consecutive cycles, if we assume in order execution. - c) Rename the registers in the above sequence to prevent, where possible, dependency problems. Produce a second table, according to the model below, showing how instructions (after register renaming) are executed in consecutive cycles, if we assume out of order execution. In the table cells indicate the sequence number (1, 2, ..., 10) of the instruction executed in the corresponding cycle on the respective unit. | | ADD/SUB | ADD/SUB | MUL | |---------|---------|---------|-----| | Cycle 1 | | | | | Cycle 2 | | | | | Cycle 3 | | | | | • • • | | | | # Tentamen i kursen Datorarkitektur - TDDI03, 2019-01-16, kl. 14-18 Du kan skriva på svenska eller engelska! | 10.<br>a)<br>b)<br>c) | Compare VLIW architectures with superscalar architectures: Show similarities and differences. Show the advantages and disadvantages of the two approaches. Why is a superscalar consuming more power, compared to a VLIW computer? (4p) | |-----------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| | 11. | What is loop unrolling? How does it work? Why is it important with VLIW architectures? (2p) | | 12. | Consider an instruction sequence such that 30% of the computation has to be executed sequentially, while the rest is executed with full parallelism on 8 processors. Calculate the expected speedup and efficiency. (3p) | | 13.<br>a)<br>b)<br>c) | What is hardware multithreading? Why do multithreaded processors provide higher performance? We have described three approaches to multithreading: interleaved, blocked, and simultaneous; what is the main characteristic of each of them? (3p) | | 14. | What is a vector processor? Draw a block diagram. What is the role of the mask register? What is the basic difference between array processors and vector processors? (3p) |