## TEKNISKA HÖGSKOLAN I LINKÖPING Institutionen för datavetenskap Zebo Peng ## Tentamen i kursen ## **TDTS08 Datorarkitektur** (Examination on TDTS 08 Advanced Computer Architecture) 2009-12-19, kl. 14 - 18 Hjälpmedel: Engelsk ordbok. Poänggränser: Maximal poäng är 40. För godkänt krävs 21 poäng. Supporting material: English dictionary. **Points:** Maximum points: 40. You need 21 points to pass the exam. Jourhavande lärare (Teacher on duty): Zebo Peng, tel. 070 258 2067 Note: You can give the answers in English or Swedish. - 1. a) What are the advantages and disadvantages of having a cache? - b) For fully-associative mapping, a main memory address is viewed as consisting of two fields. Define these two fields, and explain how they are used. - c) What are the differences between fully-associative mapping and set-associative mapping? Why is set-associative mapping usually used? (3p) 2. The following sequence of virtual page numbers is encountered in the course of execution on a computer with virtual memory: ## 53434236471216 Assume that the least-recently used (LRU) page replacement policy is used. Assume also that the main memory has four page frames, and is initially empty. How many page misses will be during this execution? Which are the virtual pages in the main memory when this execution finishes? (3p) - 3. a) What is a data hazard in a pipelined unit? Illustrate this problem by an example and show how penalties are produced (consider a 6-stage pipeline). - b) How can this penalty be reduced with the forwarding (bypassing) technique? Draw figures to illustrate the pipelined executions without and with forwarding. (3p) - 4. Why is the pipeline technique much more efficient in a RISC computer than in a CISC computer? List all the features of a RISC architecture that contribute to the efficient operation of a pipeline. List also the features of a CISC architecture that make pipeline execution of instructions difficult. (3p) - 5. a) What is the most essential characteristics of a superscalar architecture? - b) Explain the following two policies for instruction execution: - in-order issue with in-order completion, and - out-of-order issue with out-of-order completion. - c) Why is the window of execution an important mechanism for a superscalar architecture? What are the parameters that influence the selection of the size of the window of execution? Note: You can give the answers in English or Swedish. - 6. a) Define the concepts of output dependency and anti-dependency. - b) Give an example with output dependency and another one with anti-dependency. Show how they can be removed by register renaming. (3p) - 7. An instruction for the IA-64 architecture consists of a template field, which is used to specify the execution pattern of the operations. Discuss how this field is used, and all the advantages of using this template. (3p) - 8. a) What does it mean by branch predication (as implemented in the Itanium machine)? How does it work? Illustrate the technique with an example. - b) What are the differences between branch predication and branch prediction? What are their advantages and disadvantages, respectively. (3p) - 9. What is Amdahl's law? What is the important message from Amdahl's law? (2p) - 10. a) What is a non-uniform memory access (NUMA) system? Draw a picture of a typical NUMA organization. Use the picture to illustrate the important concepts and components of such a system. - b) What are the advantages and disadvantages of NUMA? (3p) 11. a) What is a vector processor? Draw the block diagram of a typical vector processor architecture. b) What is the role of the mask register in a vector unit? Give an example to illustrate the use of the mask register. (3p) Exam for TDTS 55, Advanced Computer Architecture, 2009-12-19 Note: You can give the answers in English or Swedish. - 12. a) There are two basic approaches to the snoopy protocol: write-invalidate and write-update. How do they work? - b) Describe the situation when the write-invalidate approach works better, and the situation when the write-update works better, respectively. - c) Both these approaches suffer from false sharing overheads. What does it mean by false sharing here? (4p) 13. Describe the different mechanisms (techniques) for the LongRun power management in the Crusoe processor. How do these mechnisms contribute to the reduction of power comsumption? (3p)