Vector instructions extensions

Imagine a multicore processor with private L2 caches and an inclusive L3
March 20, 2023
How much space will enhanced 2-3-4-sort require to sort an array of n items, if each item is m bits long?
March 20, 2023

Vector instructions extensions

COMPUTER SCIENCE TRIPOS Part II – 2022 – Paper 9
Advanced Computer Architecture (rdm34)
(a) Vector instructions extensions are added to a small 32-bit microcontroller. The
vector length is 128-bits. The register bank in the processor’s floating-point
unit (32 x 32-bit single-precision registers) is reused for vector processing and
eight 128-bit vector registers alias onto it. The processor can only issue a single
instruction per cycle. It has a 32-bit wide memory datapath and a single 32-bit
multiplier.
(i) How can adding vector instruction extensions allow us to make more
efficient use of the microcontroller’s memory datapath and multiplier?
[2 marks]
(ii) What is the advantage of allowing many vector instructions to be able to
access both vector registers and registers in the scalar register file?
[2 marks]
(iii) Imagine two vector instructions are executing when the first (earlier)
instruction causes an exception late in its execution. Describe two different
ways in which precise exceptions could be implemented. [5 marks]
(iv) Describe one way in which a vector instruction-set extension may efficiently
handle cases where the number of elements we wish to process is not a
precise multiple of the maximum vector length supported in hardware?
[3 marks]
(b) Imagine a 64KiB 2-way set-associative L1 data cache with a block size of 32
bytes. The cache is Virtually Indexed Physically Tagged (VIPT). The processor
has a private L2 cache which is inclusive. Virtual memory uses 4KiB pages.
(i) What problem must be overcome to ensure correctness? [2 marks]
(ii) How could the problem be detected by storing a few bits of the virtual page
number with each line of the processor’s private inclusive L2 cache?
[4 marks]
(iii) What is the minimum associativity the L1 cache must have to completely
avoid the problem? [2 marks]