RunAndExecute-cs19b005-cs19b006-spim

We tried to implemented a an Non-Inclusive Non-Exclusive(NINE cache) cache with write through policy and this what we have done for implementing cache in the phase 2 of the project to make phase 3 complete.

In phase 2 we have made a dynamic 2-dimensional vector which displays about pipelining diagram in a table assuming there every stage in the pipelinging need only 1 cycle of time to complete their stage.

Now actually in real case scenario we won't be doing that because it will take time for fetching data from RAM to processor. So we will introduce something called cache and we already know that we will be needing 2 levels of cache for decreasing the average access latency and fetch data we needed to fetch which will be always corresponding to some address of the RAM (Main Memory). We will be having a bigger size for 2nd level cache and the access latency of the cache 2 will always be greater than the access latency of level 1 cache Cache contains blocks and every block is nothing but a chunk of memory associated for the address generated by the processor. This block is associatedc with tags that store part of the address that us generated by the processor. Few block are associated into a set in the cache and number of blocksw in the cache is called as the associativity of the cache. Remaining part of the adress which is not stored in the tags are can be made into 2 parts again called as set-index and offset which then agian gets useful while accessing and some time while inserting a block into the cache. There's also a bit used that's associated with block called as the dirty bit. Generally this is kept as 0 but if th block is re-writtten then it would be changed into 1. So while replacing this block we would be able to write back to cache if we are following write allocation policy while doing a write-back policy.

Now the caches can have different block sizes and different access latencies too.

We have took 3 classes block, set and cache Attributes important for the block are as follows:

tag
participatio_num(this updates for every entry and every access so that we could know is least recently accessed) Here we haven't used dirty bit although just intialised because we are following a wrte through policy it will be automaticcaly updated there in the RAM whenever a change/update is needed to made (Here we are not storing data that's in the memory because we acan access it directly in simulator and in fact it will take only constant time to access if the address is know in the su=imulator as memory is an array here)

Attributs of Set contain :

vector array of blocks
block_size of the cache
size_keeper(This is used to look after how many blocks are present in the set. This gets intialised with zero. For every new entry of block in the set this and number of misses gets incremented and whenver this gets equal to block_size of the cache then it starts using LRU policy and we find misses here)

Attributes of Cache contain:

vector array of the sets
block_size
associatibvity
num_of_sets in the cache
num_of_blocks in the cache
cache-size
new_participation_num(as we insert evry block into cache it will be well know that which accessed is this to keep track when this is used so that we can get to know what's most and least recently used block also after assigning this to participation_num in the block)

Now we make 2 objects cache L1 and L2 we make all the attributes by giving access latency , associativity, block size and cache size from the input as files

At the time of pipelinng esapecially at memory stage We will make a variable run cycles for saying how many cycles it need to access the data in the pipelinging For the generated adress by the processor we will be searching it if it belongs ot eother L1 or L2 or even both if the block is present in L1 then run_cycles = L1_access_latency else it searches in L2 if the block is present in L2 then run_cycles = L1_access_latency+L2_access_latency and updates it in L1, else if not found it seraces MainMemory There would obviously be a hit unless if the memory is accesed wrongly and now this will be fetched into L1 and L2 caches as it is NINE cache and the block inserted will be in this way: calculate tag, set_index and offset using cache_size,block_size and associaivity tag = address/block_size set_index=(address%block_size)/number_of_sets_in_cache and offset = (address%block_size)%number_of_sets_in_cache where number_of_sets_in_cache = num_of_blocks/associativity, num_of_blocks = cache_size/block_size if the corresping set to the address is not filled then we place it there else we use LRU and remove least recently used block which can be found by participatio_num of each block in the corresponding set and we place new block there In either way we will get caches_misses This is for accessing some data in the memory

When we wanted to change or update some value in the memory, i.e. write to the byte or the chunk we will obviously having max number of cycles to be written as it's non-write allocation policy and we are performing write-thriugh policy and here we are accessing memory for this change in that byte(s)

now run_cycles will be L1_access_latency+L2_access_latency+memory_access_latency Now after this since memory stage get more number of cycles to be performed a pipelining table will be made with some changes for the next instruction based on prev instructions like if the stage took more cycles then for every number of cycle until then will be having memory stage and for that cycles we have to nake sure that the next instruction in thart cycle cannot perform this same stage until the write back stage for this instruction gets started and since no work of the stage will be done these will be taken as stalls in the pipeling

At last we will be printing number of misses, number of stalls and instructctions that showed stalls in the pipelining, IPC and number of cache misses

and whenver we found any miss in L1 we increment L1_cache_accesses and at every access in L1_cache_misses we calculate cache miss rate using L1_miss_rate = L1_cache_accesses/L1_cache_misses;

and same has to be done in case of L2

this is how we have done

Where it went wrong

When we made code for this we are getting a new kind of compilation error we haven't seen before stating

error: no matching function for call to 'sets_of_cache::sets_of_cache()' 109 | { ::new(static_cast<void>(__p)) _Tp(std::forward<_Args>(__args)...); } | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~*

we thought it was some kind of erroe due to class name and we changed and tried different names and we ran then, then we saw with same new class name showing error and we completely made a new code for that part but it didn't even took input and returned some garbage value (we use int main() and return 0;)

We believe what we did was correct and we are more than happy to know where went wrong so we can correct it and due to personal issues and health issues we had submitted this a bit early. Thanking You

gautam1222-repos / runandexecute-cs19b005-cs19b006-spim Goto Github PK