Giter Club home page Giter Club logo

ticker's Introduction

Ticker Flow Cache

This is another interview question that I was asked and I wanted to fully develop to "see it working".

The problem is about saving streaming data and discarding the Least Recently Used. It is a simpler variation of LRU caching where cache faults don't need to be implemented (if the data has been discarded, instead of retrieving it from a secondary storage as a non-streaming cache would do, the program just returns a null)

The streaming data to be saved is the content of a "ticker tape": a security (e.g., stock) symbol and its corresponding price. The cache will only save the N most recent prices for the M most recently updated securities.

The problem is interesting in that the cache has 2 "levels" with slightly different characteristics. The lower level is composed of prices. When a new price comes in and the cache is full, the oldest one is always discarded, we never preserve an existing price. The higher level is composed of security symbols and re-use can happen, we may get multiple prices for the same security and when this happens the symbol needs to have its most recent use reset, without discarding any other security from the cache.

The lower level is easy to implement with a list of simple queues (FIFO), one list per security. For the higher level I chose a linked list, so that the "remove item in the middle" operation is reasonably easy and doesn't involve memory copying (note that removing from the middle is slower than adding a new item because it requires finding the item, which is an O(n) operation in a linked list). It is also necessary to have a relation between the two levels and I used a dictionary for this, with the security symbol as the key and the lower level index as the value. This dictionary also provides HashSet functionality to determine if a security is present in the cache.

Another design decision to consider is how to handle partially full situations. For the lower level no special consideration is needed, if the queue size would grow beyond its limits we just dequeue an item. For the higher level however, because we need to maintain the relation between higher an lower levels, we would have to handle differently the cases of partially full and "fully primed". I decided to initialize the higher level with dummy data because it simplifies (and makes it faster) the "fully primed" run time at the expense of slower initialization time and a slightly higher memory usage initially. I would expect that the main operation mode of the cache would be "fully primed" however if the expected usage is that the cache would be initialized and destroyed frequently and would operate more often in "not-fully-primed" mode (that is, if the number of securities in the cache is not filled to full capacity) then the code should be changed.

Another decision relates to multi-threaded access. For a cache like this I would generally expect it to be associated to a consolidated stream of data and this code is written with an expectatino of being single threaded. If there are multiple streams of data and Tick() needs to be thread-safe it would be necessary to modify the writes/updates so that they use interlocked writing and updating. Another possible multithreaded scenario would be when reading and writing happens on different threads. If the read happens between the time an item starts being dropped from the cache and when the new item has been added, then it could get incomplete/corrupt data (and locks/interlocked would be necessary to prevent this situation).

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.