Giter Club home page Giter Club logo

lineserver's Introduction

/***************************LineServer****************************/
 Will Childs-Klein
 January 2013

 The server is started by running the run.sh script with the name
 	of the file to serve as the singular argument. To start the 
 	client, the run.sh script is run without any additional 
 	arguments. The system works across a network as well as locally.

 My system works by reading the file to be served line by line and
 	dividing it into separate files for each line. When a particular
 	line is requested, the appropriate file is read into memory and 
 	transferred over a network Socket.

 The system should perform well as the number of requests increases 
 	per second because the data file is already parsed, and the use
 	of Buffered I/O reduces low-level OS interaction, hopefully
 	increasing speed (according to Oracle docs). But, because my
 	system uses Java's file IO instead of a database, as file and
 	line size grow, performance will suffer greatly. 

 Currently, The system handles large data files well as long as each 
 	line is below ~100MB. Java's heap allocation can't handle that 
 	much data. Editing the code to not store lines as Strings would 
 	remedy this, as well as reading and transferring the data byte by
 	byte over the socket, possibly at the expense of read/transfer 
 	speed. Perhaps using StringBuilder, or some other such class
 	designed for manipulating large strings would be better.

 I primarily used Oracle's documentation on java.net (Sockets) and 
 	java.io (Readers/Writers/other IO miscellany) as reference
 	material. I consulted forums such as StackOverflow to deal with
 	various Exceptions as they arose.

 I attempted to use a java implementation of leveldb, a No-SQL 
 	key/value store originally made by Google in lieu of the flat file 
 	system I ended up implementing, but I ran into a lot of issues
 	getting it to build correctly, and ran out of time. Ideally, 
 	I would use key/value because it is perfect for this scenario:
 	there are no joins, no writes (other than the initial file parsing),
 	and Key/Value stores provide constant time access.


ISSUES:
	* When the client exits their program incorrectly or abruptly (by 
	sending an interrupt signal, for example), the server catches a 
	null pointer acception and fails. This is due to a readLine() being
	called on an object that now has null value, because the connection
	was not properly closed.
	* This software will not perform at scale. It does not utilize other
	drives (for more space on disk), it uses normal String types (depleting
	Java's heap space quickly), and the transfer over the network is line
	by line (the size of the line to be transferred is therefore constrained
	by the size of the buffer).
	* No multithreading. In order to serve multiple clients at once, I would
	have had to implement multithreading, spinning off a new thread for each
	new client, which I did not have time to do.
	* No test scripts. Given more time, I would have written a test script
	in either bash or python testing the performance of LineServer. I would
	test performance in two main dimensions: query rate and file size (as well as other
	edge cases like robustness/error handling).

No Third party software was used in the final version of this program.

All in all, including a fair amount of research and outlining, this
	assignment took me between 10 and 15 hours to implement. I apologize
	for the lack of completeness and thoroughness, but I have been 
	very time-constrained recently.

lineserver's People

Contributors

willchilds-klein avatar

Watchers

 avatar  avatar  avatar

Forkers

tareksfouda

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.