phil8192 / ob-analytics Goto Github PK
View Code? Open in Web Editor NEWR package intended for visualisation, analysis and reconstruction of limit order book data
License: Other
R package intended for visualisation, analysis and reconstruction of limit order book data
License: Other
Bitstamp provides APIs for Live Orders and Live Ticker (i.e. trades) as described here . Sometimes Live Orders API outputs incorrect data as shown below.
Here is the trade record received from Live Ticker API:
The record contains price and amount of trade as well as buy and sell order IDs.
Here is the records from Live Orders API for orders participated in the above trade:
(both screenshots are taken from PostgreSQL database where the data received from API are saved)
Correct row 6 corresponds to sell order 2269748432. 'fill' column equals to trade amount, timing is ok etc.
Incorrect row 5 corresponds to the buy order. It is incorrect because it reports the cancellation of the buy order for 'amount' column didn't change comparing to the previous event for the order. That is equivalent to zero 'fill' - no trade.
Thus processData() function should probably handle the data from both APIs together to produce reliable results.
no pacman orders were found "replacement has 1 row, data has 0"
Error: all(orders$timestamp %in% depth.summary$timestamp) is not TRUE
In addition: Warning messages:
1: In removeDuplicates(events) :
removed 3 duplicate order cancellations: 195347950 195547832 195561334
2: In matchTrades(events) :
03/10/17 : 226 jumps > $10 (swaping makers with takers)
3: In setOrderTypes(events, trades) : could not identify 6213 orders
4: In processData(paste0("csv/", day, ".csv")) :
removed 1 duplicated updates
Execution halted
Very common operation
The following code does not work when 'volume' is double (from there):
src$fill %in% dst$fill
At least it should be mentioned in the documentation that 'volume' must be integer.
First, thank you for such a useful package! I am not familiar to bitstamp data so have this question:
I assume extdata/2015-05-01.log.xz is the raw market data recording, it already has trades in it. Why does the package spend time on inferring the trades from orders?
the inferred trades count is 482 parse.sh explicitly filters trades and generates the orders.csv the log has 575 trades, which is different from number of inferred trades.the individual orders and trades are usually directly provided by exchanges, so why does the package spend time on inferring trades from orders?
discretised - summary statistics
depthMetrics in depth.R is ridiculously slow. needs rethinking.
https://www.iextrading.com/trading/alerts/2017/011/
recently made available by IEX, full depth of their order book.
E.g., event patterns.
Needs some work: does not look good with wide time period. + Maybe link "strategic run" events...
head -1 x/2017-03-10.csv >csv/2017-03-10.csv
tail -100000 x/2017-03-10.csv >>csv/2017-03-10.csv
processing 2017-03-10.csv...
Error in if (abs(this.jump$price - prev.jump$price) > 10) { :
argument is of length zero
Calls: processData -> matchTrades
In addition: Warning messages:
1: In removeDuplicates(events) :
removed 2 duplicate order cancellations: 195547832 195561334
2: In matchTrades(events) :
03/10/17 : 117 jumps > $10 (swaping makers with takers)
Execution halted
"strategic runs"
Below is an excerpt from "Example limit order book data" (lob.data$events
) showing 'changed' events for 'market' orders that do not have a corresponding limit order event (matching.event
equals NA):
It appears that the trades corresponding to the above events are missing from lob.data$trades
. The total number of 'market' and 'market-limit' events with matching.event
equals NA is 43 while the number of trades in lob.data$trades
is 482. Thus approximately 8% of trades are missing?
An order that started out as a market order, came to rest in the order book and has not been filled since then is classified as 'market' instead of 'market-limit'
matchTrades function in trades.R sometimes returns incorrect trade price:
trades[2794:2797, ]
timestamp price volume direction maker.event.id taker.event.id maker taker
2799 2015-08-28 15:46:43.698 230.00 977822452 buy 247125 248210 80395773 80396310
2800 2015-08-28 15:46:43.736 230.03 22177548 buy 247037 248211 80395734 80396310
2804 2015-08-28 15:46:50.864 206.99 3000000 buy 248231 248240 80396320 80396322
2788 2015-08-28 15:46:50.894 229.39 36881731 sell 247643 248232 80396032 80396320
due to misclassification of maker order:
events[events$id == 80396320, c("id", "event.id", "price", "volume", "direction", "type")]
id event.id price volume direction type
247732 80396320 248230 206.99 120000000 ask market-limit
247734 80396320 248231 206.99 117000000 ask market-limit
247736 80396320 248232 206.99 80118269 ask market-limit
247738 80396320 248233 206.99 76501530 ask market-limit
247739 80396320 248234 206.99 11094362 ask market-limit
247742 80396320 248235 206.99 0 ask market-limit
processing 2017-07-05.csv...
Warning messages:
1: In matchTrades(events) :
07/05/17 : 1 jumps > $10 (swaping makers with takers)
2: In deleted[deleted$id %in% created.deleted.ids, ]$volume == created[created$id %in% :
longer object length is not a multiple of shorter object length
3: In setOrderTypes(events, trades) : could not identify 534784 orders
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.