bigdatagenomics / mango Goto Github PK
View Code? Open in Web Editor NEWA scalable genome browser. Apache 2 licensed.
License: Apache License 2.0
A scalable genome browser. Apache 2 licensed.
License: Apache License 2.0
When running mango on localhost, the request time for large files contains a significant time downloading the JSON created by the scalatra servlet. The json can get quite large (100 MB+)
Find some way to eliminate this latency, at least on localhost, perhaps by outputting json to a working directory on disk, and reading that file in from the froontend
Currently defaults to a region between 0 and 100
Reads aligned forward strand should have an arrow pointing to the left from the end of the read and reads aligned reverse strand should have an arrow pointing to the right from the start of the read.
To show more than just hovering over. Note that the information contained is selected by the projection in VizReads.scala
. This issue may involve modifying the "print methods" such as printVariationJson
and the case classes for the corresponding Json object to send over to the frontend.
ReferenceRegion
places extra overhead when creating TrackedLayout
. This overhead includes creating a ReferenceRegion
for each record, and projecting the contig
field of a record. Directly accessing the start
and end
fields in a record can possibly reduce this overhead.
Ensure correct user input, and give feedback
For example, in the overall view, if the reference request errors out, the following requests will not be carried out due to the following error:
java.lang.AssertionError: assertion failed: Timer name from on top of stack [/GET reference(55,false)/GET features(0,false)/GET reads(0,false)/collect at VizReads.scala:424(58,true)] did not match passed-in timer name [GET features]
Currently a new RDD is created upon each reference to get the reference.
To solve this, we should support reference files in LazyMaterialization or keep a global RDD[NucleotideContigFragment] in VizReads
When loading in two reads files with the same sampleName, the visualization for the second sample won't display.
If you try to visualize a very high coverage region on the overall
page, you can run into funny issues where the non-read data isn't displayed. @erictu I have a set of files that reproduces this bug; they're pretty small so I'll go ahead and tar them up and send them to you.
Ideally, command line just boots up the webapp, and all further activity is performed through the webapp. This allows loading different files (reads, etc.) without quitting and relaunching the applications.
Users should be able to choose not to provide reads/variants/etc. The reference should be required IMO, but otherwise the user shouldn't be forced to provide one of every input.
If you resize the reads page and mismatches are visible, the mismatches disappear although they are selected in the view menu.
Only works on 1.2.1
/reads
and /overall
load in an RDD to calculate the number of tracks. This loading is again done when issuing a GET request to /reads/:ref
that actually gets the Json information to render. Find some way to eliminate the initial loading, and calculate tracks, as this is redundant.
I understand that Mango is still in very early stages.
I was curious about it and wanted to see how it works.
I tried building it on a linux machine (ubuntu).
I was able to start the server but when I go to http://localhost:8080, I see a error
Any idea on what I am doing wrong?
java.lang.NoSuchMethodError: javax.servlet.http.HttpServletResponse.getStatus()I
at org.scalatra.servlet.RichResponse.status(RichResponse.scala:16)
at org.scalatra.ScalatraContext$class.status(ScalatraContext.scala:29)
at org.scalatra.ScalatraServlet.status(ScalatraServlet.scala:49)
at org.scalatra.ScalatraBase$class.runActions$1(ScalatraBase.scala:165)
at org.scalatra.ScalatraBase$$anonfun$executeRoutes$1.apply$mcV$sp(ScalatraBase.scala:175)
at org.scalatra.ScalatraBase$$anonfun$executeRoutes$1.apply(ScalatraBase.scala:175)
at org.scalatra.ScalatraBase$$anonfun$executeRoutes$1.apply(ScalatraBase.scala:175)
at org.scalatra.ScalatraBase$class.org$scalatra$ScalatraBase$$cradleHalt(ScalatraBase.scala:193)
at org.scalatra.ScalatraBase$class.executeRoutes(ScalatraBase.scala:175)
at org.scalatra.ScalatraServlet.executeRoutes(ScalatraServlet.scala:49)
at org.scalatra.ScalatraBase$$anonfun$handle$1.apply$mcV$sp(ScalatraBase.scala:113)
at org.scalatra.ScalatraBase$$anonfun$handle$1.apply(ScalatraBase.scala:113)
at org.scalatra.ScalatraBase$$anonfun$handle$1.apply(ScalatraBase.scala:113)
at scala.util.DynamicVariable.withValue(DynamicVariable.scala:57)
at org.scalatra.DynamicScope$class.withResponse(DynamicScope.scala:80)
at org.scalatra.ScalatraServlet.withResponse(ScalatraServlet.scala:49)
at org.scalatra.DynamicScope$$anonfun$withRequestResponse$1.apply(DynamicScope.scala:60)
at scala.util.DynamicVariable.withValue(DynamicVariable.scala:57)
at org.scalatra.DynamicScope$class.withRequest(DynamicScope.scala:71)
at org.scalatra.ScalatraServlet.withRequest(ScalatraServlet.scala:49)
at org.scalatra.DynamicScope$class.withRequestResponse(DynamicScope.scala:59)
at org.scalatra.ScalatraServlet.withRequestResponse(ScalatraServlet.scala:49)
at org.scalatra.ScalatraBase$class.handle(ScalatraBase.scala:111)
at org.scalatra.ScalatraServlet.org$scalatra$servlet$ServletBase$$super$handle(ScalatraServlet.scala:49)
at org.scalatra.servlet.ServletBase$class.handle(ServletBase.scala:43)
at org.scalatra.ScalatraServlet.handle(ScalatraServlet.scala:49)
at org.scalatra.ScalatraServlet.service(ScalatraServlet.scala:54)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:684)
at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:501)
at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:137)
at org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:557)
at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:231)
at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1086)
at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:428)
at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:193)
at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1020)
at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135)
at org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:255)
at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116)
at org.eclipse.jetty.server.Server.handle(Server.java:370)
at org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:494)
at org.eclipse.jetty.server.AbstractHttpConnection.headerComplete(AbstractHttpConnection.java:971)
at org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.headerComplete(AbstractHttpConnection.java:1033)
at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:644)
at org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:235)
at org.eclipse.jetty.server.AsyncHttpConnection.handle(AsyncHttpConnection.java:82)
at org.eclipse.jetty.io.nio.SelectChannelEndPoint.handle(SelectChannelEndPoint.java:667)
at org.eclipse.jetty.io.nio.SelectChannelEndPoint$1.run(SelectChannelEndPoint.java:52)
at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:608)
at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:543)
at java.lang.Thread.run(Thread.java:745)
Thanks,
Nikhil
So Spark jobs run concurrently, not sequentially.
3 Part Hierarchy (Top down)
After #47 is resolved.
Currently defaults to 0-100 despite traversing a different region. Perhaps add in a box?
To allow easier view of variants. Could be applied to other features.
Eg. chr20: 90000-95000
Currently the chromosome is specified at the command line, and the region need to be entered in the start and end boxes.
E.g., if I am at chr20:29828000-29830000
on the /overall
page, I should view chr20:29828000-29830000
when I switch to /freq
page. Currently, we "reset" to the start of the chromosome.
Hardcoded for now in VizReads.scala
Currently the speed is the same whether Parquet predicates are used or not. (Though Parquet files load much more quickly than non-Parquet)
Both the reads and reference http request fetch from the reference. We should implement a working set.
The two variables are interchangeably used in the server response to get("/reads/:ref")
Using faidx
and htsjdk.samtools.reference.FastaSequenceIndex
Currently just allows two
Tabix Indexing and htsjdk.tribble.index.tabix
#8 was fixed by #9, but fixing a bug when displaying Reference has made this pop up again.
Specifically, the ReferenceRegion keeping track of the current position was removed when performing a quick fix to displaying the bases in reference files in #39 .
get("/reference/:ref") {
VizTimers.RefRequest.time {
val viewRegion = ReferenceRegion(params("ref"), params("start").toLong, params("end").toLong)
Currently, large amounts of data elements takes very long to process. In TrackedLayout.scala
.
Perhaps using KV Store
E.g. Show mismatched bases
Currently very naively just removes all elements and re-renders all svg groupings.
Utilize enter(), update, exit() correctly to only re-render elements needed, while cleanly shifting existing elements to the new correct position on the page.
HTTP request takes additional time to receive request. These exists delay between after the server receives processed JSON, and when the web browser receives the HTTP response with the JSON.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.