Giter Club home page Giter Club logo

wayback's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

wayback's Issues

The pagination filters out some results

When page=X is added to the request, some rows disappears.

For example:

http://web.archive.org/cdx/search/cdx?url=www.expertsender.ru&matchType=domain
document.body.innerText.split("\n").length = 6486

http://web.archive.org/cdx/search/cdx?url=www.expertsender.ru&matchType=domain&page=0
document.body.innerText.split("\n").length = 6246

http://web.archive.org/cdx/search/cdx?url=www.expertsender.ru&matchType=domain&page=1
Returns empty result.

Example of disappeared row:
ru,expertsender,blog)/ispolzovanie-gif-v-emejl-rassylkax-kejs-ot-butik-ru 20160401182916 http://blog.expertsender.ru:80/ispolzovanie-gif-v-emejl-rassylkax-kejs-ot-butik-ru/ text/html 200 K6ZNHY3FGL6X67KDYYW5U7L3WEJRSIM5 12454

Make robots.txt processing fully conform to widely adopted convention

Problems found during investigating WWM-163 (replay is blocked despite robots.txt is 403):

  • Any non-200 status is treated as a failure, and cached as 502.
  • RobotExclusionFilter skips any failure and goes on to test alternative robots.txt (www.example.com/robots.txt if example.com/robots.txt fails). This sounds unnecessary as it uses the host of original CDX field.
  • If all fails, RobotExclusionFilter considers "no robots.txt" and allows full access. This is okay for 404 and 403, but against the convention for 5xx.

Wayback should differentiate 404 and 403 from other failures and treat them as a success, rather than a failure.

"filter" parameter in CDX query API can result in blank pages

Let's say I want to get a list of all images under a given domain. It so happens that this query would span multiple pages.

If I use the parameter filter=image/jpeg and the page happens to not have any images on it, the page will appear to be blank instead of filling up results from later pages

bugs.chromium.org reports an incorrect robots.txt restriction

Navigate to: https://web.archive.org/web/http://bugs.chromium.org/p/project-zero/issues/detail?id=1139

see that wayback says it's blocked by robots.txt:

image

See that the robots.txt for that domain, while complicated, specifically allows that type of URL:

User-agent: *
# Start by disallowing everything.
Disallow: /
# Some specific things are okay, though.
Allow: /$
Allow: /hosting
Allow: /p/*/adminIntro
# Query strings are hard. We only allow ?id=N, no other parameters.
Allow: /p/*/issues/detail?id=*
Disallow: /p/*/issues/detail?id=*&*
Disallow: /p/*/issues/detail?*&id=*
# 10 second crawl delay for bots that honor it.
Crawl-delay: 10

Expect that complex robot.txt files are parsed and matched correctly by the wayback machine.

Change the favicon to the crawled site's favicon if available

Sometimes a website will display a favicon even though one isn't explicitly defined in the page. The site for Iridion II for example.

It would be nice if in the event of no explicitly defined favicon, Wayback Machine would look for one at %DOMAIN%/favicon.ico.

It would probably be more preferable, if not more resource-consuming, to start at the same folder as the current URL and step backwards until it finds something or reaches the domain. However, I've seen only one case where there was a favicon in a place deeper than the root and I'm not even sure if it was ever used.

Playback fails with net::ERR_CONTENT_LENGTH_MISMATCH

Playback of certain URL fails with net::ERR_CONTENT_LENGTH_MISMATCH (Chrome error message). All captures of the URL are warc/revisit. There is no original captures.

Wayback is supposed to return 404 response instead 200 for such case, but it's playing back content from a revisit record (which has no response payload). Closest capture has WARC-Refers-To-Date pointing to another revisit capture. AccessPoint.retrievePayloadForIdenticalContentRevisit blindly believe WARC-Refers-To-Date always points to a non-revisit record (i.e. the original capture), and subsequent CDX query does not exclude revisit captures.

matchType=domain doesn't work as expected

Hi,
I want to get all archived pages for domain and all its subdomains. So I'm using the following url:

http://web.archive.org/cdx/search/cdx?url=*.tut.by&from=20150724&to=20150724&filter=mimetype:text/html&output=json&fl=timestamp,original

There are no records for subdomain news.tut.by. But if I try the following url I'll get a lot of records for subdomain news.tut.by:

http://web.archive.org/cdx/search/cdx?url=*.news.tut.by&from=20150724&to=20150724&filter=mimetype:text/html&output=json&fl=timestamp,original

Thanks

Refactor URI build/rewrite framework

We've gone through several iterations trying to come up with good URL rewrite scheme for archival-URL mode. Our conclusion at this point is that we need to maintain the form of original URL before and after rewrite. By form we mean the various absolute/relative-ness of URL. In other words we want to rewrite full URL to full URL (http://www.example.com to http://web.archive.org/20140101121314/http://www.example.com), protocol relative to protocol relative (//www.example.com to //web.archive.org/20140101121314/http://www.example.com), relative path to relative path (styles/mobile.css to styles/mobile.css) etc.

We found this is rather awkward to achieve with existing framework for URI rewriting. ResultURIConverter.makeReplayURI() method takes only two String parameters datespec and url. Thus it doesn't have access to the context in which url was found. To work around this situation, there are a lot of clumsy code around it, which results in overly complex framework. Here are some observations:

  • ResultURIConverter is used for building replay URL and rewriting URL. While these look similar, I consider they are distinct services that call for different configuration schemes.
  • ReplayParseContext has ResultURIConverter instances for each of context flags (ex cs_) built through ContextResultURIConverterFactory, just for including context flags in replay URL. Those instances would be unnecessary if makeReplayURI() took context flags as an argument.
  • ContextResultURIConverterFactory has two different uses. While its getContextConverter method has single argument called flags, implying its context flags, it can also receive replay URL prefix (see AccessPointAdapter.getUriConverter()). I suppose the ContextResultURIConverterFactory implementations taking context flags would have been unnecessary if ResultURIConverter.makeReplayURI() took context flags as an argument.
  • ReplayParseContext.contextualizeUrl(String, String) checks if URL-rewrite is necessary, and then converts URL to full absolute form before passing it to ResultURIConverter. This makes it impossible for ResultURIConverter implementation to preserve mode of URL described above. Considering ResultURIConverter's primary role, these steps should be left to ResultURIConverter implementation.
  • there's one issue in rewriting relative URL: if the URL being replayed does not have path part (ex. http://www.example.com, relative URL need to be converted to full path. ResultURIConverter needs to know the URL being replayed to achieve this. This is another support case for additional parameters in ResultURIConverter.makeReplayURI().
  • Memento code (ex. EmbeddedCDXServerIndex.addTimegateHeaders()) prepends mementoPrefix to URI returned by ResultURIConverter to ensure Memento URLs are always in absolute form. This is necessary because ResultURIConverter is used for two different purposes and breaks if ResultURIConverter returns different forms of URL depending on the context.
  • There's no easy way of passing X-Forwarded-Proto request header field to URL rewriting so that it can build absolute URL with appropriate protocol (http or https). We worked around it by storing the header value in ThreadLocal.

Our JIRA ARI-4033 depends on the resolution of this issue.

Resolution Plan:

  • Make it clear ResultURIConverter is for constructing replay URL from Capture (full URL and timestamp) and context information (only thing known at present is context flag). I know the class name doesn't represent this role well, but that's >90% use of this interface, currently, and renaming will have widespread effect. We could rename the interface later.
  • Have AccessPointAdapter implement ResultURIConverter. This (along with the change below) should make ContextResultURIConverterFactory unnecessary.
  • Define new interface for customizing URL rewrite, that receives more information than current ResultURIConverter does.
  • Define an interface for passing context of URL (Capture being replayed, baseURL) that ReplayParseContext can implement (for better modularity and ease of testing)

Colors to show the status code of an archived URL

When currently searching for an archived version of an URL with status code 2xx it can take some time before an archived version is found which was archived while it was still available. Searching for the right version of an archived URL can become a lot easier if it's easy to see which archived versions returned status code 2xx or 3xx.

Currently an archived page is shown in the Wayback Machine as a blue circle on the date it was archived, see for example http://wayback.archive.org/web/20010501000000*/http://archive.org. Multiple colors can be used here to let someone know the status code of a page, for example:

  • blue for status code 2xx,
  • purple for status code 3xx,
  • red for status code 4xx.

When an URL is archived multiple times on the same day a larger circle is shown. Multiple colors can be added to this larger circle to show with which status codes the page was archived, for example:

  • blue_red for status code 2xx and 4xx for an URL archived multiple times on the same day.

This same idea can be used for the black bars showing the number of archived version for the months.

I think implementing colors or some other way of showing what status code an URL returned when it was achived would be very helpful for finding a right version of an URL.

Wayback doesn't scrape/rewrite srcset urls correctly

Let me know if this isn't the right repo, but ran into an issue when testing archival features on http://www.goodbyetohalos.com/

Like many webcomics using wordpress nowadays, Goodbye to Halos uses html5 srcset attribute to displays different image sizes to different devices:

<img
    width="800" height="1200" 
    src="http://www.goodbyetohalos.com/wp-content/uploads/2017/01/WEB_ch1_108.jpg"
    class="attachment-full size-full" alt=""
    srcset="http://www.goodbyetohalos.com/wp-content/uploads/2017/01/WEB_ch1_108.jpg 800w,
            http://www.goodbyetohalos.com/wp-content/uploads/2017/01/WEB_ch1_108-480x720.jpg 480w,
            http://www.goodbyetohalos.com/wp-content/uploads/2017/01/WEB_ch1_108-96x144.jpg 96w"
    sizes="(max-width: 800px) 100vw, 800px"
    data-webcomic-parent="837"
>

so far, so good. however, after crawling/scraping these with wayback, only the src url is scraped and rewritten, leading to the image on wayback'ed page still being served from the original server:

<img
    width="800" height="1200"
    src="/web/20170127042412im_/http://www.goodbyetohalos.com/wp-content/uploads/2017/01/WEB_ch1_108.jpg"
    class="attachment-full size-full" alt=""
    srcset="http://www.goodbyetohalos.com/wp-content/uploads/2017/01/WEB_ch1_108.jpg 800w,
            http://www.goodbyetohalos.com/wp-content/uploads/2017/01/WEB_ch1_108-480x720.jpg 480w,
            http://www.goodbyetohalos.com/wp-content/uploads/2017/01/WEB_ch1_108-96x144.jpg 96w"
    sizes="(max-width: 800px) 100vw, 800px"
    data-webcomic-parent="837"
>

this is very obvious because the original site doesn't use https, so it leads to a broken image on the wayback machine view:

image

Obviously, the correct behavior here is that all of the images should be scraped (in this case they're just resizings, but in theory they could be completely different images—nothing prevents that) and rewritten.

Thanks! let me know if you need more information, or want me to whip up a more minimal test case

Extend JSStringTransformer with unescaping language-specific escaping before URL rewrite

ReplayParseContext has ad-hoc support for the case where URLs are escaped in the target resource. For example it recognizes URLs written in JavaScript as "http://example.com/..." as absolute URLs. This approach has a few problems:

  • There are a few different forms of escaping, like "http:\u002F\u002Fexample.com/...". Adding support for these alternatives makes ReplayParseContext messy
  • JSStringTransformer is also used for other types of resources, which can have different way of escaping characters.
  • Characters in replayPrefix are NOT escaped to match the syntax of the target resource. For example, slashes in replayPrefix can break if URL is found in regular expression literal.

It would be more robust to implement unescaping at JSStringTransformer to pass clean URLs to ReplayParseContext. It can also escape special characters back before inserting rewritten URLs.

Switch over to iipc/webarchive-commons 1.1.3?

Hi,

I was looking at syncing up our forks, and couldn't proceed because webarchive-commons was forked a while ago: 6555609

I've just pulled your changes to webarchive-commons into the IIPC version and rolled a 1.1.3 release including that change (and a number of bugfixes). Would you consider switching back to the IIPC version? It would make keeping our forks in sync much easier.

Thanks,
Andy

Bad response header field in replay of gzip-encoded capture

If text (HTML, CSS, JavaScript) response is gzip-encoded (has Content-Encoding: gzip), replay response has weirdly-named header field: X-Archive-Orig-X-Archive-Orig-Encoding: gzip. It is supposed to be X-Archive-Orig-Encoding: gzip.

This is because TextReplayRenderer.decodeResource replaces Content-Encoding header field with X-Archive-Orig-Encoding header field while applying gzip-decode, and then later RedirectRewritingHttpHeaderProcessor prepends X-Archive-Orig- to it (note this is configurable).

Easy solution would be to avoid prepending head field name if it starts with X-Archive-Orig-, but this sounds too ad-hoc. X-Archive-Orig- prefix is currently hard-coded, but we may want to make it configurable.

Allow for using different collapseTime for replay and capture search

EmbeddedCDXServerIndex has timestampDedupLength property for culling captures to prevent capture search result page from getting too crowded with many captures (we call this feature timestamp-dedup thereafter). This property applies to all capture search queries, whether it is for capture list page or looking up closest capture for replay.

While we want timestamp-dedup for capture list page, we learned it is problematic for capture lookup for replay, because it often break revisit resolution. We want to disable timestamp-dedup if capture search query is for replay.

Internally known as ARI-3883.

RobotRules allows any path if it hits empty Disallow:

RobotRule.blocksPathForUA(String, String) returns false for any paths with this robots.txt:

User-agent: *
Disallow:
Disallow: /wp-admin
Disallow: /wp-includes
Disallow: /wp-login.php
Disallow: /wp-content/plugins
Disallow: /wp-content/cache
Disallow: /wp-content/themes
Disallow: /trackback
Disallow: /comments

Per robots.txt specification, empty Disallow: shall be just ignored. RobotRules returns false when it hits empty Disallow:, ignoring the rest of rules.

Found by ARI-4212.

HTML rewrite fails to insert body-insert if both </HEAD> and <BODY> are missing

originally reported in ARI-3880.
Failure case has HTML like this:

<html>
<head>
<title>...</title>
<script type="text/javascript" src="scripts/header.js"></script>

<p align="center">
...

FastArchivalUrlReplayParseEventHandler fails to insert body-insert (jspInsertPath) because the code block is skipped if inHead flag is true (set by appearance of HEAD tag). This results in failure to render top-of-the-page banner (typically disclaimer and navigation bar).

Implement real range request handling

Currently Wayback does nothing special with range request and simply render whichever capture matching the URL + timestamp combination. This works as long as capture is either 200 response (browser assumes the server does not support Range request), or 206 response with matching Content-Range.

Recently found, some HTML5 browser, when playing a video, first probes server by making range request for entire file, then making another request for small range near the end of file. If the server does not return 206 response matching the request, it stops video playback. To support HTML5 video playback, Wayback needs to implement range request handling of its own.

This issue is internally known as ARI-4254.

Add mime type detection for replaying captures with incorrect content-type

(This is an issue item for already completed work)
Determine mime type by looking into the payload when either mimetype in the search result is suspected to have incorrect value (ex. text/html) or missing (ex. unk).

Known internally as ARI-3822, ARI-3888, WWM-58. Bug fixes in ARI-4071 and ARI-4078.

Base work is done in commits 65dfc40 through 7d9d332, then bug fixes are being tracked on mimetype-detector branch.

Resource record always rendered as text/html

Resource record is always rendered as text/html, regardless of Content-Type WARC header field.

This is due to a lack of metadata record support in JWATResource. It does not return Content-Type header field from its getHttpHeaders() method, thus Tomcat supplies default value text/html.

Known internally as WWM-126.

Make ServerRelativeArchivalRedirect easier to extend

Archive-It found a issue with Referer header generated by Flash plugin for Firefox (ARI-4169) and wants to extend ServerRelativeArchivalRedirect with supplemental method for obtaining ArchivalUrl context. As the method depends on private JavaScript library it uses, we'd like to keep the enhancement local to Archive-It, for now. Unfortunately ServerRelativeArchivalRedirect does not have extension point to enable this.

Plan is to move the code in ServerRelativeArchivalRedirect that parses Referer into a new method, so that sub-class can override it.

Fix links in the toolbar timeline when the URL has ampersands in it

Here for example. Note how the URL has ampersands in it. If you were to click to another point in the timeline, the URL you would go to would have all of the ampersands replaced with &, resulting in seeing a different set of crawls.

Sure, with this page, you would still see something, but in any other case, the user won't be as lucky.

It appears the core issue is that for whatever reason the wbCurrentUrl variable is HTML-encoded. Bizarrely, this does not happen in the "see all crawls" page

This could probably be fixed by changing line 74 at this file to var wbCurrentUrl = "<%= StringEscapeUtils.unescapeHtml(searchUrlJS) %>";

Rewrite percent-encoded URLs

Currently percent encoded URLs are not rewritten. For example, the text from https://web.archive.org/web/20150804131701/http://blip.tv/file/get/NostalgiaCritic-NCPlanetOfTheApes401.m4v?showplayer=2014093037100220150422135039&referrer=http://blip.tv&mask=11&skin=flashvars&view=url should be rewritten like:
Original:

message=http%3A%2F%2Fj41.video2.blip.tv%2F5520014255207%2FNostalgiaCritic-NCPlanetOfTheApes401.m4v%3Fir%3D96428%26sr%3D2334 
_Should be rewritten as:_
message=http%3A%2F%2Fweb.archive.org%2Fweb%2F20150804131701%2Fhttp%3A%2F%2Fj41.video2.blip.tv%2F5520014255207%2FNostalgiaCritic-NCPlanetOfTheApes401.m4v%3Fir%3D96428%26sr%3D2334 

ClassCastException while replaying revisit record

Revisit record handling code makes a bad assumption that revisit records are always an instance of WarcResource. There's a alternative implementation JWATResource, and revisit replay throws ClassCastException with it.

Internally known as WWM-101.

URL-decode date component of Archival-URL request

From WWM-110.
Some UA does more URL-encoding than strictly necessary. Notably, * is sometimes passed to Wayback as %-encoded %2A. Currently it results in 404 error. There seems to be nothing against URL-decoding date component of Archival-URL before parsing, so that 2010%2A is recognized as 2010*.

Unify StringTransformer and RewriteRule

Wayback has two distinct interfaces for rewriting text resources: StringTransformer and RewriteRule. It'll be useful if we can somehow unify these. At least there's a need for using MultipleRegexReplaceStringTransformer as RewriteRule. First step is to have MultiRegexReplaceStringTransformer implement RewriteRule interface.

Changes are on unify-rewrite branch, and ready to merge into openwayback.

Some links are not presented in filtered result list

The filter below is used some rows are not presented in the result list:
http://web.archive.org/cdx/search/cdx?url=http://www.expertsender.ru/bundles/core&matchType=prefix

Example row
ru,expertsender)/bundles/core?v=9nx-rocbnddl6mfbsncc8jgbjid4p8wyv00b9yjdxm81 20161015143737 http://www.expertsender.ru/bundles/core?v=9nX-roCbNddL6MFBsnCc8JGbjiD4p8wYv00b9YJdXm81 text/css 200 JROZKHBMIZC6TXGOLNJGIPRE73Q23WTD 28877

To find this row you have to specify only full URL:
http://web.archive.org/cdx/search/cdx?url=http://www.expertsender.ru/bundles/core?v=9nX-roCbNddL6MFBsnCc8JGbjiD4p8wYv00b9YJdXm81&matchType=prefix

Embed-mode replay results in repeated redirects for captures with long revisit history

Embed-mode replay first searches for captures with timestampSearchKey flag turned on for faster lookup. If the URL has long revisit history and thus replay cannot resolve revisit within the constrained time range for timestampSearchKey, it reruns capture query with timestampSearchKey flag turned off. It is supposed to re-initialize captureSelector at that point, but it doesn't. So the replay code goes on to the next capture and returns a redirect response, and repeat.

Make per-collection exclusion configurable

Currently collection-sensitive exclusion filter provided by CompositeAccessPoint is inflexible.

  • Use of CustomPolicyOracleFilter is hard-coded, and it can only be combined with ExclusionFilterFactorys configured in CompositeAccessPoint staticExclusions property.
  • AccessPointAuthChecker assumes exclusion rules are determined by nothing but urlkey, prohibiting time-ranged exclusion rules.
  • CDXServer cannot pass oraclePolicy (used for delivering custom rewrite rules) from ExclusionFilter to capture search result

As a result,

  • Customization is required to extend EmbeddedCDXServerInde to inject ExclusionFilterFactory from AccessPointAdapter exclusionFactory into CDXToCaptureSearchResultWriter exclusionFilter - that is, Oracle exclusion filter runs at the final step of CDX processing pipeline. This turned out to be problematic, since exclusion happens after timestamp-deduplication.

Apparently CDXToCaptureSearchResultWriter exclusionFilter is necessary solely to support use of Oracle exclusion filter with CDXServer. Having multiple ways of configuring exclusion filter makes the code hard-to-follow, and customization painful.

Proper resumeKey functionality depends on including the urlkey field

In making use of the Wayback CDX server API (documented here), I noticed that when using resumeKeys I get odd behavior when leaving the urlkey field out from the fieldOrder. Specifically, it looks like the CDX server jumps directly to the 2013 era, even though there are valid records before that:

$ wget -q -U '' -O - 'https://web.archive.org/cdx/search/cdx?collapse=timestamp%3A8&url=https%3A%2F%2Farchive.org&limit=5&fl=timestamp%2Cstatuscode&showResumeKey=true'
19970126045828 200
19971011050034 200
19971211122953 200
19980109140106 200
19980113025731 200

-+19980113025732
$ wget -q -U '' -O - 'https://web.archive.org/cdx/search/cdx?collapse=timestamp%3A8&url=https%3A%2F%2Farchive.org&limit=5&fl=timestamp%2Cstatuscode&showResumeKey=true&resumeKey=-+19980113025732'
20131019030216 502
20130818180757 502
20130402123654 502
20130902085637 502
20130903032956 502

Everything seems to work fine if I include the urlkey field:

$ wget -q -U '' -O - 'https://web.archive.org/cdx/search/cdx?collapse=timestamp%3A8&url=https%3A%2F%2Farchive.org&limit=5&fl=urlkey,timestamp%2Cstatuscode&showResumeKey=true'
org,archive)/ 19970126045828 200
org,archive)/ 19971011050034 200
org,archive)/ 19971211122953 200
org,archive)/ 19980109140106 200
org,archive)/ 19980113025731 200

org%2Carchive%29%2F+19980113025732
$ wget -q -U '' -O - 'https://web.archive.org/cdx/search/cdx?collapse=timestamp%3A8&url=https%3A%2F%2Farchive.org&limit=5&fl=urlkey,timestamp%2Cstatuscode&showResumeKey=true&resumeKey=org%2Carchive%29%2F+19980113025732'
org,archive)/ 19980129163431 200
org,archive)/ 19980501124530 200
org,archive)/ 19990116225149 200
org,archive)/ 19990117003935 200
org,archive)/ 19990202042615 200

org%2Carchive%29%2F+19990202042616

Perhaps there's an undocumented dependency on passing the urlkey field?

Thanks

Missing PathIndex file results in NullPointerException

FlexResourceStore throws NullPointerException if any of configured PathIndex file is missing:

WARNING: Runtime Error
org.archive.wayback.exception.ResourceNotAvailableException: File not Found: aaa.warc.gz
        at org.archive.wayback.resourcestore.FlexResourceStore.retrieveResource(FlexResourceStore.java:266)

gapless playback on archive.org

I'm listening to the Dead on archive.org, and it would sher be swell to have gapless playback, to reduce the buzzkill when listening to Dead concerts (I'm currently working through 1989 :).

This is almost certainly the wrong repo for this ticket. Can you point me to the right repo?

I looked at the banner and "Help," "Jobs," and "Volunteer" all sound the same to me, and none of them answered the "which repo" question for me.

screen shot 2015-08-27 at 10 52 03 pm

The closest was https://developers.archive.org/get-started/. That seems to be aimed more at developers using JSON APIs than developers interested in helping with the software itself.

I also browsed around https://github.com/internetarchive ... and even ended up on https://github.com/iipc (:scream_cat: good lord what is this!?). I found a repo for Wayback, but I think that's different than what I want, right? Is there a repo for archive.org?

Thanks! :-)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.