Giter Club home page Giter Club logo

dweb-archive's Introduction

dweb-archive

User Interface to access the archive from the browser. Builds on dweb-transports and typically (currently) loaded from dweb-transport

Background

This library is part of a general project at the Internet Archive (archive.org) to support the decentralized web.

Goals

  • to allow unmodified browsers to access the Internet Archive's millions of items
  • to support as many of the IA's features as possible, adding them iteratively
  • to use decentralized platforms for as many features as possible, without sacrificing functionality
  • to avoid single points of failure where possible

Installation

Please see the installation instructions in the dweb-mirror repo They are much more recent than the ones below.

All cases

git clone https://[email protected]/internetarchive/dweb-archive.git
cd dweb-archive

# install the dependencies including IPFS & WebTorrent and dweb-transports
npm install  

Installation for testing in a browser

Do the "All Case install above"

Install a simple http-server, this may require sudo depending on permissions

npm run setuphttp
npm install -g http-server` 
cd dist
http-server

Now open a browser page.

Note: Firefox works better than Chrome for local usage as Chrome limits cross-origin http to 6 streams and we need to implement a limited http pool to fix this.

open "http://localhost:8080/archive.html"

To test with limited transports, for example HTTP only, add the transport parameter.

open "http://localhost:8080/archive.html?transport=HTTP"

To test against dweb-mirror you can pass a parameter e.g.

open "http://localhost:8080/archive.html?mirror=localhost:4244&transport=HTTP"

Node Installation to work on this repo

Note that the only reason to do this would be to work on the code,

Do the "All Case install above"

Build (webpack) the bundles and copy needed files to dist/ webpack --mode development

See related:

Repos:

  • dweb-transports: Common API to underlying transports (http, webtorrent, ipfs, yjs)
  • dweb-archive: Decentralized Archive webpage and bootstrapping
  • dweb-transport: Original Repo, still has some half-complete projects
  • dweb-archivecontroller: Object model for archive, includes routing table

Directory structure here

Directories
  • components - React components used by the UI (see also ia-components)
  • dist - all that is needed to run the UI - this is also in its own npm package.
  • docs - should be some documentation, but its a bit out of date
  • ia-components - More React components, these are dual purpose, they don't depend on Dweb
  • images - extra images used (there are also ones in dist/images copied from archive.org)
  • includes - files copied over from internet archive, where we build the CSS and JS
  • node_modules - installed from the dependencies in package.json by yarn install
  • util - just has throttler.js and to be honest I can't remember why its off on its own
  • web_modules - compiled by pix for web components (radio-player is the only one, but that has dependencies)
Files
  • archive.html - main file for displaying archive (detail or search) pages
  • archive.js - top level for creating archive-bundle.js
  • dweb-archive-styles.css - CSS styles for dweb, note that it uses standard archive styles in includes/archive.css for most
  • LICENSE - standard GNU Affero licence
  • webpack.config.js - defines bundling, and in particular which files are needed for the distribution
  • ... some more TODO documentation

Class hierarchy

  • ArchiveFile - represents a single file
  • ArchiveItem - represents data structures for an item (a directory of files)
    • ArchiveBase - Subclass ArchiveItem to add functionality specific to this UI
  • ArchiveMember - represents a listing for an item (e.g. in a search)
  • React.Component - Standard React class used for building components
    • Lots of stand-alone components
    • AVDWeb - Adds functionality common to adding content to media tags
      • AudioDweb, VideoDweb
  • Nav - common class for navigation structures (mostly at the top of the page) also maps item types to classes

dweb-archive's People

Contributors

dependabot[bot] avatar iisa avatar jhiesey avatar kant avatar mitra42 avatar shreyansh23 avatar xloem avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

dweb-archive's Issues

UI minor collection images

Collections list (under download options) only has names, not images.

This needs a change to the metadata from gateway.dweb.me to unclude thumbnails and should probably change to e.g.

collections: { crateddiggers: { title: "CrateDiggers", thumbnaillinks: [ipfs://, ...] , ...}... }

Visual issues with move to updated CSS

A collection of minor, hopefully trivial CSS issues after updating

  • details: Search bar at top - two shown, overlapping
  • details: List of files to download shows two download icons

Functionality: support data mediatypes

There are items with mediatype = web | data, that in the archive.org interface have no preview.

Need a subclass of Details that handles them similarly to the archive.org interface.

(Note there are also zotero with zero items, and educational which are re categorised)

Audio player unresponsive UX

In the Audio player, switching tracks causes a fast change in the blue bar used to show which is playing, but slow to show in the Player.

Its complex - its reacting locally, then taking time to change the player, but the player is the default Audio player and it doesnt display the time until WebTorrent starts to find the stream. Need to think about fixing this.

Dev: installable without dweb-transport

At the moment, to test dweb-archive and the archive UI, you need to install dweb-transport, this shouldn't be needed.

  • Add dweb bundles to git for dweb-transports, dweb-objects, dweb-archive
  • Make sure build process for dweb-archive uses the repos
  • Make sure build process for dweb-transport uses the repos
  • Maybe move examples back to dweb-objects

Feature: Downloads/All

On details page eg. https://dweb.archive.org/details/commute
right hand side "Download Options", "Show All"
goes to a Cweb link,
https://archive.org/download/commute
Which displays a directory.

Need to replace this link with a call to Nav.factory -with a new subclass of details (?) that does a simple directory display.

UI minor - audio album image

how to select the image for the album.

According to Tracey, the heuristic is complicated, with lots of edge cases, that makes it problematic to duplicate (and keep maintained).

Solution...

  • Tracey to add a service e.g. services/imgurl/foo or services/img/foo?output=name` that returns the name of the file, rather than the actual image
  • Gateway to retrieve this name, then check the file metadata.
  • For OK files, the image is added to the dweb, and its url's returned in a metadata field
  • For Non-OK files, the services/img/foo?scale=2 (syntax TBD) call is used to get a larger file which is added to the Dweb and urls' returned.

Non-OK means any of:
file/private is set - so only accessable to priviliged users
file/format us unusable e.g. BMP, TIF, PDF etc
file/size is too large (above a TBD amount)

All the details here are fuzzy, but give the idea of how to solve the problem.

UI: Search sorting

STR:
dweb.archive.org/search.php?query=prelinger
See the line "Sort By" and click on any of them - doesn't work

About tab reports bad history and

STR: https://dweb.me/examples/archive.html?item=prelinger&verbose=false
Click "About"

VM3941 archive.min.js:1179 Uncaught DOMException: Failed to execute 'pushState' on 'History': A history state object with URL 'https://dweb.mehttps//archive.org/details/prelinger&tab=about' cannot be created in a document with origin 'https://dweb.me' and URL 'https://dweb.me/arc/archive.org/details/prelinger?verbose=true&transport=HTTP&transport=YJS&transport=IPFS&transport=WEBTORRENT&verbose=true'.
at Function.tabby (https://dweb.me/examples/includes/archive.min.js:1179:21)
at HTMLAnchorElement.onclick (https://dweb.me/arc/archive.org/details/prelinger?verbose=true&transport=HTTP&transport=YJS&transport=IPFS&transport=WEBTORRENT&verbose=true:1:12)

Looks like bad write to history, that is only done in one place I believe.

Note this is AFTER URL = ../arc... so this might be caused by #1 or at least better to tackle after fixing that.

bootstrap loses verbose=1

https://dweb.archive.org/details/ElectricSheep?verbose=true
ends up at
https://dweb.me/arc/archive.org/details/ElectricSheep?
i.e. losing the verbose (and probably other) parameters

Feature add_bookmark

The add_bookmark code inside Collection.banner is not going to work,
a) its an absolute link
b) we aren't logged in.
For now its commented out.

Bug (minor UI) Sort criteria doesnt work in search

STR: 
https://dweb.archive.org
Enter a search string and wait for results
Change selector from "Relevance" to anything else, it doesnt seem to do anything

The code is in Search.php.rowColumnsItems, search for date_switcher,
There is an obvious issue that the onClick string for date_switcher makes no sense, but this object isn't visible to be clicked on anyway there is also code in archive_setup_push that might be looking for this date_switcher id.

This needs some detective work, comparing the dweb.archive.org version with the archive.org version.

Personal pages without data dont work

Note there are at least two cases - those with data and those without.
WITHOUT DATA ....
https://dweb.me/archive/archive.html?item=@Steve%20Nordby&verbose=true
which fails in
https://dweb.archive.org/leaf?key=%40Steve%20Nordby

Error code: 404
Message: Archive item @Steve Nordby not found.
Error code explanation: 404 - Nothing matches the given URI.

Safari -

There are a number of ways it fails in Safari. Specifically

Some videos seem to stop playing (Webtorrent failure?)

More Decentralized: About bar

The about bar links all currently go to https://archive.org/about/* etc

There is no endpoint on the Archive to get just the content of that page, and since its dynamically generated, the gateway cant easily cache it.

Can't use a relative link to a name e.g. dweb:/arc/archive.org/about because those pages aren't CORS so don't appear to be able to access them from javascript and pass to window.open

Meta - edge cases

Some of the many edge cases - listed here, but not necessarily prioritised - can generate individual issues if./when decide to tackle

UI minor: Title

The < title > in dweb.archive.org is fixed, but in archive.org its dependent on the item served

Need to:
Figure out where the title comes from (from the metadata)
Work out where to set it - probably in Nav.js/factory

Main feature: Audio

Audio only sort-of works, single track items work fine, multi-track fail and should show

Example Gramophone.Music.From.Ceylon archive.org dweb.archive.org

I've added these to the originals directory as "multitrackaudio.html" and .json

Next steps - preliminary edit of Audio.js to match

Feature edit

The editxml code inside Collection.banner is not going to work,
a) its an absolute link
b) we aren't logged in.
For now its commented out.
Note similar issue to #8

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.