jvilk / browserfs Goto Github PK

BrowserFS is an in-browser filesystem that emulates the Node JS filesystem API and supports storing and retrieving files from various backends.

License: Other

JavaScript 0.26% TypeScript 99.74%

browserfs's People

Contributors

Stargazers

Watchers

Forkers

angrave imgoodisher ksalomon brighthas prodigeni rektide lynchpin4 timdream wjordan sheepmaster luiseduardohdbackup bpowers wanderwang uprootstaging evanhahn juicetin tinkerstudent yanlinaung raydouglass piperswe gnunicorn bogdan-pr narazaka db48x hhugo runt18 amilajack bolinfest st3alth wiltonlazary emrul neuroradiology omkarkirpan hbcbh1999 xning ecoblockchain supasate idkwim choudhary001 jinankjain bpd1069 sato-shi stephensong yetanothertimes aknysh whatupdave fengweijp olivierh59500 sprime01 alastrat empia justincui lixinlin stefb965 codefrau nicolealese wmhilton-contrib thestartcup alphapoint willemneal swivelgames jochemstoel openaccess gregpaulos 1j01 happy-ferret erickwang gabouh rtwjs diginet-ab the-cc-dev mxrth opensourcedemocracy hugebean corhere mizchi mikeleshtembari privatesky hexxeh bluecodecat iznotek ajainvivek alex-kovoy chubbymaggie knmkr justforkin marionebl akgurjar golegen sahwar tosh js-van beeman reklawnos puzoliang martin12333 nfriend seuliang brodycj tjkoury

browserfs's Issues

Fix file paths in mountable file system error messages

Right now, error messages thrown by the MFS are created by the mounted file systems themselves, which have no idea where they are mounted.

If we standardize our error messages, then the MFS can interpose on error messages to append the mount point to them.

This is low priority for now.

Figure out how authentication / storage requests work into the API

IndexedDB requires the user to allow the webpage to store data on the current webpage, Dropbox requires the user to log in...

IndexedDB Filesystem

Add an IndexedDB file system that is the following:

Asynchronous-only (the synchronous API isn't widely supported, and will only be usable from webworkers -- an uncommon use case right now).
Supports permissions.

I may eventually add symlink support (just to have a symlink-capable file system), but it's much less important right now than the above.

I imagine this will be the most useful browser-local backend for BrowserFS, as IndexedDB is decently supported across browsers and can store large files.

Build system

I need a simple build system to compile and compress BrowserFS into a single module.

Define how to make filesystems that "fall back"

e.g. BrowserStorage defaults to IndexedDB if it's available, but fall backs to localStorage if it's not available.

Correct backend file writing implementation

Certain backends will generate parent directories that don't exist when a file is written.

They should be updated so that writes only succeed when the parent directory of the file in the given path already exists. This will probably be achieved by editing the sync method of their custom File subclass.

Backends currently known to have this issue:

Dropbox
WebStorage
In-memory

Stat object

We need a stat object to mirror Node's stat.

I need to investigate if this is mutable in node. We can do clever things with getters/setters if needed.

Compress localStorage file names

We currently store files in localStorage like so:

localStorage[/path/to/file] = binary string representation of data

We should be able to convert the path into a binary string:

localStorage[binary string representation of /path/to/file] = binary string representation of data

The keys are counted against the localStorage storage quota, so this could be a decent savings.

We'd still repeat parent directories, though, but it's better than what we currently do.

Support buffer reads at odd offsets

Support reading 2 byte numbers from odd offsets and such.

Specify and document FileSystem API

What do users need to implement for custom filesystems?

What can they optionally implement?

Performance tests

I think it would be good to get some performance tests added to the test suite before spending any time implementing new methods to improve performance.

This would help us prevent regressions, and even in cases like lavelle/BrowserFS#11 where it's fairly obvious that it will improve performance, it's nice to be able to quantify and measure the improvement.

This would also allow us to make comparisons of the relative speeds of the different backends, to have some quantitative evidence for users when they're weighing up the advantages and drawbacks of each.

I couldn't find any performance tests in the Node repo, but it wouldn't be too difficult to write our own.

Fix IE9

IE9 passes all of our unit tests, but fails to load classes properly in Doppio (magic number failure). It's also absurdly slow due to the TypedArray implementation; so slow, that I was unable to get IE9's debug mode past preloading.

Solution (for slowness):

We only need a polyfill for DataView, which is what Buffers use. It should be easy to modify the DataView polyfill to directly use methods to convert its internal array of bytes into other forms. Right now, it constructs a new TypedArray and ArrayBuffer for each time you call a get function -- just for the small segment of the array that you are processing.

They did this to reuse code, but it's very very very slow.

It would be manually intensive, but probably simple, to change the DataView polyfill to use the individual get and set functions that the typed arrays use directly to bypass this expensive object construction (which a profile reveals is the most expensive operation that's going on).

I am OK with absolutely ditching the TypedArray polyfills in favor of a shaved-down DataView-only polyfill. In fact, that's my favored solution right now.

This is high priority for the Doppio merger, but low priority for me for now. As far as I can tell, this is the only issue blocking the Doppio merger.

Note to @perimosocordiae: I would be OK with performing the merger as-is and leaving this bug for later fixing.

Specify and document ProxyFile API

How do ProxyFile stubs work when the filesystem uses no specialization?

How can they work with specialization?

Does the filesystem know ahead of time if the request is going to be proxied, or does it only support translating existing FDs into ProxyFiles (which I think is ideal)?

Maybe it would be good to have a FileSystem.GetProxyFd(fd) function. 😄

Buffer: Support BE write/reads

We only have little endian support atm.

Support for backend-specific unit tests

We need a mechanism to say 'run these tests for this particular backend'.

Some example tests:

localStorage: Fill up localStorage and ensure an appropriate error is thrown. I've heard some versions of IE silently fail; this would be a useful test.
FileSystem: Try to use a FS that isn't appropriately provisioned.
Dropbox: Try to use Dropbox when the user hasn't authenticated.

The easiest way to do this: Define a generateUnitTest function on file systems with specific tests. We include this as part of the testing code; it's not distributed with BFS at all. We pass it the structures it needs to generate its specific tests.

We could even potentially use this as a factory method for constructing backends for testing. Meaning, it's responsible for constructing the backend, generating generic unit tests, generating specific unit tests, etc. We can define generateUnitTest on FileSystem which handles generating generic unit tests.

Take a modular approach to backends

I opted to make my Dropbox backend a separate repo, because it seems like it should be an optional plug-in rather than core functionality. The four existing filesystems are universal enough to be part of the core, but I propose that any future ones be modular plugins.

We could create a method for registering a new backend on the BrowserFS namespace, like _.mixin, so validation can be performed.

Others could then add their own backends without needing to send PRs to this repo.

We could have a wiki page with a list of all backends.

I can then add other cloud storage providers (Box, Google Drive, etc.) this way.

Backends can list this repo as a dependency in their bower.json.

This would also suggest separating the test suite into its own repo. I'm trying to hook the Dropbox backend up to the Karma test suite but because the tests are in this repo, I'm just doing a messy ../dropbox-fs/dropbox.js hack in karma.conf.js.

Since the external API is always the same, only the implementation differs between backends, the one test suite should be enough for all plugins (plugins can obviously have their own test suites as well for testing internal behaviour).

Add generic properties support to `PreloadFile`

PreloadFile should have a generic implementation for property-related functions, like chmod.

Since PreloadFile has a reference to the file system that created it, these implementations should be guarded with:

unless @_fs.supportsProps()
  cb new BrowserFS.ApiError BrowserFS.ApiError.NOT_SUPPORTED

Make cloud storage accounts under an organisation name

Currently the Dropbox and Google Drive apps are registered under my accounts. We should create accounts with these services with some BrowserFS email address, so that we can all log in and access the app consoles, generate new API keys etc.

Any feedback users have would then go to the shared email account too, which is better than it going to my personal account.

Tool to load in test fixtures

I'm already working on writing this.

Add a tool that emits a JavaScript script that loads a directory of test files into the current file system.

This will allow us to use Node's fixtures.

Note that this will need to be synchronous for now due to how our tests are run, so I'll likely hack it in specifically for the localStorage test. Later on, if we can make the tests asynchronous, we can alter the tool to emit asynchronous loading code that triggers the test start when it completes.

Write a filesystem that supports mounting

Write a simple in-memory filesystem that allows you to mount other filesystems at arbitrary locations.

This provides a simple mechanism for interacting with files across persistent storage mechanisms.

Create tmp directory in test setup code

This currently depends on an implementation bug where parent directories are generated when files are written to.

Fix Arbitrary wildcard makefile build order (causes dev doppio builds to fail)

Hi,
I'm ran into some build issues with the latest version of doppio and browserfs.

Everything is fine if I used the pre-built git versions of vendor/browserfs/dist

browserfs also passes its tests.

However when I build my own browserfs and copy the new versions from browser/lib into browserfs/dist,

then I can no longer run doppio debug in the browser.
but make doppio test passes! Grrrr. Here's the original symptom.

Line 699
BrowserFS.node.fs.Stats = function() {
Uncaught TypeError: Cannot read property 'node' of undefined
Stats.FILE = 1;
Stats.DIRECTORY = 2;
Stats.SYMLINK = 3;
Stats.SOCKET = 4;

The underlying cause: a build order / dependency issue because Make 'wildcard' does not gauarnantee a build order.
Using Make's sort function is a reaonabale workaround, if you're happy requiring a modern version of Make .

FYI steps to finding the problem.

In the official 'dist' builds Browser.node is defined early ...

grep 'BrowserFS.node.*=' dist-orig/browserfs.js

        BrowserFS.node = {};
    BrowserFS.node.Buffer = function() {
    BrowserFS.node.fs = function() {
    BrowserFS.node.fs.Stats = function() {
    BrowserFS.node.path = function() {
                    resolved = BrowserFS.node.path.normalize(cwd + (cwd !== "/" ? path.sep : "") + resolved);
    BrowserFS.node.process = function() {
                data = BrowserFS.node.Buffer((_ref2 = req.response) != null ? _ref2 : 0);

In my locally built version, BrowserFS.node = {} appears too late...

grep 'BrowserFS.node.*=' dist/browserfs.js

    BrowserFS.node.fs.Stats = function() { <-- fs_stats TOO EARLY!
    BrowserFS.node.process = function() {
    BrowserFS.node.path = function() {
                    resolved = BrowserFS.node.path.normalize(cwd + (cwd !== "/" ? path.sep : "") + resolved);
    BrowserFS.node.Buffer = function() {
    BrowserFS.node.fs = function() { <-- SHOULD BE BEFORE fs.Stats above
        BrowserFS.node = {};  <-- 000-browserfs.js SHOULD BE FIRST
                data = BrowserFS.node.Buffer((_ref2 = req.response) != null ? _ref2 : 0);

My suspicion fell on the following ...
SRCS_CORE := $(wildcard src/core/*.coffee)

FYI On my OSX machine with Make v3.82
SRCS_CORE := $(wildcard src/core/*.coffee)
is not sorted
In fact, a google search shows this is true for 3-.82 onwards, despite what the docs say -

https://bugzilla.redhat.com/show_bug.cgi?id=635607
"The NEWS file for make-3.82 says:

WARNING: Future backward-incompatibility!
Wildcards are not documented as returning sorted values, but up to and
including this release the results have been sorted and some makefiles are
apparently depending on that. In the next release of GNU make, for
performance reasons, we may remove that sorting. If your makefiles
require sorted results from wildcard expansions, use the $(sort ...)
function to request it explicitly.

If I'm reading "up to and including this release" right, the order was NOT supposed to change in this version. But it did.

Fortunately, Make supports a sort command, so I'll send a pull-request with the following -
#From Make 3.82 onwards, wildcard returns filenames in arbitrary order.
#Alphabetically sorting the values is sufficient for browserfs build dependencies.
SRCS_CORE_UNSORTED := $(wildcard src/core/*.coffee)
SRCS_CORE := $(sort $(SRCS_CORE_UNSORTED))

Alternatively one could split the core, or just explicitly name the files in the Makefile.
Best,
Lawrence.

HTML5 FileSystem API backend

Currently this is only supported in Chrome, but apparently it's coming in 'near future' versions of all other desktop browsers.

I found a library which wraps it in a UNIX-like API, which should make tying it to Node's own UNIX-like API super easy. Of all the non-cloud backends, this looks like the best, feature-wise, so I think we should plan for the future when this has better support.

Chrome is the most popular browser already, so IMO it would be worth starting on it now, rather than waiting for the other browsers.

Fix `binary` StringUtil behavior

It appears that using the ASCII string util for the binary string format is incorrect, and the Node test cases don't stress the issue.

I figured this out when I realized that Doppio used to use BINARY to get an array of numbers that represents the file. Our ASCII behavior actually removes the highest bit from the data like Node does, as Node defines ASCII as 7-bit ASCII.

I think this is a matter of copying+pasting the ASCII behavior without the truncation, but I'm not completely certain yet. This is low priority for now.

Add a directory of filesystem templates

This will make it even easier to make a custom BrowserFS file system (and I'd use them, too).

I should start by making templates for the following cases:

Read-only filesystems
Read/write filesystems

mkdir calls in the fixture loader are synchronous

Currently they are executed as

fs.mkdir("./test", mcb);
fs.mkdir("./test/fixtures", mcb);
fs.mkdir("test/fixtures/node", mcb);

fs.mkdir is an asynchronous method, meaning that, for example, /test/fixtures/ could be created before /test/, which would trigger an error since the parent of fixtures would not exist at that point. These calls should be made asynchronous by moving each one into the callback of its predecessor, to avoid these race conditions.

Async solution

Many methods in this library, and in APIs upon which it depends, are asynchronous, which could lead to some messy callback situations down the line.

I know the plan is to move to TypeScript some time. Async support in TypeScript is a while off (the 1.x series at least). Is there any way we can use some nice await/defer constructs in the meantime?

ES6's native async support is obviously ages away, but moving to IcedCoffeeScript would be pretty easy (it's a superset).

Store directories in localStorage

We're already storing a header of stat information for files, so it's natural to store a few bytes for directories as well. This allows us to persist empty directories, and it also paves the way for eventual symlink support.

Emulate `process.on('exit',...)` for unit tests

I'm not sure if this is possible, but it would make things nicer.

Currently, here's how our tests run:

Kick off test.
Wait until all fs callbacks are handled.
End test.

It would be nice to insert a second part of the unit test where the function registered with process.exit is run. I'll need to look at the Jasmine docs.

FileBuffer API

Implement a buffer, which is how node users specify what to write to a file.

Make a BrowserFS GitHub organisation and migrate the repos to it

I think we've agreed on keeping things modular and having separate repos for each backend (or each one not currently this repo, at least). To make it easier for users to find all the backends, it would be good to put them all under one account.

This account would have:

This core repo
The cloud backends
- Dropbox
- Google Drive
The local backends, if we extract them from this repo
The test suite, if we extract it from this repo (IMO a good idea, because otherwise the only way to test is to import the backends into the core repo, which is the wrong way round)
The website/homepage if we make one

Currently

[Core + Test suite] <- [Backends]

Proposed

[Test suite] <- [Backends] <- [Core]

Synchronous API support for LocalStorage FS

LocalStorage's API is synchronous, so I should add synchronous support to its file system.

Doing this will require adding synchronous API stubs in node_fs.coffee, filesystem.coffee, and file.coffee, which means this is blocked by #23.

Define how to set up BrowserFS for webworkers

Both sides need to set up the message passing interface correctly to handle browser FS requests without disturbing any existing messages.

Ideal solution: User can encapsulate the default supplied BrowserFS send/receive functions how they want, or use them directly. Allows users to use BrowserFS easily, or in place of webworker messages.

Race condition with process.exit functions

See commit b7d3d41.

Even with the hackfix, it's still possible to get errors on occasion.

Define how to instantiate the API

You should be able to instantiate the API in some way.

For example, for an easy one-filesystem case:
fs.instantiate(BrowserFS.BrowserStorage())

For multiple in a hierarchy, you'd have to use something more complex.

Validate Buffer writes

Validate the value to the various buffer write methods.

Convert to TypeScript

After experimenting with TypeScript with Doppio, I have determined that it is very suitable for projects such as this one.

As a result, sometime later in the summer I will likely move to TypeScript. I do not anticipate that it will take longer than a week of work, and the payments for switching over will be worth the amount of churn required to perform the conversion.

List of benefits:

Static guarantees that FileSystem implementations are appropriately defined to the interface.
Easy support for customized library compilations that only contain the filesystems you will be using -- TypeScript knows your dependencies among files.
Refactoring becomes tractable once we start increasing the amount of FS implementations.
New contributors get compilation errors when they accidentally misuse an interface rather than silent failures.

Unit tests

I need to use a unit testing framework that I can run in the browser, and potentially outside of the browser.

When designing this, maybe I should specify how to write an encrypted file with authentication information / test configuration for various filesystems. It would be password protected, so a password would be required to start the test.

Alternatively: Prompt for username/password before test begins, then maybe a browser-side password saver could handle saving things for convenience.

Make test registration asynchronous

OK, tests are launched properly now, but they are registered on page load.

If we can figure out how to register tests asynchronously, then we can trigger test registration after loading in any test files for #19.

Right now, the issue is that Karma's Jasmine adapter appears to trigger Jasmine tests once page load finishes. I wonder if there is a way to trigger it manually?

Serious XmlHttpRequest listings.json performance bug

You'll laugh, but due to a bug, this is how we currently download listings.json:

Download listings.json as text.
Put it into a buffer, which will convert the file into the binary representation of UTF8 character-by-character.
Run JSON.parse on the Buffer. This implicitly runs Buffer.toString, which converts the buffer contents from the binary representation of UTF8 strings back into a JavaScript string.

These conversions are all unnecessary.

And here I was, wondering why Doppio with BFS hangs IE8/IE9 on load! Doppio's listings file is something like 900KB.

Specify and document File API

FileSystems can have custom File objects.

What API do they need to specify?

What can they optionally specify?

Handle write/read streams

The Node API encourages using these when queueing up writes. Should be simple to implement generically as a queue of requests, right?

Calling stat with an empty string as the path doesn't throw an error

The BrowserFS core is meant to validate and normalise paths, so that backends can depend on the paths they receive being valid.

The empty string is not a valid path, however when the stat method is called with this as the path, the string reaches the backend's method, and that method must perform a check for this and throw an error, as I have done in the Dropbox and HTML5 FS backends.

This should be fixed in the core, and the workarounds removed from these two backends.

Here's the relevant test case that exposes this bug.

XmlHttpRequest Filesystem

Add an XmlHttpRequest filesystem that grabs files from a server. To start, it will use a JSON-based index file that contains any relevant file properties (mainly: file size, whether or not it's a directory, and potentially permissions information).

I'm working on this now, as it's required for Doppio integration.

Add optional synchronous API support

We should make it possible for file systems to implement a synchronous API.

Like system properties, this will be optional.

The default FileSystem implementation can be augmented to automatically provide the asynchronous API if the file system implements the synchronous API -- it'll just call the synchronous equivalent and then run the callback.

This should happen after asynchronous support is solid.

Support proper error codes

Ala http://www.gnu.org/software/libc/manual/html_node/Error-Codes.html

Not a big priority, but it would mean we're more complete.

Tightly bind DataView polyfill and Buffer

Doppio is unusably slow in IE9 due to all of the copying we are doing. To download a binary file using XmlHttpRequest, we copy the entire file eagerly this many times:

Download file
Convert file from VB safe array into JavaScript array
Convert JavaScript array into our Node Buffer polyfill, which:
- Instantiates a DataView, which writes an array the size of the file and instantiates each individual byte to 0.
- Copies JavaScript array into the DataView byte-by-byte.

We can skip the third and fourth copies if we instantiate our Buffer directly with the JavaScript array, which is in the same format that our polyfill internally stores the data as. This requires the Buffer implementation to be aware of the polyfill.

Make test launching asynchronous

Right now, Karma loads each script into the browser, and they execute immediately.

It would be better if we can dynamically trigger test start so we can better use Jasmine to kick-off tests (and maybe to emulate process.exit when all callbacks finish...). This would also make it simple to load in test files asynchronously by dogfooding our asynchronous API.

There are two ways to do this -- the good way, and the easy-but-bad way:

Good: Wrap each test in a closure, and deposit its runner in a known object (e.g. window.bfs_tests['test_name'] = function() {...)
Bad: Make two files that are concatenated directly before and directly after all of the test files that wraps everything in a single closure.

The good way would allow us to set up individual Jasmine tests for each Node test, which would be awesome!

Build a browserfs.js standalone library

This is a requirement for the newly-minted doppio-demo repository.

I can see this happening in two ways:

Select which FS types you want, then bundle them into a compressed JS file.
Throw the whole kitchen sink of FS types in, and let the library user select the configuration at startup.

Track 'dirty' content in PreloadFile

Currently, PreloadFile stores no metadata relating to what has changed.

At the least, PreloadFile should track:

If any new data has been written.
If any properties have been altered.