Giter Club home page Giter Club logo

zkapauthorizer's Introduction

Project Hosting Moved

This project can now be found at https://whetstone.private.storage/Privatestorage/PrivateStorageio

PrivateStorageio

The backend for a private, secure, and end-to-end encrypted storage solution.

Documentation

There is documentation for:

  • Users: docs/user/README.rst
  • Operators/Admins: docs/ops/README.rst
  • Developers: docs/dev/README.rst

The documentation can be built using this command:

$ nix-build docs.nix

The documentation is also built on and published by CI.

zkapauthorizer's People

Contributors

crwood avatar exarkun avatar hacklschorsch avatar meejah avatar tomprince avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

zkapauthorizer's Issues

lease maintenance code cannot call `stat_shares`

It fails with a Foolscap violation:

2020-02-25T15:40:33-0500 [-] Fuzzy timer service (lease maintenance service)
        Traceback (most recent call last):
          File "/nix/store/4ba7dicrqyiazsmwgqjj8p9xwx800gap-python-2.7.17-env/lib/python2.7/site-packages/twisted/internet/defer.py", line 151, in maybeDeferred                                                  
            result = f(*args, **kw)
          File "/nix/store/4ba7dicrqyiazsmwgqjj8p9xwx800gap-python-2.7.17-env/lib/python2.7/site-packages/_zkapauthorizer/lease_maintenance.py", line 402, in <lambda>                                            
            maintain_leases,
          File "/nix/store/4ba7dicrqyiazsmwgqjj8p9xwx800gap-python-2.7.17-env/lib/python2.7/site-packages/twisted/internet/defer.py", line 1613, in unwindGenerator                                               
            return _cancellableInlineCallbacks(gen)
          File "/nix/store/4ba7dicrqyiazsmwgqjj8p9xwx800gap-python-2.7.17-env/lib/python2.7/site-packages/twisted/internet/defer.py", line 1529, in _cancellableInlineCallbacks                                   
            _inlineCallbacks(None, g, status)
        --- <exception caught here> ---
          File "/nix/store/4ba7dicrqyiazsmwgqjj8p9xwx800gap-python-2.7.17-env/lib/python2.7/site-packages/twisted/internet/defer.py", line 1418, in _inlineCallbacks                                              
            result = g.send(result)
          File "/nix/store/4ba7dicrqyiazsmwgqjj8p9xwx800gap-python-2.7.17-env/lib/python2.7/site-packages/_zkapauthorizer/controller.py", line 750, in bracket                                                    
            result = yield between()
        foolscap.tokens.Violation: Violation: ('RIStorageServer.tahoe.allmydata.com(<RemoteReference at 0x7fc143290c10 [pb://ysj4...@tcp:x.x.x.x:8898/3nml...]>) does not offer stat_shares',)

Hard-code 512000 of tokens/passes for voucher redemption for testing

Currently a voucher is redeemed for 100 passes. This is probably a long way from a realistic number of passes to issue in exchange for one payment because pass value will be kept fairly low to avoid wasted value.

Change the hard-coded value to 512000 for testing at a more realistic scale.

Add an interface to publish the cost (in ZKAPs) of stored data, as observed during the last lease crawl

To allow users to plan their storage usage and their purchasing of new ZKAPs, it would be helpful for them to know what it costs to maintain the data they currently have stored.

The lease renewal process defines the cost of this maintenance. As currently implemented, it periodically visits all files reachable from a root capability and renews leases of all objects found if they don't already have a lease of at least some configurable duration.

Lease renewal costs a number of ZKAPs proportional to the size of the object, with quantization. This makes the total cost of maintaining objects for one period equal to:

maintenanceCost = sum 
  [ quantizedSize(object) / ZKAPValue 
  | object <- reachableObjects(rootCap) 
  ]

And this value should be exposed in the HTTP API somewhere. A client UX can then estimate time to ZKAP depletion by dividing the number of ZKAPs left by the cost to maintain objects:

estimatedDepletion = remainingZKAPs / maintenanceCost * leasePeriod

The HTTP API should present the information like so:

GET /v1/unblinded-token

200 OK
Content-Type: application/json

{ ...
, "lease-maintenance-spending": {
    "when": <ISO8601 timestamp>
  , "amount": <integer ZKAP count>
  }
}

where "when" gives an approximate timestamp when the spending took place and "amount" gives the number ZKAPs which were spent on lease maintenance.

If lease maintenance has never been performed then the value for lease-maintenance-spending will be null instead.

Manage a database of ZKAPs available for use

  1. Populate the database with locally generated dummy tokens when a voucher is submitted for redemption.
  2. Updated the voucher status in the database so it is reflected in the status API.
  3. Consume tokens from the database when submitting API requests to storage servers.
  4. Correctly reflect ZKAP state in the API.

Require authentication for the web interface

We don't want to allow arbitrary processes with access to the Tahoe-LAFS web API to interact with the plugin interface. We want to expose information about vouchers in this interface so that nice clients (eg GridSync) can provide a good UX. We don't want to leak this information to attackers (they might steal unredeemed vouchers, for example).

Do the usual Tahoe-LAFS thing where the web interface requires access to the private node directory (by requiring a secret be read from there to use the web interface) so that filesystem permissions can be used to control access to the web API.

Stop using integer sets to represent groups of share sizes for pass calculations

_zkapauthorizer.storage_common.required_passes takes share_sizes as an argument. This is a set of int where each element is meant to represent the size of one share. The number of passes required is then the total of the sizes divided by the value of a pass.

The problem, of course, is that shares for a storage index are mostly (if not always) the same size so a "set" like {n, n, n} is actually just {n} which looks like it is 1/3rd the price.

renew_leases_on_server called with the wrong kind of server

renew_leases_on_server expects a ZKAPAuthorizerStorageClient for the server parameter. However, renew_leases calls it with a NativeStorageServer.

NativeStorageServer.get_storage_server will return the ZKAPAuthorizerStorageClient, though possibly the code also makes the mistake of thinking that every storage server connected will be one of ours.

Add an interface to get all unused unblinded tokens

This enables a controller to implement a backup/restore system that preserved the unblinded tokens (which can be made into ZKAPs) which came from redeeming a voucher and which will not be recoverable if the node directory is lost.

Fix the interaction between leases and mutable writes

Mutable writes are performed using slot_testv_and_readv_and_writev. The implementation of this method also renews leases.

If every write implies a lease renewal then writing becomes very expensive for users or charging a fair price becomes very hard for storage providers.

Everything is simpler if writes and leases aren't linked like this. Change the plugin storage protocol so that mutable writes have nothing to do with leases.

Considering `new_length` in `slot_testv_and_readv_and_writev` for storage usage is error prone and/or complicated

Our storage server tries to calculate the cost of storage using either the real amount of data stored or the virtual maximum size of shares deriving from the new_length field in tw_vectors (which can either extend or truncate a share without actually writing any data).

It seems reasonable to consider the virtual size of a share when calculating costs since by accepting such an operation we're strongly suggesting we will eventually accept the real data for the share.

However, actually correctly calculating storage usage accounting for new_length proves complicated and quite difficult without violating a bunch of abstraction boundaries in Tahoe-LAFS (for example, the virtual size of a share isn't exposed via any public-looking Python API).

Retry redemption on failure

If voucher redemption fails then we should try it again. If the client crashes/exits and then starts up again with an unredeemed voucher, we should try to redeem it.

Consider the privacy consequences of these operations.

Remove the pip-tools-based dependency management for CI

pip-tools can't handle versions of dependencies from git. This is not often what you want but when you want it, you really want it. pip-tools will write out a requirements.txt file containing git+https uris, if you ask it to, but pip will not read them unless you also omit hashes for all packages.

See #36

Add an interface for getting the number of ZKAPs remaining

For debugging, reporting, UX purposes (such as estimating depletion time) it would be useful to know how many unused ZKAPs remain in the local database. Together with #55 this should allow an estimate for the time when there are no more ZKAPs in the local database.

Use Nix to pin dependencies for CI

Nix is a better packaging tool than the smog of tools that centers around pip. It has better support for different archives formats (including VCS), better tools for updating some dependencies, better support for non-Python packages, etc.

This is blocked on removing the pip-tools-based dependency management.

The lease maintenance code assumes the rootcap always exists

It might not exist very early in the lifetime of the first run of the Tahoe-LAFS process.

Currently, this causes node startup to fail which prevents the rootcap from ever being created.

Instead, the lease maintenance code should be tolerant of this condition.

Account for ZKAPs spent renewing leases

Somewhere, an account of how many storage indexes and their sizes must be made, along with the number of ZKAPs spent on the renewals. This will support reporting to a user so they know what they're spending.

This information must be recorded during lease maintenance and then exposed via the web interface.

The codecov reports are missing

CI is configured to collect coverage information from the test suite and submit it to codecov. However, a look at the codecov badge or the page for the project reveals no coverage information.

The number of tokens to submit with a voucher redemption is not consistently respected

PaymentController.redeem accepts a number of tokens and supplies a default. If a value is given, it is used. However, if redemption doesn't succeed here then when the retry logic activates and the redemption is tried again, the value is ignored and the default is used.

This is related to the fact that on retries the original tokens are not re-used as they should be (since the correct number of tokens is already persisted in the VoucherStore but this is ignored during retries as well).

"aniso8601" and "treq" dependencies are undeclared to setuptools

Attempting to run a tahoe daemon with the current (092c482) ZKAPAuthorizer plugin enabled produces the following exceptions:

 2020-01-02T19:45:45-0500 [-] Unhandled Error
 	Traceback (most recent call last):
 	  File "/home/user/code/tahoe-lafs/src/allmydata/storage_client.py", line 674, in _make_storage_system
 	    self.get_rref,
 	  File "/home/user/code/tahoe-lafs/src/allmydata/storage_client.py", line 584, in _storage_from_foolscap_plugin
 	    in getPlugins(IFoolscapStoragePlugin)
 	  File "/home/user/code/tahoe-lafs/src/allmydata/storage_client.py", line 582, in <dictcomp>
 	    plugin.name: plugin
 	  File "/home/user/.local/virtualenv/py27/lib/python2.7/site-packages/twisted/plugin.py", line 213, in getPlugins
 	    allDropins = getCache(package)
 	--- <exception caught here> ---
 	  File "/home/user/.local/virtualenv/py27/lib/python2.7/site-packages/twisted/plugin.py", line 171, in getCache
 	    provider = pluginModule.load()
 	  File "/home/user/.local/virtualenv/py27/lib/python2.7/site-packages/twisted/python/modules.py", line 392, in load
 	    return self.pathEntry.pythonPath.moduleLoader(self.name)
 	  File "/home/user/.local/virtualenv/py27/lib/python2.7/site-packages/twisted/python/reflect.py", line 308, in namedAny
 	    topLevelPackage = _importAndCheckStack(trialname)
 	  File "/home/user/.local/virtualenv/py27/lib/python2.7/site-packages/twisted/python/reflect.py", line 255, in _importAndCheckStack
 	    reraise(excValue, excTraceback)
 	  File "/home/user/code/ZKAPAuthorizer/src/twisted/plugins/zkapauthorizer.py", line 19, in <module>
 	    from _zkapauthorizer.api import (
 	  File "/home/user/code/ZKAPAuthorizer/src/_zkapauthorizer/api.py", line 32, in <module>
 	    from ._plugin import (
 	  File "/home/user/code/ZKAPAuthorizer/src/_zkapauthorizer/_plugin.py", line 53, in <module>
 	    from .model import (
 	  File "/home/user/code/ZKAPAuthorizer/src/_zkapauthorizer/model.py", line 37, in <module>
 	    from aniso8601 import (
 	exceptions.ImportError: No module named aniso8601
2020-01-02T19:51:52-0500 [-] Unhandled Error
 	Traceback (most recent call last):
 	  File "/home/user/code/tahoe-lafs/src/allmydata/storage_client.py", line 674, in _make_storage_system
 	    self.get_rref,
 	  File "/home/user/code/tahoe-lafs/src/allmydata/storage_client.py", line 584, in _storage_from_foolscap_plugin
 	    in getPlugins(IFoolscapStoragePlugin)
 	  File "/home/user/code/tahoe-lafs/src/allmydata/storage_client.py", line 582, in <dictcomp>
 	    plugin.name: plugin
 	  File "/home/user/.local/virtualenv/py27/lib/python2.7/site-packages/twisted/plugin.py", line 213, in getPlugins
 	    allDropins = getCache(package)
 	--- <exception caught here> ---
 	  File "/home/user/.local/virtualenv/py27/lib/python2.7/site-packages/twisted/plugin.py", line 171, in getCache
 	    provider = pluginModule.load()
 	  File "/home/user/.local/virtualenv/py27/lib/python2.7/site-packages/twisted/python/modules.py", line 392, in load
 	    return self.pathEntry.pythonPath.moduleLoader(self.name)
 	  File "/home/user/.local/virtualenv/py27/lib/python2.7/site-packages/twisted/python/reflect.py", line 308, in namedAny
 	    topLevelPackage = _importAndCheckStack(trialname)
 	  File "/home/user/.local/virtualenv/py27/lib/python2.7/site-packages/twisted/python/reflect.py", line 255, in _importAndCheckStack
 	    reraise(excValue, excTraceback)
 	  File "/home/user/code/ZKAPAuthorizer/src/twisted/plugins/zkapauthorizer.py", line 19, in <module>
 	    from _zkapauthorizer.api import (
 	  File "/home/user/code/ZKAPAuthorizer/src/_zkapauthorizer/api.py", line 32, in <module>
 	    from ._plugin import (
 	  File "/home/user/code/ZKAPAuthorizer/src/_zkapauthorizer/_plugin.py", line 57, in <module>
 	    from .resource import (
 	  File "/home/user/code/ZKAPAuthorizer/src/_zkapauthorizer/resource.py", line 52, in <module>
 	    from .controller import (
 	  File "/home/user/code/ZKAPAuthorizer/src/_zkapauthorizer/controller.py", line 66, in <module>
 	    from treq import (
 	exceptions.ImportError: No module named treq

As expected, a pip install aniso8601 treq fixes the above two errors. Accordingly, the two missing dependencies should be explicitly declared (in both setup.py and requirements.txt?) so that they get pulled in when installing ZKAPAuthorizer via setuptools/pip.

Fix the names of "test_vectors" and friends

_zkapauthorizer.tests.strategies includes several strategies that begin with "test_...".

Certain silly overzealous test collectors might mistake these for tests. Figure out an alternate naming convention for these things and rename them.

Remove excessive privateness

There isn't much of a public interface to the Python package we implement. However, some of the steps taken to indicate privateness are redundant.

For example, _zkapauthorizer is private. Anything that is accessed beneath that is already, therefore, private. For example, _zkapauthorizer._storage_client is private by virtue of the first _ in the FQPN. The 2nd is redundant.

Watch out for cases where the privacy is not redundant. For example, the _ in _get_rref in _zkapauthorizer._storage_client.ZKAPAuthorizerStorageClient._get_rref looks like it might be redundant but it isn't because instances of ZKAPAuthorizerStorageClient are handed out to application code via the plugin system which is all public: twisted.plugin.getPlugins(allmydata.interfaces.IFoolscapStoragePlugin)[0].get_storage_client().

Rename "Secure Access Tokens" to be "zero-knowledge access passes"

Or "ZKAPs" for short.

"Secure Access Token" has a couple shortcomings:

  • "Secure" is vague. Beyond giving users good feelings here it's not clear what it might mean.
  • "Tokens" are at most half of the story. There are "random" and "blinded" tokens. There are also blinded and unblinded token signatures. There are also "passes".

Discussion of these issues and the nature of the value in question led to the idea of "zero-knowledge access passes". The passes are truly "zero-knowledge" in the sense that they prove the bearer was granted a pass and they leak no additional knowledge (and the system comes with a ZK proof of this). "Passes" in the name refocuses the concept on the part of the system where user interaction occurs and value (i.e., storage) is exchanged. The tokens are important but they're essentially internal. Passes are actually exchanged between client and server.

Add voucher state information to voucher list interface

For debugging, reporting, UX purposes it would be useful to know what the current state of each voucher known by the client side of the plugin is.

Here's my conception of the possible states of a voucher are:

  • Known but unredeemed. This is the starting state of a voucher.
  • Redemption in progress. This is a transient state that exists as long as there is an in-flight API request to the issuer to redeem the voucher for ZKAPs. Some care needs to be taken with this state as it does not persist across process restarts.
  • Redemption failed for non-payment. If a request to the issuer received a "payment required" response then the voucher has not been redeemed and redemption will not succeed until payment is made.
  • Redemption failed for double-spend protection. If a request to the issuer received an "already redeemed" response then the voucher was already redeemed, either earlier by this client or by another one entirely. Redemption won't ever succeed.
  • Redemption failed for another reason. Other errors from the issuer might be transient, related to bugs in the client or server, or due to externalities like network conditions. The redemption has not yet succeed and a retry will be made at some point.
  • Redemption succeeded. The voucher has been redeemed for ZKAPs and the ZKAPs have been added to the local database.

This this is not set in stone and the point here is to reflect implementation states to make it easier for people to understand what is happening in the system.

Since we expose a JSON-based HTTP API the states of a voucher can be represented as a short, simple string. For example, "failed-for-non-payment".
We should document the specific meaning of each such string we invent. This could be included in the response in a "description" field or some such.

Set up CI for testing

There's an automated test suite. Run it automatically!

Since we're living on the edge and importing from Tahoe-LAFS python packages we should test against Tahoe-LAFS master@HEAD somehow. That might be all we do for now since this code is definitely broken and useless against all released versions of Tahoe-LAFS.

Deny writes to slots without an active lease

Storage usage is authorized in two ways. First, by requiring ZKAPs to add leases to shares and by eventually garbage collecting shares sometime after all their leases expire. Second, by denying write access to mutable shares when they have no unexpired leases. Read access is never denied to support using the grid itself to coordinate between devices (for example, to exchange ZKAPs or vouchers). This has the benefit of providing a more graceful exit from the service for users who decide to take their data elsewhere.

This ticket is about the second part: denying write access to mutable shares with expired leases.

Figure out the weirdness with adaption relating to SecureAccessTokenAuthorizerStorageServer

_storage_server.py ends with

# I don't understand why this is required.                                                                                                                                                                                                                                     
# SecureAccessTokenAuthorizerStorageServer is-a Referenceable.  It seems like                                                                                                                                                                                                  
# the built in adapter should take care of this case.                                                                                                                                                                                                                          
from twisted.python.components import (
    registerAdapter,
)
from foolscap.referenceable import (
    ReferenceableSlicer,
)
from foolscap.ipb import (
    ISlicer,
)
registerAdapter(ReferenceableSlicer, SecureAccessTokenAuthorizerStorageServer, ISlicer)

As the comment says, it's not clear to me why this is necessary. Figure it out and fix whatever the issue is so this code can go away or explain why it needs to remain.

Expose size and lease expiration information in a single network API

Add something like this:

def stat_shares(storage_index: [StorageIndex]) -> [ShareStat]:
    """
    For each storage index, retrieve the size and latest lease expiration time (or None for missing information).
    """

where ShareStat has the stored size of the shares for the given storage index and the lease expiration times, if there are any leases.

This is required for automatic lease maintenance, #44. See discussion there for reasoning.

Automatically maintain leases

The premise of the plugin is that leases are maintained with a scarce resource (ZKAPs) and shares without leases can be garbage collected to reclaim another scarce resource (storage).

For this to work, something must actually maintain leases on shares which are meant to remain alive. There is an existing UI for this in the Tahoe-LAFS CLI, eg tahoe deep-check --add-lease. The user could keep their data alive by using this to manually renew leases.

It should be possible to provide a better experience by renewing leases automatically, though. This removes the need for the user to remember to perform the task and creates the possibility that the task can be performed in a way that leaks less information to observers (the user may fall into patterns that can be identified, automation may be able to avoid doing so).

Rename "payment reference number" to be "vouchers"

"payment reference number" has a number of shortcomings as a name for the concept it is attached to.

  • The value itself is not really numeric. Canonically, it's an opaque byte string.
  • It may be desirable to issue these when no payment has actually taken place (eg, trial access)

Discussion of these issues and the nature of the value in question led to the idea of "voucher" as the correct word.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.