opencadc / storage-inventory Goto Github PK

View Code? Open in Web Editor NEW

2.0 6.0 9.0 11.85 MB

Storage Inventory components

License: GNU Affero General Public License v3.0

HTML 0.86% XSLT 2.95% Java 96.14% Dockerfile 0.03% Shell 0.02%

storage-inventory's Introduction

OpenCADC Storage Inventory System

The Storage Inventory system is designed to manage millions/billions of files for science data archive.

What is it? Concept Documentation

software components

versions

For libraries (cadc-{name}) the version is in the build.gradle file. Libraries are published to maven central under the org.opencadc groupId, for example org.opencadc:cadc-inventory.

For services and agents, the version is in the VERSION file. Docker images are published to the images.opencadc.org repository (currently a Harbor service).

storage-inventory-dm

This is the storage inventory data model and architecture documentation. TODO: Add an FAQ.

baldur

This is an implementation of the permissions service API using configurable rules to grant access based on resource identifiers (Artifact.uri values in the inventory data model).

Official docker image: images.opencadc.org/storage-inventory/baldur:$VER

critwall

This is an implementation of the file-sync process that runs at a storage site and downloads files.

Official docker image: images.opencadc.org/storage-inventory/critwall:$VER

fenwick

This is an implementation of the metadata-sync process that runs at both global inventory and at storage sites.

Official docker image: images.opencadc.org/storage-inventory/fenwick:$VER

luskan

This is an implementation of the metadata service that enables querying the storage inventory at both global inventory and storage sites. It is an IVOA TAP service that supports ad-hoc querying of the inventory data model.

Official docker image: images.opencadc.org/storage-inventory/luskan:$VER

minoc

This is an implementation of the file service that supports HEAD, GET, PUT, POST, DELETE operations and IVOA SODA operations.

Official docker image: images.opencadc.org/storage-inventory/minoc:$VER

raven

This is an implementation of the global locator service that supports transfer negotiation and direct file GET requests.

Official docker image: images.opencadc.org/storage-inventory/raven:$VER

ratik

This is an implementation of the metadata-validate process that runs at both global inventory and at storage sites.

Official docker image: images.opencadc.org/storage-inventory/ratik:$VER

ringhold

This is an implementation of a simplified part of the metadata-validate process that can be used to remove the local copy of artifacts from a site (file cleanup is done by tantar).

Official docker image: images.opencadc.org/storage-inventory/ringhold:$VER

tantar

This is an implementation of the file-validate process that compares the inventory database against the back end storage at a storage site.

Official docker image: images.opencadc.org/storage-inventory/tantar:$VER

vault

This is an implementation of an IVOA VOSpace service that uses storage-inventory as the back end storage mechanism.

Official docker image: images.opencadc.org/storage-inventory/vault:$VER

cadc-*

These are libraries used in multiple services and applications.

cadc-inventory: core data model implementation
cadc-inventory-db: database library
cadc-inventory-util: re-usable code
cadc-inventory-server: re-usable service code
cadc-storage-adapter: defines the interface between inventory and back end storage
cadc-storage-adapter-fs: storage adapter implementation for a POSIX filesystem back end
cadc-storage-adapter-ad: storage adapter for the legacy CADC Archive Directory storage system (temporary)
cadc-storage-adapter-swift: storeage adapter implementation for the Swift Object Store API (e.g. CEPH Object Store)
cadc-storage-adapter-test: re-usable test suite for storage adapter implementations

storage-inventory's People

Stargazers

Watchers

Forkers

andamian brianmajor at88mph jburke-cadc hjeeves yeunga manuparra tmtsoftware pdowler

storage-inventory's Issues

minoc: support http range requests for resumable downloads

complete RFC7233 support – [https://tools.ietf.org/html/rfc7233|https://tools.ietf.org/html/rfc7233]

limitation: minoc should support single byte range only so it can output binary with content-type: application/octet-stream (or maybe text/plain for part of a text/* document)

minoc and raven: implement permission checking

minoc:

config is documented and probably implemented
enable remote call (executed in parallel if multiple) to permissions service

raven:

document and implement config
enable remote call (executed in parallel if multiple) to permissions service

Common code can go into cadc-permissions-client as raven and minoc use that lib directly:

extract settings from MultiValuedProperties object
use credentials to call services and respond with first positive result

luskan: async results are stored and never cleaned up

async query results are stored in a local directory (inside container by default) and never cleaned up.

minoc: content-disposition needs double-quotes around filename value

could be a comma in the filename

cadc-storage-adapter-fs: cannot compute MD5 on the fly

Computing the MD5 checksum on the fly while iterating cannot perform adequately.

Instead:

store the checksum using a file system attribute at the end of a write (see cavern)
return the xattr value in the iterator (consistency with inventory)
compute the checksum on the stream during a read and reset the checksum xattr; log this at WARN level at least (detect corruption?)

Stretch feature:

implement an async process that scans the file system and verifies checksum attributes pre-emptively; once the main feature above is implemented this should be split into it's own RFE.

modify the permissions override in minoc and raven to be enabled for dev testing only

luskan: make the database schema customisable

The tap_schema content in luskan currently hardcodes the schema name to "inventory".

Introduce a luskan.properties file with
org.opencadc.inventory.db.schema={schema}

The some code to massage the schema in the tap_schema content appropriately. This should be easy enough since the base InitDatabase code allows for setting the schema and replacing in SQL files dynamically... it just hasn't been setup to replace comtent so that mechanism is in use for setting the schema of the tap_schema tables themselves... solvable.

cadc-inventory-db: artifact iterator auto commit hard to grok

the set autocommit to false is very far removed from set autocommit to true, making it hard to analyse and prove the behaviour is correct. It violates the normal best practice of having the start trasnaction and either commit or rollback close together in the code.

Probably: refactor to merge the ArtifactIteratorQuery and the ArtifactResultSetExtractor into a single class that does both the query and the iteration. There is no good way to determine that the caller has abandoned an iterator - only code review can help - but that's the same as the best practice txn handling mentioned above.

If done right, this will provide a good example/template for other streaming query result implementations.

cadc-inventory-db: API to manage siteLocations and storageLocation should not expose implementation/optimisations

The ArtifactDAO.put(Artifact) method has an alternate put(Artifact, boolean) to force the update so that transient state (not part of the entity and therefore not part of the metaChecksum) will still be written to the database. This exposes the current implementation details and will make changes and oprtimisations hard in future.

Methods to manage (add/set and remove) siteLocation and storageLocation values for an artifact should be provided instead.

critwall: not running in correct security context with Subject.doAs

tantar: can't distinguish between 'not authorized' and 'no results found'

When tantar queries a storage site running a TAP service, it cannot tell if the caller is authorized to do the query but no query results were found, or if the caller is not authorized to do the query. Both result in zero rows returned. A incorrect certificate, or an out of date certificate, can result in all data for the archive(s) queried being deleted from the inventory database.

minoc and StorageAdapter API: support very large files

S3 API limitation: maxmimum 5 GiB upload in a single stream, then must use multi-part upload

multi-part upload: minimum 5 MiB part size (except last part), content-length & content-md5 per part

no content-md5 checksum after re-assembly

cadc-storage-adapter and tantar: allow StorageMetadata with invalid flag to support cleanup

if invalid stored objects end up in storage, the StorageAdapter.iterator() has to be able to return a StorageMetadata object so that tantar can perform cleanup. Otherwise, failure modes that leave any garbage behind will block cleanup by tantar

changes:

StorageMetadata allows contentLength == 0
add StorageMetadata.isValid()
tantar can then cleanup (delete stored objects that are invalid) subject to reportOnly mode and policy

minoc: consistency - change some config file items to java system properties

The following should be moved from minoc.properties to catalina.properties for consistency with other tools:

# inventory database settings
org.opencadc.inventory.db.SQLGenerator=org.opencadc.inventory.db.SQLGenerator
org.opencadc.inventory.db.schema={schema}

Artifact Iteration crashes with very large dataset

Asking the ArtifactDAO to iterate over millions of rows results in an OutOfMemoryException as the results are actually gathered.

We may need a paginated solution. Here is the output from the Tantar project using the SQLGenerator class:

1456 [main] DEBUG SQLGenerator  - ArtifactGet: SELECT uri,uriBucket,contentChecksum,contentLastModified,contentLength,contentType,contentEncoding,siteLocations,storageLocation_storageID,storageLocation_storageBucket,lastModified,metaChecksum,id FROM inventory.Artifact WHERE storageLocation_storageBucket LIKE ? AND storageLocation_storageID IS NOT NULL ORDER BY storageLocation_storageBucket, storageLocation_storageID
java.lang.OutOfMemoryError: Java heap space
Dumping heap to java_pid1.hprof ...
Heap dump file created [1885647612 bytes in 10.179 secs]
445103 [main] DEBUG ArtifactDAO  - iterator: HST 443647ms
END
Exception in thread "main" java.lang.OutOfMemoryError: GC overhead limit exceeded
	at java.lang.String.toCharArray(String.java:2899)
	at java.util.zip.ZipCoder.getBytes(ZipCoder.java:78)
	at java.util.zip.ZipFile.getEntry(ZipFile.java:316)
	at java.util.jar.JarFile.getEntry(JarFile.java:240)
	at java.util.jar.JarFile.getJarEntry(JarFile.java:223)
	at sun.misc.URLClassPath$JarLoader.getResource(URLClassPath.java:1054)
	at sun.misc.URLClassPath.getResource(URLClassPath.java:249)
	at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
	at java.net.URLClassLoader$1.run(URLClassLoader.java:363)
	at java.security.AccessController.doPrivileged(Native Method)
	at java.net.URLClassLoader.findClass(URLClassLoader.java:362)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:419)
	at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:352)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:352)
	at org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:2212)
	at org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:311)
	at org.postgresql.jdbc.PgStatement.executeInternal(PgStatement.java:447)
	at org.postgresql.jdbc.PgStatement.execute(PgStatement.java:368)
	at org.postgresql.jdbc.PgPreparedStatement.executeWithFlags(PgPreparedStatement.java:159)
	at org.postgresql.jdbc.PgPreparedStatement.executeQuery(PgPreparedStatement.java:109)
	at org.opencadc.inventory.db.SQLGenerator$ArtifactIterator.query(SQLGenerator.java:488)
	at org.opencadc.inventory.db.ArtifactDAO.iterator(ArtifactDAO.java:142)
	at org.opencadc.tantar.BucketValidator.iterateInventory(BucketValidator.java:494)
	at org.opencadc.tantar.BucketValidator.validate(BucketValidator.java:187)
	at org.opencadc.tantar.Main.main(Main.java:108)

luskan: add test queries that hit the inventory tables

src/intTest/resources

at least one test query for each table:

SyncQueryTest-{table-name}.properties

LANG=ADQL
QUERY=SELECT TOP 1 * from t

cadc-storage-adapter-fs: incorrect iterator() and list() output in "comprehensible" StorageAdapter

The StorageLocation.storageBucket value is not assigned, resulting in an incorrect ordering of StorageMetadata objects. This issue was exposed by the now complete and consistent implementation of StorageLocation compareTo() and equals().

FileSystemIterator.next() line ~194
FileSystemStorageAdapterTest -- some tests may be disabled so an unrelated PR can be completed

storage-adapter API: replace unsorted Iterator with Set

An unsorted iterator does not actually enable the caller to perform useful streaming operations and is more complex to implement than filling and returning a Set.

Zero length files are not properly handled

Issuing a PUT for a zero length file causes the request to hang.

$ rm -f /tmp/zerosize.txt
$ touch /tmp/zerosize.txt
$ curl <auth> @zerosize.txt https://myserver.com/minoc/files/zerosize.txt

minoc, critwall, tantar: figure out how to include a single storage-adapter impl library

The current build includes multiple implementations: cadc-storage-adapter-ad (temporary), cadc-storage-adapter-fs, cadc-storage-adapter-swift.

Add plugins dynamically before starting tomcat?

Just live with a fat container? What about other users of minoc with their own private storage-adapter impl? Do they always have to build from source?

minoc: availability does not verify that credentials are valid

README consistency

module README.md should be organised like this:

configuration
building it
checking it
running it

Config templates vs examples should be clearly identified: placeholders in templates should be {surrounded} by curly braces.

To be re-organised: baldur, raven
Config templates/examples needing fixing: baldur, luskan, minoc, raven

rename: make Artifact.uri modifiable rather than immutable

How:

add Artifact.setURI(URI) method to Artifact && have it recompute uriBucket
add StorageAdapter.rename(Artifact orig, Artifact a)

Can the latter be mandatory? It puts another constraint on back end implementation when combined with wanting/encouraging the back end to be able to reconstruct artifact URI in StorageAdapter.iterator() methods (so that file-validate can recover/populate inventory from storage).

Luskan inventory.Artifact.contentChecksum should have xtype of uri.

implement automated database init in relevant components

The only component thats that run at all locations (storage site and global inventory) are fenwick (metadata-sync) and luskan (TAP service).

minoc is the only component that is usable standalone; it writes to the db; it currently implements automated database init (for storage sites)
not obvious which component should init the global database

cadc-storage-adapter-fs consistency: configure via java system properties

configure via system properties rather than custom file

implement correct row locking in tools that use cadc-inventory-db

minoc already implemements locking and transactions, but other tools do not:

tantar
fenwick
critwall

requires: all basic operations working so the set of db operations is known; then consistent sequence to prevent deadlocks can be determined and locking added.

minoc: move generate and validate pre-auth token to a library

tantar: support range of buckets in configuration

re-use the BucketSelector in critwall to support both single and range of buckets

this is useful with the opaque FS adapter and will be useful with any ceph (S3 or swift) adapters as well
proposal: create a new library cadc-inventory-util for common code, extract BucketSelector from critwall to lib
in the lib, BucketSelector will also be potentially useful to do metadata-validate (buckets in parallel)

baldur: make permissions expiry configurable

fenwick logic: ArtifactSync will run forever and could get ahead of DeletedArtifactEventSync

DeletedArtifactEvent(s) need to be processed before (new) Artifact(s) so that an overwrite (delete+put) at a storage site is processed correctly and does not cause a duplicate key failure (two artifacts with different ID but same URI).

cadc-storage-adapter-ceph: S3 API inadequate

putObject requires the correct content length (long) or the put fails.

tantar: Parallelize iterator queries

There are two potentially long running queries to provide the iterators needed for comparison. They could be done in two threads in parallel while tantar waits, rather than in sequence.

luskan and minoc consistency

At a single site, an artifact should only appear in the luskan query response if minoc is able to deliver the file.

Currently:

minoc does behave this way for GET and HEAD, but the body of the error message is different and leaks info (not available vs not found)
luskan returns all rows that satisfy a query independent of the storageLocation; when running as part of a storage site it should only return artifacts with non-null storageLocation (query injection); when running as part of a global inventory it should return all artifacts (there are no storageLocation values in global)

cadc-storage-adapter-fs modes

The URIBUCKET mode is more or less useless as implemented. It helps a little with debugging but doesn't have the right properties to really be usable as a filesystem or robust as an opaque storage backend.

In URI mode, the filesystem could be mounted (read-only) and users could (in principle) open and read files by knowing the Artifact URI (complication: would open by name be costly in a directory with millions of files e.g. in a flat URI structure?)

A new OPAQUE mode could use UUID for file names and place them in a hierarchy of buckets - basically URIBUCKET mode but with a non-reusable filename. The artifact URI coul;d be stored on the file using xattr; xattr support is already needed to store the contentChecksum (see issue #33).

storage-adapater-ad: list query must provide unique files only

Currently doesn't limit to only unique files. To maintain current AD functionality, add this.

storage-adapter API: fix all implementations to reject zero length files

StorageMetadata constructor now checks that contentLength > 0.

StorageAdapter.put() specifies that zero-length files are not allowed ( IllegalArgumentException). Implementations must throw and should auto-cleanup if possible.

luskan: AvailabilityPlugin does not check that TempStorageManager works

should check for config and check that the location is writable

Minoc relies on GMSClient.getGroups() which is not implemented

Issuing a HEAD request to a Minoc resource causes a call to getUsersGroups() in checkReadPermission(), which relies on the GMSClient.getGroups() function. This method is not implemented.

Consequently, this also prevents JDBC connections from being returned to the pool. The default pool has two connections, so two HEAD requests will cause Minoc to hang.

fenwick: multiple included condition files was a bad idea

Each include file defines a separate stream of new|modified Artiact events; that is hard to manage and won't play well with also processing DeletedArtifactEvent(s) in a timely fashion.

Should just be one optional /config/artifact-filter.sql (ish) file

Fenwick: Support multiple instances on site

Due to the nature of the HarvestState, only one Fenwick can be run on a single site. Currently the HarvestState is preserved per Item (i.e. Artifact.class, DeletedEvent.class, etc), but perhaps should be harvested per process instead to allow multiple instances to be run.

There are no workarounds for this currently.

cadc-storage-adapter-fs: must hide temp files until complete

The filesystem adapter writes to a temp file and performs an atomic move to final destination.

However, the temp file is visible within the normal tree during the write so will be picked up by the iterator method (e.g. file-validate process).

Recommend: divide the configured root directory into two separate sub dirs, e.g. "complete" and "transactions", create temp files in "transactions", and move them to "complete".

baldur: support group access to permissions

The baldur.properties file only supports specifying users to give access to permissions. Add a groups property to baldur.properties and give group members access to permissions.

luskan: make IdentityManager configurable

baldur: refactor authorization code

The code to check that a caller is authorized via config is very similiar to LogControlServlet and AvailabilityServlet (and generally: admin interfaces have this kind of A&A)... It is time to refactor into a common utility that can be re-used.

URI columns are being returned as Strings instead of URIs from the Luskan Schema

Querying for a URI column (i.e. inventory.StorageSite.resourceID) returns a String rather than a URI object in the TapClient. Can TAP identify URI columns?

minoc and StorageAdapter API: cutouts

StorageAdapter API needs to support optional data operations:

content-range requests
FITS metadata extraction
FITS pixel cutouts
HDF5 metadata extraction?
HDF5 pixel cutouts?
sky coordinate cutouts?

cadc-storage-adapter-fs: FileSystemIterator Files.getAttributeView does not work on macOS

The cadc-storage-adapter-fs module relies on a Files API from the installed JDK. The macOS file system does not lend itself to the extended attributes that way, but rather relies on the xattrs API. We would need a JNI solution:
https://github.com/leiless/xattr4j

To work properly. Understandably this will not be run on macOS, but it makes development a bit of a pain.

raven: implement negotiation of write to storage

Some minoc instances will accept writes and they should advertise this via their StorageSite record. The raven service at a global site can determine which sites are writable from the inventory database content.

raven: consistency - get rid of config file and use java system properties

The following should be moved from raven.properties to catalina.properties for consistency:

#inventory database settings
org.opencadc.inventory.db.SQLGenerator=org.opencadc.inventory.db.SQLGenerator
org.opencadc.inventory.db.schema=inventory