Giter Club home page Giter Club logo

laravel-filer's Introduction

Laravel Filer

This project was started to scratch my itch on our growing Laravel site:

  • Metadata for all files is stored in a local repository - Supported backing systems are json, database, and memory (for testing). This is designed to speed up certain operations which normally call out to the remote filesystems.
  • Handles fallback to the original disk - If you provide an "original disk" list, this adapter will attempt to import data from those original disks into the metadata repository.
  • Pluggable Strategies - While the current version ships with a single strategy, you can replace the Basic implementation which allows for 1 + async, mirror, or other interactions with backing storage adapters.
  • Manage data + files - Coming soon: query and manage the metadata to do things like:
    • Find files stored on a single service and mirror them
    • Migrate files between stores while still maintaining continuity
  • Abstract data from metadata - Planning at some point on allow things like deduplication and copy-on-write to make copies, renames, and deletions work better.

Getting Started

To get started, require the project.

Laravel 7, 8 (Flysystem V1):

composer require nvahalik/laravel-filer@^1

Laravel 9 (Flysystem V3):

composer require nvahalik/laravel-filer@dev-laravel-9

Once that's done, you'll need to edit the filer config file and then update your filesystem configuration.

Config File

By default, the metadata is stored in a JSON file. You can edit config/filer.php to change the default storage mechanism from json to database or memory. Note that memory is really a null adapter. The JSON adapter wraps memory and just serializes and saves it after each operation.

Filesystem Configuration

The configuration is very similar to other disks:

'example' => [
    'driver' => 'filer',
    'original_disks' => [
        'test:s3-original',
    ],
    'id' => 'test',
    'disk_strategy' => 'basic',
    'backing_disks' => [
        'test:s3-new',
        'test:s3-original',
    ],
    'visibility' => 'private',
],

The original_disks is an option if you are migrating from an existing disk or disks to the filer system. Effectively, this is a fallback so that files which are not found in the local metadata store will be searched for in original_disks. If they are found, their metadata will be imported. If not, the file will be treated as missing. We'll cover doing mass importing of metadata later on.

Note: that this will slow the filesystem down until the cache is filled. Once the cache is loaded, you can remove these original_disks and those extra operations looking for files will be eliminated.

Note 2: files which are truly missing do not get cached. Therefore, if a file is missing, and you repeatedly attempt to assert its existence, it will search over and over again. This could be improved by caching the results or likewise having some sort of missing-files cache.

id is just an internal ID for the metadata store. File duplications are not allowed within the metadata of a single id, for example, but would be allowed for different ids.

disk_strategy has only a single option currently: 'basic' but will be pluggable to allow for different strategies to be added and used. The basic strategy simply writes to the first available disk from the list of provided backing_disks.

backing_disks allows you to define multiple flysystem disks to use. Want to use multiple S3-compatible adapters? You can. Note that for the basic adapter, the order of the disks determines the order in which they are tried.

A couple of examples

Given the configuration above, if the following code is run:

Storage::disk('example')->has('does/not/exist.txt');
  1. The existing metadata repo will be searched.
  2. Then, the single 'original_disks' will be searched.
  3. Finally, the operation will fail.
Storage::disk('example')->put('does/not/exist.txt', 'content');
  1. The existing metadata repo will be searched.
  2. Then, the single 'original_disks' will be searched.
  3. Then, a write will be attempted on test:s3-new.
  4. If that fails, then a write will be attempted on test:s3-original.
  5. If any of the writes succeeds, that adapter's backing information will be returned and the entries metadata updated.
  6. If any of them fails, then false will be returned and the operation will have failed.

Importing Metadata

If you already have a ton of files on S3, you can use the filer:import-s3-metadata command to import that data into your metadata repository:

# Grab the existing contents.
s3cmd ls s3://bucket-name -rl > s3output.txt

# Import that data into the "example" storageId.
php artisan filer:import-s3-metadata example s3output.txt

The importer uses File::lines() to load its data, and therefore should not consume a lot of memory. Additionally, it will look at the bucket name in the URL which is present in the output and attempt to find that within your existing filesystems config.

Visibility

By default, it will grab this from the filesystem configuration. If none is found nor provided with --visibility, it will default to private.

Filename stripping

You can strip a string from the filenames by specifying the --strip option.

Disk

If you need to specify the disk directly or want to otherwise override it, just pass it in with --disk. This is not checked, so don't mess it up.

Example

php artisan filer:import-s3-metadata example s3output.txt --disk=some-disk --visibility=public --strip=prefix-dir/ 

The above command would strip prefix-dir/ from the imported URLs, set their visibility to public, and mark their default backing-disk to some-disk.

laravel-filer's People

Contributors

nvahalik avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

Forkers

smith981

laravel-filer's Issues

Support copying files

While copying files isn't part of my initial use case, it would be good to support copying files. The biggest question is how?

  1. Support something like copy-on-write by just creating a new metadata entry, setting some metadata indicating that this is a shared copy, and then updating adapter manager accordingly to write to a new file or copy it before any write takes place.
  2. Just copy the file on the adapters, honoring the rules and supporting the ability to copy to the same or to specify a specific backing adapter.

Currently, the rename functionality doesn't actually touch the backing adapters. It just changes the path.

Enable file migration

Here's the use-case:

I've been using a particular S3 provider for some time, but I'm worried about their stability. It would be great if their data could be migrated from one provider to another provider over time.

Here are 2 possible ways to handle this:

  1. Batch system -> migration jobs
  2. Manual migration

Batch System -> migration jobs

  1. Create a command or some ability to provide a list of metadata records (e.g. using the existing Metadata repository as a source).
  2. Create a job which loads the data and writes or streams that data from one location to another.
  3. Create a batch filled with jobs which handles the migration of the data and the updating of the backing store information.
  4. Implement some means to see/view the migration process.

Slower, but could be integrated into the tooling of the module directly.

Manual migration

  1. Using a tool such as s3cmd/s5cmd/s3p, copy the files to/from the source/destination buckets.
  2. Create a new manifest using s3cmd/s5cmd (see existing metadata import).
  3. Update using CLI tools the files within the manifests to point to the new backing store.
  4. Then, once tested, remove the backing store information from the metadata repository.

Faster, but requires careful managing of the data.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.