Giter Club home page Giter Club logo

Comments (4)

jaydenseric avatar jaydenseric commented on May 18, 2024 1

I tried out about 4 options in the early days of this project. It's been a while since I made the choice, but there are a couple of reasons. Firstly, memory storage is not really desirable because you can easily use up all the available memory on the server. Then it really comes down to a choice of quality, ease-of-use and popularity/support:

  • For quality, I found comparative benchmarks (can't find them again now) that suggested formidable had great performance.
  • All the major options are suffering maintenance backlog, but formidable has a much more elegant codebase (zero dependencies) than multer (8 dependencies, including busboy). I tried using busyboy on it's own, but that got overly complicated.
  • For ease of use, formidable was by far the simplest and cleanest to use for our purposes.
  • Regarding popularity, formidable is dominant with 243k daily installs vs 75k for multer.

I have been pretty happy with the choice so far.

I am working with the Apollo team to look at bringing upload functionality into the official GraphQL servers, so there is an opportunity to improve a few things. For a while I have wanted to pass file streams into resolvers, but in experimentation it was challenging to engineer. If it turns out to be possible, developers could do whatever they want with uploads in the resolver and it could be an amazing performance boon for passing files through to cloud storage.

Keep in mind that uploads need to be inspectable for validation purposes in the resolver before permanent cloud storage happens, ruling out a configurable cloud upload process in apollo-upload-server. My vision is for there to be no Apollo client and GraphQL server uploads config, it just works; developers can decide what to do with uploads in the resolver.

Feel free to discuss further, we can reopen if something actionable comes up.

from graphql-upload.

du5rte avatar du5rte commented on May 18, 2024

Thanks for getting back to me @jaydenseric, passing streams to resolvers sounds promising and solves both issues. What challenges did you have trying to pass streams to resolver?

It's gonna be awesome once this library become part of apollo 🤙

from graphql-upload.

jaydenseric avatar jaydenseric commented on May 18, 2024

A multipart form gets parsed a field at a time as the request streams in. Streams for each file in the resolvers are only beneficial if the parsing of the multipart form is not blocking to GraphQL server and the resolvers. In a middleware setup, GraphQL server has to come after the multipart form is done with. By the time you have parsed the full list of files to pass into the resolvers, the upload has virtually finished anyway.

We might have to introduce a new files multipart form field, with a list of object paths to where all the files live in the operation. Then operations and files need to be the first two fields in the multipart form, so they can be parsed quickly and as soon as they are done, an operations object can be reconstructed with placeholder buffers or streams, each waiting for their file to be encountered in the remaining form parse.

from graphql-upload.

du5rte avatar du5rte commented on May 18, 2024

By the time you have parsed the full list of files to pass into the resolvers, the upload has virtually finished anyway

I've tried couple approaches first with koa-multer to work with the memoryStorage. I'm only dealing with avatar pictures but you're probably right bigger files can start to take a lot of memory.

import multer from "koa-multer";

app.use(multer({
    storage: multer.memoryStorage(),
}).any())

I guess this is what you mean by blocking the graphlQL server and resolvers, while the middleware is parsing and reading the FileStream by the time it reaches the resolver it has already handle the upload.

export function processRequest(request, options) {
  // Parse the multipart form request

  const { body, files } = request

  // Decode the GraphQL operation(s). This is an array if batching is
  // enabled.
  const operations = JSON.parse(body.operations)

  // Check if files were uploaded
  if (files.length) {
    // File field names contain the original path to the File object in the
    // GraphQL operation input variables. Relevent data for each uploaded
    // file now gets placed back in the variables.
    const operationsPath = objectPath(operations)

    files.forEach((file) => {
      const {
        originalname: name,
        mimetype: type,
        size,
        fieldname,
        buffer
      } = file

      operationsPath.set(fieldname, { name, type, size, buffer })
    })
  }

  return operations;
}

capture

Second approached with busboy, I haven't used streams much so I'm not sure how performant this is but it works, also not the prettiest code.

It does the minimal work to parse and skips reading the FileStream letting that for resolver to handle, a downside is you don't know how big the file is, but maybe that can be uploaded from the front end?

export async function busboyPromise(ctx) {
  return new Promise(function(resolve, reject) {
    const results = {
      fields: [],
      files: []
    }

    const busboy = new Busboy({ headers: ctx.req.headers });

    busboy.on('file', function(fieldname, stream, filename, encoding, mimetype) {
      // ReadStream
      results.files.push({
        filename,
        mimetype,
        encoding,
        fieldname,
        stream
      })

      stream.resume();
      // file.on('data', function(data) {
      //   // buffer
      // });
      // file.on('end', function() {
      // });
    });
    busboy.on('field', function(fieldname, val, fieldnameTruncated, valTruncated, encoding, mimetype) {
      results.fields[fieldname] = val
    });
    busboy.on('finish', function() {
      resolve(results)
    });

    ctx.req.pipe(busboy);
  });
}


export async function processRequest(ctx, options) {
  // Parse the multipart form request
  const { fields, files } = await busboyPromise(ctx)

  // Place on request for futher info
  ctx.req.files = files

  // Decode the GraphQL operation(s). This is an array if batching is
  // enabled.
  const operations = JSON.parse(fields.operations)

  // Check if files were uploaded
  if (files.length) {
    // File field names contain the original path to the File object in the
    // GraphQL operation input variables. Relevent data for each uploaded
    // file now gets placed back in the variables.
    const operationsPath = objectPath(operations)

    files.forEach((file) => {
      const {
        filename: name,
        mimetype: type,
        // you don't get size until you read the stream
        // but it could be transfered from the client mutation
        // size,
        fieldname,
        stream
      } = file

      operationsPath.set(fieldname, { name, type, stream })
    })
  }

  return operations
}

I feel there's probably a cleaner way than using busboy but the end result would be something like it. Not save it to either disk or memory storage, let the resolver handle it.

export default {
  type: new GraphQLNonNull(GraphQLString),
  args: {
    avatar: { type: UploadInput }
  },
  resolve(root, { avatar }, ctx, info) {
    console.log(avatar)

    avatar.stream.pipe(fs.createWriteStream(path.resolve(__dirname, avatar.name)))

    return `New avatar saved`;
  }
};

capture2

from graphql-upload.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.