Giter Club home page Giter Club logo

Comments (17)

stephenplusplus avatar stephenplusplus commented on August 15, 2024 3

forever: false, which prevents sending a Keep-Alive header, is only for file downloads. We added that recently (PR) as a fix for this issue. The problem we were having was, for slower / throttled connections, we weren't receiving the proper end event from the HTTP request stream.

All other API requests made through the Storage API should be using the Keep-Alive header. The default configuration for all requests is here: https://github.com/googleapis/nodejs-common/blob/22383ec89b2866b52850594efab60bd5484a81cc/src/util.js#L30-L37

It's possible for users to change all of our settings with request interceptors. That would look like:

const Storage = require('@google-cloud/storarage')
const gcs = new Storage({ ... })

gcs.interceptors.push({
  request: reqOpts => {
    reqOpts.forever = true
    return reqOpts
  }
})

What do you think-- should we reverse the forever: false fix, and if users have problems, recommend they set the request interceptor, like above?

from nodejs-storage.

danrasmuson avatar danrasmuson commented on August 15, 2024 2

I can confirm

gcs.interceptors.push({
  request: reqOpts => {
    reqOpts.forever = true
    return reqOpts
  }
})

Increased the speed of my uploads from ~20-30 seconds to 2-3 seconds.

from nodejs-storage.

dmho418 avatar dmho418 commented on August 15, 2024 1

In my case the GCP Console was very helpful for debugging this issue.
If you want to know if it has anything to do with reusing connections, go to
GCP Console > APIs & Services > Google Cloud Functions API > Quotas
Check the plots and compare the number of socket connections and DNS resolutions to the number of functions invocations.

from nodejs-storage.

dmho418 avatar dmho418 commented on August 15, 2024

Thanks for the tip!
Adding the interceptor solved the problem. Now each connection is being reused for 1000+ invocations.
I'm not an expert in NodeJS so I'm not going to suggest any code changes.
Still, we could mention somewhere in the docs or samples about the interceptors.

from nodejs-storage.

joemanfoo avatar joemanfoo commented on August 15, 2024

My suggestion is that the API should/ought to manage this based off of the action being performed if it is a requirement. For example if I have a download operation to fetch data, do stuff with it and then upload it I've got to keep this straight in my code on what the library dependancy is for downloading (turn off keep alive) and then turn it back on for uploading.

I currently have Google Functions that are timing out due to this issue where the uploading to Storage has to have the interceptor updated now...previous version I didn't (1.1.1 for example "just worked") but with 1.5.x with the reqOpts.forever is causing the uploads to timeout on about 30% of the function invocations.

from nodejs-storage.

stephenplusplus avatar stephenplusplus commented on August 15, 2024

Thanks for the input.

and then turn it back on for uploading.

The only time we disable keepAlive is for file downloads. If you want it to remain off, you don't have to manage anything. If you want it off for downloads, you can assign the override like I wrote in my last post.

If we did have some methods using keepAlive, and multiple others not, you could get specific with the interceptor:

gcs.interceptors.push({
  request: reqOpts => {
    if (reqOpts.method === 'POST' && reqOpts.uri.includes('{somethingFromTheApiUrl}')) {
      reqOpts.forever = true
    }
    return reqOpts
  }
})

And to go even further, you can assign interceptors on all levels of the hierarchy:

gcs.interceptors.push({ request: function(reqOpts) {} })
bucket.interceptors.push({ request: function(reqOpts) {} })
file.interceptors.push({ request: function(reqOpts) {} })

from nodejs-storage.

joemanfoo avatar joemanfoo commented on August 15, 2024

Thanks @stephenplusplus - I'm having a terrible issue with the upload process from my cloud functions to cloud storage is timing out nearly 90% of the time now. This started around December 6th.
collagebuildertimeouts

I thought perhaps it was something with my setting the keepalive to false to fix the download issue but with updated code to create a new storage object to use I'm still getting timeouts more than successes.

And I've not a clue on who/what or where I need to try to get help other than chucking out $300 bucks for support with Google.

from nodejs-storage.

joemanfoo avatar joemanfoo commented on August 15, 2024

Thanks @dmho418 for the hint. However my issue is that the Google Function that I'm trying to upload content to GCS just timesout.

I built a very simple test function:

exports.uploadTest = functions.database.ref('/upload-test').onWrite(event => {
  const fbBucket = admin.storage().bucket()
  // download a file
  let file = 'inventory/uploads/127470647675333/1507506068973-leg-tc-1.JPG'
  console.log('downloading test file...')
  return fbBucket.file(file).download({ destination: '/tmp/1507506068973-leg-tc-1.JPG'})
  // upload a file
  .then(() => {
    console.log('uploading test file...')
    return fbBucket.upload('/tmp/1507506068973-leg-tc-1.JPG', { destination: 'inventory/uploads/127470647675333/test-1507506068973-leg-tc-1.JPG' })
  }).catch(error => {
    console.log(error)
    return error
  })
})

If I point this to a file that's less than 5MB then the function executes and finished successfully. If I point the script to an image that's larger than 5MB then the function fails. This behavior started to be manifested on December 7th however the change to the Google environment likely took place earlier.

There's an open ticket with BugTracker: https://issuetracker.google.com/issues/70555688 opened by another with the same timeout issue however they are saying they have been able to download and upload payloads larger than 10MB.

The firebase.bucket()... code uses the storage api under the hood and replacing the bucket code to using the api directly doesn't change the outcome.

from nodejs-storage.

stephenplusplus avatar stephenplusplus commented on August 15, 2024

I'm sorry, I was mistaken. I forgot we had a request in April (2017) to not send the Keep-Alive header for Cloud Functions in all of our APIs: (googleapis/google-cloud-node#2254).

This seems to be a unique issue for GCF, so we'll have to see where we get in the Google issue tracker: https://issuetracker.google.com/issues/70555688

For now, I'll call this blocked on our part (the client library). Feel free to provide any extra information, especially if anyone else is having the same problem, or has an idea where we might look for an answer.

from nodejs-storage.

shaibt avatar shaibt commented on August 15, 2024

Having the same issue with Firebase Functions:
Downloading a <1Mb file to a function and then uploading it again after simple processing (same size) - about ~10% of my functions timeout on upload (that usually takes ~300ms).
Intuitively, it seems that when a function "resource" is recycled - i.e its already "warm" (I can deduct this from the function "start-up time") - this happens more often.
Problems started around early december.
Never attempted to override the default keep-alive config.

from nodejs-storage.

shaibt avatar shaibt commented on August 15, 2024

Hey @stephenplusplus,
Do you know if anything has moved on this? or googleapis/nodejs-bigquery#41?

from nodejs-storage.

fhinkel avatar fhinkel commented on August 15, 2024

@stephenplusplus Would it make sense to make the options @danielrasmuson suggests the default?

from nodejs-storage.

stephenplusplus avatar stephenplusplus commented on August 15, 2024

@fhinkel I'm not sure. The suggested fix for this issue would reverse the fix for googleapis/google-cloud-node#2254.

from nodejs-storage.

victor5114 avatar victor5114 commented on August 15, 2024

Thx for your help @stephenplusplus on this issue.

I'm having the following use case where I read a stream from a remote file stored on Cloud Storage, processing it through multiple transform and finally writing the processed files into two different buckets.

const fieldStream = remoteFileSrc
    .createReadStream()
    .on('error', logErr)
    .pipe(parserStream)
    .pipe(dropStream);

  fieldStream
    .on('error', logErr)
    .pipe(csvStream1)
    .pipe(writeStream1);

  fieldStream
    .on('error', logErr)
    .pipe(pseudoStream) // This transform slows down the process
    .pipe(csvStream2)
    .pipe(writeStream2);

Locally, I get the two files correctly written on both destination buckets but i'm getting timeout while I run this code on my Cloud Function environment no matter which option I choose regarding the reqOpts.forever = true/false parameter.

{ Error: ESOCKETTIMEDOUT
    at ClientRequest.<anonymous> (/user_code/node_modules/@google-cloud/storage/node_modules/request/request.js:816:19)
    at ClientRequest.g (events.js:292:16)
    [...]
    at Timer.listOnTimeout (timers.js:214:5) code: 'ESOCKETTIMEDOUT', connect: false }

It looks like pseudoStream transform slows down the whole process as it needs to access other remote resources for each chunk I process through the stream thus causing the timeout issue. The only workaround I came with so far is to process smaller files (~5Mb instead of ~10Mb) to avoid the timeout.

from nodejs-storage.

stephenplusplus avatar stephenplusplus commented on August 15, 2024

Hey @victor5114, would you mind opening a new issue for that? I don't believe this is related (or at least, there could be an unrelated way to solve it).

from nodejs-storage.

stephenplusplus avatar stephenplusplus commented on August 15, 2024

Our library now is using a lot of new guts, especially in terms of the transport libraries. Is anyone still seeing a performance drop in GCF using a newer version of Storage?

from nodejs-storage.

jkwlui avatar jkwlui commented on August 15, 2024

Closing this stale issue upon recommendation to try newer versions of the library with new transport implementations. @dmho418 Please feel free to reopen this issue if you still experience the issue.

from nodejs-storage.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.