Giter Club home page Giter Club logo

Comments (6)

jsvd avatar jsvd commented on August 16, 2024 1

In the meantime the fix proposed for cleaning up metrics resources is still valid and can be done regardless of the fix for the input register leaks

from logstash.

yaauie avatar yaauie commented on August 16, 2024

Fly-by thoughts:

JavaPipeline#clear_pipeline_metrics is only called for JavaPipeline#shutdown, and does not occur when a pipeline fails to start. IIRC we want for a pipeline to leave its metrics in-place when it stops "normally" or exceptionally, but we want for them to be cleared when the pipeline has been shut down.

I think that the "easiest" way to handle this would be to also clear them as a part of starting up, but we would need to be careful to ensure that if a pipeline has overlap during a restart that we keep ordering correct.

Alternatively, we could change the default id generation algorithm to somehow fingerprint the source of the plugin.

from logstash.

edmocosta avatar edmocosta commented on August 16, 2024

Hey @yaauie, thank you for the input! I'm wondering if changing the JavaPipeline#shutdown method, adding a reloading parameter to avoid the shutdown early return would solve this issue?, E.g:

def shutdown(reloading: false)
 return if finished_execution? && !reloading
end

Then on the PipelineAction::Reload#execute:

old_pipeline.shutdown(reloading: true)

My initial tests seems to be working with this code change, but I'm still unsure if it can have any other impact on the reloading process. WDYT?

from logstash.

yaauie avatar yaauie commented on August 16, 2024

🤔 The comments lead me to believe that we guard on JavaPipeline#finished_execution? to ensure we don't have a race with two threads starting and stopping the pipeline simultaneously, and that we have a chance to block until a running pipeline has fully shut down. But because it bails from the shutdown method entirely it fails to do the cleanup that needs to happen whether or not the to-be-shutdown pipeline has already completed execution.

I think that if we change the scope of the guard a little, we can get the behaviour we want in both cases without needing to add the reloading control variable:

diff --git a/logstash-core/lib/logstash/java_pipeline.rb b/logstash-core/lib/logstash/java_pipeline.rb
index 85f182fdb..0e4bd5c44 100644
--- a/logstash-core/lib/logstash/java_pipeline.rb
+++ b/logstash-core/lib/logstash/java_pipeline.rb
@@ -446,15 +446,16 @@ module LogStash; class JavaPipeline < AbstractPipeline
   # this method is intended to be called from outside the pipeline thread
   # and will block until the pipeline has successfully shut down.
   def shutdown
-    return if finished_execution?
-    # shutdown can only start once the pipeline has completed its startup.
-    # avoid potential race condition between the startup sequence and this
-    # shutdown method which can be called from another thread at any time
-    sleep(0.1) while !ready?
-
-    # TODO: should we also check against calling shutdown multiple times concurrently?
-    stop_inputs
-    wait_for_shutdown
+    unless finished_execution?
+      # shutdown can only start once the pipeline has completed its startup.
+      # avoid potential race condition between the startup sequence and this
+      # shutdown method which can be called from another thread at any time
+      sleep(0.1) while !ready?
+
+      # TODO: should we also check against calling shutdown multiple times concurrently?
+      stop_inputs
+      wait_for_shutdown
+    end
     clear_pipeline_metrics
   end # def shutdown
 

from logstash.

edmocosta avatar edmocosta commented on August 16, 2024

Hi @yaauie, If I'm not mistaken, It would only solve the pipeline metrics issue, given the clear_pipeline_metrics was moved out of the guard, but the resources leak problem would still happen, as successfully started input plugins wouldn't be closed on subsequent failed reloads.

For example:

Let's assume the following pipeline started successfully and both plugins are running:

input {
  jdbc {
    id => "db_1"
    type => "jdbc"
     ...
   }

   jdbc {
     id => "db_2"
     type => "jdbc"
      ...
    }
}

Then, the user changes the jdbc { id => "db_2" } settings, making it invalid (wrong password, for example).
On the first reload attempt, the shutdown method gets finished_execution? => false, cleaning the resources and metrics properly.

But then, the user changes it again, but the configuration is still invalid. This time, when Logstash tries to reload the pipeline configuration, the shutdown method will get finished_execution? => true, as the pipeline being reloaded/shut down is the same that previously failed to start, skipping the resources cleaning up, and letting the jdbc { id => "db_1" } connections open (as no stop_inputs are invoked).

from logstash.

jsvd avatar jsvd commented on August 16, 2024

The scenario I was able to reproduce consistently has to do with resources created during the register method of input plugins.
This didn't reproduce when resources were created by filters, or if they were created after register during start.

To reproduce I took simple inputs like heartbeat and generator and added a resource creation (connection) during register, and added an conditional exception to one of them. With this I could trigger the resource pile up.

The proposed solution fixes this scenario, and I don't see any problems it could create. In this case we're sort of abusing stop_inputs to perform the cleanup, even though inputs never ended up starting in the first place, but it works. A more structured fix would be to differentiate this situation and call close on each input (as opposed to stop).

from logstash.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.