Giter Club home page Giter Club logo

Comments (16)

pofider avatar pofider commented on August 21, 2024 1

Thank you for your time on this PR. I reviewed your code and done some more thinking regarding this topic now. It looks to me that this type of solution is not safe in the edge cases. It can still be polluting the system with temporary files If the process fails in the middle of the conversion or if the temp file is for a moment blocked with something else.

I'm sorry I gave wrong recommendation in my previous comment. At this moment I think it is better to create a temp directory inside the tmpDir at the startup and have global option cleanTmpFiles enabling the reaper on this.

What you think?

from phantom-html-to-pdf.

pofider avatar pofider commented on August 21, 2024

Yes, we should probably provide it in this package. The html can be deleted right after and the pdf after the stream is read.

Anybody is welcome to PR this.

Right now you can use reaper package to remove the old files like we do here
https://github.com/jsreport/jsreport-core/blob/master/lib/reporter.js#L317

from phantom-html-to-pdf.

bjrmatos avatar bjrmatos commented on August 21, 2024

I'm sorry I gave wrong recommendation in my previous comment. At this moment I think it is better to create a temp directory inside the tmpDir at the startup and have global option cleanTmpFiles enabling the reaper on this.

👍 on this, also i think that the default for the cleanTmpFiles option should be false to preserve original behaviour.

from phantom-html-to-pdf.

 avatar commented on August 21, 2024

Sure. What if the entire process dies before the reaper has time to remove the files? Would it be possible to remove the dependency on tmp files entirely?

from phantom-html-to-pdf.

pofider avatar pofider commented on August 21, 2024

What if the entire process dies before the reaper has time to remove the files?

In this case reaper cleans up the tmp files after it starts again.

Would it be possible to remove the dependency on tmp files entirely?

No. I believe phantomjs needs to store the pdf into file so we need tmp files.

from phantom-html-to-pdf.

bjrmatos avatar bjrmatos commented on August 21, 2024

Would it be possible to remove the dependency on tmp files entirely?

i have an idea to remove dependency of temporary files but when using the electron-pdf recipe 😆, i'm afraid that because the way of phantomjs works, this will not possible at least for now.

from phantom-html-to-pdf.

 avatar commented on August 21, 2024

It looks like phantomjs can read from stdin and write to stdout. That may be a way to remove the dependency on temp files. Could just pipe the html to the child process and capture the output in a stream. Thoughts?

from phantom-html-to-pdf.

pofider avatar pofider commented on August 21, 2024

See the docs, you can only render to file
http://phantomjs.org/api/webpage/method/render.html

We could possibly remove the temp files on html. However I was using stdio to pass html to phantomjs originally and it was indeterministicly cutting it. Environment variables could work here, but we need the temp files because of the pdf anyway.

from phantom-html-to-pdf.

pofider avatar pofider commented on August 21, 2024

I forgot to note there are methods to render into memory
http://phantomjs.org/api/webpage/method/render-base64.html
http://phantomjs.org/api/webpage/method/render-buffer.html

However searching the phantomjs issues shows it is not in the 1.9 and probably not even in the latest release
https://github.com/ariya/phantomjs/search?q=renderBuffer&type=Issues&utf8=%E2%9C%93

from phantom-html-to-pdf.

 avatar commented on August 21, 2024

You can use render to output to stdout page.render('/dev/stdout', { format: 'pdf' });

Then capture that in a stream for the pdf.

from phantom-html-to-pdf.

bjrmatos avatar bjrmatos commented on August 21, 2024

@ianloverink yes, but it is not cross-platform, and windows support is a big concern for this package and other products built with it.

there is other way to make it possible, using named pipes (windows has support for named pipes and seems like phantomjs supports writing the pdf to a named pipe), but the implementation will be more complex and likely would require to use some c/c++ dependency (like https://github.com/avz/node-mkfifo, which unfortunately doesn't support creating named pipes on windows)

from phantom-html-to-pdf.

 avatar commented on August 21, 2024

Seems to have support for windows here:
https://github.com/ariya/phantomjs/blob/1.9/src/webpage.cpp#L867

from phantom-html-to-pdf.

bjrmatos avatar bjrmatos commented on August 21, 2024

seems like it could work (on windows, they remove the temp file and send it to stdout/stderr), but i'm not sure to use stdout/stderr as the default mechanism to transfer the pdf, i think that transfering big chunks of data (large pdf files) via stdout/stderr will not scale (but i could be wrong), using the disk is the most performant way IMO (I think this package scales very well, since it uses the disk to transfer the pdf between the node.js and phantomjs processes), also it will make the phanthom-server strategy more complex that it needs to be.

i'm 👎 on this, maybe it could be added as an option for the dedicated-process strategy but i don't think that a new option that its going to work for only one strategy will worth the effort.

let's wait for @pofider 's comments.

from phantom-html-to-pdf.

bjrmatos avatar bjrmatos commented on August 21, 2024

@pofider have you tried before to remove the temp file when the stream is closed or ended? (close, end events of stream) that way we clean the temp file after the user has been consumed the stream. i think it could work, have you found any downside to not do that?

that way we clean up temp files.

i still think that using the reaper will be necessary (and the recomended way) to be really sure that all temp files are removed in case of something unexpected. but with this solution at least the most common case will be covered.

from phantom-html-to-pdf.

bjrmatos avatar bjrmatos commented on August 21, 2024

initial step to get rid of temp files ->#49

from phantom-html-to-pdf.

pofider avatar pofider commented on August 21, 2024

To summary this..

1 (stdout). Using stdout would need a performance comparison on linux. Windows benefit is that we don't have to deal with temp cleanup. The stdout approach would take time to implement. It gives more security for linux - the pdf file doesn't leak somewhere where it could be potentially seen by other process.

2 (stream events). Deleting temp files in stream events is implemented in both
#48 and #49. The downside is that it doesn't handle process crashes or other edge cases and it can be potentially buggy because nodejs stream events handling can be complex (for me). On the other hand in nicely handles the main case.

3 (reap). Third option is to use reaper. This is very simple and straight forward like shown here. It works without any issues for years in production for me.

I don't think we need to have a combination of 2 and 3 at once. I like 3 more since it is complete and proven solution.

The option 1 sounds very promising. However it is not sure it will work as expected and it will take time to implement.

As for me, I can quickly add option 3 with reaper. I don't have currently (next weeks) time to investigate the stdout option, but I can give hints as I've already tried it this morning.

I'll wait for your opinions and then we can decide based on it.

from phantom-html-to-pdf.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.