Giter Club home page Giter Club logo

Comments (9)

aj-bayanat avatar aj-bayanat commented on May 18, 2024 1

Yep you nailed it ! Mounted the big eng model file only. (not the directory) and it works !

from stirling-pdf.

aj-bayanat avatar aj-bayanat commented on May 18, 2024 1

Sounds like a good fix. Btw when you convert a pdf which you first used ocr to word, the ocr layer disappears and the word document has only an image. Is this intended ?
Also amazing app! You should post it on r/selfhosted. A pdf manipulator is one of the most requested apps I’ve seen over the years. The closest is hrconvert but that doesn’t work very well and it’s ui is dated.

from stirling-pdf.

Frooodle avatar Frooodle commented on May 18, 2024

Weird are you able to give me the pdf to test against or is it private? I can't reproduce this my side

Maybe it's how the bigger eng data works I will test with that as well tonight or tomorrow

from stirling-pdf.

aj-bayanat avatar aj-bayanat commented on May 18, 2024

Since I am using this at work, that PDF is private, however it happens with all pdf's that I have tried. Will try the fast model and let you know.

Edit: Same issue with fast model. Any way I can send you debug logs or such ? Just point me in the right direction.

from stirling-pdf.

Frooodle avatar Frooodle commented on May 18, 2024

I think I found issue
Mounting that directory is removing some needed files which are already there and which you are missing
I will change docker file this weekend to ensure those files are kept on mount

from stirling-pdf.

Frooodle avatar Frooodle commented on May 18, 2024

I have a fix to create a temp folder during build and copy everything from temp to final folder on container startup.
It does mean you wont be able to delete of the old files in that folder but you can add any new ones fine now
also renaming eng.traineddata to English-Lite.traineddata

from stirling-pdf.

Frooodle avatar Frooodle commented on May 18, 2024

It's not ideal but the backend for pdf to word document is libre office so I can't change how it handles the OCR layer sadly
you would have to raise a issue with LibreOffice directly to get that fixed
The OCR tech im using in backend has different ways of rendering the text
here
So i will try this usecase and see if i can get it working, will track the issue here https://github.com/Frooodle/Stirling-PDF/issues/118

from stirling-pdf.

Frooodle avatar Frooodle commented on May 18, 2024

Also thanks for the comments! i did post on reddit when i first started this app
https://www.reddit.com/r/selfhosted/comments/10pexhn/new_browserbased_pdf_editor_github_link/
I plan to post again when I release V1.0.0 (Once i finally add PDF cropping, PDF signing and improved PDF image importing)

Feel free to make a post for me though:') haha

from stirling-pdf.

Frooodle avatar Frooodle commented on May 18, 2024

Fixed with extra lang support in latest patch

from stirling-pdf.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.