Light

Exported annotations on cropped PDFs don't align about paper2remarkable HOT 19 OPEN

gjjvdburg commented on June 16, 2024

Exported annotations on cropped PDFs don't align

from paper2remarkable.

Comments (19)

GjjvdBurg commented on June 16, 2024 1

Done!

from paper2remarkable.

GjjvdBurg commented on June 16, 2024 1

Thanks for sharing your approach @gwtaylor! This seems like a good intermediate solution to have around while #95 is unfinished, and I'm sure it'll help people who are dealing with the same issue.

FWIW, I did indeed mean a soft-crop + export, so good to know that that does work. Thanks!

from paper2remarkable.

sternj commented on June 16, 2024 1

For reference, this is how the PDF library thinks about annotations. My largely uninformed theory is that unipdf is deciding the page height based off of some metadata field that isn't changed in here. I'm going to do some more hunting, it seems like the dimensions of each PDF page is decided individually, specifically (it seems) by the mediabox. I think it's done here, since rmapi appears to put in the original PDF pages as "background" pages. I'm not entirely sure how this all connects yet, but my suspicion is that the recomputation being done might be causing the issues.

from paper2remarkable.

sternj commented on June 16, 2024 1

Reassessing my previous assessment, the salient code happening in rmapi is the end of this function. I'm concerned by the reference to the transformation matrix, since it means that the math here might be more complex than I was thinking. However, it seems like the annotations are being translated down and to the right, rather than being uneven. I'm trying to figure out how annotations are drawn on the PDF, because it seems like they're being translated opposite the cropping

from paper2remarkable.

sternj commented on June 16, 2024 1

Confirming that the displacement of the annotations is directly related to how the box was created, I removed the upper and lower cropping and the displacement was only horizontal. What on earth are things drawn on top of?

from paper2remarkable.

sternj commented on June 16, 2024 1

Just like a normal person, I have found and started reading the pdf standard. What rmapi does is it uses a path, which is defined in section 8.5.2. The path is created relative to the coordinates in "user space". What worries me is that the objects in the PDF itself might be defined in terms of userspace, which isn't something p2r manipulates from what I can tell. That said, if that's the case, I can't tell for the life of me how any of this works-- if coordinates are absolute or quasi-absolute, why does cropping work at all?

from paper2remarkable.

sternj commented on June 16, 2024 1

So, to summarize:

paths are defined by their coordinates in "User Space"
User Space is defined as... something. It seems to be an invariant, but is mapped onto the device by an object called a transformation matrix
User space doesn't seem to be affected by changes to things like the CropBox or MediaBox
The visible page's origin is not (0,0) in user space

Idea: perhaps the ReMarkable itself is defining coordinates based on its own screen, in which case what would need to happen is that rmapi would just need a minor edit to make the MediaBox (or whatever box) start at its (0,0)

Thank you for letting your email inboxes be filled by my comments! I think I know where to look now. More info to come.

from paper2remarkable.

sternj commented on June 16, 2024 1

Well, I tried messing around with this in a few different ways only to conclude that there's something else going on other than just mismatches in the coordinate system. I reached out to Remarkable Support who informed me that the tablet expects something that's shaped like A4, which means there's some additional scaling happening (which is also reflected in the rmapi code). They suggested using "print to pdf" from Chrome, which doesn't solve the issue at hand, but the equivalent from Firefox does work, so... progress I guess? I spent a few hours trying to figure out what those web browsers are actually doing when they "print to PDF" but so far what I've found is a rather inscrutable event loop.

What's weird to me (and sufficiently weird to remarkable support that they've escalated it to the devs) is

Annotations show up properly on the desktop app
Annotations are irregular for irregularly positioned PDFs exported from Chrome, but not Firefox.

I've made some good progress reverse engineering the PDF renderer from Xochitl itself, so I'm also digging into that and seeing what transformations it does to actually display the PDF.

I also reached out to the person who originally wrote the annotation code in rmapi, but he seems to be quite busy in his real life and his understanding largely relies on the tacit assumption that Xochitl makes that a PDF is shaped like an A4.

Once I'm out of this current push at work I'm going to try to figure out in a more systematic way what makes Firefox-rendered PDFs different from Chrome-rendered PDFs or p2r-rendered PDFs.

Long story short, this... is really weird, even by PDF standards. More to come as I read more \MediaBox entries and specify the proper arity of various QT functions.

from paper2remarkable.

GjjvdBurg commented on June 16, 2024

What are you using for exporting annotations? If it's rMapi the bug might be in their code. When p2r crops the pdf the dimension of the pages will be changed, so the calculation of the annotation position must take that into account. This isn't something that can be fixed in p2r unfortunately.

from paper2remarkable.

reini1305 commented on June 16, 2024

I'm using the remarkable app itself on Mac to do the export.
I'm afraid that the bug is in their export code...

from paper2remarkable.

GjjvdBurg commented on June 16, 2024

Yeah, that's quite likely. What p2r does is essentially a "hard crop" since the page dimensions are changed. The reMarkable itself I think only adjusts the view when you use the crop function, leaving the page size unchanged.

There are a number of tools to export annotations, including rMapi, remarks, and rM2svg (see this page for an overview). I don't know if any support PDFs with different page sizes, but it might be worth a look.

I'll close this for now since the issue isn't in p2r.

from paper2remarkable.

reini1305 commented on June 16, 2024

Thanks. I think it would be good to note this somewhere or possibly change the default option to avoid user frustration :)

from paper2remarkable.

gwtaylor commented on June 16, 2024

I'm wondering if anyone has found a workflow to deal with this issue yet, or found success with any of the tools that @GjjvdBurg suggested?

I am using the (convenient) reMarkable app functionality to export to PDF. I'm experiencing the same behaviour as described: annotations don't align to the cropped PDF.

Yes, I could disable cropping, but it is a nice-to-have! So if anyone discovered a workflow that supports both cropping and PDF export, that would be great to know.

I'm a rMapi user and attempted its gita command, but that has the same behaviour as the rM app on the cropped PDFs.

from paper2remarkable.

GjjvdBurg commented on June 16, 2024

I'm not aware of any work-arounds, but I do have an idea of how we can fix this. As far as I recall the issue is that we set the bounding box of the PDF, which is sort of a "hard" crop, and that results in the annotations not being aligned. However, the reMarkable supports adjusting the view as well (a "soft" crop if you wil).

So, the idea is to add an option to p2r to use a soft crop if the user prefers. I just added my work-in-progress code to the repo in #95. If you're feeling up for it, you could consider contributing to the project by working on that PR (I'm happy to advise, but haven't found the time to dig into this myself lately).

I'll reopen this issue as I think it can be solved using the soft crop. As a sanity check: have you tried using the reMarkable's crop functionality and seeing how that affects the positions of the annotations? If that doesn't actually work then my PR won't help either of course.

from paper2remarkable.

gwtaylor commented on June 16, 2024

Hi @GjjvdBurg, thanks for your reply and for drafting #95. I found a workaround, though I think your proposed solution is more elegant than mine.

To answer your question about the reMarkable's crop functionality, I did attempt to use "Reset View" in the UI, after the file had been hard cropped and annotated, but that didn't make a difference. I think that was because after the hard cropped, the view was by default "zoomed out" as far as it would go. But perhaps you are asking whether I took a file that was not hard cropped, cropped using the reMarkable "Adjust View" tool (i.e. soft crop), annotated and exported as PDF? Yes, that certainly works.

Here is my workaround:

Using the reMarkable software, "Export to SVG". This will create a folder of SVG files, one per page. Interestingly, these don't suffer from the misalignment issue.
Convert each SVG file to a PDF using rsvg-convert
Assemble the PDFs into a single PDF using pdftk

Both of these command-line tools are available through Homebrew via the packages librsvg and pdftk-java, respectively.

Here is a bash script that automates steps 2 and 3, after you have exported the SVGs using the reMarkable software:

#!/bin/bash

# Convert a folder of SVG files exported by reMarkable software to a single PDF
# ARGUMENTS:
#   A directory containing SVGs, 1 file per page
# It is expected that this function be run from the parent of the
# directory of SVGs
# This function will create a PDF with the same name as the directory and a
# .pdf extension
# EXAMPLE
#   p2r2pdf Liao_et_al_-_Efficient_Graph_Generation_With_Graph_Recurrent_Attention_Networks_2019
#
#   This results in Liao_et_al_-_Efficient_Graph_Generation_With_Graph_Recurrent_Attention_Networks_2019.pdf
p2r2pdf () {
  echo "Processing $1"
  folder=$1
  # read number of svgs into variable
  # xargs trims whitespace
  nsvg=$(ls $folder  | wc -l | xargs)
  echo "$nsvg files found"
  prefix="$folder - page "
  filepath="$folder/$prefix"
  for i in {1..$nsvg}
  do
      filename=$filepath$i.svg
      echo "Converting $filename"
      rsvg-convert -f pdf -o "${filename%%svg}pdf" "$filename" 
  done
  finalpdf="${folder}.pdf"
  echo "Building final PDF: $finalpdf"
  pdftk $filepath{1..$nsvg}.pdf cat output "$finalpdf"
  echo "Removing intermediate PDFs"
  rm $filepath{1..$nsvg}.pdf
}

So far it looks good, except for one tiny issue. When it exports SVGs, the reMarkable software embeds the original PDF as an image and the annotations as strokes. Therefore the resulting PDF will have crisp (vectorized) annotations but the background is blurry (rasterized).

from paper2remarkable.

sternj commented on June 16, 2024

I'm interested in what's happening here-- How does the remarkable calculate where annotations should be when exported compared to where they should be when being drawn on the page?

from paper2remarkable.

sternj commented on June 16, 2024

Also worth noting that this issue also occurs with the geta subcommand of rmapi, which can be interrogated

from paper2remarkable.

sternj commented on June 16, 2024

So I've established that the MediaBox attribute does have an effect on alignment, but it doesn't fix the misalignment. I have also confirmed that it is the CropBox that causes the misalignment. I think that it might be a translation error between ghostscript and PIL format, but I'm not entirely sure yet.

from paper2remarkable.

sternj commented on June 16, 2024

The issue isn't in the cropping itself but how the bounding box is being computed-- removing all reference to the margins in get_bbox surfaces this issue. I'm also seeing that the translations are uneven-- boxes towards the top of the page are translated upwards more than boxes on the bottom of the page. I think it's something in get_bbox-- @GjjvdBurg do you remember what the intuition behind that was?

from paper2remarkable.

Related Issues (20)

Keep internal links HOT 3
Optionally add margin HOT 2
New source recommendations HOT 2
Suggestion: add example conversion to Readme. HOT 2
Math symbols are not converted HOT 2
[Errno 2] No such file or directory
Hanging on removing timestamp HOT 10
GLib-GObject-CRITICAL error when copying a website HOT 3
pdf not found on remarkable
Specify local html file HOT 1
"Could not build wheels for pikepdf" HOT 1
Can't get -p option (or --remarkable-path) to work HOT 5
Installation Struggles on MacOS HOT 1
FileNotFoundError: [WinError 2] The system cannot find the file specified HOT 1
pdf timetables HOT 2
Custom font and line-spacing HOT 3
Error Building wheel for paper2remarkable (PEP 517) HOT 3
Transfer of multiple files with similar names HOT 2
Any interest in making the provider code a standalone library? HOT 5
adding blank pages fails HOT 3

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.