Converting markdown like [Header](#Header) to LaTeX j

Anchor links and conversion to LaTeX about nbconvert HOT 14 CLOSED

jupyter commented on August 21, 2024

Anchor links and conversion to LaTeX

from nbconvert.

Comments (14)

mpacer commented on August 21, 2024 1

My recommendation for the nbconvert docs would be the following:

if the underlying nbconvert code is changed then we should be more detailed in the nbconvert docs

if there is no new nbconvert code to accomplish this, I would recommend linking to the pandoc resource for explanation with perhaps a couple of sentences in the nbconvert docs.

So right now, this works in a minimal case using standard pandoc code, but only within a single cell. That's such limited functionality, I'm not sure if it merits inclusion as of yet.

That said, the auto-identifier formatting might be weird for people… but that leads to a slightly different issue…

@Carreau Do we currently support anything like tab-completions for these kinds of selectors? I don't think we do. Or should that be a new issue? Should that be a second aim after getting cross references to work across the entire document? I feel like for autoidentifiers, if we are going to mention it in the documentation, given how tricky they can be to figure out how to write correctly, we should have some way to make it easier to automatically complete them (at least at the cell level).

Ok, and all that said, It looks like in the long run the syntax can follow pandoc's but the support for the feature is going to vary dramatically between now and then (if I can figure out how to make it work). Meaning that the way forward with documentation might be a partial note on this now, with a more elaborate exploration of it later when it actually works across cells?

from nbconvert.

takluyver commented on August 21, 2024 1

Discussed this with Michael today. We can relatively easily preserve these links by using a pandoc filter to convert. We have such a filter in bookbook already (this is actually a bit more complex than we need for nbconvert, since it also deals with references between notebooks).

There will be some performance penalty in using a filter, whether we use pandoc twice to turn markdown -> JSON and JSON -> Latex, or run pandoc once and let it invoke a Python subprocess for the filter. @michaelpacer is investigating what this is like.

from nbconvert.

mpacer commented on August 21, 2024 1

Did a quick and dirty approach to this, and on a benchmark set of documents; we're seeing a increase to 1.5× for calling pandoc twice from within python and 2.5× for invoking python from within pandoc. Eitherway we're using the pandocfilters library.

Now, to make the solutions not quick and dirty.

from nbconvert.

minrk commented on August 21, 2024

I suspect this is a pandoc issue, rather than an IPython one. What version of pandoc do you have?

from nbconvert.

flying-sheep commented on August 21, 2024

pandoc 1.13.1.

from nbconvert.

jankatins commented on August 21, 2024

I've the same issue when I want to convert a notebook to pdf via the notebook UI. The notebook contains a TOC (which I manually inserted via a md cell and [headline](#headlineLink) links).

Could that be a problem due to the way the cells are processed: one by one instead of a complete document? E.g. pandoc sees only the link, but can't find the anchor for that link because it is in another cell?

from nbconvert.

takluyver commented on August 21, 2024

That could well be the issue.

from nbconvert.

stsievert commented on August 21, 2024

I am having this same issue. I can replicate this when I include the section I'm linking to in a different cell. When I'm linking to a section in the same cell, the issue goes away.

I have also found it doesn't work when I include math in the titles. When I convert to latex with nbconvert, I see these lines in the output:

\protect\hyperlink{proof-for-ux24Vux5fux5cux257Bpux5fnux5cux257Dux24}{the
appendix}, by induction we can show that
% ...
\subsection{\texorpdfstring{Proof for
\(V_{p_n}\)}{Proof for V\_\{p\_n\}}}\label{proof-for-vux5fpux5fn}

from nbconvert.

mpacer commented on August 21, 2024

Is the consensus that this is a pandoc issue and not a nbconvert issue? Based on @stsievert's success within a cell and @JanSchulz's hypothesis about the parsing granularity, it sounds like it might be an interaction, but one that could be somewhat alleviated if we were to parse the entire notebook at once instead of at the cell level.

I'm assuming there's a reason for not doing that (possibly having to do with the contents of individual cells to pandoc vs. concatenating entire notebooks before passing data to pandoc), but why is that? At least in the context of passing something to pandoc, couldn't we make this an option?

from nbconvert.

takluyver commented on August 21, 2024

I think various things could be improved if we were to pass entire notebooks to pandoc, but we also lose quite a bit of the customisation we want, at least if we do it the simple way (convert entire notebook to markdown and then pass into latex). I think it may make sense to experiment with some alternative pathways from notebook to PDF, one of which would rely more heavily on pandoc, but at least for the time being, I don't think we should try to replace the workings of the existing LatexExporter.

from nbconvert.

mpacer commented on August 21, 2024

@takluyver We talked about this a bit today and @minrk suggested instead looking into the intermediate representation format for pandoc which can still mark up everything as separate pieces (and apparently is quite like nbformat). But will still be able to resolve the cross references globally.

from nbconvert.

mpacer commented on August 21, 2024

Also, I'm pretty sure the only reason any of this worked at all is because pandoc does some automatic reference/identifier name conversion as part of it's handling of headers: http://pandoc.org/MANUAL.html#extension-auto_identifiers

specifically:
• Remove all formatting, links, etc.
• Remove all footnotes.
• Remove all punctuation, except underscores, hyphens, and periods.
• Replace all spaces and newlines with hyphens.
• Convert all alphabetic characters to lowercase.
• Remove everything up to the first letter (identifiers may not begin with a number or punctuation mark).
• If nothing is left after this, use the identifier section.

If we want to support this feature, we should probably add something about this to the documentation, or at least point to the pandoc resource explaining it (ping @willingc, which do you think should be the approach).

Also, we can specify references explicitly, by giving them a unique CSS style id attribute.

Headers can be assigned attributes using this syntax at the end of the line containing the header text:
{#identifier .class .class key=value key=value}

from nbconvert.

willingc commented on August 21, 2024

If we want to support this feature, we should probably add something about this to the documentation, or at least point to the pandoc resource explaining it (ping @willingc, which do you think should be the approach).

My recommendation for the nbconvert docs would be the following:

if the underlying nbconvert code is changed then we should be more detailed in the nbconvert docs
if there is no new nbconvert code to accomplish this, I would recommend linking to the pandoc resource for explanation with perhaps a couple of sentences in the nbconvert docs.

from nbconvert.

willingc commented on August 21, 2024

Excellent @michaelpacer. We could perhaps try to optimize further next week too.

from nbconvert.

Anchor links and conversion to LaTeX about nbconvert HOT 14 CLOSED

Comments (14)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent