Giter Club home page Giter Club logo

Comments (14)

mpacer avatar mpacer commented on August 21, 2024 1

My recommendation for the nbconvert docs would be the following:

  • if the underlying nbconvert code is changed then we should be more detailed in the nbconvert docs
  • if there is no new nbconvert code to accomplish this, I would recommend linking to the pandoc resource for explanation with perhaps a couple of sentences in the nbconvert docs.

So right now, this works in a minimal case using standard pandoc code, but only within a single cell. That's such limited functionality, I'm not sure if it merits inclusion as of yet.

That said, the auto-identifier formatting might be weird for people… but that leads to a slightly different issue…

@Carreau Do we currently support anything like tab-completions for these kinds of selectors? I don't think we do. Or should that be a new issue? Should that be a second aim after getting cross references to work across the entire document? I feel like for autoidentifiers, if we are going to mention it in the documentation, given how tricky they can be to figure out how to write correctly, we should have some way to make it easier to automatically complete them (at least at the cell level).

Ok, and all that said, It looks like in the long run the syntax can follow pandoc's but the support for the feature is going to vary dramatically between now and then (if I can figure out how to make it work). Meaning that the way forward with documentation might be a partial note on this now, with a more elaborate exploration of it later when it actually works across cells?

from nbconvert.

takluyver avatar takluyver commented on August 21, 2024 1

Discussed this with Michael today. We can relatively easily preserve these links by using a pandoc filter to convert. We have such a filter in bookbook already (this is actually a bit more complex than we need for nbconvert, since it also deals with references between notebooks).

There will be some performance penalty in using a filter, whether we use pandoc twice to turn markdown -> JSON and JSON -> Latex, or run pandoc once and let it invoke a Python subprocess for the filter. @michaelpacer is investigating what this is like.

from nbconvert.

mpacer avatar mpacer commented on August 21, 2024 1

Did a quick and dirty approach to this, and on a benchmark set of documents; we're seeing a increase to 1.5× for calling pandoc twice from within python and 2.5× for invoking python from within pandoc. Eitherway we're using the pandocfilters library.

Now, to make the solutions not quick and dirty.

from nbconvert.

minrk avatar minrk commented on August 21, 2024

I suspect this is a pandoc issue, rather than an IPython one. What version of pandoc do you have?

from nbconvert.

flying-sheep avatar flying-sheep commented on August 21, 2024

pandoc 1.13.1.

from nbconvert.

jankatins avatar jankatins commented on August 21, 2024

I've the same issue when I want to convert a notebook to pdf via the notebook UI. The notebook contains a TOC (which I manually inserted via a md cell and [headline](#headlineLink) links).

Could that be a problem due to the way the cells are processed: one by one instead of a complete document? E.g. pandoc sees only the link, but can't find the anchor for that link because it is in another cell?

from nbconvert.

takluyver avatar takluyver commented on August 21, 2024

That could well be the issue.

from nbconvert.

stsievert avatar stsievert commented on August 21, 2024

I am having this same issue. I can replicate this when I include the section I'm linking to in a different cell. When I'm linking to a section in the same cell, the issue goes away.

I have also found it doesn't work when I include math in the titles. When I convert to latex with nbconvert, I see these lines in the output:

\protect\hyperlink{proof-for-ux24Vux5fux5cux257Bpux5fnux5cux257Dux24}{the
appendix}, by induction we can show that
% ...
\subsection{\texorpdfstring{Proof for
\(V_{p_n}\)}{Proof for V\_\{p\_n\}}}\label{proof-for-vux5fpux5fn}

from nbconvert.

mpacer avatar mpacer commented on August 21, 2024

Is the consensus that this is a pandoc issue and not a nbconvert issue? Based on @stsievert's success within a cell and @JanSchulz's hypothesis about the parsing granularity, it sounds like it might be an interaction, but one that could be somewhat alleviated if we were to parse the entire notebook at once instead of at the cell level.

I'm assuming there's a reason for not doing that (possibly having to do with the contents of individual cells to pandoc vs. concatenating entire notebooks before passing data to pandoc), but why is that? At least in the context of passing something to pandoc, couldn't we make this an option?

from nbconvert.

takluyver avatar takluyver commented on August 21, 2024

I think various things could be improved if we were to pass entire notebooks to pandoc, but we also lose quite a bit of the customisation we want, at least if we do it the simple way (convert entire notebook to markdown and then pass into latex). I think it may make sense to experiment with some alternative pathways from notebook to PDF, one of which would rely more heavily on pandoc, but at least for the time being, I don't think we should try to replace the workings of the existing LatexExporter.

from nbconvert.

mpacer avatar mpacer commented on August 21, 2024

@takluyver We talked about this a bit today and @minrk suggested instead looking into the intermediate representation format for pandoc which can still mark up everything as separate pieces (and apparently is quite like nbformat). But will still be able to resolve the cross references globally.

from nbconvert.

mpacer avatar mpacer commented on August 21, 2024

Also, I'm pretty sure the only reason any of this worked at all is because pandoc does some automatic reference/identifier name conversion as part of it's handling of headers: http://pandoc.org/MANUAL.html#extension-auto_identifiers

specifically:
• Remove all formatting, links, etc.
• Remove all footnotes.
• Remove all punctuation, except underscores, hyphens, and periods.
• Replace all spaces and newlines with hyphens.
• Convert all alphabetic characters to lowercase.
• Remove everything up to the first letter (identifiers may not begin with a number or punctuation mark).
• If nothing is left after this, use the identifier section.

If we want to support this feature, we should probably add something about this to the documentation, or at least point to the pandoc resource explaining it (ping @willingc, which do you think should be the approach).

Also, we can specify references explicitly, by giving them a unique CSS style id attribute.

Headers can be assigned attributes using this syntax at the end of the line containing the header text:

{#identifier .class .class key=value key=value}

from nbconvert.

willingc avatar willingc commented on August 21, 2024

If we want to support this feature, we should probably add something about this to the documentation, or at least point to the pandoc resource explaining it (ping @willingc, which do you think should be the approach).

My recommendation for the nbconvert docs would be the following:

  • if the underlying nbconvert code is changed then we should be more detailed in the nbconvert docs
  • if there is no new nbconvert code to accomplish this, I would recommend linking to the pandoc resource for explanation with perhaps a couple of sentences in the nbconvert docs.

from nbconvert.

willingc avatar willingc commented on August 21, 2024

Excellent @michaelpacer. We could perhaps try to optimize further next week too.

from nbconvert.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.