htdebeer / paru Goto Github PK

View Code? Open in Web Editor NEW

36.0 36.0 12.0 5.77 MB

Control pandoc with Ruby and write pandoc filters in Ruby

Home Page: https://heerdebeer.org/Software/markdown/paru/

License: GNU General Public License v3.0

Ruby 97.56% TeX 0.07% HTML 1.44% JavaScript 0.06% Dockerfile 0.18% Lua 0.69%

pandoc pandoc-filter ruby

paru's People

Contributors

Stargazers

Watchers

Forkers

professosaurus rriemann iandol phitsc aarondufall herculosh tomrobison crguezl lygaret

paru's Issues

Accessing element.inner_markdown truncates at first space, occasionally

Hello!

I wrote to you a few years ago asking about using EPUB as an input method for paru, and I've been using it ever since you implemented the convert_file method.

Inspired by the code in this related issue #9, I'm using a paru filter to fine-tune the latex output of a document and it's proving really successful.

This is the relevant snipped from my filter:

  # Change the output so chapters and sections aren't numbered
  with 'Header' do |header|
    header.inner_markdown = header.children.first.inner_markdown if header.has_children?
    if header.level == 1
      header.markdown = "\\chapter\*\{\\texorpdfstring\{\{#{header.inner_markdown}\}\}\{#{header.inner_markdown}\}\}"
    end
    if header.level == 2
      header.markdown = "\\section\*\{\\texorpdfstring\{\{#{header.inner_markdown}\}\}\{#{header.inner_markdown}\}\}"
    end
  end

In general, this works absolutely fine. However, in some instances, the result output includes only the first word in that header, rather than the full header text.

For example, in the result file I would see something like this

\section*{\texorpdfstring{{London, 
}}{London, 
}}

or 

\section*{\texorpdfstring{{The 
}}{The
}}

when the expected output would have been

\section*{\texorpdfstring{{London, February 1907
}}{London, February 1907
}}

and

\section*{\texorpdfstring{{The Orient Express, August 1905
}}{The Orient Express, August 1905
}}

I can't seem to find any commonality in the instances where this occurs, but when there is a comma, that bit of punctuation is always preserved, and otherwise the string gets truncated at the first space.

Am I accessing the wrong method with .inner_markdown?

I've been looking at the input carefully and it's not that there is a child node in there I don't think – the source HTML from the EPUB input in this instance was <h2>The Orient Express, August 1905</h2>, for example.

This issue doesn't occur in every instance of a heading, but when it does occur in a specific heading it will happen every time I run paru, but as I said I can't quite ascertain what is causing it.

Do you have any ideas? Am happy to provide more examples and information to help debug this issue.

Thanks!

Trouble configuring paru to pass pandoc an external metadata file

I was looking at commit ec42eb7 and how it's added support for pandoc --metadata-file, but I can't seem to get this to work with paru.

I assumed the rubyesque way of setting this option would be with an underscore instead of hyphen "metadata_file 'filename.yaml'"

file = 'test.md'
Paru::Pandoc.new do
  from 'markdown-smart'
  to 'markdown-smart'
  standalone true
  output 'test-1.md'
  metadata_file 'metadata.yaml'
end.convert_file file

However, I get an error when I try this.

NoMethodError: undefined method 'metadata_file' for #<Paru::Pandoc:0x00007fc55d30ec68> did you mean? metadata-file

When I try and use metadata-file in my options instead, I get a ruby syntax error because of the hyphen.

Is there a bug in this implementation, or am I not configuring this option correctly in my example above?

Replacing outer_markdown does not always work (e.g. replacing with embedded LaTeX)

When replacing elements outer_markdown:

it sometimes work (see attached sample code) e.g. when I try to replace an image-object with **Bold Text
sometimes it does not completely work, e.g. when I try to replace it with # Chapter, then the resulting file will omit the # and will just include the word “Chapter”.
When I try to replace it with embedded LaTeX command, it reports an error:

/var/lib/gems/2.3.0/gems/paru-0.2.4.8/lib/paru/filter/node.rb:103:in each': undefined method each' for nil:NilClass (NoMethodError)
from /var/lib/gems/2.3.0/gems/paru-0.2.4.8/lib/paru/filter/markdown.rb:87:in outer_markdown=' from testFilter.rb:10:in block (2 levels) in
'
from /var/lib/gems/2.3.0/gems/paru-0.2.4.8/lib/paru/filter.rb:289:in with' from testFilter.rb:7:in block in '
from /var/lib/gems/2.3.0/gems/paru-0.2.4.8/lib/paru/filter.rb:268:in instance_eval' from /var/lib/gems/2.3.0/gems/paru-0.2.4.8/lib/paru/filter.rb:268:in block in filter'
from /var/lib/gems/2.3.0/gems/paru-0.2.4.8/lib/paru/filter/ast_manipulation.rb:101:in each_depth_first' from /var/lib/gems/2.3.0/gems/paru-0.2.4.8/lib/paru/filter/ast_manipulation.rb:102:in block in each_depth_first'
from /var/lib/gems/2.3.0/gems/paru-0.2.4.8/lib/paru/filter/node.rb:104:in block in each' from /var/lib/gems/2.3.0/gems/paru-0.2.4.8/lib/paru/filter/node.rb:103:in each'
from /var/lib/gems/2.3.0/gems/paru-0.2.4.8/lib/paru/filter/node.rb:103:in each' from /var/lib/gems/2.3.0/gems/paru-0.2.4.8/lib/paru/filter/ast_manipulation.rb:102:in each_depth_first'
from /var/lib/gems/2.3.0/gems/paru-0.2.4.8/lib/paru/filter/ast_manipulation.rb:102:in block in each_depth_first' from /var/lib/gems/2.3.0/gems/paru-0.2.4.8/lib/paru/filter/node.rb:104:in block in each'
from /var/lib/gems/2.3.0/gems/paru-0.2.4.8/lib/paru/filter/node.rb:103:in each' from /var/lib/gems/2.3.0/gems/paru-0.2.4.8/lib/paru/filter/node.rb:103:in each'
from /var/lib/gems/2.3.0/gems/paru-0.2.4.8/lib/paru/filter/ast_manipulation.rb:102:in each_depth_first' from /var/lib/gems/2.3.0/gems/paru-0.2.4.8/lib/paru/filter.rb:266:in filter'
from /var/lib/gems/2.3.0/gems/paru-0.2.4.8/lib/paru/filter.rb:228:in run' from testFilter.rb:6:in '
pandoc: Error running filter testFilter.rb

You can run the example code with:
pandoc "test.markdown" --filter testFilter.rb -o output.markdown

Again I’m not 100% how to write a replacing filter. I also tried to replace it with an Paru-Object (e.g. RawBlock or RawInline - but it did not work either)

Replace_with_Markdown.zip

Support new data-directory in pandoc

Ian mentioned that pandocomatic has trouble with finding files in the data-directory (pandocomatic issue #64) in pandoc 2.7 because pandoc now implements the XDG Base Directory Specification and the default data-directory changed from ~/.pandoc to ~/.local/share/pandoc. In the release notes I read that the new location has preference.

More of one argument

Hi!

Guess it's not a bug :), simply my low ruby skills. How can I pass arguments from command line to a filter?

image.attr.has_keys? fails as it expects a map but gets an Array

The following code snippet fails in line

if image.attr.has_key? "width" with error
gems/paru-0.3.0.1/lib/paru/filter/attr.rb:72:in has_key?': undefined method key_exists?' for #Array:0x00005604c3b8bc80

`require "paru/filter"

Paru::Filter.run do
with "Image" do |image|
if image.attr.has_key? "width"
STDERR.puts "has width"
end
end
end
`

Make paru compatible with pandoc version 2.7

There are some bigger changes to pandoc we should support:

2.7:
- New data-directory (#46)
- New option --ipynb-output=all|none|best
2.3:
- New option --metadata-file

Check update pandoc to 2.10

Pandoc has just gotten to release 2.10. There seems to be some changes to pandoc-types, so check if those changes needs to be propagates to paru or have other implications for paru.

Pry break Paru filters

This is quite bizarre, but when I try to use the excellent Pry (0.10.4) to inspect a filter, I get a Paru error.

test.md

A minimal **pandoc** example.

noop filter

#!/usr/bin/env ruby
require 'pry'
require 'paru/filter'

binding.pry
Paru::Filter.run do 

end

Output:

👉  pandoc -t json -F noop test.md
/Library/Ruby/Gems/2.0.0/gems/paru-0.2.4.8/lib/paru/filter/document.rb:61:in `rescue in from_JSON': Unable to read document. (Paru::FilterError)

Most likely cause: Paru expects a pandoc installation that has been
compiled with pandoc-types >= 1.17.0.5. You can
check which pandoc-types have been compiled with your pandoc installation by
running `pandoc -v`.

Original error message: 765: unexpected token at ''
	from /Library/Ruby/Gems/2.0.0/gems/paru-0.2.4.8/lib/paru/filter/document.rb:57:in `from_JSON'
	from /Library/Ruby/Gems/2.0.0/gems/paru-0.2.4.8/lib/paru/filter.rb:237:in `document'
	from /Library/Ruby/Gems/2.0.0/gems/paru-0.2.4.8/lib/paru/filter.rb:264:in `filter'
	from /Library/Ruby/Gems/2.0.0/gems/paru-0.2.4.8/lib/paru/filter.rb:228:in `run'
	from /Users/ian/.pandoc/filters/noop:8:in `<main>'
pandoc: Error running filter /Users/ian/.pandoc/filters/noop
Filter returned error status 1

What is strange is that pry should stop execution at the run() — if you comment out binding.pry then there is no error. I also tried the new binding.irb and that also has problems.

Some more Pandoc 2 details...

Hi, the first version of the epic changelog for Pandoc 2 is now available:

jgm/pandoc@25590b1

and I saw this:

Set `PANDOC_READER_OPTIONS` in environment where filters are run. This contains a JSON representation of `ReaderOptions`, so filters can access it.

Worth noting, though doesn't need any explicit paru support I suppose? There is also a detailed section of API changes, which may be useful for paru...

0.2.4.7 error running hello world in Ruby V2.0

If I try to run the "hello world" paru example with 0.2.4.7 (installed via gem) I get this error:

👉  test2
/Library/Ruby/Gems/2.0.0/gems/paru-0.2.4.7/lib/paru/filter.rb:22:in `require_relative': /Library/Ruby/Gems/2.0.0/gems/paru-0.2.4.7/lib/paru/filter/document.rb:64: syntax error, unexpected tIDENTIFIER, expecting keyword_do or '{' or '(' (SyntaxError)
Most likely cause: Paru expects a pandoc installation that has been
                               ^
/Library/Ruby/Gems/2.0.0/gems/paru-0.2.4.7/lib/paru/filter/document.rb:66: syntax error, unexpected tIDENTIFIER, expecting keyword_do or '{' or '('
check which pandoc-types have been compi...
           ^
/Library/Ruby/Gems/2.0.0/gems/paru-0.2.4.7/lib/paru/filter/document.rb:66: syntax error, unexpected tIDENTIFIER, expecting keyword_do or '{' or '('
check which pandoc-types have been compiled with your pand...
                             ^
/Library/Ruby/Gems/2.0.0/gems/paru-0.2.4.7/lib/paru/filter/document.rb:76: syntax error, unexpected tCONSTANT, expecting keyword_do or '{' or '('
pandoc-types API version used in document (ve...
                ^
/Library/Ruby/Gems/2.0.0/gems/paru-0.2.4.7/lib/paru/filter/document.rb:76: syntax error, unexpected keyword_in, expecting keyword_end
...andoc-types API version used in document (version = #{versio...
...                               ^
/Library/Ruby/Gems/2.0.0/gems/paru-0.2.4.7/lib/paru/filter/document.rb:77: syntax error, unexpected tIDENTIFIER, expecting keyword_do or '{' or '('
smaller than the version of pandoc-types used by paru
            ^
/Library/Ruby/Gems/2.0.0/gems/paru-0.2.4.7/lib/paru/filter/document.rb:77: syntax error, unexpected tIDENTIFIER, expecting keyword_do or '{' or '('
smaller than the version of pandoc-types used by paru
                                             ^
/Library/Ruby/Gems/2.0.0/gems/paru-0.2.4.7/lib/paru/filter/document.rb:81: syntax error, unexpected keyword_end, expecting ')'
/Library/Ruby/Gems/2.0.0/gems/paru-0.2.4.7/lib/paru/filter/document.rb:85: syntax error, unexpected keyword_end, expecting ')'
/Library/Ruby/Gems/2.0.0/gems/paru-0.2.4.7/lib/paru/filter/document.rb:136: syntax error, unexpected keyword_end, expecting ')'
	from /Library/Ruby/Gems/2.0.0/gems/paru-0.2.4.7/lib/paru/filter.rb:22:in `<module:Paru>'
	from /Library/Ruby/Gems/2.0.0/gems/paru-0.2.4.7/lib/paru/filter.rb:19:in `<top (required)>'
	from /Library/Ruby/Site/2.0.0/rubygems/core_ext/kernel_require.rb:55:in `require'
	from /Library/Ruby/Site/2.0.0/rubygems/core_ext/kernel_require.rb:55:in `require'
	from /Library/Ruby/Gems/2.0.0/gems/paru-0.2.4.7/lib/paru.rb:21:in `<module:Paru>'
	from /Library/Ruby/Gems/2.0.0/gems/paru-0.2.4.7/lib/paru.rb:19:in `<top (required)>'
	from /Library/Ruby/Site/2.0.0/rubygems/core_ext/kernel_require.rb:133:in `require'
	from /Library/Ruby/Site/2.0.0/rubygems/core_ext/kernel_require.rb:133:in `rescue in require'
	from /Library/Ruby/Site/2.0.0/rubygems/core_ext/kernel_require.rb:40:in `require'
	from ../test2:2:in `<main>'

What I think is happening is that you are using the indent-aware <<~ for your HEREDOC errors (https://github.com/htdebeer/paru/blob/master/lib/paru/filter/document.rb#L61), which was only introduced in Ruby V2.3, and macOS uses V2.0 by default. I think it would be better to try to keep V2.0 compatibility, especially as you aren't really using the indentation feature of <<~ anyway

Pandoc V2.8

Hi Huub, hope you are well. This was just a small heads-up that the upcoming V2.8 of Pandoc will include a few command-line option changes:

https://github.com/jgm/pandoc/blob/master/changelog.md

There is a nice new feature to call YAML defaults, and I hope this will work alongside pandocomatic (I haven't tested anything yet though):
https://github.com/jgm/pandoc/blob/master/MANUAL.txt#L1423

Inline filter possible?

I'm looking for a way to embed filters "inline", for example:

 converter = Paru::Pandoc.new do
  from "textile"
  to "html"
  filter do
    ...
  end
end

Is this possible? Or is there another way?

Add today's date filter not working

Hi, I tried using the new example filter, which I renamed addToday in my local install and I get the following noMethod error:

/Users/ian/.pandoc/filters/addToday:8:in `block in <main>': undefined method `[]=' for #<Paru::PandocFilter::Meta:0x007fb7d612bc00> (NoMethodError)
	from /Library/Ruby/Gems/2.0.0/gems/paru-0.2.4.3/lib/paru/filter.rb:132:in `instance_eval'
	from /Library/Ruby/Gems/2.0.0/gems/paru-0.2.4.3/lib/paru/filter.rb:132:in `block in filter'
	from /Library/Ruby/Gems/2.0.0/gems/paru-0.2.4.3/lib/paru/filter/ast_manipulation.rb:92:in `each_depth_first'
	from /Library/Ruby/Gems/2.0.0/gems/paru-0.2.4.3/lib/paru/filter.rb:130:in `filter'
	from /Library/Ruby/Gems/2.0.0/gems/paru-0.2.4.3/lib/paru/filter.rb:108:in `run'
	from /Users/ian/.pandoc/filters/addToday:7:in `<main>'
pandoc: Error running filter /Users/ian/.pandoc/filters/addToday
Filter returned error status 1

filter:

#!/usr/bin/env ruby
## Add today's date to the metadata

require "paru/filter"
require "date"

Paru::Filter.run do 
	metadata["date"] = Paru::PandocFilter::MetaString.new(Date.today.to_s)
end

Paru does not work on windows?

When I try to run

require 'paru/pandoc'

converter = Paru::Pandoc.new do
	from "markdown"
	to "html"
end

result = converter << "hello *world*"

on Windows, I get the message:

$ ruby test_paru.rb
pandoc: \
: openFile: invalid argument (Invalid argument)
C:/Ruby24/lib/ruby/gems/2.4.0/gems/paru-0.2.5.9/lib/paru/pandoc.rb:299:in `run_converter': error while running: (Paru::Error)

pandoc  --from=markdown \
        --to=html

Pandoc responded with:

pandoc: \
: openFile: invalid argument (Invalid argument)

        from C:/Ruby24/lib/ruby/gems/2.4.0/gems/paru-0.2.5.9/lib/paru/pandoc.rb:153:in `convert'
        from test_paru.rb:8:in `<main>'

After fixing running paru on windows, tests with paths with spaces fail on windows

After fixing running paru on windows, tests with paths with spaces fail on windows. Just run rake test on Windows.

Don't regenerate date footers for all yard docs on each small fix?

This isn't really important, but if you look at the diff of each commit, every HTML file has a changed date, and this makes looking for the actual change really difficult in the git commit history. Easiest option is just remove date in the erb footer:

https://stackoverflow.com/questions/10959850/how-to-customize-diff-git-to-ignore-yard-date-generation

Make pandoc2yaml.rb and do-pandoc.rb from the examples executables

Make pandoc2yaml.rb and do-pandoc.rb from the examples executables so these can be used easily by users who do not want to use more than just that and/or are inexperienced using Ruby.

Allowing the use of stop! at start of filter

As we discussed by email, if I use a conditional to test a metadata key exists at the start of a filter, stop! does not actually stop, but carries on.

Paru::Filter.run do
  stop! unless metadata.key?('institute')
  #do something with metadata key, this gets called even if 'institute' key never existed
end

The simple fix is to just wrap the rest of the filter in an if statement, but it seem to be ruby "style" to return early in this fashion...

Setting metadata Loops the filter run 7 times

Hi, I'm testing the a metadata filter and I notice that the filter code is triggered 7 times.

---
title: 'My Title'
author: John Doe
...
Minimal **example**.

#!/usr/bin/env ruby
require 'paru/filter'

testkey = 'author'

Paru::Filter.run do
  if metadata.has_key?(testkey)
    nau = nil
    au = metadata[testkey]
    if au.is_a?(String)
        warn "It's a string"
        nau = [Hash["name" => au]]
    elsif au.is_a?(Array)
        warn "It's an array"
        if au[0].is_a?(String)
          nau = Array.new(au.length) {Hash.new}
          au.each_index {|i| nau[i] = Hash["name" => au[i]]}
        end
    elsif au.is_a?(Hash)
        warn "It's a hash"
    else
      warn "Who know's what it is?"
    end
    if not nau.nil?
      metadata[testkey] = nau
    end
  end
end

👉  pandoc -s -f markdown -t markdown -F testFilter test.md
It's a string
It's an array
It's an array
It's an array
It's an array
It's an array
It's an array
---
author:
- name: John Doe
title: My Title
---

Minimal **example**.

As you can see the warn output is 7 lines long. Using byebug, when the metadata on line 32 is set I see:

[26, 35] in /Users/ian/.dotfiles/pandoc/filters/noop2
   26:     elsif au.is_a?(Hash)
   27:         warn "It's a hash"
   28:     else
   29:       warn "Who know's what it is?"
   30:     end
   31:     if not nau.nil?
=> 32:       metadata[testkey] = nau
   33:     end
   34:   end
   35: end
(byebug) n

[96, 105] in /Users/ian/.rbenv/versions/2.4.1/lib/ruby/gems/2.4.0/gems/paru-0.2.5f/lib/paru/filter/ast_manipulation.rb
    96:             #   tree
    97:             #
    98:             # @yield node
    99:             def each_depth_first(&block)
   100:                 yield self
=> 101:                 each {|child| child.each_depth_first(&block)} if has_children?
   102:             end
   103:
   104:         end
   105:     end

And this leads to another run through the filter. The longer the document, the more superfluous loops through the filter.

Selector "classes" regex is too restrictive, disallows the valid underscore character

The Selector class matcher uses the regular expression \.[a-zA-Z-]+, which prevents selectors with underscores. Technically, any character can be used in a CSS class name as long as it's escaped properly in the stylesheet rule; but the _ character is valid without being escaped, and indeed underscores are commonly used in modern CSS namespacing conventions such as BEM. There's an alternative regex in this StackOverflow answer.

Update pandoc-api-version for Pandoc 2

Pandoc 2 should be getting close to release (there is one outstanding issue, and documentation to finish).

pandoc-types has been upgraded to 1.17.1 — and thus paru seems currently broken with the latest Pandoc 2 nightlies:

👉  pandocomatic -b 'Lu.md'
pandoc	--standalone \
	--filter=/Users/ian/.pandoc/filters/removeHR \
	--filter=/Users/ian/.pandoc/filters/authorRemoveHash \
	--bibliography=/Users/ian/.pandoc/Core.json \
	--csl=/Users/ian/.pandoc/csl/neuron.csl \
	--from=markdown \
	--to=docx \
	--reference-doc=/Users/ian/.pandoc/templates/custom.docx \
	--dpi=300 \
	--output=./Lu.docx
Error running filter /Users/ian/.pandoc/filters/removeHR:
Error in $['pandoc-api-version'][3]: expected Int, encountered Null
Pandocomatic needed 1.2 seconds to convert 'Lu.md'.

I tried to find how this works, but can't see why this bug is triggering (how is pandoc-api-version generated?).

👉  pandoc --version
pandoc 2.0
Compiled with pandoc-types 1.17.1, texmath 0.9.4.1, skylighting 0.3.3
Default user data directory: /Users/ian/.pandoc
Copyright (C) 2006-2017 John MacFarlane
Web:  http://pandoc.org
This is free software; see the source for copying conditions.
There is no warranty, not even for merchantability or fitness
for a particular purpose.

Citation-Abbreviations not being called

Hi Huub, long time! Hope all is good with you.

There is a small naming bug with V2 Pandoc options:

https://pandoc.org/MANUAL.html#citation-rendering

Should be citation_abbreviations but in https://github.com/htdebeer/paru/blob/master/lib/paru/pandoc_options_version_2.yaml#L116 it is citation_abbreviation and so we get the error:

The pandoc option 'citation_abbreviations' (with value '/Users/ian/.pandoc/cite-abbr.json') is not recognized by paru. This option is skipped.

I'll create a pull request...

--toc-depth is missing in Pandoc 2 command line YAML

👉  pandoc -h
pandoc [OPTIONS] [FILES]
...
                        --toc, --table-of-contents
                        --toc-depth=NUMBER
...

https://github.com/htdebeer/paru/blob/paru-for-pandoc2/lib/paru/pandoc_options_version_2.yaml

I'm not sure what the default value is...

Add class to table

undefined method empty - metadata.rb:49

Hi, I'm getting this error:

Erro: JSON parse error: Error in $: Incompatible API versions: encoded with [1,20] but attempted to decode with [1,21].
/var/lib/gems/2.7.0/gems/paru-0.4.0.1/lib/paru/filter/metadata.rb:49:in `initialize': undefined method `empty?' for nil:NilClass (NoMethodError)
	from /var/lib/gems/2.7.0/gems/paru-0.4.0.1/lib/paru/filter.rb:272:in `new'
	from /var/lib/gems/2.7.0/gems/paru-0.4.0.1/lib/paru/filter.rb:272:in `filter'
	from /var/lib/gems/2.7.0/gems/paru-0.4.0.1/lib/paru/filter.rb:244:in `run'
	from ./filtro.rb:6:in `<main>'
Error running filter filtro.rb:
Filter returned error status 1

Can you help me?

convert issue on Windows (?)

I don't know if this is a bug or not but testing the "hello world" test script on Windows 10 it complains with this:

pandoc:
: openFile: invalid argument (Invalid argument)

Ruby version: ruby 2.4.0p0 (2016-12-24 revision 57164) [i386-mingw32]

Tell me if you need more information.

Thanks in advance!

Upgrade to pandoc 1.18

Upgrade to pandoc 1.18. In particular process the added command line options and the changes to the JSON format

metadata.delete does report error?

In previous version I could delete metadata entries by typing:

metadata.delete("date”)

as of version paru-0.2.4.6 it does not work anymore and I get:

/var/lib/gems/2.3.0/gems/paru-0.2.4.8/lib/paru/filter/meta_map.rb:185:in

select': undefined method has_key?' for []:Array (NoMethodError)
from /var/lib/gems/2.3.0/gems/paru-0.2.4.8/lib/paru/filter/meta_map.rb:139:in has?' from /var/lib/gems/2.3.0/gems/paru-0.2.4.8/lib/paru/filter/meta_map.rb:150:in delete'
from bin/removeMetadataFilter.rb:6:in block in <main>' from /var/lib/gems/2.3.0/gems/paru-0.2.4.8/lib/paru/filter.rb:268:in instance_eval'
from /var/lib/gems/2.3.0/gems/paru-0.2.4.8/lib/paru/filter.rb:268:in block in filter' from /var/lib/gems/2.3.0/gems/paru-0.2.4.8/lib/paru/filter/ast_manipulation.rb:101:in each_depth_first'
from /var/lib/gems/2.3.0/gems/paru-0.2.4.8/lib/paru/filter/ast_manipulation.rb:102:in block in each_depth_first' from /var/lib/gems/2.3.0/gems/paru-0.2.4.8/lib/paru/filter/node.rb:104:in block in each'
from /var/lib/gems/2.3.0/gems/paru-0.2.4.8/lib/paru/filter/node.rb:103:in each' from /var/lib/gems/2.3.0/gems/paru-0.2.4.8/lib/paru/filter/node.rb:103:in each'
from /var/lib/gems/2.3.0/gems/paru-0.2.4.8/lib/paru/filter/ast_manipulation.rb:102:in each_depth_first' from /var/lib/gems/2.3.0/gems/paru-0.2.4.8/lib/paru/filter.rb:266:in filter'
from /var/lib/gems/2.3.0/gems/paru-0.2.4.8/lib/paru/filter.rb:228:in run' from bin/removeMetadataFilter.rb:4:in
’

Of course I’m not 100% sure if this is a correct way of removing metadata (I have metadata in latex/pdf output but remove them for HTML output)

I have attached a test example - you can execute:
pandoc "test.markdown" --filter testFilter.rb -o test.html

Metadata_Remove.zip

howto: capturing a bunch of paragraphs

Hi after a long time!
I'm struck trying to capture this block of paragraphs to generate a custom html output. For example. I want to capture all the markdown code inside :::alert and :::

:::alert
A simple **alert**

with a few paragraphs and...

other stuff like lists:

* One
* Two
:::

and produce some custom HTML output like this:

<div class="alert">
<p>A simple <strong>alert</strong></p>

<p>with a few paragraphs and...</p>

<p>other stuff like lists:</p>

<ul><li>One</li><li>Two</li></ul>
</div>

Is that possible with Paru?

Thanks in advance!

Paru documentation

I found this Paru documentation (https://heerdebeer.org/Software/markdown/paru/#frequently-asked-questions) very heplful but I'm stuck at the time of extract the metadata from the markdown file (section 2.1)

$ ./pandoc2yaml.rb:32:in `values_at': no implicit conversion of String into Integer (TypeError)

I googled a bit (https://stackoverflow.com/questions/20790499/no-implicit-conversion-of-string-into-integer-typeerror) but I did'nt obtain a way to workaround that issue.

undefined method `epub_stylesheet'

I'm having this issue with paru:

C:/abc/methods.rb:44:in `block in gen_epub': undefined method
 `epub_stylesheet' for #<Paru::Pandoc:0x0000000002a08be0> (NoMethodError)
Did you mean?  epub_chapter_level
        from C:/Ruby24-x64/lib/ruby/gems/2.4.0/gems/paru-0.3.0.0/lib/paru/pandoc.rb:137:in `instance_eval'
        from C:/Ruby24-x64/lib/ruby/gems/2.4.0/gems/paru-0.3.0.0/lib/paru/pandoc.rb:137:in `configure'
        from C:/Ruby24-x64/lib/ruby/gems/2.4.0/gems/paru-0.3.0.0/lib/paru/pandoc.rb:107:in `initialize'

What is going on here?

The selector "Code" does not work.

When I try to run the filter

#!/usr/bin/env ruby

require "paru/filter"

Paru::Filter.run do 
    with "Code" do |n|
        warn n.string
    end
end

on the text

This is a *line* with some `code` in it

I get the error

selector.rb:105:in `expect_pandoc_type': Expected a Pandoc type, got 'Code' instead (Paru::SelectorParseError)

Using pandoc within a filter is confusing smart conversion of quotes

Using the pattern suggested in #54, I have setup a filter to be able to inject an HTML class to a blockquote element in HTML.

filter.rb:

require 'paru/filter'
require 'paru/pandoc'

def html_convert(string)
  Paru::Pandoc.new do
    from 'markdown'
    to 'html'
  end << string
end

def classed_blockquote(blockquote_class, contents)
  Paru::PandocFilter::RawBlock.new([
    'html',
    "<blockquote class='#{blockquote_class}'>\n#{html_convert(contents)}</blockquote>"
  ])
end

Paru::Filter.run do
  with 'Div.classed_blockquote' do |div|
    content = div.inner_markdown
    blockquote_class = div.attr['blockquote_class']
    div.parent.replace(div, classed_blockquote(blockquote_class, content))
  end
end

The intention here, as an example, would be to apply a class which is related to CSS presentation of poetry. Given the following markdown:

test.md

:::{.classed_blockquote blockquote_class='hanging_indent'}

‘Let 'em talk 'bout what they think they see

Let ’em talk 'bout how they see us be

'Cause baby, we got nothin' to prove

We earned our scars and put in our time
:::

I expect this output:

<blockquote class='hanging_indent'>
<p>‘Let ’em talk ’bout what they think they see</p>
<p>Let ’em talk ’bout how they see us be</p>
<p>‘Cause baby, we got nothin’ to prove</p>
<p>We earned our scars and put in our time</p>\n</blockquote>

However, I think related to my use of inner_markdown to access the node's children, the result I'm getting turns the initial quote mark the other way around, no matter what I do, even if I escape it as "\‘":

<blockquote class='hanging_indent'>
<p>’Let ’em talk ’bout what they think they see</p>
<p>Let ’em talk ’bout how they see us be</p>
<p>‘Cause baby, we got nothin’ to prove</p>
<p>We earned our scars and put in our time</p>\n</blockquote>

I've tried a few ways of individually parsing the nodes along the lines of the below snippet, as I thought perhaps due to the nesting I was confusing pandoc, but still suffer from the same issue.

children = div.children
children.each do |node|
  html_convert(node.inner_markdown)
end

Anytime I attempt to convert a node to html from within a filter, I'm seeing results in the quote mark being turned the wrong way around. This doesn't happen if I simply run the markdown above without the filter (though of course the output paragraphs are wrapped in <div class="classed_blockquote" data-blockquote_class="hanging_indent"> instead.

I'm obviously approaching this in a way that is confusing pandoc's implementation of --smart, likely due to the nesting of elements or the use of 'inner_markdown', but I'm not sure how else to access a node's contents.

Should I be individually converting them to_ast/json and then transforming that into my result HTML?

Request: filter that split document onto single HTML files

Hi!
That's not an issue. I started to write my own filter based on this: https://github.com/jdittrich/SplitMarkdownFilter/blob/master/writeSplitPandocJSON.js The problem is that my ruby skills are very limited. At the moment I'm stuck to figure out how to capture on Paru with every markdown content between level one headers.

I do appreciate any help.

documentation: using filters with pandoc

Hello,

I just discovered this neat project and will look into it more closer in the future. I already have one question: Is it possible to use the paru filters within the original pandoc command, e.g. pandoc --filter=my-paru-filter.rb?

Feature Request: with_startup

Hello,

you provide the following example code:

Paru::Filter.run do 
    metadata.delete "pandoc"
end

I added a print statement and noted that this line is actually called many times. It would be nice if it could be executed only once and if one can control if it is executed before other with "selector" blocks or after.

Manipulating metadata in a filter is cumbersome / impossible

It is not always possible to add or change the metadata in a filter. For example, I can delete a key-value from a MetaMap, but I cannot add one. Adding a simple string involves creating a MetaString, which is more complex than just setting a string. And I am not sure how to set MetaInline or MetaBlock easily, if at all.

Cannot select InlineMath in Table

Hello,

I wont to write a filter to replace keywords with variables. Keywords use the same syntax like Pandoc template variables, i.e. $keyword$ .

My filter-replace.rb:

#!/usr/bin/env ruby
require 'paru/filter'

Paru::Filter.run do
  with "Math" do |str|
    STDERR.puts str.inner_markdown.inspect
  end
end

My markdown file for testing:

# Title

In Paragraph: $replace.in_para$.

  In Table
  ---------
  $replace.in_table$

However, the selector filters only for the first key word in the paragraph, not the second in the table. How can I get both?

META-Metadata from Pandocomatic?

I'm writing a filter where I want to change my modification based on whether the pandoc output format is docx or latex. It would be good if pandocomatic could somehow pass this information to paru for use in filters. As far as I can tell you strip the pandocomatic_ field before invoking pandoc (I can't see it in metadata anyway). The easiest way to solve this is to keep pandocomatic_: and add the to: field from the pandocomatic.yaml template to the document metadata before pandoc gets it. That way a filter would have this info available to use.

Filter applied for a specific output format

Is it possible to choose an specific output format on the context of a filter? Let me explain: for a some reason, I want to change the uri of the wholes images of a document but only for the HTML ouput, not for PDF output (and mantaining the original images uri on that last case)

documentation: add minimum ruby version

Hello,

I experienced problems with paru because some packaged ruby version is too old and does not support require_relative. I think the minimum ruby version can be specified in the gemspec file.

According to https://www.rubydoc.info/gems/require_relative/1.0.3 , require_relative needs 1.9.2 or later.

The gem require_relative provides a backport

add_today.rb filter hangs running via pandocomatic

Hi, using the new metadata.yaml version of the add_today.rb, my pandocomatic compiles are hanging, and I have to do a CTRL+C to force break:

👉  pandocomatic --debug "SNN-Attention-V1.4.md"
^Cpandoc: Error running filter /Users/ian/.pandoc/filters/add_today.rb
user interrupt
/Users/ian/.rbenv/versions/2.4.1/lib/ruby/gems/2.4.0/gems/paru-0.2.4.8/lib/paru/pandoc.rb:161:in `read': Interrupt
	from /Users/ian/.rbenv/versions/2.4.1/lib/ruby/gems/2.4.0/gems/paru-0.2.4.8/lib/paru/pandoc.rb:161:in `block in convert'
	from /Users/ian/.rbenv/versions/2.4.1/lib/ruby/gems/2.4.0/gems/paru-0.2.4.8/lib/paru/pandoc.rb:158:in `popen'
	from /Users/ian/.rbenv/versions/2.4.1/lib/ruby/gems/2.4.0/gems/paru-0.2.4.8/lib/paru/pandoc.rb:158:in `convert'
	from /Users/ian/.rbenv/versions/2.4.1/lib/ruby/gems/2.4.0/gems/pandocomatic-0.1.4.1/lib/pandocomatic/command/convert_file_command.rb:157:in `pandoc'
	from /Users/ian/.rbenv/versions/2.4.1/lib/ruby/gems/2.4.0/gems/pandocomatic-0.1.4.1/lib/pandocomatic/command/convert_file_command.rb:92:in `convert_file'
	from /Users/ian/.rbenv/versions/2.4.1/lib/ruby/gems/2.4.0/gems/pandocomatic-0.1.4.1/lib/pandocomatic/command/convert_file_command.rb:59:in `run'
	from /Users/ian/.rbenv/versions/2.4.1/lib/ruby/gems/2.4.0/gems/pandocomatic-0.1.4.1/lib/pandocomatic/command/command.rb:87:in `execute'
	from /Users/ian/.rbenv/versions/2.4.1/lib/ruby/gems/2.4.0/gems/pandocomatic-0.1.4.1/lib/pandocomatic/command/convert_file_multiple_command.rb:81:in `block in execute'
	from /Users/ian/.rbenv/versions/2.4.1/lib/ruby/gems/2.4.0/gems/pandocomatic-0.1.4.1/lib/pandocomatic/command/convert_file_multiple_command.rb:80:in `each'
	from /Users/ian/.rbenv/versions/2.4.1/lib/ruby/gems/2.4.0/gems/pandocomatic-0.1.4.1/lib/pandocomatic/command/convert_file_multiple_command.rb:80:in `execute'
	from /Users/ian/.rbenv/versions/2.4.1/lib/ruby/gems/2.4.0/gems/pandocomatic-0.1.4.1/lib/pandocomatic/pandocomatic.rb:108:in `run'
	from /Users/ian/.rbenv/versions/2.4.1/lib/ruby/gems/2.4.0/gems/pandocomatic-0.1.4.1/bin/pandocomatic:3:in `<top (required)>'
	from /Users/ian/.rbenv/versions/2.4.1/bin/pandocomatic:22:in `load'
	from /Users/ian/.rbenv/versions/2.4.1/bin/pandocomatic:22:in `<main>'

other paru filters are working.

EDIT: this filter also contains a <<~ HEREDOC, but making it <<- and it still hangs, and this was tested on Ruby 2.4.1 (installed via rbenv)

#!/usr/bin/env ruby
## Add today's date to the metadata
require "paru/filter"
require "date"

Paru::Filter.run do 
    metadata.yaml <<-YAML
---
date: #{Date.today.to_s}
...
    YAML
end

Allow multiple versions of pandoc / prepare for pandoc 2

Issue created in reaction to the pandocomatic issue: Broken with Pandoc 2.

Currently pandoc 2 is in the making. This new major version promises some API changes. Although it might be a while before it is released, it is good to be prepared for when it is. Particularly because it is not unlikely that some users will keep on using the 1.x range while others start using the 2.x range.

I suggest adding an environment variable, PARU_PANDOC_PATH, that can be used to choose which version of pandoc to use. If no such environment variable exists, or if the path does not resolve, try to use the pandoc executable in the system's PATH. Paru's pandoc API will change with the pandoc version used: some pandoc CLI options in 1.x are not there in 2.x and vice versa.

Pandoc 2.10 and Underlines

Trying to compile a document using Pandoc 2.10 and the new underline type, I am getting an error when running a Paru filter:

pandocomatic.yaml template:

templates:
  test:
    pandoc:
      from: markdown
      to: html5
      filter:
        - ./noop.rb
    metadata:
      lang: 'EN-GB'

noop.rb filter:

#!/usr/bin/env ruby

require 'paru/filter'

Paru::Filter.run do
  stop!
end

test.md

---
title: "Underline test"
pandocomatic_:
  use-template: test
---

# Abstract #

[Lørem ipsum dolør sit amet]{.underline} , eu ipsum movet vix, veniam låoreet posidonium te eøs, eæm in veri eirmod. Sed illum minimum at, est mægna alienum mentitum ne. Amet equidem sit ex. Ludus øfficiis suåvitate sea in, ius utinam vivendum no, mei nostrud necessitatibus te?

The error I get is the following:

➜ pandocomatic -b -c pandocomatic.yaml test.md
pandoc	--from=markdown \
	--to=html5 \
	--filter=noop.rb
Error running filter noop.rb:
Error in $.blocks[1].c[0]: mempty
Error running pandoc => error while running:

pandoc	--from=markdown \
	--to=html5 \
	--filter=noop.rb

Pandoc responded with:

Error running filter noop.rb:
Error in $.blocks[1].c[0]: mempty

If I change the class of the [inline]{} to something other than underline, then it compiles without issue...

➜ pandoc -v
pandoc 2.10
Compiled with pandoc-types 1.21, texmath 0.12.0.2, skylighting 0.8.5
Default user data directory: /Users/ian/.local/share/pandoc or /Users/ian/.pandoc
Copyright (C) 2006-2020 John MacFarlane
Web:  https://pandoc.org
This is free software; see the source for copying conditions.
There is no warranty, not even for merchantability or fitness
for a particular purpose.
➜ pandocomatic -v
Pandocomatic version 0.2.7.2
© 2014—2020 Huub de Beer <[email protected]>

Pandocomatic is free software; pandocomatic is released under the GPLv3.

For more information about pandocomatic run 'pandocomatic --help' or read its
documentation at https://heerdebeer.org/Software/markdown/pandocomatic/.

➜ gem list | grep paru
paru (0.4.1)

insert_code_block.rb can't modify first element of the document

Working with insert_code_block.rb filter doesnt allow to modify the first of the block element of the document inserted. I attached a zip with a simple demo (add a silly text in capitals at the beginning of every paragraph). Check out the PDF to see the issue.

issue_insert_code_block.zip

howto: generate tables with captions

Hello,

I wrote a filter that replaces some ::paru paragraph with a markdown table. Preparing the table as markdown does not feel like a very clean solution.

Is there a way to let paru directly convert a ruby 2D array/Hash to a table? Is there support to set a caption?

Best,
Robert

`--pdf-engine-opt` can occur more than once

The pandoc option --pdf-engine-opt should be allowed to be used multiple times. Issue originates from pandocomatic issue

Thank you!

Hi,

I just wanted to say "Thank You" for this project. I just used it to implement an include filter for pandoc, and with paru it was a dead-simple task.

Thanks again, and keep up the good work!

Steffen

Filter selection fails to match in certain scenarios

I'm not sure why, but in certain circumstances, a filter will fail to match an element that it should match.

For example, given this markdown source:

# Chapter One

## A location

This is chapter one. It has a bunch of text.

# Chapter Two

## Another location

This is chapter two. It has some more text.

## A Subsequent location

The end.

The following filter will fail to match the first two H2 headings:

require 'paru/filter'

Paru::Filter.run do
  with 'Header' do |header|
    warn "Header is #{header.inner_markdown}"
    if header.level == 1
      header.markdown = "\\chapter\*\{#{header.inner_markdown.strip}\}\\label\{#{header.attr.id}\}"
    end
    if header.level == 2
      header.markdown = "\\section\*\{#{header.inner_markdown.strip}\}\\label\{#{header.attr.id}\}"
    end
    if header.level == 3
      header.markdown = "\\subsection\*\{#{header.inner_markdown.strip}\}\\label\{#{header.attr.id}\}"
    end
  end
end

I guess the manipulation of a given node's markdown is resulting in the following node being skipped over by the filter, as the nodes only seem to be passed over when they are immediately following another node that has matched.

It's definitely related to the 'markdown' method manipulation, as if I simply run a filter like this one below, every heading node is listed:

Paru::Filter.run do
  with 'Header' do |header|
    warn "Header is #{header.inner_markdown}"
  end
end

At the moment I'm resorting to a relatively hacky check of the next node in the index's type, and if it is also a Heading I am manipulating it as well. But ideally I would expect this filter to match every heading node in the document.

Should I be manipulating the output of those nodes in a different way? Or is there a bug that is resulting in a skipped over node in the index when a mode has been altered?

htdebeer / paru Goto Github PK

paru's People

Contributors

Stargazers

Watchers

Forkers

paru's Issues

Recommend Projects

Recommend Topics

Recommend Org