cdsoft / pp Goto Github PK

View Code? Open in Web Editor NEW

253.0 17.0 21.0 4.2 MB

PP - Generic preprocessor (with pandoc in mind) - macros, literate programming, diagrams, scripts...

Home Page: http://cdelord.fr/pp

License: GNU General Public License v3.0

Makefile 6.58% Haskell 92.10% Shell 0.81% Assembly 0.51%

pp's People

Contributors

Stargazers

Watchers

pp's Issues

Add Railroad Diagrams Support

I was looking into the various diagram tools supported by PP, and it seems that none of them allows creation of Railroad Diagrams like these:

http://www.xanthir.com/etc/railroad-diagrams/example.html

I've tried to search for some libraries that can handle creating Railroad Diagrams images from plain text, and found these, which seems good candidates:

https://github.com/tabatkins/railroad-diagrams (Node.js or Python)
https://github.com/Chrriis/RRDiagram (Java)

Ideally, it would great if there was an Haskell library (or some bindable binary) that could be integrated directly in PP, without requiring external dependencies (or installing languages).

If I have understood correctly, PlantUML is integrated into PP even though is in Java (I've notice reference to .stack-work/PlantumlJar_c.c in the pp.cabal source file). Is it done via a C wrapper, or is it a different PlantUML binary version altogether?

Could the RRDiagram Java library (mentioned above) be integrated natively into PP?

What do you think about it? is this type of diagrams worth of being built into PP? or should it rely on some external tool?

Do you have some suggestions for Railroad Diagrams CLI tools or scripts?

I find railroad diagrams very useful, especially to visualize regular expressions — examples:

... and possibly they might be quite useful, in general, in the context of coding-documentation.

Reduce Number of Tildes Produced by `!src()`

I'm working on a PP tutorial, in GFM markdown, and I'm having lots of trouble trying to represent a code block which contains in itself a fenced code block produced by !src() macro. Here is the tutorial in question:

https://github.com/tajmone/markdown-guide/blob/2f8b50a/pp/tutorial_01.md#our-first-usage-test

Because of the excessive number of tildes (70 tildes):

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ {.c}

... wrapping !src()'s output inside a fenced code block seems to fail on GitHub's preview. I've tried to enclose the block within fences with a higher number of tildes, but didn't work. Maybe there is a limit to the number of tildes (or backticks) that are taken into account by different markdown parsers, or maybe it's just too hard to keep track of such a high number of tildes. Whatever the reason, I found it hard to handle it.

Possibly, 3 tildes would be an easier amount to handle, and it would only require >=4 tildes to wrap the example in a fenced code block, so that the parsers doesn't mistake those tildes for the end of the code block.

Referencing pandoc's guide:

http://pandoc.org/MANUAL.html#fenced-code-blocks

In addition to standard indented code blocks, pandoc supports fenced code blocks. These begin with a row of three or more tildes (~) and end with a row of tildes that must be at least as long as the starting row.

Then follows an example wrapped in 6 tildes:

~~~~~~~
if (a > 3) {
  moveShip(5 * gravity, DOWN);
}
~~~~~~~

Then it states:

If the code itself contains a row of tildes or backticks, just use a longer row of tildes or backticks at the start and end:

and as an example it wraps a 10-tildes block in 15 tildes:

~~~~~~~~~~~~~~~~
~~~~~~~~~~
code including tildes
~~~~~~~~~~
~~~~~~~~~~~~~~~~

70 tildes are by far too many, much more than required, and can pose a problem when needing to enclose them in another code block, because it would require more than 71 tildes/backticks. I'm not sure if there are other built-in PP macros that output fenced-code tildes, but I suggest a consistent reduction of their number wherever they occur (4-10 would be a more manageable number).

PP Tutorial

Hi Christophe,

just wanted to let you know that I've published a first draft of my PP tutorial:

https://github.com/tajmone/markdown-guide/tree/master/pp

It could still be continued, but it's a good starting point for absolute novices. (It might not be over-polished right now, I didn't have time to fully revise it, but I though it was best to just publish it).

In the future (time permitting) I'm planning to add more examples and also some custom scripts (eg: for redirecting source code to external syntax highlighters, ecc.).

The tutorial is public domain, so feel free to plunge from it at will — I've decided to make the project public domain because I don't want people to strugle with quotations, or to constantly feeling obliged to quote the source. Just take and use what you like.

B.R.

Download from Homepage is missing tag.sh

As advised by the docs I downloaded pp from http://cdsoft.fr/pp/pp.tgz and extracted it; yet make failed with

/bin/bash: ./tag.sh: No such file or directory
make: *** No rule to make target `.stack-work/PlantumlJar_c.c', needed by `/Users/suhlig/Downloads/pp/.stack-work/install/x86_64-osx/lts-9.1/8.0.2/bin/pp'.  S
top.

Looks like the tarball does not include tag.sh. Cloning the git repo worked fine, so I guess the tarball is incomplete?

Question: Passing options to diagram renderers

Is it possible to pass options to diagram renderers (e.g. --no-shadows to ditaa)? If not is it something you might consider adding?

Include CSV file as markdown table

It would be nice if pp can include CSV file as markdown table like iA write did.

Markdown tables are painful to write manually. On the other hand, spreadsheet editors like Numbers or Excel make it easy to create tables with complex formulas. A long-time standard for exporting these tables is the comma separated value format (csv). You can compose a table in Excel, and then have all the calculations exported in a plain text csv. And now you can reference a csv file in iA Writer the same as way as images:

Using the first prototype simulating said behavior, we knew this was the way forward. It’s hard to describe how transclusion feels other than it just feels right. Beginning with a simple, straightforward syntax kept reaping dividends.

How to use asymptote with other formats than png

I want to use svg generated from pp and asymptote.
But pp always append png to the filename.
How to use asymptote with other formats than png?

plantuml.jar no longer available

When I download pp.tgz and run make, I receive the following error:

wget http://sourceforge.net/projects/plantuml/files/plantuml.jar -O .stack-work/Plantuml.jar
--2018-02-27 11:54:30--  http://sourceforge.net/projects/plantuml/files/plantuml.jar
Resolving sourceforge.net (sourceforge.net)... 216.105.38.13
Connecting to sourceforge.net (sourceforge.net)|216.105.38.13|:80... connected.
HTTP request sent, awaiting response... 301 Moved Permanently
Location: https://sourceforge.net/projects/plantuml/files/plantuml.jar [following]
--2018-02-27 11:54:31--  https://sourceforge.net/projects/plantuml/files/plantuml.jar
Connecting to sourceforge.net (sourceforge.net)|216.105.38.13|:443... connected.
HTTP request sent, awaiting response... 404 Not Found
2018-02-27 11:54:33 ERROR 404: Not Found.

Makefile:133: recipe for target '.stack-work/Plantuml.jar' failed
make: *** [.stack-work/Plantuml.jar] Error 8
make: *** Deleting file '.stack-work/Plantuml.jar'

Why pp?

@CDSoft, do you mind explaining, why someone should use your pp preprocessor over gpp and pandoc-gpp?
What are the unique features?
Why do you have started pp and not still using gpp?
When should someone use gpp (when pp is not intended to be as powerful)?

Question About PP Errors & Warnings

I'm working on a binary flat-file CMS that uses PP and pandoc to create HTML documentation (or websites) from markdown source files.

I need to fine-tune the handling of PP/Pandoc errors and warnings.

Could you provide me some details on how PP reports errors and warnings?

I'm currently capturing separately the Exit Code and STDERR messages from PP and pandoc process.

What values can I expect for Exit Code? (ie: is just a boolean error or no error value, or are there error-specific values returned?)

Are there only fatal errors, or can I expect also non-blocking warnings? (ie: for which my app should not abort operations but should inform the user about PP warnings).

Could you provide me some simple examples of how to raise errors and warnings by both messing up the command line arguments and the markdown source macros, so I can test if my app handles them correctly. (eg: like trying to redefine a built-in macro).

Is there a test file for errors and warning in PP's source?

PS: I've looked at the code in ErrorMessages.hs, but I couldn't gather much info from it due to lack of Haskell knowldege.

Include Partial Files Functionality

It would be nice to have a macro to include only specific lines from an external file. I was thinking of something along the lines of AsciiDoctor's include directive:

http://asciidoctor.org/docs/user-manual/#by-line-ranges

It allows to specify multiple lines and/or lines-ranges. Something like:

!include(FILENAME)(10-15,18,21)

AsciiDoctor also allows to include file regions defined in the source doc via tags:

http://asciidoctor.org/docs/user-manual/#by-tagged-regions

... region-tags are just paired opend/close tags in commented lines (using the target language's native commenting convention) which are formatted in a way that AsciiDoctor can reckognize during parsing — the commented lines are then omitted from the final emitted text, as well as from line-counting operations.

This last feature might be excessive, but it proves the point that when dealing with external source code files there is often a need to select specifc parts of the code from a same file — instead of having to either split the file in multiple parts or copy and paste those parts in the markdown/rst source.

This would make maintainance of code documentation much easier, especially so the tagged regions feature since it grants that even if the code changes the targeted include-part will always remain the same. It would, for example, allow commenting a long source file by including specific line-ranges, and have the doc reflect any changes made to the original file.

Currenty, I'm achieving this via a PP macro than invokes AsciiDoctor on the external source file. A small CLI binary could be created to achieve the same withouth requiring users to install Ruby and AsciiDoctor. But having this functionality built-in into PP would improve speed because it could be handled without having PP temporarily handle over control to an external tool/script.

Beside adding a new macro for this purpose, some of the existing built-in macros that deal with code and external files could integrate this feature (optionally).

These features would augment PP's support for literate code too.

What's your view on this? Is it something worth of being built-in or it should be handled as a custom macro via third party tools/scripts?

Are there logical operators?

Say I have two different symbols: S1 and S2.
If S1 or S2 is defined, then output TEXT.

How do I do that, while writing TEXT only once?
I know I can use this:

!ifdef(S1)(TEXT)
!ifdef(S2)(TEXT)

But if TEXT gets very large or there are many symbols, then something like this would be much better:
!ifdef(S1|S2)(TEXT)

Similar, this question is valid for other operators as well, notably the AND operator. Same problem with when using !ifeq.
Or is there some other way to achieving the above example? At least I don't get it.

Question !def

I am using pp to process md files.
In my markdown I am using multiple !def one under another like that:
!def{X}{Y}{Z}
!def{A}{B}{C}
...
...

After pp processing this file, in place of line with !def I receive multiple empty lines.
Is it intended? How to prevent that?

Feature Request: !rem() Macro

A new built-in macro: !rem(COMMENT TEXT), does nothing and discards COMMENT TEXT.

Useful for organizing macros code with comments, memos, copyright and license information.

Why? Because markdown doesn't support comments, and using HTML comment () inside macros will cause them to be emited when macros are preloaded, it will pollute the final output with comments that should be bound to the macros themselves, and will cause problems with non HTML output.

Since the last parameter supports fences by default, !rem could be used to create nice headers in macros files, with version info, links, ecc. Just like with any shared code snippet.

!env() broken in PP 1.3 for Windows

After the update to v1.3 !env() stopped working in many cases. Often, when it is invoked (eg: !env(Path)) it reports:

pp: Arity error: cmd expects 1 argument
CallStack (from HasCallStack):
  error, called at src\ErrorMessages.hs:40:27 in main:ErrorMessages

(I've tested it with scripts that previously worked with v1.2)

It's not the letter-casing bug creeping back in (#5), it produces always the error, regardless of casing.

I've tested it also with custom defined environment variables, It doesn't always raise this error, it depends on the contents of the variable. Could it be a problem with handling strings with spaces or other special characters (like you find on paths)?

Or maybe it's because the %PATH% is too long? (but it used to work before)

Could you add a for-construct?

Would it be possible to add support for for-loops. E.g.,

!for(example)(!code.gui)
### Example: GUI - ${example.title}
`` `scala
!example.src
`` `
!endfor

This would be fed with data from a YAML file, e.g.,

code:
  gui:
    traditional:
      title: Traditional Approach
      src: |
        def ...
        val ...

    alternative:
      title: Alternative Approach
      src: |
        def ...
        val ...
  
  controller:
     ...

The result would then be like:

### Example: GUI - ${Traditional Approach}
`` `scala
        def ...
        val ...
`` `

### Example: GUI - ${Alternative Approach}
`` `scala
        def ...
        val ...
`` `

Apart from the for-construct itself, I am requesting support for:

YAML input, including repeated nodes. Maybe this would be possible the "command-line, cross-platform and single-binary tool to read environment vars from YAML files: https://github.com/EngineerBetter/yml2env" as referenced in #9 (see also #25)
field access, as in !code.gui. Maybe that would introduce too much special syntax; then hopefully a notation like !field(code)(gui) would be feasible.

Add !execwait Macro

In some edge cases, when chaining up multiple !exec calls to external tools, an !exec call macro won't work unless the previous !exec task has finished (if accessing the same file, the previously called tool might still have a blocking handle on the file; if generating file contents/temporary files, the file might not be ready for consumption).

For this reason, it would be good to have an !exec variant that will not carry on parsing the document until the invoked task has exited — possible name: !execwait

If this new macro could also capture (cross platform) the exit code of the invoked command in !exitcode (read only built-in symbol) it would great.

Of course, all this could be achived via scripts, but when dealing with lots of small tasks it would be handier to have a built-in solution, and in some cases it might easen cross-platform macros.

`!env(VARNAME)` Fails under Windows

I've tried all sorts of combinations, but I can't get !env() working (Win10 x64).

None of these worked:

!env(PATH)
!env(%PATH%)
!env(%%PATH%%)
!env($PATH)
!env($PATH$)

If I understood correctly, it should allow access to env vars of the current session (eg: within the CMD):

!env(VARNAME)
pp preprocesses and emits the value of the process environment variable VARNAME.

It would be really nice to have this feature working under Windows.

Implementing `!exec()` and `!rawexec()` for Windows

On Windows, trying to use !exec() or !rawexec() raises the following error (and exits PP):

pp: Error while executing `sh C:\Users\PK6A24~1.DIC\AppData\Local\Temp\pp.4118467.sh`: sh: createProcess: does not exist (No such file or directory)
CallStack (from HasCallStack):
  error, called at src\Preprocessor.hs:543:19 in main:Preprocessor

I can't miss noticing that the error reports a filepath using short names (PK6A24~1.DIC is the short name of that folder), and the reference to sh (Linux shell files?)

I didn't understand from PP docs if this command is intended for Linux only or for any shell (including Windows CMD), but it would be nice to see it implement on Windows --- also for scripts portability.

NOTE: I have Win 10 x64, and I'm running PP in a CMD shell with normal priviledges.

Add !append Macro

Feature request:

!append(SYMBOL)(TEXT)

... which would allow to incrementally define symbols. It would be useful to create ad hoc text for documentation pages, exploiting conditional statements and OS specific commands to build SYMBOL step by step, and in the final document it can be emitted with a single !SYMBOL call.

Currently this is difficult to achieve via custom macros because it would envolve nesting temporary macros definition (prev TEXT, new TEXT containing previous one, etc) and I haven't so far managed to find a way to achive it without incurring in infinte loops or crashes.

Unlike literal code, the aim of this macro is to append TEXT which might contain other SYMBOLS or MACROS.

[Question] Execute shell command without inserting its stdout/stderr?

Is there a way to run a shell command without inserting its output into the pp output? I need to ensure that a certain PDF is updated before inserting it as an image. Nothing make can't handle but it would be nice if it could be done inside the document source.

PP-Macros Library

I finally managed to publish my first collection of pp-macros:

https://github.com/tajmone/pandoc-goodies/tree/master/pp

The pandoc template preview is built using the pp macros, so it also offers a good example.

There are other macros almost ready, just need some time to polish them before pushing.

Hopefully, this could be the beginning of a collaborative effort to create a library of cross-platform, cross-format pp macros. At the moment, the macros are for HTML output only, and a couple of them work for Windows only (beacuse of the !cmd macro), but I'll soon try to fix them so they can be OS aware and use the right script macro.

As with all project ... creating and polishing the minimum README files, draft some documentation, and check that all third party licenses were honored, took up more time than actually writing the macros or the template. But now that this is out of the way, progresses should be faster.

Release to Hackage

Would you mind uploading this to Hackage? That would make it particularly easy to install on NixOS.

Extend `!undef` to Accept Multiple Parameters/Symbols

I've discovered that if I create a macro that takes optional parameters, and then the macro is invoked without passing the last optional parameters (eg: \3 and \4) at all (ie: not even empty values), the corresponding parameters of a previous macro call might creep in.

For example:

!firstmacro(param1)(param2)(param3)(param4)
!secondmacro(param1)(param2)

... where in !secondmacro params \3 and \4 are optional, and checked with !ifdef(3)(...) and !ifdef(4)(...), and in the above example the macro will see the 3rd and 4th params previously passed to !firstmacro.

A solution would be to always call the macro with empty values for unused parameters:

!firstmacro(param1)(param2)(param3)(param4)
!secondmacro(param1)(param2)()()

But to make life easier in actual usage, a more elegant solution seems to be undefining at the end of each custom macro all the possible parameter-symbols used by it:

!undef(4) !undef(3) !undef(2) !undef(1)

I think that !undef's syntax should allow an unlimeted number of parameters to simplify destroying multiple symbols at once:

!undef(4)(3)(2)(1)

... it would be more friendly an neat.

Add macro for assigning the expansion of some text to a symbol

This is sort of a follow-up on #29.

To be able to specify a required argument in code block format I recently wrote a macro where the first argument is optional, handled by checking the definedness of what would be the last argument if the optional argument is present:

!foo[(OPTIONAL)](REQUIRED_A)(REQUIRED_B)
~~~~~~~~~~~~~~~~~~~~~~~~
REQUIRED_C
~~~~~~~~~~~~~~~~~~~~~~~~

However inside the definition of !foo I have to repeatedly say !ifdef(4)(!2)(!1) etc. when referring to any of the arguments. What I would like to be able to do is to assign the text of a parameter to a symbol, however !def(__req_a)(!ifdef[4][!2][!1]) doesn't work because inside a definition the number symbols must refer to the arguments of the macro being defined. The solution would be a macro !assign(SYMBOL)(TEXT) which makes !SYMBOL expand to the expansion of TEXT when !assign was called. Then one could say things like !assign(__foo)(!ifeq[3][yes][!1][!2]) and thereafter use !__foo instead of !ifeq(3)(yes)(!1)(!2), which hopefully makes the code more readable and more maintainable.

Valid Macro's Name Chars?

I've looked at the source code but couldn't find any references (RegEx, etc) to what constitutes a valid macro name.

Which characters can be used in a macro's name? I assume the usual a-zA-Z0-9_-, but what about special chars (like: $#@;: and so on)?

If I've understood correctly, macro's names are case sensitive; therefore !hello and !Hello would be two distinct macros.

And (I assum) user defined macros can't take the name of a built-in macro (at least not with the same letter casing).

Are these last two assumptions correct?

Changing graph type does not regenerate graphs

Example (added a \ before the tilde line due to GitHub code formatting issues):

Let's say I have a \dot graph that generates x.png:

\dot(x)()
\~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
digraph {
    A -> B
    B -> C
    C -> D
}

If I change \dot to \neato (or any other graph type), without changing the body of the graph, the graph will not be regenerated.

\neato(x)()
\~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
digraph {
    A -> B
    B -> C
    C -> D
}

The above snippet will not trigger a regeneration.

I suspect that this check in dpp.c only checks if the block was changed or if the file is not existent. Information regarding the used command needs to be saved in the block as well.

EDIT: it seems that the block created by saveblock is directly passed to the GraphViz executable, so storing command information inside it would be problematic. A possible suggestion would be creating an additional "dpp metadata file" that contains the command information, and checking that file for differences. I think it's also a future-proof solution in case additional parameters (such as forcing a specific size for images) are added to dpp commands.

EDIT2: just noticed that dpp.c is unused in pp.hs, as the entire functionality was reimplemented in Haskell. I skimmed through pp.hs and the problem seems the same: only the block forwarded to the diagram-generating runtime is saved and checked for differences, and not the command name itself.

i18n: IT Localization

Is it possible to add support for Italian locale to pp?

Here's my adaptation of the locale date format settings for Italian, based on the French from Preprocessor.hs:

-- italian locale date format
myLocale "it" = TimeLocale {
                    wDays = [("domenica","dom")
                            ,("lunedì","lun")
                            ,("martedì","mar")
                            ,("mercoledì","mer")
                            ,("giovedì","gio")
                            ,("venerdì","ven")
                            ,("sabato","sab")],
                    months = [("gennaio","gen")
                             ,("febbraio","feb")
                             ,("marzo","mar")
                             ,("aprile","apr")
                             ,("maggio","mag")
                             ,("giugno","giu")
                             ,("luglio","lug")
                             ,("agosto","ago")
                             ,("settembre","set")
                             ,("ottobre","ott")
                             ,("novembre","nov")
                             ,("dicembre","dic")],
                    amPm = ("AM","PM"),
                    knownTimeZones = [],
                    dateTimeFmt = "%a %e %b %Y, %H:%M:%S %Z",
                    dateFmt = "%d/%m/%y",
                    timeFmt = "%H:%M:%S",
                    time12Fmt = "%I:%M:%S %p"
                }

In Italian weekdays and monthes name are lower cased, unlike English.

I've adapted dateTimeFmt definition to the Italian way of representing long dates.

Sorry if my contribution has to go through copy-&.paste, but I don't have Haskell installed, nor I can work my way into that language — so I'd rather avoid messing up the source by commiting changes I don't fully understand and can't test thoroughly.

How to define a macro that accepts link_attributes?

Thank you for the help and for the tool.

I am trying to define a macro that accepts link_attributes (thinking of width=30%), but how do I do that?

The following macro:

!define(figure)
(
\begin{figure}
\includegraphics{!2}
\caption{!1}
\end{figure}
)

Naturally breakes the preprocessing if it is called with:
!img(Exports and Imports to and from Denmark & Norway from 1700 to 1780)(source/figures/exports-imports.png { width = 20% } )

How do you suggest I should go about adding this functionality to the above macro?

R support?

Are there any plans for R programming language support? R allows for very complex graphic capabilities and would be a nice addition!

Thanks

Q: Handling OS Awareness in Macros?

Some of the macros in the library I'm building rely on OS-specific macros like !cmd.

I would like to make the macros library cross-platform, and create a custom macro like !clicommand that would use the appropriate command invocation macro for the host OS (!bash or !cmd).

How can I check the OS using pp macros? Are there some env-vars I could evaluate that would allow to establish with 100% certainity which OS is the macro being run under?

Or should I set an environment variable from the calling batch/shell script (eg: GuestOS) than can then be queried from the macros? (ie: I know that the macros library will be invoked by different scripts in Linux and Windows).

Which solution would be best for both worlds (*nix & Win)?

When dealing with external command line tools that are available on both Linux and Windows (eg: Highlight), it would be convenient to have an agnostic command line invocation macro capable of behaving as either !bash or !cmd, according to host OS.

!usermacros: Omit Macros Whose Name Starts with Underscore

When building a set of macros, sometimes I also add some "internal" macros — ie: macros not intended to be used by endusers, but just by the macros, for secondary tasks. Usually I give these internal macros names starting with an underscore, to distinguish them from the actual enduser macros.

I was thinking that it would be nice if !usermacros and !userhelp would ignore macros starting with an "_". This would help keeping a clean auto-generated documentation, without showing certain macros. My guess is that usually no one would use such a naming convention unless it was some sort of internal-usage stuff, therefore it shouldn't interfere with most existing macros collections.

Intricate Interactions With External Tools

( EDITED: Fixed links to point to specific commit in master branch because dev branch was merged-in and deleted )

Today I've struggled quite hard to create a macro tha takes a block of code and passes it to an external source highlighting tool and then emits the final raw html back into the doc. Maybe something can be done to make such cases easier...

Here is the final macro (still in a dev branch):

pandoc-goodies/pp/macros/Highlight.pp

and here is some test code to see how it actually works:

The syntax of the macro is:

!raw{!Highlight(LANG)(OPTIONS)
~~~~~
CODE
~~~~~
}

... taking the following parameters:

LANG (mandatory) — The language of the source (eg: HTML, Python, etc.).
OPTIONS (can be empty) — Further options to pass to Highlight during invocation. If none desired, just pass empty value.
CODE (mandatory) — The block of source code to syntax-color.

NOTE: This macro creates and deletes a temporary file (named "_pp-tempfileX.tmp", where X is a numeric counter) in the macros folder (/pp/macros/) for each macro call in the document, to temporarily store the code to highlight. The X counter is reset at each PP invocation.

This is how I managed to create the macro:

!define(Highlight)(
!add(HLCounter)
!quiet[!lit(!env(PP_MACROS_PATH)_pp-tempfile!HLCounter.tmp)()(\3)]
!quiet[!flushlit]
<pre class="hl"><code class="\1">!exec[highlight.exe -f -S \1 --no-trailing-nl --validate-input !ifdef(2)(\2) !env(PP_MACROS_PATH)_pp-tempfile!HLCounter.tmp]</code></pre>
!ifeq[!os][windows]
[!exec(DEL !env(PP_MACROS_PATH)_pp-tempfile!HLCounter.tmp)]
[!exec(rm !env(PP_MACROS_PATH)_pp-tempfile!HLCounter.tmp)]
)

!define(HLCounter)(0)

The problem encountered was thay I couldn't find any direct way to pass to the Highlight application the CODE parameter by feeding it to it via STDIN. So I resorted to a temporary file stored in the macros folder (its path is already in an env-var because some macros need it to find CSS definitions). It's not the most elegant solution, but it works fine (the temp files are deleted by the macro itself).

As for the counter ... the problem was that there is no macro to reset a literate file, and even if I used !flushlit, at each invocation of this macro the new CODE parameter would be added to the one of the previous invocation — with each invocation of the macro the source code piled up together. This is why I resorted to a counter, so each invocation of the macro will use a different filename for the !lit macro.

It would be useful to have some macro that resets/destroys a literal file from PP's memory, to avoid situations like this one. Ideally, I would have liked the macro to use the same file at each invocation, but to forget about it after.

The file deletion works on Windows, didn't have a chance to test it on Linux yet. I'm using a simple check if the os is Windows (in which case use CMD command), else use Bash rm command to delete file — I assumed it should work for both Linux and Mac! Does it?

Was there a simpler way to accomplish this task, that I might have overlooked?

Are there any PP enhancements and new features that could make similar interactions with external apps easier, without having to rely on a temporary file?

It would be really great if there was a way to have pp call an external shell/cmd tool and be able to invoke with options and pass it some text so that the other app sees it coming from STDIN. Some piping is required, and I have no idea how easy or difficult this would be in Haskell.

Macros to List All Built-In and User-Defined Macros

Issue #40 made think that if the \ syntax was to be dropped, I'd need to do some RegEx search-&-replace on all my projects folders to adapt to the new syntax. In SublimeText this is rather easy to achieve using file patterns (covering both markdown files and PP definitions files — I use "*.pp" for them).

The painful part would be to write the RegEx to capture all the built-in macros and the custom definitions.

This made me think that having a built in macro that can emit a list of all the built in macros, and another to emit a list of all the user defined macros, could be handy — not only for the above mentioned task, but also for maintaing documentation on PP or custom macros.

Something like this:

A !list-builtin-macros, that would emit:

def
define
[.. etc ...]
src

(covering both short and long version of each macro syntax)

And a !list-user-macros, that would emit:

MyMacro
AnotherMacro
[.. whatever ...]

The latter could be invoked after importing all the macros definitions files, and then emit the full list of user-defined macros (for example, in the project I'm currently working on, I have quite a lot of custom macros definitions, and they are invoked hundreds of times in the markdown sources).

Both lists could then be used to easily create the needed search and replace RegEx:

\\(def|define|...list of macros...|src)

... with a replace pattern:

!$1

... but I can envisage other uses also.

Add PowerShell Support

I think it would be a worth adding support for PowerShell via a built in macro (!ps, !powershell or !posh) as an alternative to !cmd.

PowerShell is way more powerful and flexible compared to CMD, and ships with many Unix like commands (ls, etc.), and there are lots of third party scripts availabe to emulate Linux commands (touch, etc.). This could simplify creating cross-platform macros because it would allow to use some common commands syntax (ie: and ls command would also work with !ps, unlike !cmd requiring DIR).

PowerShell scripts are less clancky and more user friendly than batch files, and — most important — the PowerShell environment doesn't suffer from the lack of full Unicode/UTF8 compliance as the CMD does (many commands break up unicode in piping operations, as not all CMD commands are fully unicode compliant yet, and there is the usual page encoding nightmare).

Installation in Ubuntu complains about stack

Ho to circumvent the compile error:

#### converting .stack-work/Plantuml.jar to C
stack tools/blob.hs .stack-work/Plantuml.jar
/bin/bash: stack: comando não encontrado
Makefile:127: recipe for target '.stack-work/PlantumlJar_c.c' failed
make: *** [.stack-work/PlantumlJar_c.c] Error 127

I can't find stack with apt, is there a place where I can obtain it?

Latex \dot vs Built-in Macro \dot

In my markdown I use raw Inline Latex command, like this: $\dot V$. I guess this is the pandocs markdown extension "tex_math_single_backslash".
And now when using pp I get an error message, because pp means the \dot is a built-in macro.
This is the error message:

pp: Arity error: dot expects 2 or 3 arguments
CallStack (from HasCallStack):
  error, called at src/ErrorMessages.hs:49:27 in pp-1.12-JZaGJL3q7GE2Y6zH7xiUqv:ErrorMessages

How to overcome this issue?
Can you disable the "\" notation and only use "!" notation?

Add mode to list all input files for dependency/up-to-date tracking for Make-like build systems

GCC has a -M option that changes the output to be a Makefile snippet that lists all file dependencies of the source files instead of compiling it. This is useful for generating Makefiles and integration with other tools.

Question About !comment Macro

Hi, I wanted to ask if the !comment macro has any impact on a macros perfomance — ie: in my macros definition I'd like to keep a comment block with description and usage instructions inside the definition itself, like this:

!define(Highlight)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
!comment
````````````````````````````````````````````````
USAGE:

	!Highlight(LANG)(EXTRA HIGHLIGHT OPTIONS)(CODE CLASS)
	~~~~~~~~~~~~~~~~
	BLOCK OF SOURCECODE
	~~~~~~~~~~~~~~~~
````````````````````````````````````````````````
<pre class="hl"><code class="!ifdef(3)(\1 \3)(\1)">\sh
````````````````````````````````````
cat <<EOF | highlight -f -S \1 --no-trailing-nl --validate-input !ifdef(2)(\2)
\4
EOF
````````````````````````````````````</code></pre>
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

The question is wether PP would then store the defined macro with all comments stripped off (ie: as if I wrote it without the comments) or wether the comments are kept as part of the macro definition and are parsed again at each macro occurence/invocation in the doc.

I'm sure this wouldn't have a significant performance hit anyway, but I'm just curious about it.

Build failure on macOS 10.13.1

stack build
pp-2.1.3: configure (lib + exe)
Configuring pp-2.1.3...
pp-2.1.3: build (lib + exe)
Preprocessing library pp-2.1.3...
Cabal-simple_mPHDZzAJ_1.24.2.0_ghc-8.0.2: can't find source for Tag in src,
.stack-work, .stack-work/dist/x86_64-osx/Cabal-1.24.2.0/build/autogen

--  While building package pp-2.1.3 using:
      /Users/tolink/.stack/setup-exe-cache/x86_64-osx/Cabal-simple_mPHDZzAJ_1.24.2.0_ghc-8.0.2 --builddir=.stack-work/dist/x86_64-osx/Cabal-1.24.2.0 build lib:pp exe:pp --ghc-options " -ddump-hi -ddump-to-file"
    Process exited with code: ExitFailure 1
make: *** [/Users/tolink/Downloads/pp-2.1.3/.stack-work/install/x86_64-osx/lts-9.11/8.0.2/bin/pp] Error 1

!cmd() Not Working on v1.2

The documentation says that !bat() is deprecated in favour of !cmd() but with latest build (PP v1.2) the former works and the latter fails.

Is it due to the fact that the latest build doesn't implement it?

Could you release binaries for the updated version?

Thanks

Calling macro B with arguments of macro A as arguments?

Is it possible to have one macro call another macro with arguments of the outer macro as arguments of the inner macro, something like !def(A)(!B(!1)) !A(C), where C is passed as argument to !B? It seems this causes an infinite loop. What am I doing wrong? I'm guessing that !1 in !B(!1) doesn't expand into the first argument to !A but tries to expand into the first argument to !B.
Compare LaTeX where \newcommand{\StrEmph}[1]{\emph{\textbf{#1}}}\StrEmph{C} gives you a bold italic C.

Add "md" and "rst" output formats

Currently PP supports md and rst as input formats (dialects), and html, pdf, odt, epub and mobi as output formats (formats).

I propose that md and rst should also be added to the output formats, and maybe even gfm (for GitHub markdown dialect). The reason is that often we reuse a same page (or snippet) for building html documentation as well as the repo's README (or even for the Wiki pages). CDSoft website could be an example of such usage, since it seems to share some common documents with its GitHub repos.

So it wouldn't be rare to also "convert" a document from markdown to markdown (or from ReST to ReST, or ReST to markdown). In such case, macros should be aware of the output format and behave accordingly. An example would be a macro that does nothing if the output is markdown or ReST, or that produces markdown instead of html, etc.

Other times it might have to do with importing different snippets depending on the output --- for example, blocks containing reference links definitions.

I think that gfm might be a candidate too because in PP's workflow we'd usually take advantage of pandoc flavored markdown in the source document (because of its added benefits), but the popularity of GitHub will often dictate converting the final README doc to GFM --- which offers both added features (like Task Lists) as well as lacking some features (like footnotes).

Add !space macro

Currently PP strips leading and trailing whitespaces from bracketed/braced parameters. This can result in some struggling with parameter-based conditional text in contexts where spaces can't be represented as HTML or Unicode entities (eg: inside html tags, or inside verbatim text).

I propose adding a built-in macro to emit blank space: !sp[ace].

Additionally, it could take an optional parameter for specifying multiple spaces (eg: !sp(8) to emit eight spaces). This could be handy in verbatim blocks.

Examples

Let's say I want to create a !PRE macro that encloses some text within <pre> tags, with an optional first parameter for specifying a class ....

!def(PRE)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
<pre!ifne(\1)()( class="\1")>\2</pre>
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

\PRE()
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Some   verbatim   text.
  Spacing preserved.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

\PRE(someclass)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Some   verbatim   text.
  Spacing preserved.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

the problem here is that the leading space in ( class="\1") is actually lost in the final output (ie: "<preclass=""):

<preclass="someclass">Some verbatim text. Spacing preserved.
</pre>

So, the easiest workaround seems to be this:

!def(PRE)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
<!ifeq(\1)()(pre)(pre class="\1")>\2</pre>
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

... which works but isn't really elegant because of the repetition of "pre".

The !space macro would solve the problem in a more elegant way:

!def(PRE)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
<pre!ifne(\1)()(!spclass="\1")>\2</pre>
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

One could argue that in this example it was possible to just add a space after the "<pre" in the definition of the first example:

!def(PRE)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
<pre !ifne(\1)()( class="\1")>\2</pre>
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

... but this would produce a <pre > tag when no class parameter is specified, and might cause problems if the final HTML is to be parsed by some script that has strict expectations — and in other contexts spaces might be more critical.

Add Escape String Macro

I've encountered an issue when using the !bash macro under Windows (Git Bash):

Git Bash for windows accepts both Windows- and Unix-like paths (ie: dir separators / and \). This allows to invoke Windows commands and console apps from with Bash and Bash scripts.

But in the context of a !bash macro, when I emit a path from an env var (!env(SOME_PATH_VAR)) under Windows' Git Bash, the path needs escaping backslashes!

It would be useful to have a built-in macro that handles escaping path strings — eg:

!esc(D:\some\path\) » D:\\some\\path\\

... ecaping whatever needs to be escaped in a Windows path being passed to !sh or !bash!

and maybe it's counterpart, to unescape strings.

I think this is worth implementing because (as you advised me) using Bash instead of Windows' CMD allows not only creating cross-platform macros, but also finer and better support tools. In many cases, a PP macro will need to access a path already set in an env-var by either the system or some other app.

One can resort to weak quoting, or use some shell command to achieve this, but a built-in macro would be easier, less verbose, and maybe add less overhead. My guess is that there could many contexts in which such a macro comes handy (including Win's CMD context).

What's your view on this?

Please add `-latex` and `-context` as formats

Now that Pandoc supports several ways of generating a PDF there are actually four different scenarios involving production of PDF and/or HTML:

PDF via LaTeX
PDF via ConTeXt
PDF via HTML and wkhtmltopdf
HTML for other purposes (usually the web)

I have found that there are situations where the -html and -pdf formats don't suffice, namely when you want to generate HTML and PDF from the same source and haven't yet decided which means of PDF generation to use. HTML for wkhtmltopdf and HTML for the web may require different content, e.g. you may want to include a form in the web version but not in the PDF version.
For that reason I wonder if you might consider adding -latex and -context formats so that one can differentiate between the four possibilities in a uniform way?
(You of course already can use !ifdef(latex)() if you want!) While it's true that the same person would probably not consider both LaTeX and ConTeXt at the same time, so that a -tex format would suffice it is probably wise to not make any rash assumptions in that area, and what's more important an explicit !latex(...) or !context() makes the source clearer (and FWIW there is a long-standing issue to add a plain TeX writer opened by jgm himself: jgm/pandoc#1541)

Accessing Pandoc Header / YAML Variables

Feature request: Add to PP awarenes of Pandoc variables defined with pandoc_title_block and/or yaml_metadata_block extensions.

Rationale: this would allow granular control of conditional PP macros through pandoc variables set on a per file basis (either in the file's pandoc or YAML header blocks) or globally on project by including YAML files on a per folder basis (ie: common variables are stored in a YAML file in the project's root, and in each folder, allowing overrides of values by concatenation).

This solution integrates better with pandoc, instead of setting variables via command line parameter, and also allows PP to share pandoc's template variables — PP could even conditionally change some variables before invoking pandoc.

It would be particularly useful in projects relying on automated scripting.

Generated figure attributes (width, height, id and class)

Problem

I have come across a situation where I'd like to resize the figures generated by pp.

As far I can see, pp generates Pandoc image syntax after generating the figures.

I think Pandoc image syntax is limited, as it's impossible to choose figure alignment and scale.

Proposed solution

Pandoc supports width and height attributes for images. Look for the link_attributes extension in the docs here.

For example, the following Pandoc code:

![Caption](path/image.png){#some_id width=20%}

Generates the following LaTeX code:

\begin{figure}[htbp]
\centering
\includegraphics[width=0.20000\textwidth]{path/image.png}
\caption{Caption}\label{some_id}
\end{figure}

You can try this online, and play around with other outputs.

I propose extending pp's syntax as such:

\twopi(path/image.png)(Caption){#some_id width=20%}
\~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
digraph {
    O -> A
    O -> B
}
\~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

The above example will generate the following Pandoc syntax:

![Caption](path/image.png){#some_id width=20%}

In short, what I propose is:

Add an optional { ... } field to graph generating functions such as \dot and \twopi, called attribute field.
The attribute field, if present, must follow the (...) caption field.
The attribute field, if present, will generate a valid attribute list at the end of generated Pandoc image syntax.
- It is sufficient to simply copy the whole attribute field (including the curly braces) after the path specification in the Pandoc image inclusion code.

Macros Definitions Files: Thoughts and Questions

I'm currently prototyping a custom flat-file static CMS that uses PP » Pandoc (+pandoc templates) to generate HTML documentation in a website fashion from markdown source files. The obvious advantage of this approach lies in PP macros being very flexible and highly customizable.

I wanted to share my experience with this project, as it might prompt suggestions and shred light on potential uses and new-features — I propose a new feature in the Conclusions. Please, if you see better solutions to approach this task I'd really appreciate your feedback. Pandoc has already become part of different CMS projects (like Pandocomatic), and I think that PP has great integration potential in this field.

Hopefully, my open source project will be ready published on GitHub in the nearby future, so others might benefit from it and re-adapt it to their uses. It's main goal is to create resouces documention projects that can be built into GitHub pages viewable website, as well as clonable repos containing coding resources + html docs.

YAML Settings Files Hierarchy

The project relies on a YAML setting file being present in each folder, plus a YAML file (or YAML headers) for each single document (in the form filename.yaml). The CMS loads in memory the PP-macros, in order to prevent redundant disk accesses; it also loads in memory each folder's YAML setting file for the duration of each folder's processing. The CMS then also loads the source files into memory and merges them with the pre-loaded macros definitions and YAML files before feeding them to PP and/or Pandoc.

YAML files can easily be merged with the markdown source files in memory, and then be fed to PP or Pandoc via STDIN. When YAML blocks contain a variable already defined in a previous YAML block, Pandoc operates on a lef-biased principle, so the first definition is the one that wins — therefore YAML order matters, and YAML blocks are chained thus:

Source doce YAML header
Associated filename.yaml
folder's YAML setting file

NOTE: By loading PP-Macors definitions ahead of YAML blocks, PP-Macros can be placed in YAML definitions, allowing to create dynamic/conditional YAML variables. This is great if planning, for example, to also output the documentation in a different format and using different Pandoc templates.

Handling Macros-Definitions

As for the macros definitions, it's not currently possible to feed macros definitions (in a non-emitting manner) to PP via STDIN — ie: separately from the source file to be processed.

So I've come across two possible solutions.

from CLI — via -import=FILE option
source injection — inject before source doc via !quiet(TEXT) macro

The 1st approach has the disadvantage that custom macros file(s) have to be loaded from disk every time a file is built — their content can’t be passed via STDIN!

The 2nd approach has the advantage that macros definition can be loaded from file(s) to memory just once, when the CMS is launched, and then injected at the beginning of the markdown document inside the !quiet() macro. Example:

    !quiet
    ~~~~~~~~~~
    [macros definitions from memory]
    ~~~~~~~~~~

… this should be much faster — but more memory expensive! — because it would avoid lot’s of redundant disk accesses.

NOTE 1: This approach requires all macros to be defined in a single file (instead of multiple modules being imported from a core file using !import macros). But having all macros definition in a single file is somewhat more bothersome to maintain.

NOTE 2: I could still keep macros definition in multiple files with the second approach — if they follow a standard naming convention (eg: having a *.pp extension), they can be sequentially loaded in memory by the CMS and merged into a single datablock to inject in the !quiet() macro.

NOTE 3: The !quiet() macro injection approach prevents using pandoc style headers (inside source docs) because they would no longer be at the beginning of the source file — a small price to pay. That is, unless the CMS could parse the source doc and inject the macros definition after the pandoc headers (requires some work and might slow things down). On the other hand, YAML headers and blocks are not affected by this.

Conclusions

From working on the CMS prototype, I've realized that the following proposed behaviour might simplify using PP in similar contexts. I'm not sure if this is even possible, but it would be great if PP could accept two independent STDIN streams:

one for macros definition — ie: nothing is emitted, but definitions and env vars are preserved
the other for the source file to process

Basically, if there was an alternative to the -import=FILE option capable of accepting a STDIN stream (no text emitted) and then carry on with the rest of the PP's invocation line, it would allow to feed macros definitions to PP without resorting to the above mentioned injections techniques.

Something along these lines:

pp -import-stdin

... where PP waits for a first STDIN stream (which it treats like it would with -import=FILE), and after that waits for a second STDIN stream, which it processes regularly (as by default).

It could also be used in conjuction with source files from disk, still allowing the macros to be fed via STDIN:

pp -import-stdin somefile.md

... where PP first gets the macros definition from STDIN (supplied from memory by some app), and then process some file from disk, as usual. This prevents redundant access to disk for getting macros definition.

Other areas of PP improvement might be in relation to YAML blocks. In complex projects involving Pandoc, YAML headers and files seem the natural solution to handling settings inheritance and overriding. Any built-in macros that might support working directly with YAML definitions, blocks, or files could make a big difference and is a potential area of interest.

For example, functionality to merge YAML definitions of same variable from different files into a cumulative array of values, instead of having pandoc ignore subsequent definitions. For example, keywords could be inherited and merged from YAML hierarchy, building up a more specifical subset with each subfolder level.

!rename( SYMBOL )( NEW NAME )

It's just an idea that flashed into my mind, but I think it has some interesting potential uses:

!ren[ame]( SYMBOL )( NEW NAME )

This would allow to change the behavior of "higher level" user macros by swapping names of some of their internally used macros.

For example, a !format(TEXT) macro could carry out some text manipulation via other macros:

!def(justifyLeft)( [...] )
!def(justifyRight)( [...] )

!def(format)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
!justify(!1)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

At any point the behavior of !format could be changed by !rename:

!rename( justifyLeft )( justify )
!format( SOME TEXT )

!rename( justify )( justifyLeft )
!rename( justifyRight )( justify )
!format( SOME TEXT )

... this approach can avoid using lots of !if, !ifdef checks al over the place.