fletcher / peg-multimarkdown Goto Github PK

This project forked from jgm/peg-markdown

An implementation of MultiMarkdown in C, using a PEG grammar - a fork of jgm's peg-markdown. No longer under active development - see MMD 5.

License: Other

C 85.50% Objective-C 0.07% Visual Basic 0.19% Shell 1.28% Makefile 3.41% Roff 9.37% Batchfile 0.18%

peg-multimarkdown's Introduction

Title: peg-multimarkdown User's Guide
Author: Fletcher T. Penney
Base Header Level: 2

Note: This project has been deprecated in favor of MultiMarkdown-6.

Introduction

Markdown is a simple markup language used to convert plain text into HTML.

MultiMarkdown is a derivative of Markdown that adds new syntax features, such as footnotes, tables, and metadata. Additionally, it offers mechanisms to convert plain text into LaTeX in addition to HTML.

peg-multimarkdown is an implementation of MultiMarkdown derived from John MacFarlane's peg-markdown. It makes use of a parsing expression grammar (PEG), and is written in C. It should compile for most any (major) operating system.

Thanks to work by Daniel Jalkut, MMD no longer requires GLib2 as a dependency. This should make it easier to compile on various operating systems.

Installation

Mac OS X

On the Mac, you can choose from using an installer to install the program for you, or you can compile it yourself from scratch. If you know what that means, follow the instructions below in the Linux section. Otherwise, definitely go for the installer!

You can also install MultiMarkdown with the package manager MacPorts with the following command:

sudo port install multimarkdown

Or using homebrew:

brew install multimarkdown

NOTE: I don't maintain either of these ports/packages and can't vouch that they are up to date or working properly. That said, I have started using homebrew to install the latest development build on my machine, while using make in my working directory while editing:

brew install multimarkdown --HEAD

If you don't know what any of that means, just grab the installer.

If you want to compile for yourself, be sure you have the Developer Tools installed, and then follow the directions for [Linux].

If you want to make your own installer, you can use the make mac-installer command after compiling the multimarkdown binary itself.

Windows

The easiest way to get peg-multimarkdown running on Windows is to download the installer from the downloads page. It is created with the help of BitRock's software.

If you want to compile this yourself, you do it in the same way that you would install peg-markdown for Windows. The instructions are on the peg-multimarkdown [wiki] (https://github.com/fletcher/peg-multimarkdown/wiki/Building-for-Windows). I was able to compile for Windows fairly easily using Ubuntu linux following those instructions. I have not tried to actually compile on a Windows machine.

As a shortcut, if running on a linux machine you can use:

make windows

This creates the multimarkdown.exe binary. You can then install this manually.

The make win-installer command is what I use to package up the BitRock installer into a zipfile. You probably won't need it.

Linux

You can either download the source from peg-multimarkdown, or (preferentially) you can use git:

git clone git://github.com/fletcher/peg-multimarkdown.git

You can run the update_submodules.sh script to update the submodules if you want to run the test commands, download the sample files and the Support directory, or compile the documentation.

Then, simply run make to compile the source.

You can also run some test commands to verify that everything is working properly. Of note, it is normal to fail one test in the Markdown tests, but the others should pass. You can then install the binary wherever you like.

make
make test
make mmd-test
make latex-test
make compat-test

NOTE As of version 3.2, the tests including obfuscated email addresses will also fail due to a change in how random numbers are generated.

FreeBSD

If you want to compile manually, you should be able to follow the directions for Linux above. However, apparently MultiMarkdown has been put in the ports tree, so you can also use:

cd /usr/ports/textproc/multimarkdown
make install

(I have not tested this myself, and cannot guarantee that it works properly. Come to think of it, I don't even know which version of MMD they use.)

Usage

Once installed, you simply do something like the following:

multimarkdown file.txt --- process text into HTML.
multimarkdown -c file.txt --- use a compatibility mode that emulates the original Markdown.
multimarkdown -t latex file.txt --- output the results as LaTeX instead of HTML. This can then be processed into a PDF if you have LaTeX installed. You can further specify the LaTeX Mode metadata to customize output for compatibility with memoir or beamer classes.
multimarkdown -t odf file.txt --- output the results as an OpenDocument Text Flat XML file. Does require the plugin be installed in your copy of OpenOffice, which is available at the downloads page. LibreOffice includes this plugin by default.
multimarkdown -t opml file.txt --- convert the MMD text file to an MMD OPML file, compatible with OmniOutliner and certain other outlining and mind-mapping programs (including iThoughts and iThoughtsHD).
multimarkdown -h --- display help and additional options.
multimarkdown -b *.txt --- -b or --batch mode can process multiple files at once, converting file.txt to file.html or file.tex as directed. Using this feature, you can convert a directory of MultiMarkdown text files into HTML files, or LaTeX files with a single command without having to specify the output files manually. CAUTION: This will overwrite existing files with the html or tex extension, so use with caution.

Note: Several convenience scripts are available to simplify things:

mmd			=> multimarkdown -b
mmd2tex		=> multimarkdown -b -t latex
mmd2odf		=> multimarkdown -b -t odf
mmd2opml	=> multimarkdown -b -t opml

mmd2pdf		=> Unsupported script to try and run latex/xelatex.
			   You can direct questions to the discussion list, but
			   I may or may not respond.  It works for me, so I share
			   it with those who are interested but make no
			   guarantees.

Why create another version of MultiMarkdown?

Maintaining a growing collection of nested regular expressions was going to become increasingly difficult. I don't plan on adding much (if any) in the way of new syntax features, but it was a mess.
Performance on longer documents was poor. The nested perl regular expressions was slow, even on a relatively fast computer. Performance on something like an iPhone would probably have been miserable.
The reliance on Perl made installation fairly complex on Windows. That didn't bother me too much, but it is a factor.
Perl can't be run on an iPhone/iPad, and I would like to be able to have MultiMarkdown on an iOS device, and not just regular Markdown (which exists in C versions).
I was interested in learning about PEG's and revisiting C programming.
The syntax has been fairly stable, and it would be nice to be able to formalize it a bit --- which happens by definition when using a PEG.
I wanted to revisit the syntax and features and clean things up a bit.
Did I mention how much faster this is? And that it could (eventually) run on an iPhone?

What's different?

"Complete" documents vs. "snippets"

A "snippet" is a section of HTML (or LaTeX) that is not a complete, fully-formed document. It doesn't contain the header information to make it a valid XML document. It can't be compiled with LaTeX into a PDF without further commands.

For example:

# This is a header #

And a paragraph.

becomes the following HTML snippet:

<h1 id="thisisaheader">This is a header</h1>

<p>And a paragraph.</p>

and the following LaTeX snippet:

\part{This is a header}
\label{thisisaheader}


And a paragraph.

It was not possible to create a LaTeX snippet with the original MultiMarkdown, because it relied on having a complete XHTML document that was then converted to LaTeX via an XSLT document (requiring a whole separate program). This was powerful, but complicated.

Now, I have come full-circle. peg-multimarkdown will now output LaTeX directly, without requiring XSLT. This allows the creation of LaTeX snippets, or complete documents, as necessary.

To create a complete document, simply include metadata. You can include a title, author, date, or whatever you like. If you don't want to include any real metadata, including "format: complete" will still trigger a complete document, just like it used to.

NOTE: If the only metadata present is Base Header Level then a complete document will not be triggered. This can be useful when combining various documents together.

The old approach (even though it was hidden from most users) was a bit of a kludge, and this should be more elegant, and more flexible.

Creating LaTeX Documents

LaTeX documents are created a bit differently than under the old system. You no longer have to use an XSLT file to convert from XHTML to LaTeX. You can go straight from MultiMarkdown to LaTeX, which is faster and more flexible.

To create a complete LaTeX document, you can process your file as a snippet, and then place it in a LaTeX template that you already have. Alternatively, you can use metadata to trigger the creation of a complete document. You can use the LaTeX Input metadata to insert a \input{file} command. You can then store various template files in your texmf directory and call them with metadata, or with embedded raw LaTeX commands in your document. For example:

LaTeX Input:		mmd-memoir-header  
Title:				Sample MultiMarkdown Document  
Author:				Fletcher T. Penney  
LaTeX Mode:			memoir  
LaTeX Input:		mmd-memoir-begin-doc  
LaTeX Footer:		mmd-memoir-footer

This would include several template files in the order that you see. The LaTeX Footer metadata inserts a template at the end of your document. Note that the order and placement of the LaTeX Include statements is important.

The LaTeX Mode metadata allows you to specify that MultiMarkdown should use the memoir or beamer output format. This places subtle differences in the output document for compatibility with those respective classes.

This system isn't quite as powerful as the XSLT approach, since it doesn't alter the actual MultiMarkdown to LaTeX conversion process. But it is probably much more familiar to LaTeX users who are accustomed to using \input{} commands and doesn't require knowledge of XSLT programming.

I recommend checking out the default LaTeX Support Files that are available on github. They are designed to serve as a starting point for your own needs.

Note: You can still use this version of MultiMarkdown to convert text into XHTML, and then process the XHTML using XSLT to create a LaTeX document, just like you used to in MMD 2.0.

Footnotes

Footnotes work slightly differently than before. This is partially on purpose, and partly out of necessity. Specifically:

Footnotes are anchored based on number, rather than the label used in the MMD source. This won't show a visible difference to the reader, but the XHTML source will be different.
Footnotes can be used more than once. Each reference will link to the same numbered note, but the "return" link will only link to the first instance.
Footnote "return" links are a separate paragraph after the footnote. This is due to the way peg-markdown works, and it's not worth the effort to me to change it. You can always use CSS to change the appearance however you like.
Footnote numbers are surrounded by "[]" in the text.

Raw HTML

Because the original MultiMarkdown processed the text document into XHTML first, and then processed the entire XHTML document into LaTeX, it couldn't tell the difference between raw HTML and HTML that was created from plaintext. This version, however, uses the original plain text to create the LaTeX document. This means that any raw HTML inside your MultiMarkdown document is not converted into LaTeX.

The benefit of this is that you can embed one piece of the document in two formats --- one for XHTML, and one for LaTeX:

<blockquote>
<p>Release early, release often!</p>
<blockquote><p>Linus Torvalds</p></blockquote>
</blockquote>

<!-- \epigraph{Release early, release often!}{Linus Torvalds} -->

In this section, when the document is converted into XHTML, the blockquote sections will be used as expected, and the epigraph will be ignored since it is inside a comment. Conversely, when processed into LaTeX, the raw HTML will be ignored, and the comment will be processed as raw LaTeX.

You shouldn't need to use this feature, but if you want to specify exactly how a certain part of your document is processed into LaTeX, it's a neat trick.

Processing MultiMarkdown inside HTML

In the original MultiMarkdown, you could use something like <div markdown=1> to tell MultiMarkdown to process the text inside the div. In peg-multimarkdown, you can do this, or you can use the command-line option --process-html to process the text inside all raw HTML.

Math Support

MultiMarkdown 2.0 supported ASCIIMathML embedded with MultiMarkdown documents. This syntax was then converted to MathML for XHTML output, and then further processed into LaTeX when creating LaTeX output. The benefit of this was that the ASCIIMathML syntax was pretty straightforward. The downside was that only a handful of browsers actually support MathML, so most of the time it was only useful for LaTeX. Many MMD users who are interested in LaTeX output already knew LaTeX, so they sometimes preferred native math syntax, which led to several hacks.

MultiMarkdown 3.0 does not have built in support for ASCIIMathML. In fact, I would probably have to write a parser from scratch to do anything useful with it, which I have little desire to do. So I came up with a compromise.

ASCIIMathML is no longer supported by MultiMarkdown. Instead, you can use LaTeX to code for math within your document. When creating a LaTeX document, the source is simply passed through, and LaTeX handles it as usual. If you desire, you can add a line to your header when creating XHTML documents that will allow MathJax to appropriately display your math.

Normally, MathJax and LaTeX supported using \[ math \] or $ math $ to indicate that math was included. MMD stumbled on this due to some issues with escaping, so instead we use \\[ math \\] and \$ math \$. See an example:

latex input:	mmd-article-header  
Title:			MultiMarkdown Math Example  
latex input:	mmd-article-begin-doc  
latex footer:	mmd-memoir-footer  
xhtml header:	<script type="text/javascript"
	src="http://localhost/~fletcher/math/mathjax/MathJax.js">
	</script>
			
			
An example of math within a paragraph --- \\({e}^{i\pi }+1=0\\)
--- easy enough.

And an equation on it's own:

\\[ {x}_{1,2}=\frac{-b\pm \sqrt{{b}^{2}-4ac}}{2a} \\]

That's it.

You would, of course, need to change the xhtml header metadata to point to your own installation of MathJax.

Note: MultiMarkdown doesn't actually do anything with the code inside the brackets. It simply strips away the extra backslash and passes the LaTeX source unchanged, where it is handled by MathJax if it's properly installed, or by LaTeX. If you're having trouble, you can certainly email the MultiMarkdown Discussion List, but I do not provide support for LaTeX code.

Acknowledgments

Thanks to John MacFarlane for peg-markdown. Obviously, this derivative work would not be possible without his work. Additionally, he was very gracious in giving me some pointers when I was getting started with trying to modify his software, and he continues to update peg-markdown with the various edge cases MultiMarkdown users have found. Hopefully both programs are better as a result.

Thanks to Daniel Jalkut for his work on enabling MultiMarkdown to run without relying on GLib2. This makes it much more flexible!

Thanks to John Gruber for the original Markdown. 'Nuff said.

And thanks to the many contributors and users of the original MultiMarkdown that helped me refine the syntax and search out bugs.

peg-multimarkdown's People

Contributors

Stargazers

Watchers

peg-multimarkdown's Issues

New Footnote Style

It seems from MMD2 -> MMD3, reverse footnote links were moved outside of the actual footnote's paragraph.

Was that intentional/what was the reasoning there?

Need to add beamer support to XSLT

This is planned before 3.0 is finalized.

Requires glib2

the glib2 library is required to build this project, as it makes use of gStrings. This was a requirement of the original peg-markdown, and I hope to remove it as I think it unnecessarily complicates building and using this program.

I welcome help with this as I am not an expert c programmer!

Inline images converted to `begin{figure}...\end{figure}` blocks

$ cat inlineFigure.txt 
This is a formatted ![image][] and a [link][] with attributes.

[image]: http://path.to/image "Image title" width=40px height=400px
[link]:  http://path.to/link.html "Some Link" class=external
         style="border: solid black 1px;"

$ cat inlineFigure.txt | multimarkdown -t latex
This is a formatted \begin{figure}
\begin{center}
\includegraphics[keepaspectratio,width=40pt,height=400pt]{http://path.to/image}
\end{center}
\caption[image]{Image title}
\label{image}
\end{figure}
 and a \href{http://path.to/link.html}{link}\footnote{\href{http://path.to/link.html}{http:\slash \slash path.to\slash link.html}} with attributes.

ASCIIMathML

I've really been trying to put a lot of thought into what to do with this. Key points:

Most people don't use math in any fashion
People who are more comfortable with LaTeX tend to use LaTeX for their math.
A few people do use ASCIIMath to include simple formulas - I'm not sure how much goes to HTML and how much goes to LaTeX output.
MMD 2 uses someone else's code to convert text to mathml, and then another person's work to convert mathml to latex.
It would take a lot of work for me to rewrite this to be compatible MMD 3 natively.

So I welcome ideas, and will continue to think about a reasonable solution.

Poor support for image dimensions in ODF

It's relatively easy to insert an image into ODF using fixed dimensions, but harder to get a scaled image without knowing the exact aspect ratio of the image.

For example, in LaTeX or HTML, one can specify that image should be scaled to 50% of the width, and have it automatically calculate the proper height. This does not work in ODF, at least not that I can find.

You have to manually adjust the image to fit your desired constraint. It's easy to do, simply hold down the shift key while adjusting the image size, and it will likely snap to match the specified dimension.

I welcome suggestions on a better way to do this.

F-

Language metadata conflict

In my case, the metadata Language: French translated to \def\language{french} causes issues. A variable with the same name is probably defined in the babel or some related package.

I must say that I'm not using the plain mmd header file, so the error might be caused by the specific packages in my preambule, but there is nothing really fancy there. Only the recommended packages for writing in French.

Maybe a more smartypants-specific metadata name would be appropriate.

Folder delimiters for file paths in metadata are escaped.

Example:

$ cat meta.txt 
Latex Input:    path/to/file

Text

$ multimarkdown -t latex meta.txt 
\input{path\slash to\slash file}
Text

mac-installer: libglib-2.0.0.dylib install vs runtime location

On my system, the mac installer puts libglib-2.0.0.dylib in /usr/local, but on run time, the file with the same name in /sw/lib is used, even though this folder is not in my path:

$ echo $PATH
/usr/bin:/bin:/usr/sbin:/sbin:/usr/local/bin:/usr/texbin:/usr/X11/bin:/usr/local/CrossPack-AVR/bin:/usr/local/git/bin:/usr/texbin:/Developer/usr/bin

This prevented multimarkdown from running on my system since the /sw/lib file was outdated. Copying the libglib-2.0.0.dylib distributed with peg-multimarkdown v3.0a5 over the one in /sw/lib sloved the problem. However, I think the installation package should not rely on a default installation of fink.

Extra blank lines at beginning and end of `verbatim` environment

This leaves large unnecessary space before and after verbatim code in LaTeX output.

$ cat pre.txt 
This

    is a test

$ cat pre.txt | ../peg-multimarkdown/multimarkdown -t latex
This

\begin{verbatim}

is a test

\end{verbatim}

paragraph level elements parsed inside tables

MMD parsing inside tables causes issues. For example, this simple but reasonable example triggers a block quote inside a table, which causes errors in latex.

$ cat tableQuote.txt 
| Item   | Price ($) |
|--------|-----------|
| Small  | < 1       |
| Large  | > 10      |

$ multimarkdown -t latex tableQuote.txt 
\begin{table}[htbp]
\begin{minipage}{\linewidth}
\setlength{\tymax}{0.5\linewidth}
\centering
\small
\begin{tabular}{@{}ll@{}} \\ \toprule
Item &Price (\$)  \\
\midrule
Small &$<$ 1  \\
Large &\begin{quote}10
\end{quote} \\

\bottomrule

\end{tabular}
\end{minipage}
\end{table}

README.markdown is pushed with conflicts

Hi there,

I was going through the readme and it seems it was pushed without being cleaned from a conflict that git should have warned about.
Check the end of "Why create another version of MultiMarkdown?" section and up to the end of the file.

Just issuing this in case you haven't noticed.

Automatic linebreaks in metadata for LaTeX not translated

In a few rare instances, such as the return address in a letter, I would use multiline metadata destined for LaTeX output:

Return address: 123 Main st
                Some City, ST  12345

In the LaTeX output, the 2 spaces followed by the newline would be converted to '' to indicated a linebreak.

This feature is not yet implemented in MMD 3, and sadly is somewhat complicated to try and do. Will keep working on it, but as always welcome ideas.

HTML header not working

Example on page 20 of MMD 3.0 manual:

 HTML header: <script type="text/javascript" src="http://example.net/mathjax/MathJax.js"> </script>

Generates (in head of html file):

 <meta name="htmlheader" content="&lt;script type=&quot;text/javascript&quot; src=&quot;http://example.net/mathjax/MathJax.js&quot;&gt; &lt;/script&gt;"/>

Expected

 <script type="text/javascript" src="http://example.net/mathjax/MathJax.js"> </script>

hyphen in labels

[continued from the MMD Discussion Group]

labels with non ascii (but legal latex/html)
characters are transformed in some weird way. Please notice the - that gets converted to null.

There are limitations to what can (easily) be translated. I improved
the print_raw_element function so that the "null" doesn't appear, but
the hyphen will simply be left out.

The hyphen, however, remains in the link, which gets broken.

$ cat hyphen.txt 
Link to [](#figure-1).

![figure-1][]

[figure-1]: file.png

$ cat hyphen.txt | multimarkdown -t latex
Link to \autoref{figure-1}.

\begin{figure}
\begin{center}
\includegraphics[keepaspectratio,width=\textwidth, height=.75\textheight]{file.png}
\end{center}
\label{figure1}
\end{figure}

MathJax and latex compatibility for math environments

MathJax supports common math environments such as \begin{eqnarray} ... \end{eqnarray}, but when used with the plain straighforward "escape-the-first-backslash-in-[" approach, you end up with invalid latex code.

The following code\\[\begin{eqnarray}
 2^{\log_2 x} & = & e^{\log 2\frac{\log x}{\log 2}}\\
              & = & e^{\log x}\\
              & = & x
\end{eqnarray}\\] ...

is correct with MathJax, but the resulting \[\begin{eqnarray} ... \end{eqnarray}\] generated using -t latex is invalid latex. Currently, the possible workarounds are:

write the equation inside , but then it is not visible in HTML (consistency issue);
write only plain displaymath, but this is very limiting (no equation numbering, no arrays, etc.) and sad, considering that MathJax supports it so well --- unfortunate to have better math support for the web than for LaTeX, isn't it?
post-process and strip the eclosing \\[ and \\] when the content starts with \begin

I personally use the third approach, but I think mmd should output valid code whenever possible, without requiring user scripts just to allow latex to complile.

May I suggest simply testing whether the content in \\[ ... \\] starts with \begin, and, if so, strip the enclosing \\[ and \\] in print_latex_element()?

Thanks

Segmentation fault

The new version of MultiMarkdown segfaults on my machine (see below). How can I help you debug this?

Greetings,
Oliver

oliver$ sw_vers 
ProductName:    Mac OS X
ProductVersion: 10.6.7
BuildVersion:   10J869
oliver$ multimarkdown --version
peg-multimarkdown version 3.0b11
[…]
oliver$ mmd2tex mainmatter.mdml 
/usr/local/bin/mmd2tex: line 22:  5717 Segmentation fault      multimarkdown -b -t latex "$1"
oliver$

markdown=1 attribute not supported

Currently, there is no way to tell peg-multimarkdown to process text inside of an html block.

I will probably add this as either a command line option, or support for the markdown=1 attribute that many other versions support.

Parsing of non integer units in properties

Parsing of image (maybe other) properties stops at a decimal point, resulting in no units being recognized, and the remaining part of the property inserted as plain text after the image. An error message is also output stating No units:<value before decimal point>

Example:

$ cat image.txt 
![animage][]

[animage]: path/to/image.png "caption" width=3.5in

$ multimarkdown -t latex image.txt 
No units: 3
\begin{figure}
\begin{center}
\includegraphics[keepaspectratio,keepaspectratio,width=3pt,height=0.75\textheight]{path/to/image.png}
\end{center}
\caption{caption}
\label{animage}
\end{figure}


.5in
$

Anyone Interested in RTF - here's your chance to contribute!

Since RTF is a text-file based format, it is possible to include it as an output format in MMD 3.0.

It would require adding the appropriate output functions to "teach" MMD how to output RTF.

It really wouldn't be too hard, I am just interested in doing it myself. That said, I am willing to pull commits from others to help coordinate efforts if anyone is interested in adding it.

My suggestion would be to break RTF support into "regular Markdown" which should be built off the "original" branch:

https://github.com/fletcher/peg-multimarkdown/tree/original

This would allow it to be compatible with peg-markdown and pulled into John MacFarlane's project (like he pulled in my ODF changes).

Then, the MMD-specific syntax features could be created in a separate branch (like I keep the ODF work split into two branches).

So --- for everyone begging for RTF support --- here's your chance to make it happen!

Crash on footnotes referenced more than once

This is ok:

$ cat footnotes.txt 
a note[^footnote].

[^footnote]: footnote content
$ cat footnotes.txt | multimarkdown -t latex
a note\footnote{footnote content}.

But this is not:

$ cat footnotes2.txt 
a note[^footnote], then another one[^footnote].

[^footnote]: footnote content
$ cat footnotes2.txt | multimarkdown -t latex
multimarkdown(92359) malloc: *** error for object 0x305f20: pointer being freed was not allocated
*** set a breakpoint in malloc_error_break to debug
multimarkdown(92359) malloc: *** error for object 0x841f0f: pointer being freed was not allocated
*** set a breakpoint in malloc_error_break to debug
Bus error

P.S: I'm not using the latest release, but a personal fork. Maybe this is not relevant in current mainstream version...

Math support in ODF

I figured out how to embed MathML in an OpenDocument file. There are several tools to convert LaTeX to MathML, including blahtex.

I'll probably start by writing a perl utility that can take a MMD-created ODF, and then parse the math "bits" out and put them back in as MathML.

Even better would be if I could figure out how to compile blahtex into multimarkdown as a library to do the processing internally.... Not sure how long that will take to get around to. Anyone with any particular skills in that area who wants to help - let me know!

F-

Malloc error when using Format: complete in header

If I include the header line "Format: complete" in my document, the mmd3b1 parser generates:

multimarkdown(52212) malloc: *** error for object 0x100827400: incorrect checksum for
freed object - object was probably modified after being freed.
*** set a breakpoint in malloc_error_break to debug
Abort trap

Am I right that this directive is no longer necessary with MMD3?

Deep logical divisions not treated as logical divisions in latex

Hi,

Logical divisions below \subsubsection (\paragraph, etc) have output formatting applied (\itshape, etc) instead of being converted to real logical divisions. In articles, for example, where the highest level is not \part, but \section, this can be a problem (visually, and for \autoref).

HTML output seems ok, though (h4, h5, h6...).

Thanks

Enabling build on 32 bit mac

I had to tweak the makefile to get this to compile on my mac, but I really didn't know what I was doing. I added -arch i386 to the CFLAGS variable, which works for me but probably breaks this for everybody else.

I welcome a fix!

Inline math delimiter "no robust"

One thing I forgot to let you know because my own scripts have been correcting it for a while is that the $... $ is not robust and causes problems depending on where it is in the document. The problem arises when there is math in section titles and there is a table of contents (or in captions with a list of figures, etc.). See section 2.3 of this reference for details.

Try:

Latex Input: mmd-memoir-header
Title: something
Latex Header Level: 2
Latex Input: mmd-memoir-begin-doc
Latex Footer: mmd-memoir-footer

# Some \\(math\\) title #

and content

There are two easy solutions for this:

use \protect$ and \protect$ for left and right delimiters, respectively;
use $ for both delimiters

I would suggest that you systematically replace $ and $ with $ because the resulting code is cleaner, closer to most common hand-written latex. That's what my scripts do.

SmartyPants emulation only works for English

The original MMD made use of several international variants of SmartyPants. I'll need to include this somehow in the new version

Figure dimensions units `%` discarded

Units in % are discarded from the figure (although acceptable in plain MD) and a paragraph containing only a printable % character follows. Example:

$ cat percent.txt 
![][figure]

[figure]: fig.png "caption" width=75%

$ cat percent.txt | multimarkdown
<p><img src="fig.png" id="figure" alt="" title="caption" width="75" /></p>

<p>%</p>

$ cat percent.txt | multimarkdown -t latex
\begin{figure}[htbp]
\centering
\includegraphics[keepaspectratio,width=75pt,height=0.75\textheight]{fig.png}
\caption{caption}
\label{figure}
\end{figure}


\%
macjo:test Jo$

Speaking of this, couldn't the % unit be converted in fraction of \textwidth for latex? I think this makes the most sense for documents that should be both converted to both html and latex. Indeed: cm/in are not valid in html, and the old equation 1px = 1pt doesn't make sense with today's screen resolutions, where html figures become tiny. So I think

[figure]: fig.png "caption" width=75%

should be converted to

\includegraphics[keepaspectratio,width=0.75\textwidth,height=0.75\textheight]{fig.png}

Your thoughts?

visual enhancement in latex figures

Detail: The recommended (at least by the IEEE) syntax inside figure environments is to use \centering rather than \begin{center}...\end{center} because it leaves less blank space between the figure and its caption.

Glossaries

Glossaries seem to have gotten much more complicated with the new glossaries package, and I think it might just be out of the scope of MMD for the time being. We'll see. There is a "glossary" branch where I have started some of this, but it's kind of a mess at the moment....

4wishlist: Command line switch disable output of Metadata definitions at beginning of the .tex-file

One thing nice to have for the beta: Would it be possible to add a command line switch to disable the output of the Metadata definitions at the beginning of the file in LaTeX-mode? I'm talking about these lines:

\def\mytitle{This is my title}
\def\myauthor{John Doe}
…

That would be swell.

trying to get eglib to work

I created a new branch - eglib. It almost compiles without any external dependency (I think).

inside the eglib directory, do:
./autogen.sh
make

then do make in the peg-multimarkdown directory.

it seems as though I just need to fix the Makefile????

Parsing in captions

Captions are not parsed using mmd syntax, while any special character is escaped. As a result, the final caption always prints literally the raw content of the MMD source. Therefore it seems impossible to put any formatting and, more importantly, links in captions.

$ cat caption.txt 
![animage][]

[animage]: image.png "some [#stuff][], other \cite{stuff}, and more <!--\cite{stuff}-->."


$ multimarkdown -t latex caption.txt 
\begin{figure}
\begin{center}
\includegraphics[keepaspectratio,width=\textwidth, height=.75\textheight]{image.png}
\end{center}
\caption{some [\#stuff][], other $\backslash$cite\{stuff\}, and more $<$!--$\backslash$cite\{stuff\}--$>$.}
\label{animage}
\end{figure}

Image’s alt attribute is not a (fig)caption

Hello, I understand that HTML5 brings a lot of freshness to our old wooden table, but unfortunately transforming alt attribute in <img> element to <figcaption> is not correct and not just in some abstract, nerdy way.

That alt attribute is supposed to be used as a replacement for the image, in case the image can not be displayed.

Read more about its usage: http://dev.w3.org/html5/spec/Overview.html#alt

Or imagine something like this: you have an article where you describe how good business relationships look like. You add an illustration of two shaking hands. The alt attribute of that illustration isn’t “Two shaking hands” and should not be “Copyright Getty Images 2011” either. It should represent a mental connection between seeing an image and understanding its context inside the article – so “Agreement is a great tool in business relationships” would be much better.

Now, <figure> and <figcaption> are used only when their content is specifically mentioned in the article. So it could be a graph, a listing of code or an image. But it doesn’t mean that every image with a caption automatically is a figure.

Multimarkdown is a great stuff and I appreciate the amount of work you put into this thing. I just want to make sure HTML5 is supported by modern tools like the spec says it should.

Tables with only a caption crash with `Bus error`

Tables with a caption but without label cause this error on my system:

$ cat tableCaptionOnly.txt 
[Caption only]
| Item   | Price ($) |
|--------|-----------|
| Small  | 1         |
| Large  | 10        |

$ multimarkdown -t latex tableCaptionOnly.txt 
Bus error

Using this build from developent branch:

commit 4a6a8919843bf0cf181111832025033bbf48f947
Author: Fletcher T. Penney <[email protected]>
Date:   Mon Jan 24 23:05:31 2011 -0500

    update tests

Beamer and tables

There seems to be an issue where a table needs to be followed by another block level element to "trigger" the recognition that the next header needs to trigger a new header section.

This can be an empty comment:

Some table source

<!-- -->

### Next Slide Title ####

I haven't been able to figure out why this occurs yet...

Test issue

Test

Syntax errors in links cause `Bus error`s

Some simple syntax errors in links cause Bus errors. Since many kinds of links differ only slightly in terms of syntax, I think it should be somehow "expected" that users do them once in a while (at least I do them) and an error message would make debugging much easier on a long document than a puzzling Bus error. I found that sometimes omitting # is the problem, but I don't know if more cases can trigger the Bus error.

Examples:

This
$ cat label_issues.txt
Reference to [#label-1].

$ cat label_issues.txt | multimarkdown -t latex
Reference to ~\cite{label-1}.

and that
$ cat label_issues.txt
Reference to [label-1].

$ cat label_issues.txt | multimarkdown -t latex
Reference to \href{file.png}{label--1}\footnote{\href{file.png}{file.png}}.

are correct. But, while this is correct
$ cat label_issues.txt
Reference to [][#label-1].

$ cat label_issues.txt | multimarkdown -t latex
Reference to ~\cite{label-1}.

this is not
$ cat label_issues.txt
Reference to [][label-1].

$ cat label_issues.txt | multimarkdown -t latex
Bus error

Same applies for this:
$ cat label_issues.txt
Reference to .

![label-1][]

[label-1]: file.png

$ cat label_issues.txt | multimarkdown -t latex
Reference to \autoref{label-1}.

\begin{figure}
\begin{center}
\includegraphics[keepaspectratio,width=\textwidth, height=.75\textheight]{file.png}
\end{center}
\label{label1}
\end{figure}

and that:

$ cat label_issues.txt 
Reference to [](label-1).

![label-1][]

[label-1]: file.png

$ cat label_issues.txt | multimarkdown -t latex
Bus error

RE: MultiMarkdown 3.0a7 released; Ready for beta?

Hi Fletcher

Somehow this table doesn't get recognized by MMD3:

[US-Produktivitätszuwachs durch Einführung der IT][tab:produktivitaetszuwachsit]
|                               | 1974–1990 | 1991–1995 | 1996–2000 |
| ----------------------------- | --------: | --------: | --------: |
Computer (Pro-Kopf-Wachstum)    |      30.4 |      54.6 |      56.3 |
Dampfkraft (Pro-Kopf-Wachstum)  |       3.8 |       2.4 |      23.6 |
Elektrizität (Pro-Kopf-Wachstum)|           |      28.2 |      47.0 |

Unfortunately I couldn't work out why, I suspect the problem lies somewhere in the text of the first column.

Greetings,
Oliver

PS: This table is merely an example (e.g. the year ranges only apply to the first row). If you came here through some search engine and want more information on the contents of the table see http://books.google.com/books?id=oLBHAAAAYAAJ

{\itshape ...} vs \textit{...}

Hi,

*...* should be converted to \textit{...} rather than {\itshape ...} since it takes care of little extra space if required to avoid, for example, a sloped f to bump into the top of an upright I.

Thanks,

Need to add support to memoir features in XSLT

This is planned before 3.0 is finalized.

Using XSLT to modify HTML "de-obfuscates" email

Whenever an XSLT is applied to HTML output, it de-obfuscates the HTML entities, which puts email addresses back out into plain-text.

Anyone else with experience using XSLT who can help find a fix - much appreciated!

Crash with Tables

I have been experiencing crashes in peg-mmd with tables, and have managed to isolate a repeatable case.

A document with only this table will cause the crash:

[April Schedule]  
   | Chair of Meeting | Cleanup after Meeeting |  
-- | ---------------- | ---------------------- |  
03 | Lee Smith        | -                      |  
10 | Lee Smith        | -                      |  
17 | Lee Smith        | -                      |  
24 | -                | -                      |

This however, works fine:

   | Chair of Meeting | Cleanup after Meeeting |  
-- | ---------------- | ---------------------- |  
03 | Lee Smith        | -                      |  
10 | Lee Smith        | -                      |  
17 | Lee Smith        | -                      |  
24 | -                | -                      |  
[April Schedule]

This also works fine:

[April Schedule]  
   | Chair of Meeting | Cleanup |  
-- | ---------------- | ------- |  
03 | Lee Smith        | -       |  
10 | Lee Smith        | -       |  
17 | Lee Smith        | -       |  
24 | -                | -       |

I hope this helps track it down. I can provide a crash log, if that would help.

Thanks,
Rob

Mac Installer limited to 64 bit 10.6 machines?

It appears that the current binary won't run on OS X 10.5 machines, or on 32-bit machines.

I'm working on a few ways to try and create an installer for older machines/OS installs. What I would really like is to figure out how to make a single binary that would also run on older machines without losing performance gains of being 64 bit (if they even exist).

Anyone out there smarter than me with advice to offer - I'll take it! ;)

Wrap for CPAN?

I just noticed that MultiMarkdown.pl doesn't handle nested definition lists very well, but peg-multimarkdown does it perfectly. I'll file a bug report for MultiMarkdown.pl, but since this is already here, and I'm writing Perl apps that use Text::MultiMarkdown from CPAN, I'm wondering if anyone has the TUITs to wrap peg-multimarkdown in a CPAN distribution. I'd love to be able to install peg-multimarkdown and then install Text::PegMultiMarkdown from CPAN and just use it in my Perl apps…

Is this on a to-do list, perhaps?

Thanks,

David

Visual enhancement: spacing between a table and its caption

The following line in the latex output for tables:

\begin{tabular}{@{}ll@{}} \\ \toprule

should be

\begin{tabular}{@{}ll@{}} \toprule

to reduce extra blank space between captions and tables.

\bottomrule in middle of tables with multiple sections

Hi,

Nit-picking: the \bottomrule before Category B should be a \midrule for better spacing and line width.

$ cat tableParts.txt 
[caption][label]
| Category | Item   | Price ($) |
|----------|--------|-----------|
| A        | Small  | 1         |
|          | Large  | 10        |

| B        | Small  | 100       |
|          | Large  | 1000      |

$ multimarkdown -t latex tableParts.txt 
\begin{table}[htbp]
\begin{minipage}{\linewidth}
\setlength{\tymax}{0.5\linewidth}
\centering
\small
\caption{caption}
\label{label}
\begin{tabular}{@{}lll@{}} \\ \toprule
Category &Item &Price (\$)  \\
\midrule
A &Small &1  \\
&Large &10  \\

\bottomrule
B &Small &100  \\
&Large &1000  \\

\bottomrule

\end{tabular}
\end{minipage}
\end{table}

latex figure labels generated from captions

Description

Figure labels (in latex) are generated by the captions rather than the multimarkdown label (see example below). As a result, \autorefs are never resolved and '??' always get printed in the pdf.

Example

Input

(slightly modified from Sample-Document/sample.txt)

As an example, here is an image from my website---[Nautilus Star](#nautilusstar).  If you have a local copy of the image, you can include the image in a pdf.

![Nautilus Star][]

[Nautilus Star]: Nautilus_Star.png "Nautilus Star Caption" width=307px height=250px

Expected result:

As an example, here is an image from my website---Nautilus Star (\autoref{nautilusstar}). If you have a local copy of the image, you can include the image in a pdf.

\begin{figure}
\begin{center}
\includegraphics[keepaspectratio,width=307pt,height=250pt]{Nautilus_Star.png}
\end{center}
\caption{Nautilus Star Caption}
\label{nautilusstar}
\end{figure}

Actual result:

As an example, here is an image from my website---Nautilus Star (\autoref{nautilusstar}). If you have a local copy of the image, you can include the image in a pdf.

\begin{figure}
\begin{center}
\includegraphics[keepaspectratio,width=307pt,height=250pt]{Nautilus_Star.png}
\end{center}
\caption{Nautilus Star Caption}
\label{nautilusstarcaption}
\end{figure}

Configuraiton

peg-multimarkdown v3.0a5
Mac OSX 10.6.5

"Skipped" levels not converted to OPML

When converting a MMD text file to OPML with the mmd binary, each level only contains it's direct children. For example:

# First Level #

## Second Level  ##

### Third Level ###

## Another Second Level ##

#### Fourth Level ####

When this is converted to OPML, the "Fourth Level" item will be deleted, since it skips a level from its parent, "Another Second Level".

It's possible to fix this, but it's going to take a more complicated algorithm than what I currently have and it's not a high priority for me to fix at the moment.

As always, suggestions welcome.

memory leaks

I was able to fix a few memory leaks, but there are still more. I used John MacFarlane's setup for valgrind and was able to fix most of the leaks in the regular code, but there are still a bunch in the parser itself when I test against a MMD file.

This is being worked on in the development branch.

See the wiki for parts of the parser that seem to have leaks. Please feel free to help me fix the errors!