asciidoctor / asciimath Goto Github PK

View Code? Open in Web Editor NEW

24.0 4.0 16.0 347 KB

Asciimath parser

License: MIT License

Ruby 99.19% CSS 0.81%

asciimath's People

Stargazers

Watchers

Forkers

mojavelinux tobski hgiesel giuliano52 riboseinc whitphx garkgarcia davidfarmer slonopotamus manzhikov hixio-mh johnyanni skalee akhil99 paulapousa felipesvi

asciimath's Issues

Expressions containing `|` are not parsed correctly

Expression like R(alpha_(K+1)|x) are not parsed correctly. | is treated is a both a left and right paren that can be closed by any other right or left paren. This is not correct, | should only close and be closed by a matching |.

Fix warning of uninitialized instance variable

Fix the following warning caused by an uninitialized instance variable:

lib/asciimath/parser.rb:76: warning: instance variable @push_back not initialized

This shows up when running the main Asciidoctor test suite.

Support for MathML output for HTML backend?

After doing some reading, it seems that Asciidoctor does have support for STEM macro to MathML output, but it's only supported if you have the asciimath gem and are using the DocBook backend. I see no reason why MathML cannot be generated for use in the HTML backend if we standardize a variable like stem-output.

I want MathML/HTML support as people with browsers that support MathML shouldn't have to download multiple images (Asciidoctor Mathmatical) or execute JavaScript (MathJax).

P.S.: If this issue is in the wrong repository and should be in asciidoctor instead, feel free to move it.

Percent sign should be an operator (<mo>)

Problem description

Given example:

AsciiMath.parse("40%").to_mathml

Expected result:

<math><mn>40</mn><mo>%</mo></math>

Actual result:

<math><mn>40</mn><mi>%</mi></math>

That is it includes <mi>%</mi> instead of <mo>%</mo>.

Rationale

It is a suffix operator or part of number notation, but certainly not an identifier.
https://github.com/asciimath/asciimathml converts percent signs to <mo>%</mo>.

Can you provide a fix?

I think so.

Crash when italics symbol `ii` used independently

In AsciiMath 1.x this asciimath expression worked:

E_{{ii}}

In 2.x this crashes:

asciimath 'E_{{ii}}'

= >

NoMethodError: undefined method `parent' for nil:NilClass
  .rbenv/versions/2.6.5/lib/ruby/gems/2.6.0/gems/asciimath-2.0.1/lib/asciimath/ast.rb:109:in `add'
  .rbenv/versions/2.6.5/lib/ruby/gems/2.6.0/gems/asciimath-2.0.1/lib/asciimath/ast.rb:228:in `initialize'
  .rbenv/versions/2.6.5/lib/ruby/gems/2.6.0/gems/asciimath-2.0.1/lib/asciimath/ast.rb:35:in `new'
  .rbenv/versions/2.6.5/lib/ruby/gems/2.6.0/gems/asciimath-2.0.1/lib/asciimath/ast.rb:35:in `unary'
  .rbenv/versions/2.6.5/lib/ruby/gems/2.6.0/gems/asciimath-2.0.1/lib/asciimath/parser.rb:617:in `parse_simple_expression'
  .rbenv/versions/2.6.5/lib/ruby/gems/2.6.0/gems/asciimath-2.0.1/lib/asciimath/parser.rb:546:in `parse_intermediate_expression'

It crashes because it considers ii the italics operator and instantiates it as one Unary object (ii), but it is actually a set of two Identity objects (i and i).

The workaround in 2.x is to add a space in between, i.e. 'E_{{i i}}':

asciimath 'E_{{i i}}'

<math><msub><mi>E</mi><mrow><mo>{</mo><mrow><mi>i</mi><mi>i</mi></mrow><mo>}</mo></mrow></msub></math>

AsciiMath.org supports E_{{ii}} without problem.

This problem probably affects many :symbol entries that utilize only letters in def self.add_default_parser_symbols(b).

tilde(x) produces latex \~{x} instead of \tilde{x}

Hi!

When running asciimath latex 'tilde(x)' I expect the output to be \tilde{x}. This is what my latex compiler outputs as x with a tilde over it and is what is documented at https://asciimath.org

The current output is \~{x} instead, which produces a regular x for me. If it's possible to add some configuration to the latex preamble that converts it to x with a tilde over it then I'd appreciate that information. Otherwise, I'd like this asciimath converter to output \tilde{x} instead.

Thank you.

function names should use <mi> tag

Hello,
if an expression contain an function (e.g. sin, cos, ...) it should be return <mi> tag instead of <mo>

Example:

sin(x)
with is encoded as

<math><mo>sin</mo><mrow><mo>(</mo><mi>x</mi><mo>)</mo></mrow></math>

but is should something like:

<math xmlns="http://www.w3.org/1998/Math/MathML"> <mstyle displaystyle="true"> <mrow> <mi>sin</mi> <mrow> <mo>(</mo> <mi>x</mi> <mo>)</mo> </mrow> </mrow> </mstyle> </math>

Special characters get double encoded in truffleruby

When running on truffleruby, the to_mathml method double encodes special twice.

Consider the following input text:

(::AsciiMath.parse '+-x').to_mathml

This produces:

<math><mo>&amp;#xB1;</mo><mi>x</mi></math>

instead of the expected:

<math><mo>&#xB1;</mo><mi>x</mi></math>

It's possible this is a bug in truffleruby itself.

Request normal font shift

We are using AsciiMath to enter expressions with units, metanorma/bipm-si-brochure#54, and that requires the option of a font shift to normal, upright font.

This is not present in most instances of Asciimath, but as I found in #7, https://runarberg.github.io/ascii2mathml/ (now offline) had rm (taken from TeX, although it properly refers to serifed font, not upright style).

The option to force upright font is necessary, we feel, and we request rm as an addition to this gem's Asciimath:

asciimath/lib/asciimath/parser.rb :

b.add('sfbi', :normal, :unary)

asciimath/lib/asciimath/latex.rb :

:normal => "\\mathrm", (or: \\mathup)

asciimath/lib/asciimath/markup.rb :

b.add(:sans_serif_bold_italic, :normal, :font)

How to use Russian Cyrillic letters in asciimath?

I want to use Russian Cyrillic letters in [asciimath] blocks in Asciidoctor, but I noticed hat this causes asciimath to choke with error.

invalid byte sequence in UTF-8

Here is simple example:

[asciimath]
++++
Скорость=(Расстояние)/(Время)
++++

From asciimath to latex: Translate `del` as `\partial` instead of `\del`

Currently it del translates to \del which on my compiler isn't defined (unlike \partial).

Create interfaces for extentions and custumazation

The changes we've been working on recently should make it pretty easy to make the parser and the renders extendable. The code is already there, we just need to create some facilities for users to use it and document it.

My proposal is the following:

Make SymbolsTable render-agnostic. The idea is that SymbolsTable would have a column for MathML, a column for LaTeX and a column for HTML, as well as a column for the AsciiMath expression and a column for the Ruby symbol that represents it:

AsciiMath Symbol MathML HTML LaTeX

aleph :aleph ℵ ℵ \aleph

... ... ... ... ...

If a cell is left empty, the renderers would use a (render-specific) default strategy to render the symbol. This would allow users to create extensions by using custom symbol tables.

Want the parser and the renders to handle a custom symbol of yours? Simply create a row for it your symbols table.
Create an optional parameter for AsciiMath.parse that represents the symbols table that should be used by the parser. It's default value should be the symbols table currently used by the parser.
Create an optional parameter that represents the ColorTable that should be used by the parser. It's default value should be the HTML standard color names.
Create a optional parameter in MarkupBuilder.initialize that represents which SymbolsTable should be used when rendering the markup. It's default value should be the default SymbolsTable used by each renderer.
Create an optional parameter in MarkupBuilder.initialize that represents a map between RGB values and color names.

AsciiMath	Symbol	MathML	HTML	LaTeX
aleph	`:aleph`	ℵ	ℵ	\aleph
...	...	...	...	...

@pepijnve What do you think?

Update AST.adoc

phi/varphi option

There is a split in which character is used for phi and which for varphi.

Unicode has U+03C6 as phi and U+03D5 as the variant of phi
TeX has them the other way around; so did Unicode before 1998
Asciimath, MathJax, AsciiMathML all follow TeX as a default, and they all provide an option to give Unicode instead of TeX behaviour (fixphi = false, true by default): see e.g. asciimath/asciimathml#14, mathjax/MathJax#353, http://math.chapman.edu/~jipsen/asciimathjax/, http://asciimath.org
I recommend that this gem follow suit, and have a fixphi option, true by default, which assigns phi and varphi the values the other way around by default

Buggy tokenisation of text(), font()

The contents of text() and font() are being parsed with the same tokenisation of as the rest of AsciiMath.

As a result, text("K") is being rendered as <mtext><mtext>K</mtext></mtext>: text renders as <mtext>...</mtext>, and "K" as <mtext>K</mtext>.

Similarly, other text within text() and font() is being tokenised, which introduces whitespaces: text(“K”) (with smart quotes) is being rendered as <mtext><mrow><mi>\u201C</mi><mi>K</mi><mi>\u201D</mi></mrow></mtext>

And of course, if any of the text happens to be an AsciiMath instruction, it is treated as an AsciiMath instruction: text(int x) renders as <mtext><mrow><mo>∫</mo><mi>x</mi></mrow></mtext>. In a text we were about to go live with, text("AA") is ending up being rendered as ∀!

This is completely wrong. The contents of text() and font() are text, and must be treated as text. The culprit is the code:

def parse_simple_expression(tok, depth)
      t1 = tok.next_token
...
when :unary, :font
          s = parse_simple_expression(tok, depth)
          {:type => t1[:type], :s => s, :operator => t1[:value]}

That is only valid where t1[:value] is sqrt. For text and :font, a different tokeniser needs to kick in, which grabs all text between the following :lparen token and its matching :rparen token, and returns it to :s as just text. Something like:

when :unary_text, :font
          s = parse_text_expression(tok)
....

def parse_text_expression(tok)
  ret = ""
  while(t1 = tok.next_token)
    case t1[:type]
      when :lparen, :lrparen
        ret = ""
      when :rparen, :rlparen
        break
     else
       ret << t1.text # doesn't exist yet: get the literal token, not its symbol lookup
     end
     ret
end

Fix the spacing of some standard AsciiMath operators

Some standard AsciiMath operators - such as Sin - don't really have a direct LaTeX equivalent. As of now, the solution I came up with was to output \text{Sin}, which mean the ouput for Sin x and "Sin" x is the same.

The rendered output look like the following:

One could then expect that Sin x (which is outputted as \text{Sin} x) would be rendered as:

However, this is not the case. This is how \text{Sin} x is rendered in LaTeX:

This could be fixed by outputting a \; between \text{Sin} and x (as of \text{Sin} \; x), but this is a bodge, a hack. This isn't a robust strategy. For example, if we were to adopt this strategy, the expression Sin (x)would be rendered as:

This is incorrect. The correct output would be:

It gets worse when dealing with infix operators - such as and, or and if. I'm not quite sure on how to deal with this. @davidfarmer @pepijnve I'd like your input.

Keep in mind I'm doing my best to make the LaTeX output as human-readable as possible. This could be easily fixed by using low-level TeX hacks, but as of now, my intention is to avoid them.

Optionally leave Unicode characters unescaped in MathML output

This is a feature request. I can provide implementation, but I wanted to discuss things first.

MathML does not enforce any particular encoding, UTF-8 is legal, and non-ASCII characters can be used without escaping them [W3C].

However, this gem always encodes non-ASCII characters using numeric XML character references (e.g. é):

asciimath/lib/asciimath/mathml.rb

Lines 234 to 249 in 3a4bbab

 def append_escaped(text) 

 text.each_codepoint do |cp| 

 if cp == 38 

 @mathml << "&amp;" 

 elsif cp == 60 

 @mathml << "&lt;" 

 elsif cp == 62 

 @mathml << "&gt;" 

 elsif cp > 127 

 @mathml << "&#x#{cp.to_s(16).upcase};" 

 else 

 @mathml << cp 

 end 

 end 

 end 

 end

This is safer as it never depends on parent document's encoding, but on the other hand it hampers readability.

My suggestion is to add :escape_non_ascii option to Expression#to_mathml method which would disable this kind of escaping (of course <, >, and & will be escaped anyway). This option should default to false.

Perhaps similar option could be added to Expression#to_html method.

Releasing the recent changes to RubyGems

I think we're getting closer to the point of releasing the recent work we've done to RubyGems. That would finally allow asciidoctor-mathematical and asciidoctor-latex to support AsciiMath.

@pepijnve Is there something you'd like to review before releasing the changes?

stack level too deep

The following line in my asciidoctor document produces the trace below:
stem:[TD_t(s_t, a_t)=R(s_t, a_t)+\gamma \max_aQ(s_{t+1},a)-Q(s_t, a_t)]

asciidoctor-pdf -v -r asciidoctor-mathematical -a mathematical-format=svg WAD.adoc --trace
Traceback (most recent call last):
9775: from /usr/local/bin/asciidoctor-pdf:23:in

' 9774: from /usr/local/bin/asciidoctor-pdf:23:in load'
9773: from /var/lib/gems/2.7.0/gems/asciidoctor-pdf-1.6.0/bin/asciidoctor-pdf:27:in <top (required)>' 9772: from /var/lib/gems/2.7.0/gems/asciidoctor-2.0.15/lib/asciidoctor/cli/invoker.rb:113:in invoke!'
9771: from /var/lib/gems/2.7.0/gems/asciidoctor-2.0.15/lib/asciidoctor/cli/invoker.rb:113:in each' 9770: from /var/lib/gems/2.7.0/gems/asciidoctor-2.0.15/lib/asciidoctor/cli/invoker.rb:130:in block in invoke!'
9769: from /var/lib/gems/2.7.0/gems/asciidoctor-2.0.15/lib/asciidoctor/convert.rb:189:in convert_file' 9768: from /var/lib/gems/2.7.0/gems/asciidoctor-2.0.15/lib/asciidoctor/convert.rb:189:in open'
... 9763 levels...
4: from /var/lib/gems/2.7.0/gems/asciimath-2.0.2/lib/asciimath/ast.rb:99:in each' 3: from /var/lib/gems/2.7.0/gems/asciimath-2.0.2/lib/asciimath/latex.rb:385:in block in is_very_small'
2: from /var/lib/gems/2.7.0/gems/asciimath-2.0.2/lib/asciimath/latex.rb:385:in is_very_small' 1: from /var/lib/gems/2.7.0/gems/asciimath-2.0.2/lib/asciimath/latex.rb:385:in all?'
/var/lib/gems/2.7.0/gems/asciimath-2.0.2/lib/asciimath/ast.rb:99:in `each': stack level too deep (SystemStackError)

Replacing the {t+1} with t prevents the error from occurring.
This occurs only when using asciidoctor-pdf. With asciidoctor it works fine.

Crash on truffleruby if input text contains spaces

The AsciiMath.parse method crashes on truffleruby if the input text contains spaces.

The following works:

AsciiMath.parse 'a>b'

The following does not work:

AsciiMath.parse 'a > b'

Instead, it fails with the following exception:

asciimath-1.0.7/lib/asciimath/parser.rb:473:in `parse_expression':
  undefined method `[]' for nil:NilClass (NoMethodError)

AST Redesign

An issue for discussing the current AST redesign effort. The current proposal is the following:

Symbols should be identified by tokens (such as :aleph) instead of characters (such as ℵ). That way each builder can store an internal table that maps this tokens to their intended representation (like what I did with LatexBuilde::CONSTANTS).
"Operators" (:operator) and "identifiers" (:identifier) should no be distinguished. Their syntax is the same. I undertand that this distinction is pretty relevant for the MathML and HTML builders, but those could be dealt with by storing internal reference tables (an array or a set that stores all symbol identifiers that should be dealt with as operators).
"Operations" (:unary, :binary, :trinary) should be unified and expressed as Hash's that store an array of operands. Each builder would then be responsible for segregating them appropriately. The order of the operands should be the same as the order in which they are expressed in asciimath.
Operations listed in asciimath.org (such as hat or vec) should no be expressed in terms of lower-level operations (such as:over or :under). Each builder would be responsible for converting the more specific representation to the more general one (the one done in terms of lower-level primitives) as necessary.

Matrices with complex expressions are not recognised correctly

For instance in s'_i = {(- 1, if s_i > s_(i + 1)),( + 1, if s_i <= s_(i + 1)):} the rhs of the equality is not recognised as a matrix. As a result the output is not correct.

Matrix requirement on bracketing too strict

((a),(b),(c)) is correctly rendered as a matrix. However ((a, b, c)) should also be recognised as a matrix, per http://asciimath.org, and is not: it is being rendered as literal ((a, b, c)) instead.

I'm doing a PR to address this.

Add "ker" as a standard symbol?

I think we could add the ker symbol to the symbols table. It's pretty common in Algebra, just like dim (which is included in the symbols table).

The expression ker would then be parsed as symbol('ker', :ker). Here is an example of it's use in context:

ker T = { v in V : T(v) = 0_W }

Anomalies in symbols table

This refers to the symbols table in AST.adoc .

For example, the first of these two lines presumably should end in sinh:

sinh | :sinh |   | Sinh
Sinh | :Sinh |   | Sinh

There are some other similar examples (Cosh, Tanh, Coth, Sech, Csch)

And the capitalization at the end of this line does not make sense to me:

arcsin | :arcsin |   | ARCsin

Some rows in that table have math which does not make sense to me:
Lim (what is lim with a capital L?)
FLoor (F and L are capital)
Norm, Ceil (again, why capital?)
same for the 7 lines from Sqrt to UNder

Is there an official reference to what should be in that table? That might help
me understand which of the above are errors and which are intentional. If this
is meant as an alternate representation of the data in the AMsymbols of
ASCIIMathML.js , then that explains some of my questions.

Add support for `cancel`

Add support for augmented matrices

Matrices containing a vbar in the same column of each row like [[a,b,|,c],[d,e,|,f]] should result in a column line. At the moment this case is not treated specially and results in a vbar operator in each cell instead.

Consider supporting an alternative to AsciiMath

In a previous issue was written:

I'm currently studying markup languages that can serve as a viable alternative to LaTeX in the academic document preparation space.

Let me describe some thoughts about this.

First, there is no existing viable alternative. But I think it is possible to invent one,
which I started doing a couple years ago. I had a student working on a project,
which started out as a way to convert AsciiMath to LaTeX. (There exists such a
converter, but it does not output LaTeX in a form a human would write.) After
working for a while, we decided to rethink the AsciiMath syntax and create a
new (but similar-looking) math markup language. We got pretty far, but then the
student graduated and I have not actively worked on that project (but I continue
to think about it and intend to return to it).

I named the project "Space Math" and started sketching a retro 1950's logo.
The name comes from the critical role that the space character plays. As
mentioned elsewhere, you can disambiguate function application and implied
multiplication: f(x) means function application because there is no space
between the f and the (x).

We also found the need to introduce some Python-like syntax for expressions
that naturally take up multiple lines. The markup is intended to be human
readable and human writable, for example:

abs(x) = cases:
    x if x >= 0
   -x if x < 0

Note also the abs keyword. This is preferable to |.| because out of context
the meaning of |A| is ambiguous. The use of multiple lines was considered a
deal-breaker for AsciiMath, but for me it is a deal-breaker that there is no good
way to write the above construction, or multiline equations or derivations,
in AsciiMath.

I don't claim that we totally figured out everything, but we definitely were on the
right track and were able to handle a lot of things that didn't work well in
AsciiMath.

I can dig up my old material if you think this is worth considering.

I should also mention that my use case for this was the PreTeXt authoring system,
a project in which I am actively involved: https://pretextbook.org .

Escaped space not converted to no-break space

An escaped backslash should be converted to a no-break space. However, currently it's being left unprocessed.

Here's an example:

{ x \ : \ x in A ^^ x in B }

Expected output:

<math xmlns="http://www.w3.org/1998/Math/MathML"><mrow><mo>{</mo><mi>x</mi><mo>&#xA0;</mo><mo>:</mo><mo>&#xA0;</mo><mi>x</mi><mo>&#x2208;</mo><mi>A</mi><mo>&#x2227;</mo><mi>x</mi><mo>&#x2208;</mo><mi>B</mi><mo>}</mo></mrow></math>

Actual output:

<math xmlns="http://www.w3.org/1998/Math/MathML"><mrow><mo>{</mo><mi>x</mi><mi>\</mi><mi>:</mi><mi>\</mi><mi>x</mi><mo>&#x2208;</mo><mi>A</mi><mo>&#x2227;</mo><mi>x</mi><mo>&#x2208;</mo><mi>B</mi><mo>}</mo></mrow></math>

LaTeX output support

The title of this issue is a bit misleading 😁️. Parsing LaTeX is definitively out of scope. However, I believe we should consider LaTeX as an output format. In other words, maybe we should implement a LatexBuilder class.

Besides making the library more useful for other projects, that would allow asciidoctor-latex and asciidoctor-mathematical to transpile asciimath blocks to LaTeX, bringing asciimath support to the LaTeX and PDF backends.

Looking at the official asciimath grammar specification, it's clear that asciimath could be unambiguously traspiled to LaTeX.

I'm available to work on this, but I don't quite understand how the parsed asciimath AST looks like. If you guys could help me on that, I'd be glad to contribute to the project.

accent attribute needs to be respected in processing, for Word

hat p currently renders into MathML as <mover><mi>p</mi><mo>^</mo></mover>. When this is converted into Word, the result is

instead of the desired

Inspecting the Word OOXML XSLT, it turns out that Word expects to see <mover accent="true"><mi>p</mi><mo>^</mo></mover>, in order to render the equation with the its <m:acc> tag. The hat operator of Asciimath is recognised as being of class :accent, but that class is not being propagated into the Asciimath parse.

Parsing of matrices of a single column fails since 2.0.0

Parsing of this worked prior to 2.0.0.

((L - W + 1),(2))

Trace:

NoMethodError: undefined method `each' for #<AsciiMath::AST::Number:0x000055cc8328f1b8>
1521
  /opt/hostedtoolcache/Ruby/2.6.6/x64/lib/ruby/gems/2.6.0/gems/asciimath-2.0.0/lib/asciimath/parser.rb:669:in `block in convert_to_matrix'
1522
  /opt/hostedtoolcache/Ruby/2.6.6/x64/lib/ruby/gems/2.6.0/gems/asciimath-2.0.0/lib/asciimath/parser.rb:660:in `map'
1523
  /opt/hostedtoolcache/Ruby/2.6.6/x64/lib/ruby/gems/2.6.0/gems/asciimath-2.0.0/lib/asciimath/parser.rb:660:in `convert_to_matrix'
1524
  /opt/hostedtoolcache/Ruby/2.6.6/x64/lib/ruby/gems/2.6.0/gems/asciimath-2.0.0/lib/asciimath/parser.rb:600:in `parse_simple_expression'
1525
  /opt/hostedtoolcache/Ruby/2.6.6/x64/lib/ruby/gems/2.6.0/gems/asciimath-2.0.0/lib/asciimath/parser.rb:545:in `parse_intermediate_expression'
1526
  /opt/hostedtoolcache/Ruby/2.6.6/x64/lib/ruby/gems/2.6.0/gems/asciimath-2.0.0/lib/asciimath/parser.rb:519:in `parse_expression'
1527
  /opt/hostedtoolcache/Ruby/2.6.6/x64/lib/ruby/gems/2.6.0/gems/asciimath-2.0.0/lib/asciimath/parser.rb:508:in `parse'
1528
  /opt/hostedtoolcache/Ruby/2.6.6/x64/lib/ruby/gems/2.6.0/gems/asciimath-2.0.0/lib/asciimath/parser.rb:770:in `parse'
1529

Will submit a corresponding PR.

Use objects instead of Hashes and Arrays to represent the AST

In order to enable more complex interpretation of the AST it can be helpful to be able to inspect the rest of the tree while generating output for one node. For instance to know if something is a function application or just an identifier it helps to be able to check sibling nodes.

As an example

+ f
+-+ (
  + x
  + )

is a the application of function f to value x and it might be desirable to render a ⁡ between f and (.

On the other hand when rendering
As an example

+ sin
+-+ (
  + f
  + )

f is not a function.

In order to distinguish between the two we need to be able to 'see' that in the first case f is followed by a paren expression while in the second case it's the only element inside a paren expression and is not followed by some other identifier.

My hope is that proper AST objects that provide tree traversal methods will enable this type of more sophisticated reasoning in the code.

Add support for `color`

Avoid outputting `mfenced` tags

mfenced will be deprecated in the next revision of MathML. Since mfenced is 100% equivalent to the expanded <mrow><mo>left_fence</mo>content<mo>right_fence</mo></mrow> form we should generate that instead.

LaTeX subsup behaviour

A follow up to #25. As pointed by @davidfarmer, the LaTeX processor renders the expression z_12^34 as z_12^34 instead of z_{12}^{34}. This is an error.

I'm working on a patch to place curly braces around the arguments of _ and ^. However, to improve the readability of the output, the current algorithm does not place the curly braces around the following:

Single characters or numbers. That meas a_b and a_2 get rendered as a_b and a_2 instead of a_{b} and a_{2} respectivelly. However, a_(bc) and a_12 get rendered as a_{b c} and a_{12} repectivelly.
Symbols. That means a_(aleph) gets rendered as a_\aleph instead of a_{\aleph}.
Sections of text. That means a_"some text" gets rendered as a_\text{some text} instead of a_{\text{some text}}.

The same goes for ^. Everything else is outputted inside curly braces. @davidfarmer Would this work for you?

Standardize color descriptors across renderers

A follow up to #33 (comment).

My idea is to create a standard for valid color descriptors, which would be used by all renderers. Invalid color descriptors could then be detected at the parsing level. Also, that would allow for greater flexibility at the rendering level, since each renderer would be able to translate those standard color descriptors accordingly.

That would mean the expression color x y would not be represented as a binary operation, bu rather as a special operator whose first argument is always a valid color descriptor.

`underset` not recognized

This AsciiMath expression:

underset(_)(hat A) = hat A exp j vartheta_0

is being rendered as this MathML fragment:

<math>
  <mi>u</mi>
  <mi>n</mi>
  <mi>d</mi>
  <mi>e</mi>
  <mi>r</mi>
  <mi>s</mi>
  <mi>e</mi>
  <mi>t</mi>
  <mfenced open="(" close=")"/>
  <mfenced open="(" close=")">
    <mover>
      <mi>A</mi>
      <mo>^</mo>
    </mover>
  </mfenced>
  <mo>=</mo>
  <mover>
    <mi>A</mi>
    <mo>^</mo>
  </mover>
  <mi>exp</mi>
  <mi>j</mi>
  <msub>
    <mi>&#x3D1;</mi>
    <mn>0</mn>
  </msub>
</math>

Which gets rendered like this:

The correct rendering (as done by MathJax) is:

Migrate away from travis-ci.org

travis-ci.org is going to be shut down on December, 31:

https://docs.travis-ci.com/user/migrate/open-source-repository-migration#frequently-asked-questions

Q. When will the migration from travis-ci.org to travis-ci.com be completed?
A. In an effort to ensure that all of our users - whether you build open-source, public or private repositories - receive regular feature updates, security patches and UX/UI enhancements, we are announcing that travis-ci.org will be officially closed down completely no later than December 31st, 2020, allowing us to focus all our efforts on bringing new features and fixes to travis-ci.com and all of our awesome users like yourself on the travis-ci.com domain.

Q. What will happen to travis-ci.org after December 31st, 2020?
A. Travis-ci.org will be switched to a read-only platform, allowing you to see your jobs build history from all repositories previously connected to travis-ci.org.

Also, note that travis-ci.com, while being the most logical place to move to, has build time limits for public repositories:

https://blog.travis-ci.com/2020-11-02-travis-ci-new-billing

For those of you who have been building on public repositories (on travis-ci.com, with no paid subscription), we will upgrade you to our trial (free) plan with a 10K credit allotment (which allows around 1000 minutes in a Linux environment).

Most of Asciidoctor ecosystem has already moved to GitHub Actions.

Gap analysis against asciimath/asciimathml

I have noted that expressions like a' do not correctly translate to the prime symbol, but leave the ' token as a quote. Given that https://github.com/asciimath/asciimathml/blob/master/ASCIIMathML.js, the Javascript from AsciiMath itself, does recognise the prime, I want to submit this list of symbols unrecognised currently for discussion, before I submit another PR:

{input:":'",  tag:"mo", output:"\u2235",  tex:"because", ttype:CONST},
{input:":|:", tag:"mo", output:"|", tex:null, ttype:CONST},
{input:"|:", tag:"mo", output:"|", tex:null, ttype:LEFTBRACKET},
{input:":|", tag:"mo", output:"|", tex:null, ttype:RIGHTBRACKET},
{input:"'",   tag:"mo", output:"\u2032",  tex:"prime", ttype:CONST},
{input:"/_\\",  tag:"mo", output:"\u25B3",  tex:"triangle", ttype:CONST},
{input:"|:", tag:"mo", output:"|", tex:null, ttype:LEFTBRACKET},
{input:"abs",   tag:"mo", output:"abs",  tex:null, ttype:UNARY, rewriteleftright:["|","|"]},
{input:"cancel", tag:"menclose", output:"cancel", tex:null, ttype:UNARY},
{input:"ceil",   tag:"mo", output:"ceil",  tex:null, ttype:UNARY, rewriteleftright:["\u2308","\u2309"]},
{input:"class", tag:"mrow", ttype:BINARY},
{input:"color", tag:"mstyle", ttype:BINARY},
{input:"divide",   tag:"mo", output:"-:", tex:null, ttype:DEFINITION},
{input:"floor",   tag:"mo", output:"floor",  tex:null, ttype:UNARY, rewriteleftright:["\u230A","\u230B"]},
{input:"frown",  tag:"mo", output:"\u2322", tex:null, ttype:CONST},
{input:"gt=", tag:"mo", output:"\u2265", tex:"geq", ttype:CONST},
{input:"id", tag:"mrow", ttype:BINARY},
{input:"lt=", tag:"mo", output:"\u2264", tex:"leq", ttype:CONST},
{input:"mbox", tag:"mtext", output:"mbox", tex:null, ttype:TEXT},
{input:"norm",   tag:"mo", output:"norm",  tex:null, ttype:UNARY, rewriteleftright:["\u2225","\u2225"]},
{input:"overarc", tag:"mover", output:"\u23DC", tex:"overparen", ttype:UNARY, acc:true},
{input:"overset", tag:"mover", output:"stackrel", tex:null, ttype:BINARY},
{input:"setminus", tag:"mo", output:"\\", tex:null, ttype:CONST},
{input:"tilde", tag:"mover", output:"~", tex:null, ttype:UNARY, acc:true},
{input:"underset", tag:"munder", output:"stackrel", tex:null, ttype:BINARY},

These are also in the Javascript, as case variants:

{input:"Arccos",  tag:"mo", output:"Arccos", tex:null, ttype:UNARY, func:true},
{input:"Arcsin",  tag:"mo", output:"Arcsin", tex:null, ttype:UNARY, func:true},
{input:"Abs",   tag:"mo", output:"abs",  tex:null, ttype:UNARY, notexcopy:true, rewriteleftright:["|","|"]},
{input:"Cos",  tag:"mo", output:"Cos", tex:null, ttype:UNARY, func:true},
{input:"Cosh", tag:"mo", output:"Cosh", tex:null, ttype:UNARY, func:true},
{input:"Cot",  tag:"mo", output:"Cot", tex:null, ttype:UNARY, func:true},
{input:"Csc",  tag:"mo", output:"Csc", tex:null, ttype:UNARY, func:true},
{input:"Ln",   tag:"mo", output:"Ln",  tex:null, ttype:UNARY, func:true},
{input:"Log",  tag:"mo", output:"Log", tex:null, ttype:UNARY, func:true},
{input:"Sec",  tag:"mo", output:"Sec", tex:null, ttype:UNARY, func:true},
{input:"Sin",  tag:"mo", output:"Sin", tex:null, ttype:UNARY, func:true},
{input:"Sinh", tag:"mo", output:"Sinh", tex:null, ttype:UNARY, func:true},
{input:"Tan",  tag:"mo", output:"Tan", tex:null, ttype:UNARY, func:true},
{input:"Tanh", tag:"mo", output:"Tanh", tex:null, ttype:UNARY, func:true},

I do think this gem should natively support prime ("'"). How many other of these it should support, I'd like to discuss with you...

frac()() does not appear to be implemented

root(x)(y) is implemented, and there is code to handle it: append_root

However, while the symbol frac is recognised as a binary symbol, there is no corresponding code to process :frac: under

asciimath/lib/asciimath/markup.rb

Line 380 in d3fd7ac

if (symbol = resolve_symbol(node.operator))

I'm assuming all that needs to be done is:

 when ::AsciiMath::AST::BinaryOp
          if (symbol = resolve_symbol(node.operator))
            case symbol[:type]
              when :over
                append_underover(node.operand2, nil, node.operand1)
              when :under
                append_underover(node.operand2, node.operand1, nil)
              when :root
                append_root(node.operand2, node.operand1)
              when :frac
                append_fraction(node.operand1, node.operand2)
              when :color
                append_color(node.operand1.to_hex_rgb, node.operand2)
            end
          end

asciidoctor-pdf fails with "stack level too deep" for certain expressions

First of all, I'm new to the ruby ecosystem so I'm sorry if the fix for this issue is trivial or if I should have reported this issue elsewhere. If that is indeed the case, please let me know which repo I should be filing this issue under and I'll happily do so.

Steps to reproduce

Contents of test.adoc file (the actual expressions that caused the issue are slightly more complex, these are just the minimal expressions with which I could reproduce it):

:stem: asciimath

asciimath:[R(alpha_(K+1)|x)]

asciimath:[x ((y) ^ (1-x))]

`asciidoctor-pdf` invocation

asciidoctor-pdf --trace --require asciidoctor-mathematical test.adoc

Expected result

The expression gets parsed and rendered without issues.

Actual result

asciidoctor-pdf fails with "stack level too deep".

Trace

Traceback (most recent call last):
        9775: from /home/username/.asdf/installs/ruby/2.7.3/bin/asciidoctor-pdf:23:in `<main>'
        9774: from /home/username/.asdf/installs/ruby/2.7.3/bin/asciidoctor-pdf:23:in `load'
        9773: from /home/username/.asdf/installs/ruby/2.7.3/lib/ruby/gems/2.7.0/gems/asciidoctor-pdf-1.6.0/bin/asciidoctor-pdf:27:in `<top (required)>'
        9772: from /home/username/.asdf/installs/ruby/2.7.3/lib/ruby/gems/2.7.0/gems/asciidoctor-2.0.15/lib/asciidoctor/cli/invoker.rb:113:in `invoke!'
        9771: from /home/username/.asdf/installs/ruby/2.7.3/lib/ruby/gems/2.7.0/gems/asciidoctor-2.0.15/lib/asciidoctor/cli/invoker.rb:113:in `each'
        9770: from /home/username/.asdf/installs/ruby/2.7.3/lib/ruby/gems/2.7.0/gems/asciidoctor-2.0.15/lib/asciidoctor/cli/invoker.rb:130:in `block in invoke!'
        9769: from /home/username/.asdf/installs/ruby/2.7.3/lib/ruby/gems/2.7.0/gems/asciidoctor-2.0.15/lib/asciidoctor/convert.rb:189:in `convert_file'
        9768: from /home/username/.asdf/installs/ruby/2.7.3/lib/ruby/gems/2.7.0/gems/asciidoctor-2.0.15/lib/asciidoctor/convert.rb:189:in `open'
         ... 9763 levels...
           4: from /home/username/.asdf/installs/ruby/2.7.3/lib/ruby/gems/2.7.0/gems/asciimath-2.0.2/lib/asciimath/latex.rb:385:in `block in is_very_small'
           3: from /home/username/.asdf/installs/ruby/2.7.3/lib/ruby/gems/2.7.0/gems/asciimath-2.0.2/lib/asciimath/latex.rb:385:in `is_very_small'
           2: from /home/username/.asdf/installs/ruby/2.7.3/lib/ruby/gems/2.7.0/gems/asciimath-2.0.2/lib/asciimath/latex.rb:385:in `all?'
           1: from /home/username/.asdf/installs/ruby/2.7.3/lib/ruby/gems/2.7.0/gems/asciimath-2.0.2/lib/asciimath/ast.rb:99:in `each'
/home/username/.asdf/installs/ruby/2.7.3/lib/ruby/gems/2.7.0/gems/asciimath-2.0.2/lib/asciimath/ast.rb:99:in `each': stack level too deep (SystemStackError)

Remarks

Each of the expressions can independently cause the issue.

With the first one, I won't get the error if the expression is either:
alpha_(K+1)|x (adding parentheses around the expression throws the error), or
R(alpha_(K)|x) (adding any operator after K throws the error).

Similar results with the second one, (y) ^ (1-x) and x ((y) ^ (x)) don't throw any errors.

I was also able to reproduce this issue with the docker-asciidoctor container if that's of any help.

If there's any more information required from my end, I'll gladly provide.

Transfer project to asciidoctor organization

This gem was written for the asciidoctor project. It would be preferable to transfer ownership of the repository to that organisation.

@mojavelinux are you ok with that?

Conformance tests with JavaScript reference implementation

I am thinking of adding some conformance checks which will ensure that converting given formula with AsciiMath gem gives the same result as converting it with AsciiMath's original JavaScript implementation. Unless there are some differences which have been introduced for purpose, of course…

The idea is to prepare a list of example AsciiMath formulas, the longer the better, then convert every single of them with both implementations, and then compare the results. The whole process could be written as follows:

for each formula in list_of_example_formulas
  ruby_retval = convert_with_ruby(formula)
  js_retval = convert_with_js(formula)
  assert_equal js_retval, ruby_retval
end

This kind of tests requires having some JavaScript runner in development environment (nodejs or mini_racer).

This is easy to implement and I can help with that. Hopefully, these tests will help to early detect bugs like #58.

LaTeX and mathml issues in parser_spec

I only became aware of this project recently, so apologies if these comments are
premature.

In parser_spec.rb, which I assume you are using to test your parser,
there are some constructions which are not optimal.

z_12^34 should be z_{12}^{34} in latex.

Writing f\left( x \right) in latex does not look good. If you compare that to f(x)
you will see that the version with the right and left has too much space after the "f".
(It seems tricky to decide when the object in parentheses is large,
so that the left and right are needed. I'm interested to hear if you have
a way to deal with that.)

In mathml, mfenced is deprecated. See https://developer.mozilla.org/en-US/docs/Web/MathML/Element/mfenced
But more importantly, in mathml you can (and should) distinguish between
function application and implied multiplication. That can be done with the
function application character ⁡. If your parser knows that f(x)
is "the function f at the argument x", then you can output it that way.

Avoid using \left and \right in the LaTeX output whenever possible

A follow up to #25. As stated by @davidfarmer:

Writing f\left( x \right) in latex does not look good. If you compare that to f(x)
you will see that the version with the right and left has too much space after the "f".
(It seems tricky to decide when the object in parentheses is large,
so that the left and right are needed. I'm interested to hear if you have
a way to deal with that.)

I mostly agree. This is definitively a thing we could improve. We'll just have to establish a set of rules to decide weather or not \left and \rightshould be used.

There are at least some cases where you don't need the left and right around
function arguments: one character, multiple characters with only +, -, and multiplication between them. Not obvious to me if the default should
be to have left and right, or the default should be to no have those.

This is definitively a case we're \left and \right could be avoided. I personally think it's easier to think about this in terms of when should \left and \right be avoided instead of when should \left and \right be used. So yes, \left and \right should be the default.

@davidfarmer Would the case mentioned above be sufficient for you?

Fix behaviour of ubrace, obrace with superscript, subscript

Issue raised in metanorma/metanorma#235:

In Asciimath, ubrace followed by a subscript, and obrace followed by a superscript, is meant to render the subscript/superscript text under and over the expression, like with say lim. So:

ubrace(((0.5, 0, 0.5),(0, 0.5, 0.5),(0, 0, 1)))_("Adjustment to texture space")

should render as

as you can verify at http://asciimath.org

The current behaviour of the gem, which looks like the following, is incorrect:

I've kludged some behaviour in PR #68 in markup.rb to enable that, adding :underover => true to :underbrace and :overbrace, and doing an ad hoc if branch to process it. I defer to you about how to do it properly.

	def append_escaped(text)
	text.each_codepoint do \|cp\|
	if cp == 38
	@mathml << "&"
	elsif cp == 60
	@mathml << "<"
	elsif cp == 62
	@mathml << ">"
	elsif cp > 127
	@mathml << "&#x#{cp.to_s(16).upcase};"
	else
	@mathml << cp
	end
	end
	end
	end