catcombo / jira2markdown Goto Github PK

View Code? Open in Web Editor NEW

20.0 1.0 5.0 118 KB

Convert text from JIRA markup to Markdown using parsing expression grammars

License: MIT License

Python 100.00%

youtrack text-converter markdown jira

jira2markdown's Issues

Add conversion for sub/super-script

We use ^superscript^ and ~subscript~ in our projects.

Could you please add translation according to the following rules:

If unicode conversion of the complete marked text possible, then do it: https://gist.github.com/molomby/9bc092e4a125f529ae362de7e46e8176
If no unicode conversion is possible, then wrap in <sup> & <sub> tags

Thanks!

Examples of overriding various markup to "plain text"?

First, thanks for a great library! 🎉

I'd like to be able to convert more elements to "plain text", for example, when it sees {color:red}Red Text{color}, I'd like the conversion to only have Red Text.

I'm not familiar with pyparsing, so I'm wondering if you have suggestions on the "best" way to override the various classes in the markup directory.

Perhaps you can provide some examples of how to convert some of the various classes of markup to "just text"?

Ordered lists always use "1."

Why do all elements in an ordered list start with "1." instead of incrementing? This is not how they show up in Jira.

Jira	Markdown
`# a # numbered # list`	`1. a 1. numbered 1. list`

Emphasis text

Hi @catcombo

regarding emphasis, the readme show that using +foo+ will be rendered as foo

However I found that it renders as <u>foo</u>. The Unit Tests also confirm that as of the code

class TestUnderline:
    def test_basic_conversion(self):
        assert convert("inside +some long+ text") == "inside <u>some long</u> text"

I personally think it should be rendered as *foo* (or _foo_) as stated in YouTrack Markdown

I'll try to provide a PR

Thanks!

Example usage does not execute

I installed the module per the instructions:

pip install jira2markdown

I then copy and pasted the usage code into a python file.

from jira2markdown import convert

convert("Some *Jira text* formatting [example|https://example.com].")
# >>> Some **Jira text** formatting [example](https://example.com).

# To convert user mentions provide a mapping Jira internal account id to username 
# as a second argument to convert function
convert("[Winston Smith|~accountid:internal-id] woke up with the word 'Shakespeare' on his lips", {
    "internal-id": "winston",
})
# >>> @winston woke up with the word 'Shakespeare' on his lips

When I run this application, I get the following output:

Traceback (most recent call last):
  File "C:\OneDrive\Powershell-Scripts\jira2markdown-sample.py", line 1, in <module>
    from jira2markdown import convert
  File "C:\OneDrive\Powershell-Scripts\jira2markdown.py", line 27, in <module>
    from jira2markdown import convert
ImportError: cannot import name 'convert' from partially initialized module 'jira2markdown' (most likely due to a circular import) (C:\OneDrive\Powershell-Scripts\jira2markdown.py)

I'm running Python 3.3.9 on Windows. I also tried running this using Python 3.8.10 in Ubuntu. I get a similar error.

Converting Jira lists with CRLF line-breaks adds erroneous whitespace to subsequent text

When using jira2markdown's convert() function on Jira lists with Carriage Return (CR) Line Feed (LF) (CRLF) style line-breaks the resulting markdown text adds erroneous whitespace to subsequent text after the last list item.

See below for a visual example of the conversion issue.

from jira2markdown import convert
jira_text = 'Line Before List: Sample text words words words:\r\n * Bulleted Item 1: Sample text words words words\r\n * Bulleted Item 2: Sample text words words words\r\n\r\nLine After List: Sample text words words words\r\nLine After List: Sample text words words words'

print(jira_text)

Input (jira_text printed):

Line Before List: Sample text words words words:
 * Bulleted Item 1: Sample text words words words
 * Bulleted Item 2: Sample text words words words

Line After List: Sample text words words words
Line After List: Sample text words words words

Input (jira_text with line-breaks visualized):

Line Before List: Sample text words words words:\r\n
 * Bulleted Item 1: Sample text words words words\r\n
 * Bulleted Item 2: Sample text words words words\r\n
\r\n
Line After List: Sample text words words words\r\n
Line After List: Sample text words words words

md_text = convert(jira_text)

Expected Output (md_text printed):

Line Before List: Sample text words words words:
- Bulleted Item 1: Sample text words words words
- Bulleted Item 2: Sample text words words words

Line After List: Sample text words words words
Line After List: Sample text words words words

Expected Output (md_text with line-breaks visualized):

Line Before List: Sample text words words words:\r\n
- Bulleted Item 1: Sample text words words words\r\n
- Bulleted Item 2: Sample text words words words\r\n
\r\n
Line After List: Sample text words words words\r\n
Line After List: Sample text words words words\r\n

print(md_text)

Actual Output (md_text printed):

Line Before List: Sample text words words words:
- Bulleted Item 1: Sample text words words words
- Bulleted Item 2: Sample text words words words
  
  Line After List: Sample text words words words
  Line After List: Sample text words words words

Actual Output (md_text with line-breaks visualized):

Line Before List: Sample text words words words:\r\n
- Bulleted Item 1: Sample text words words words\n
- Bulleted Item 2: Sample text words words words\n
  \n
  Line After List: Sample text words words words\n
  Line After List: Sample text words words words

As shown the conversion ends up replacing:

\r\n in the list with \n
\r\n\r\n at the end of the list with \n \n
\r\n after the list with \n

Copy-and-Pasteable Snippet to replicate the issue:

from jira2markdown import convert

# Input with CRLF line-breaks 
jira_text = 'Line Before List: Sample text words words words:\r\n * Bulleted Item 1: Sample text words words words\r\n * Bulleted Item 2: Sample text words words words\r\n\r\nLine After List: Sample text words words words\r\nLine After List: Sample text words words words'

# Print input with line-breaks rendered
print("\njira_text:\n" + jira_text)

# Print input with line-breaks represented, not rendered
print("\nrepr(jira_text):\n" + repr(jira_text))

md_text = convert(jira_text)

# Print output with line-breaks rendered
print("\nmd_text:\n" + md_text)

# Print output with line-breaks represented, not rendered

print("\nrepr(md_text):\n" + repr(md_text))

Add underline support

Would be nice to add underline support.
To do it now you can use the following code:

import jira2markdown
from jira2markdown.elements import MarkupElements
from jira2markdown.markup.text_effects import QuotedElement, Underline

text = "This is a +test+"


class CustomUnderline(QuotedElement):
    TOKEN = "+"
    QUOTE_CHAR = "<u>"
    END_QUOTE_CHAR = "</u>"


elements = MarkupElements()
elements.replace(Underline, CustomUnderline)

print(jira2markdown.convert(text, elements=elements))

Add ??citation?? conversion as <q> tag

Allow new version of pyparsing

jira2markdown pins pyparsing to < 3.0.0. Many new tools uses pyparsing > 3.0.0 and therefore creates pip conflicts if jira2markdown is used.

If there are no breaks, allow pyparsing > 3.0.0

Enhancement: Add logic for when Jira Editor wraps text effects (`~`, `_`, `+`, etc) in curly braces `{ }`

I haven't been able to find the Atlassian documentation behind why this happens but in real world usage the Jira Editor sometimes wraps text effects (~, _, +, etc) in curly braces { }.

For instance instead of having this:

This is *strong*

This is +inserted+

This is _emphasis_

is will sometimes have this:

This is {*}strong{*}

This is {+}inserted{+}

This is {_}emphasis{_}

which renders in Jira exactly the same as the first.

jira2markdown isn't aware causing erroneous markdown output.

I can provide additional info and/or submit a PR later on. I feel like following something similar to the implementation of the color conversion in text_effects.Color() would be a good way forward.

Add github image support

Currently no attributes are supported (width, height), it would be nice to support these.
An example how this can be shown:

import re

import jira2markdown
from jira2markdown.elements import MarkupElements
from jira2markdown.markup.base import AbstractMarkup
from jira2markdown.markup.images import Image
from pyparsing import (
    Combine,
    Optional,
    ParserElement,
    ParseResults,
    PrecededBy,
    Regex,
    StringStart,
    Word,
    printables, )


class CustomImage(AbstractMarkup):
    def action(self, tokens: ParseResults) -> str:
        url = tokens.url
        attr_str = self._create_attribute_dict(tokens.attrs)
        return f'<img src="{url}" {attr_str} />'

    @property
    def expr(self) -> ParserElement:
        return (StringStart() | PrecededBy(Regex(r"\W", flags=re.UNICODE), retreat=1)) + Combine(
            "!"
            + Word(printables + " ", min=3, exclude_chars="|!").set_results_name("url")
            + Optional("|")
            + Word(printables + ",", exclude_chars="!").set_results_name("attrs")
            + Optional("!")
        ).set_parse_action(self.action)

    @staticmethod
    def _create_attribute_dict(attrs_str: str) -> str:
        attrs = {}
        for attr in attrs_str.split(','):
            key, value = attr.split('=')
            attrs[key] = value
        attr_str = " ".join([f'{k}="{v}"' for k, v in attrs.items()])
        return attr_str


elements = MarkupElements()
elements.replace(Image, CustomImage)
markdown_text = jira2markdown.convert("!image.png|width=200,height=400!", elements=elements)
print(markdown_text)

This will print

<img src="image.png" width="200" height="400" />

P.S. ChatGPT to the rescue for helping to find the correct parsing

Can't convert a link in a table in markup language.

Hi, I found that I can't convert a link in a table using jira2markdown.convert().

Here's my code:

import jira2markdown

markup = '||aa||bb||cc||dd||\n|row1|row1-1|row1-2|row1-3[mylink|https://www.google.com]|'
result = jira2markdown.convert(markup)
print(result) # shows '|aa|bb|cc|dd|\n|-|-|-|-|\n|row1|row1-1|row1-2|row1-3|\n'

As you can see from the result above, the link [mylink|https://www.google.com] disappears after the conversion.

I use python 3.8.17 in macOS Monterey 12.6 to run the above code.

And I use the latest version(0.3.4) of jira2markdown.

This error doesn't occur when I run it in macOS Ventura 13.4.1.

How can I run jira2markdown correctly in macOS Monterey 12.6?

What are the dependent libraries of jira2markdown?

license

Greetings,

This is great software. My only issue is there is no license. Could you kindly add a license so I can hopefully make use of your code?

AttributeError

Hi,

First, thanks for this repo that is was I was looking for !

Hi installed it with pip (python 3.8.10)

When running I got :

Traceback (most recent call last):
  File "/home/pyd/Code/Jira-to-wikijs/test.py", line 1, in <module>
    from jira2markdown import convert
  File "/home/pyd/.local/lib/python3.8/site-packages/jira2markdown/__init__.py", line 1, in <module>
    from .parser import convert  # noqa
  File "/home/pyd/.local/lib/python3.8/site-packages/jira2markdown/parser.py", line 7, in <module>
    ParserElement.set_default_whitespace_chars("")
AttributeError: type object 'ParserElement' has no attribute 'set_default_whitespace_chars'

My test file is very easy I used the example you provided :

from jira2markdown import convert
print(convert("Some *Jira text* formatting [example|https://example.com]."))

Do you know what could have possibly go wrong ?

Table Header Separator `-` is not compliant with YouTrack Markdown `---`

When converting tables into Markdown the resulting tables have single-hyphen (minus-sign) separators (-) which do not follow the YouTrack Markdown implementation of triple-hyphen (minus-sign) separators (---):

Separate the header row from the rest of the table with three or more minus signs (---).

Some Markdown processors render the resulting single-hyphen separated tables but others do not.

`{noformat}` should accept the same parameters as `{panel}`

Hello,
Thanks for this very convenient tool!

When parameters are used with the {noformat} tag, there are some mis-conversions.
As documented by https://jira.atlassian.com/secure/WikiRendererHelpAction.jspa?section=advanced:

All the optional parameters of {panel} macro are valid for {noformat} too.

The current implementation does not recognize the {noformat} tag at all when there is a parameter.

Add {panel} translation as block quote

ListIndent missing abstract method _generateDefaultName

I believe pyparser is requiring ListIndent to implement _generateDefaultName, per the following error received:

... temp = jira2markdown.convert(text) File "/home/.conda/envs/operations/lib/python3.10/site-packages/jira2markdown/parser.py", line 18, in convert markup << elements.expr(inline_markup, markup, usernames, elements) File "/home/.conda/envs/operations/lib/python3.10/site-packages/jira2markdown/elements.py", line 64, in expr return MatchFirst([ File "/home/.conda/envs/operations/lib/python3.10/site-packages/jira2markdown/elements.py", line 65, in <listcomp> element(inline_markup=inline_markup, markup=markup, usernames=usernames).expr File "/home/.conda/envs/operations/lib/python3.10/site-packages/jira2markdown/markup/lists.py", line 85, in expr + ListIndent(self.indent_state, self.tokens) TypeError: Can't instantiate abstract class ListIndent with abstract method _generateDefaultName

Jira Mixed Nested Lists Conversion Issue

Jira supports two different syntax styles for mixed nested lists but jira2markdown only identifies/converts with one version:

First version (works in jira2markdown), each nested bullet begins with same bullet character(s) (#'s or *'s) as its parent bullet:

# a
# numbered
#* with
#* nested
#* bullet
# list

* a
* bulleted
*# with
*# nested
*# numbered
* list

Second version (does not work in jira2markdown), each nested bullet uses only the bullet character (#'s or *'s) of it's type:

# a
# numbered
** with
** nested
** bullet
# list

* a
* bulleted
## with
## nested
## numbered
* list

I can provide additional info and/or submit a PR later on.

Indentation breaks list recognition

Hello,

Thanks for the library, works like a charme in general.
I am using jira2markdown in version 0.2.1 in a poetry project.
if you use the visual editor in JIRA, you often get a blank in front of the list symbol. It is nonetheless correctly visualized in JIRA.

❯ python
Python 3.10.5 (main, Jun 23 2022, 17:15:25) [Clang 13.1.6 (clang-1316.0.21.2.5)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import jira2markdown
>>> jira2markdown.convert('* Blue\n* Green')
'- Blue\n- Green'
>>> jira2markdown.convert(' * Blue\n * Green')
' \\* Blue\n \\* Green'

catcombo / jira2markdown Goto Github PK

jira2markdown's Issues

Recommend Projects

Recommend Topics

Recommend Org