catcombo / jira2markdown Goto Github PK
View Code? Open in Web Editor NEWConvert text from JIRA markup to Markdown using parsing expression grammars
License: MIT License
Convert text from JIRA markup to Markdown using parsing expression grammars
License: MIT License
We use ^superscript^
and ~subscript~
in our projects.
Could you please add translation according to the following rules:
<sup>
& <sub>
tagsThanks!
First, thanks for a great library! ๐
I'd like to be able to convert more elements to "plain text", for example, when it sees {color:red}Red Text{color}
, I'd like the conversion to only have Red Text
.
I'm not familiar with pyparsing, so I'm wondering if you have suggestions on the "best" way to override the various classes in the markup
directory.
Perhaps you can provide some examples of how to convert some of the various classes of markup to "just text"?
Why do all elements in an ordered list start with "1." instead of incrementing? This is not how they show up in Jira.
Jira | Markdown |
---|---|
|
|
Hi @catcombo
regarding emphasis, the readme show that using +foo+
will be rendered as foo
However I found that it renders as <u>foo</u>
. The Unit Tests also confirm that as of the code
class TestUnderline:
def test_basic_conversion(self):
assert convert("inside +some long+ text") == "inside <u>some long</u> text"
I personally think it should be rendered as *foo*
(or _foo_
) as stated in YouTrack Markdown
I'll try to provide a PR
Thanks!
I installed the module per the instructions:
pip install jira2markdown
I then copy and pasted the usage code into a python file.
from jira2markdown import convert
convert("Some *Jira text* formatting [example|https://example.com].")
# >>> Some **Jira text** formatting [example](https://example.com).
# To convert user mentions provide a mapping Jira internal account id to username
# as a second argument to convert function
convert("[Winston Smith|~accountid:internal-id] woke up with the word 'Shakespeare' on his lips", {
"internal-id": "winston",
})
# >>> @winston woke up with the word 'Shakespeare' on his lips
When I run this application, I get the following output:
Traceback (most recent call last):
File "C:\OneDrive\Powershell-Scripts\jira2markdown-sample.py", line 1, in <module>
from jira2markdown import convert
File "C:\OneDrive\Powershell-Scripts\jira2markdown.py", line 27, in <module>
from jira2markdown import convert
ImportError: cannot import name 'convert' from partially initialized module 'jira2markdown' (most likely due to a circular import) (C:\OneDrive\Powershell-Scripts\jira2markdown.py)
I'm running Python 3.3.9 on Windows. I also tried running this using Python 3.8.10 in Ubuntu. I get a similar error.
When using jira2markdown
's convert()
function on Jira lists with Carriage Return (CR) Line Feed (LF) (CRLF) style line-breaks the resulting markdown text adds erroneous whitespace to subsequent text after the last list item.
See below for a visual example of the conversion issue.
from jira2markdown import convert
jira_text = 'Line Before List: Sample text words words words:\r\n * Bulleted Item 1: Sample text words words words\r\n * Bulleted Item 2: Sample text words words words\r\n\r\nLine After List: Sample text words words words\r\nLine After List: Sample text words words words'
print(jira_text)
Input (jira_text
printed):
Line Before List: Sample text words words words:
* Bulleted Item 1: Sample text words words words
* Bulleted Item 2: Sample text words words words
Line After List: Sample text words words words
Line After List: Sample text words words words
Input (jira_text
with line-breaks visualized):
Line Before List: Sample text words words words:\r\n
* Bulleted Item 1: Sample text words words words\r\n
* Bulleted Item 2: Sample text words words words\r\n
\r\n
Line After List: Sample text words words words\r\n
Line After List: Sample text words words words
md_text = convert(jira_text)
Expected Output (md_text
printed):
Line Before List: Sample text words words words:
- Bulleted Item 1: Sample text words words words
- Bulleted Item 2: Sample text words words words
Line After List: Sample text words words words
Line After List: Sample text words words words
Expected Output (md_text
with line-breaks visualized):
Line Before List: Sample text words words words:\r\n
- Bulleted Item 1: Sample text words words words\r\n
- Bulleted Item 2: Sample text words words words\r\n
\r\n
Line After List: Sample text words words words\r\n
Line After List: Sample text words words words\r\n
print(md_text)
Actual Output (md_text
printed):
Line Before List: Sample text words words words:
- Bulleted Item 1: Sample text words words words
- Bulleted Item 2: Sample text words words words
Line After List: Sample text words words words
Line After List: Sample text words words words
Actual Output (md_text
with line-breaks visualized):
Line Before List: Sample text words words words:\r\n
- Bulleted Item 1: Sample text words words words\n
- Bulleted Item 2: Sample text words words words\n
\n
Line After List: Sample text words words words\n
Line After List: Sample text words words words
As shown the conversion ends up replacing:
\r\n
in the list with \n
\r\n\r\n
at the end of the list with \n \n
\r\n
after the list with \n
Copy-and-Pasteable Snippet to replicate the issue:
from jira2markdown import convert
# Input with CRLF line-breaks
jira_text = 'Line Before List: Sample text words words words:\r\n * Bulleted Item 1: Sample text words words words\r\n * Bulleted Item 2: Sample text words words words\r\n\r\nLine After List: Sample text words words words\r\nLine After List: Sample text words words words'
# Print input with line-breaks rendered
print("\njira_text:\n" + jira_text)
# Print input with line-breaks represented, not rendered
print("\nrepr(jira_text):\n" + repr(jira_text))
md_text = convert(jira_text)
# Print output with line-breaks rendered
print("\nmd_text:\n" + md_text)
# Print output with line-breaks represented, not rendered
print("\nrepr(md_text):\n" + repr(md_text))
Would be nice to add underline support.
To do it now you can use the following code:
import jira2markdown
from jira2markdown.elements import MarkupElements
from jira2markdown.markup.text_effects import QuotedElement, Underline
text = "This is a +test+"
class CustomUnderline(QuotedElement):
TOKEN = "+"
QUOTE_CHAR = "<u>"
END_QUOTE_CHAR = "</u>"
elements = MarkupElements()
elements.replace(Underline, CustomUnderline)
print(jira2markdown.convert(text, elements=elements))
jira2markdown
pins pyparsing
to < 3.0.0. Many new tools uses pyparsing
> 3.0.0 and therefore creates pip conflicts if jira2markdown
is used.
If there are no breaks, allow pyparsing
> 3.0.0
I haven't been able to find the Atlassian documentation behind why this happens but in real world usage the Jira Editor sometimes wraps text effects (~
, _
, +
, etc) in curly braces { }
.
For instance instead of having this:
This is *strong*
This is +inserted+
This is _emphasis_
is will sometimes have this:
This is {*}strong{*}
This is {+}inserted{+}
This is {_}emphasis{_}
which renders in Jira exactly the same as the first.
jira2markdown
isn't aware causing erroneous markdown output.
I can provide additional info and/or submit a PR later on. I feel like following something similar to the implementation of the color conversion in text_effects.Color()
would be a good way forward.
Currently no attributes are supported (width, height), it would be nice to support these.
An example how this can be shown:
import re
import jira2markdown
from jira2markdown.elements import MarkupElements
from jira2markdown.markup.base import AbstractMarkup
from jira2markdown.markup.images import Image
from pyparsing import (
Combine,
Optional,
ParserElement,
ParseResults,
PrecededBy,
Regex,
StringStart,
Word,
printables, )
class CustomImage(AbstractMarkup):
def action(self, tokens: ParseResults) -> str:
url = tokens.url
attr_str = self._create_attribute_dict(tokens.attrs)
return f'<img src="{url}" {attr_str} />'
@property
def expr(self) -> ParserElement:
return (StringStart() | PrecededBy(Regex(r"\W", flags=re.UNICODE), retreat=1)) + Combine(
"!"
+ Word(printables + " ", min=3, exclude_chars="|!").set_results_name("url")
+ Optional("|")
+ Word(printables + ",", exclude_chars="!").set_results_name("attrs")
+ Optional("!")
).set_parse_action(self.action)
@staticmethod
def _create_attribute_dict(attrs_str: str) -> str:
attrs = {}
for attr in attrs_str.split(','):
key, value = attr.split('=')
attrs[key] = value
attr_str = " ".join([f'{k}="{v}"' for k, v in attrs.items()])
return attr_str
elements = MarkupElements()
elements.replace(Image, CustomImage)
markdown_text = jira2markdown.convert("!image.png|width=200,height=400!", elements=elements)
print(markdown_text)
This will print
<img src="image.png" width="200" height="400" />
P.S. ChatGPT to the rescue for helping to find the correct parsing
Hi, I found that I can't convert a link in a table using jira2markdown.convert()
.
Here's my code:
import jira2markdown
markup = '||aa||bb||cc||dd||\n|row1|row1-1|row1-2|row1-3[mylink|https://www.google.com]|'
result = jira2markdown.convert(markup)
print(result) # shows '|aa|bb|cc|dd|\n|-|-|-|-|\n|row1|row1-1|row1-2|row1-3|\n'
As you can see from the result above, the link [mylink|https://www.google.com]
disappears after the conversion.
I use python 3.8.17
in macOS Monterey 12.6 to run the above code.
And I use the latest version(0.3.4
) of jira2markdown.
This error doesn't occur when I run it in macOS Ventura 13.4.1.
How can I run jira2markdown
correctly in macOS Monterey 12.6?
What are the dependent libraries of jira2markdown
?
Greetings,
This is great software. My only issue is there is no license. Could you kindly add a license so I can hopefully make use of your code?
Hi,
First, thanks for this repo that is was I was looking for !
Hi installed it with pip (python 3.8.10)
When running I got :
Traceback (most recent call last):
File "/home/pyd/Code/Jira-to-wikijs/test.py", line 1, in <module>
from jira2markdown import convert
File "/home/pyd/.local/lib/python3.8/site-packages/jira2markdown/__init__.py", line 1, in <module>
from .parser import convert # noqa
File "/home/pyd/.local/lib/python3.8/site-packages/jira2markdown/parser.py", line 7, in <module>
ParserElement.set_default_whitespace_chars("")
AttributeError: type object 'ParserElement' has no attribute 'set_default_whitespace_chars'
My test file is very easy I used the example you provided :
from jira2markdown import convert
print(convert("Some *Jira text* formatting [example|https://example.com]."))
Do you know what could have possibly go wrong ?
When converting tables into Markdown the resulting tables have single-hyphen (minus-sign) separators (-
) which do not follow the YouTrack Markdown implementation of triple-hyphen (minus-sign) separators (---
):
Separate the header row from the rest of the table with three or more minus signs (
---
).
Some Markdown processors render the resulting single-hyphen separated tables but others do not.
Hello,
Thanks for this very convenient tool!
When parameters are used with the {noformat}
tag, there are some mis-conversions.
As documented by https://jira.atlassian.com/secure/WikiRendererHelpAction.jspa?section=advanced:
All the optional parameters of {panel} macro are valid for {noformat} too.
The current implementation does not recognize the {noformat}
tag at all when there is a parameter.
I believe pyparser is requiring ListIndent to implement _generateDefaultName, per the following error received:
... temp = jira2markdown.convert(text) File "/home/.conda/envs/operations/lib/python3.10/site-packages/jira2markdown/parser.py", line 18, in convert markup << elements.expr(inline_markup, markup, usernames, elements) File "/home/.conda/envs/operations/lib/python3.10/site-packages/jira2markdown/elements.py", line 64, in expr return MatchFirst([ File "/home/.conda/envs/operations/lib/python3.10/site-packages/jira2markdown/elements.py", line 65, in <listcomp> element(inline_markup=inline_markup, markup=markup, usernames=usernames).expr File "/home/.conda/envs/operations/lib/python3.10/site-packages/jira2markdown/markup/lists.py", line 85, in expr + ListIndent(self.indent_state, self.tokens) TypeError: Can't instantiate abstract class ListIndent with abstract method _generateDefaultName
Jira supports two different syntax styles for mixed nested lists but jira2markdown only identifies/converts with one version:
First version (works in jira2markdown
), each nested bullet begins with same bullet character(s) (#
's or *
's) as its parent bullet:
# a
# numbered
#* with
#* nested
#* bullet
# list
* a
* bulleted
*# with
*# nested
*# numbered
* list
Second version (does not work in jira2markdown
), each nested bullet uses only the bullet character (#
's or *
's) of it's type:
# a
# numbered
** with
** nested
** bullet
# list
* a
* bulleted
## with
## nested
## numbered
* list
I can provide additional info and/or submit a PR later on.
Hello,
Thanks for the library, works like a charme in general.
I am using jira2markdown
in version 0.2.1 in a poetry project.
if you use the visual editor in JIRA, you often get a blank in front of the list symbol. It is nonetheless correctly visualized in JIRA.
โฏ python
Python 3.10.5 (main, Jun 23 2022, 17:15:25) [Clang 13.1.6 (clang-1316.0.21.2.5)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import jira2markdown
>>> jira2markdown.convert('* Blue\n* Green')
'- Blue\n- Green'
>>> jira2markdown.convert(' * Blue\n * Green')
' \\* Blue\n \\* Green'
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.