nonintanon / markdownsharp Goto Github PK

View Code? Open in Web Editor NEW

0.0 1.0 0.0 1000 KB

Automatically exported from code.google.com/p/markdownsharp

C# 41.64% Perl 21.42% PHP 16.54% HTML 20.40%

markdownsharp's People

Contributors

Watchers

markdownsharp's Issues

Problems on img

What steps will reproduce the problem?
1. Trying to insert a image in the text using ref or inline
2. Output does not close the img tag

What is the expected output? What do you see instead?
Expected out should be <img src="..." title="..." />
but I get
<img src="..." title="..." <="" p="">

What version of the product are you using? On what operating system?
Latest (July)

Please provide any additional information below.

Original issue reported on code.google.com by [email protected] on 23 Oct 2011 at 6:55

HTML Encoding user inputted value and then running it through MarkdownSharp's Transform method will essentially re-encode text within code blocks.

What steps will reproduce the problem?
1. Reference MarkdownSharp.dll in a web project.
2. Server.HtmlEncode(userInput)
3. markdown.Transform(encodedUserInput)
4. You will notice that the already encoded text gets re-encoded within code 
tags.

What is the expected output?
I would expect to be able to encode user input so that it can be safely stored 
in the database without having to worry about it getting pulled out and 
rendered incorrectly, at which point users visiting the site could become 
vulnerable to XSS attacks. I would then expect MarkdownSharp not to re-encode 
the values in the code blocks since they are already encoded, or at least offer 
a boolean switch to disable re-encoding of tags in code blocks.

What do you see instead?
MarkdownSharp re-encodes the code within the code tags that are rendered to the 
client.

Please provide any additional information below.
I have fixed the issue myself and added an option to the MarkdownSharp source 
that will disable the encoding of code blocks. I have attached the modified 
Markdown.cs file to this issue. Scan the file for _encodeCodeBlocks and you 
will find every area that I made changes to as that is the boolean option I 
added to change the behavior.

If this ever opens up in the future for contribution I would be interested.

Original issue reported on code.google.com by [email protected] on 17 Mar 2011 at 8:17

Attachments:

Markdown.cs

_amps Regex

private static Regex _amps = new Regex(@"&(?!#?[xX]?([0-9a-fA-F]+|\w+);)", 
RegexOptions.ExplicitCapture | RegexOptions.Compiled);

The idea of this regular expressions seems to be to find all occurences of an 
ampersand that is not 
already part of an html escape sequence.

Looking at the group "([0-9a-fA-F]+|\w+)" it seems that [0-9a-f-A-F] is nothing 
more than a subset 
of \w, this is equivalent to "\w+" - the group cannot be captured anyways due 
to 
RegexOptions.ExplicitCapture.

When looking at the remaining Regex, more becomes obvious: "[xX]?\w" - As far 
as I know, html escape 
characters contain either a 'x' to signify a hex sequence or an '#' to signify 
a decimal sequence, 
therefore allowing us to write "[#xX]?", which is not the same as the original, 
but in fact reduces 
the number of false positives (e.g. &#x77; is not a valid html escape sequence 
if I am not 
mistaken).

Following these considerations, the pattern @"&(?![#xX]?\w+;)" should work at 
least as well as the 
original, while being easier to read and understand in my opinion. Also,

Original issue reported on code.google.com by [email protected] on 31 Dec 2009 at 6:32

Please release the product periodically

Could you please release the product periodically? For example, I am planning 
to publish a PowerShell script that uses MarkdownSharp.dll (just a few lines of 
code and the link to this project, I am not planning to distribute the DLL 
itself). It would be much easier for users to get the latest DLL version than 
to get the sources and build the DLL. Besides, getting the latest sources is 
really tricky for those who do not have Mercurial.

Thank you for the useful tool. Please make it easier to use (to get, actually).

Original issue reported on code.google.com by nightroman on 18 Sep 2011 at 7:08

_emptyElementSuffix null when passing empty MarkdownOptions

What steps will reproduce the problem?
1. Call public Markdown(MarkdownOptions options)
2. Transform text that generates empty html element (eg. <br />)
3. Inspect invalid html

What is the expected output? What do you see instead?
Expect html with '<br />', seeing '<br'

Please provide any additional information below.

This unit-test demonstrated the problem:
        [Test]
        public void BreaklinesWithEmptyOptions()
        {
            string input = "Bla bla:  \r\nFoo";
            string expected = "Bla bla:<br />\nAppHarbor Inc.";
            var markdownSharp = new Markdown(
                new MarkdownOptions{}
            );
            string actual = markdownSharp.Transform(input);
            Assert.AreEqual(expected, actual);
        }

It's due to bad option handling. Supposedly, emptyelementsuffix id initialized 
to a sane default in this line:
private string _emptyElementSuffix = " />";

But that gets overriden with 'null' if one foolishly passes an options-object 
without setting 'EmptyElementSuffix'

Original issue reported on code.google.com by [email protected] on 27 Jan 2011 at 1:29

Merged into: #38

Add support for HTML5 semantic elements

Elements such as <header>, <footer>, <aside>, <article>, <section>, and 
<hgroup> are all useful when using Markdown. It would be nice to have an 
addition to the spec that supports these elements. The exact syntax should 
certainly be debated a bit. I am hoping to begin work on a Ruby library (or 
additions to existing ones) to add HTML5 support and I would like to have us 
agree on a syntax.

Original issue reported on code.google.com by [email protected] on 17 Oct 2010 at 5:22

Strange behaviour of the ordered/unordered lists

Passing this text

    - lion
    - tiger
    - cougar
    - smilodon
    - Garfield




    1. London
    2. Paris
    3. Barcelona
    4. Kraków


I'm getting this beauty

    <ul>
    <li>lion</li>
    <li>tiger</li>
    <li>cougar</li>
    <li>smilodon</li>
    <li><p>Garfield</p>

    <ol>
    <li>London</li>
    <li>Paris</li>
    <li>Barcelona</li>
    <li>Kraków</li>
    </ol></li>
    </ul>

... although I am expecting something more conventional:

    <ul>
    <li>lion</li>
    <li>tiger</li>
    <li>cougar</li>
    <li>smilodon</li>
    <li>Garfield</li>
    </ul>


    <ol>
    <li>London</li>
    <li>Paris</li>
    <li>Barcelona</li>
    <li>Kraków</li>
    </ol>


I am not sure whether it is a bug or just a feature. Trying to solve it alone 
but if someone knows the right direction I'd be *very* grateful.

Thanks in advance

Original issue reported on code.google.com by [email protected] on 25 Jan 2011 at 12:57

Outdent Performance

Outdent is the sole user of the static _outDent (who would have guessed?), 
doing a replace on the whole match. _outDent however has a capturing group 
inside (which previously held a "\t|[ ]{1,_tabWidth}" and now only consists 
of "[ ]{1,_tabWidth}"). Simply making this group non-capturing (or removing 
the grouping entirely, since it serves no more purpose) improves 
performance a bit.

Another slight performance gain can be achieved by implementing the 
function without regular expressions at all (see attached file for my best 
try).

Although my fastest solution might be overkill for such a negligible 
performance gain, I think the regex should be changed in any case, since it 
hurts me to see performance lying on the streets :)

Perf Original:
input string length: 475
8000 iterations in 5222 ms (0,65275 ms per iteration)
input string length: 2356
2000 iterations in 5003 ms (2,5015 ms per iteration)
input string length: 27737
180 iterations in 4944 ms (27,4666666666667 ms per iteration)
input string length: 11075
375 iterations in 4852 ms (12,9386666666667 ms per iteration)
input string length: 88607
45 iterations in 4693 ms (104,288888888889 ms per iteration)
input string length: 354431
12 iterations in 5125 ms (427,083333333333 ms per iteration)

Perf Non-Capturing Group:
input string length: 475
8000 iterations in 5195 ms (0,649375 ms per iteration)
input string length: 2356
2000 iterations in 4894 ms (2,447 ms per iteration)
input string length: 27737
180 iterations in 4843 ms (26,9055555555556 ms per iteration)
input string length: 11075
375 iterations in 4727 ms (12,6053333333333 ms per iteration)
input string length: 88607
45 iterations in 4635 ms (103 ms per iteration)
input string length: 354431
12 iterations in 4881 ms (406,75 ms per iteration)

Perf w/o regex:
input string length: 475
8000 iterations in 5146 ms (0,64325 ms per iteration)
input string length: 2356
2000 iterations in 4860 ms (2,43 ms per iteration)
input string length: 27737
180 iterations in 4806 ms (26,7 ms per iteration)
input string length: 11075
375 iterations in 4651 ms (12,4026666666667 ms per iteration)
input string length: 88607
45 iterations in 4565 ms (101,444444444444 ms per iteration)
input string length: 354431
12 iterations in 4831 ms (402,583333333333 ms per iteration)

Original issue reported on code.google.com by [email protected] on 13 Jan 2010 at 1:17

Attachments:

Outdent.cs

Making the current implementations easy to change

While reading the issue 32 I started to wonder if a similar approach should be 
taken with all the transformation methods in the library. (DoAnchors, DoImages, 
DoLists etc...)

The reason I'm suggesting it is because people already want to modify the 
behaviours of the current transformation methods (see issue 30 and issue 35). 
And right now they can't make these modifications without changing the source 
code directly.

Here's what I have in mind ; There would be an ITransform interface (or maybe 
different interfaces for block level and inline transformation methods)
public interface ITransform {
    public string Transform(string text);
}
And interfaces for the default transformation methods in the library, like :
public interface ITransformAnchors : ITransform {}

A class that implements ITransformAnchors would be responsible for transforming 
the anchor elements :
public class DefaultAnchorTransformer : ITransformAnchors {
    public string Transform(string text) {
        // do reference-style links
        // inline-style links. and so on...
    }
}

So if an end-developer wants to add target="_blank" or rel="nofollow" 
attributes to the anchors (like in the issue 30 and 35) then they would send 
their own class that implements the ITransformAnchors and the Markdown class 
would use that instead.

Or if they want to do more transformations (like adding support for tables etc) 
then they would send an ITransform and Markdown class would run that type's 
Transform() method when it's done with the rest.

Another similar approach would be having a Transformation class :
public class Transformation {
    public string Name { get; set; }
    public Func<string, string> Transform { get; set; }
    public ElementType Type { get; set; } // anchor, image etc
    // and maybe another property for order.
}
And the same idea would apply again.


Any opinions?

Original issue reported on code.google.com by [email protected] on 9 Oct 2010 at 1:46

Patch for /MarkdownSharp/Markdown.cs

spelling mistake in comment

Original issue reported on code.google.com by [email protected] on 18 Jul 2011 at 12:00

Merged into: #46

Attachments:

Markdown.cs.patch

Option to render URLs with rel="nofollow"

The site I'm developing will suffer from questionable behaviour by people with 
commercial interests if we let them post URLs without rel="nofollow" on all 
HTML anchors generated in Markdown.

I've attached a patch (generated by Git) of the changes I made to Markdown.cs 
in implementing support for this.

I haven't changed the default behaviour.  Just set the NoFollowLinks property 
to true if you want to opt into this behaviour.

Version number bumped too.

Original issue reported on code.google.com by drewnoakes on 6 Oct 2010 at 11:27

Attachments:

MarkdownSharp-NoFollowLinks.patch

Speed-up for nested block pattern matching

Before change:

input string length: 475
4000 iterations in 3814 ms (0.9535 ms per iteration)
input string length: 2356
1000 iterations in 4215 ms (4.215 ms per iteration)
input string length: 27737
100 iterations in 5908 ms (59.08 ms per iteration)
input string length: 11075
1 iteration in 25 ms
input string length: 88607
1 iteration in 278 ms
input string length: 354431
1 iteration in 2386 ms

After:

input string length: 475
4000 iterations in 3756 ms (0.939 ms per iteration)
input string length: 2356
1000 iterations in 4196 ms (4.196 ms per iteration)
input string length: 27737
100 iterations in 4753 ms (47.53 ms per iteration)
input string length: 11075
1 iteration in 23 ms
input string length: 88607
1 iteration in 190 ms
input string length: 354431
1 iteration in 1027 ms

with all unit tests passing.

So a moderate speed-up.

Change to:

        private static Regex _blocksNested = new Regex(string.Format(@"
                (                       # save in 
$1
                    ^                   # start of line  
(with /m)
                    <({0})              # start tag = $2
                    \b                  # word break
                    (?>.*\n)*?          # any number of lines, 
minimally matching
                    </\2>               # the matching end 
tag
                    [ \t]*              # trailing 
spaces/tabs
                    (?=\n+|\Z)          # followed by a newline or end of 
document
                )", _blockTags1), RegexOptions.Multiline | 
RegexOptions.IgnorePatternWhitespace | RegexOptions.Compiled);

        private static string _blockTags2 = "p|div|h[1-
6]|blockquote|pre|table|dl|ol|ul|script|noscript|form|fieldset|iframe|math"
;
        private static Regex _blocksNestedLiberal = new 
Regex(string.Format(@"
               (                        # save in 
$1
                    ^                   # start of line  
(with /m)
                    <({0})              # start tag = $2
                    \b                  # word break
                    (?>.*\n)*?          # any number of lines, 
minimally matching
                    .*</\2>             # the matching end 
tag
                    [ \t]*              # trailing 
spaces/tabs
                    (?=\n+|\Z)          # followed by a newline or end of 
document
                )", _blockTags2), RegexOptions.Multiline | 
RegexOptions.IgnorePatternWhitespace | RegexOptions.Compiled);

The important part is:

(?>.*\n)*?

instead of:

(.*\n)*?

Original issue reported on code.google.com by [email protected] on 4 Jan 2010 at 10:22

Added target="_blank" support to links

What version of the product are you using? On what operating system?
v1.12, VS2008 Professional on Server 2008 64-bit.

Please provide any additional information below.
I had a requirement to allow users to specify target="_blank" on links, and
noticed this functionality used to be part of MarkdownSharp.  I have added
it back in again.

Note, when running the tests, some tests (e.g. code-inside-list) produce
different output to what's expected.  Having said that, I re-ran the tests
with the downloaded v1.12 source and got the same results!

Have attached a .patch file.

Original issue reported on code.google.com by [email protected] on 8 Mar 2010 at 11:34

Attachments:

Markdown.patch

Unable to load the solution in VS2005

What steps will reproduce the problem?
1. Open Visual Studio 2005 Professional Edition
2. Open the file "MarkdownSharp.sln".
3. Error reported: "The selected file is a solution file, but was created by a 
newer version of this application and cannot be opened."

What is the expected output? What do you see instead?
- Should be compatible with previous versions of VS.

What version of the product are you using? On what operating system?
- Visual Studio 2005 Professional Edition
- Windows Vista Business SP2

Original issue reported on code.google.com by [email protected] on 12 Jul 2010 at 11:12

How to contribute

Please write up a short wiki page on how to contribute to the project. Do
you require the use of Resharper or similar tools, how should the code be
formatted, do you prefer patches or regular commits, etc? If you use
Resharper or something similar, bundling a .resharper file with the
solution would help others with the tool to contribute without breaking the
code formatting standard.

Original issue reported on code.google.com by asbjornu on 26 Dec 2009 at 2:26

ReverseTransform

I'd love a function that does an HTML -> Markdown conversion.

I'm working on a patch, but it's slow going, as my knowledge of RegEx is
pretty basic.

Original issue reported on code.google.com by [email protected] on 4 Jan 2010 at 5:41

Patch (MarkdownOptions)

* Added a MarkdownOptions class under the same namespace.
* Added a constructor that accepts MarkdownOptions so the end developer can
keep an instance of it and pass it whenever a new Markdown instance with
the same options is needed.
* There shouldn't be any performance difference since none of the Regex
patterns or method implementations were modified.

Original issue reported on code.google.com by [email protected] on 24 Feb 2010 at 6:44

Attachments:

MarkdownOptions.patch

RegexOptions.Multiline expression changes

Instead of using the lookahead and end of file comparison of 

(?=/n+|/Z)

I changed it to the plain old

$

which when used in combination of Multiline, it does the same thing as the 
statement above, and seems to yield times of a 14-20% decrease in the 
execution time.

patched:
{{{
input string length: 475
performed 1000 iterations in 1025 (1.025 ms per iteration)
input string length: 2356
performed 500 iterations in 2267 (4.534 ms per iteration)
input string length: 27737
performed 100 iterations in 6246 (62.46 ms per iteration)
}}}

un-patched:
{{{
input string length: 475
performed 1000 iterations in 1289 (1.289 ms per iteration)
input string length: 2356
performed 500 iterations in 2755 (5.51 ms per iteration)
input string length: 27737
performed 100 iterations in 7236 (72.36 ms per iteration)
}}}

Original issue reported on code.google.com by nberardi on 28 Dec 2009 at 4:53

Attachments:

markdown-multiline-patch.patch

Escaping Performance

Something that is called very often are EncodeBoldItalics and brethren, 
which escape sensitive characters. Therefore they seemed to lend themselves 
to a little performance nudge.

Seizing the opportunity, I also encapsulated the whole escape mechanisms 
into their own (internal static) class.

Since _escapeTable accesses happen all over the place, I attached both 
Markdown.cs and the (new) Escapes.cs.

Now, the juicy bits are about 10% big:

Before changes:
input string length: 475
8000 iterations in 5222 ms (0,65275 ms per iteration)
input string length: 2356
2000 iterations in 5003 ms (2,5015 ms per iteration)
input string length: 27737
180 iterations in 4944 ms (27,4666666666667 ms per iteration)
input string length: 11075
375 iterations in 4852 ms (12,9386666666667 ms per iteration)
input string length: 88607
45 iterations in 4693 ms (104,288888888889 ms per iteration)
input string length: 354431
12 iterations in 5125 ms (427,083333333333 ms per iteration)

After changes:
input string length: 475
8000 iterations in 4761 ms (0,595125 ms per iteration)
input string length: 2356
2000 iterations in 4558 ms (2,279 ms per iteration)
input string length: 27737
180 iterations in 4403 ms (24,4611111111111 ms per iteration)
input string length: 11075
375 iterations in 4461 ms (11,896 ms per iteration)
input string length: 88607
45 iterations in 4280 ms (95,1111111111111 ms per iteration)
input string length: 354431
12 iterations in 4520 ms (376,666666666667 ms per iteration)

Original issue reported on code.google.com by [email protected] on 13 Jan 2010 at 3:31

Attachments:

Embed test cases as resources and drive tests with data

With the attached patch, NUnit is upgraded to version 2.5.3 (documentation
added as well to help intellisense) and all the file-based tests are
converted to data driven tests via the TestCaseSourceAttribute. See the
following for documentation on the attribute:

http://nunit.com/index.php?p=testCaseSource&r=2.5

All the test files are also included as embedded resources instead of being
addressed directly in the file system, alleviating the need for every
developer to configure the physical path to the "mdtest-1.1" directory.

The patch also adds an exclude pattern to the "mdtest-1.1" directory such
that files generated during execution of the MarkdownSharpTests console
application won't be included in source control.

Original issue reported on code.google.com by asbjornu on 5 Jan 2010 at 1:13

Attachments:

data-driven-tests.diff

The Markdown class should implement an interace to make mocking easier

What steps will reproduce the problem?

Public methods are not virtual so some mocking frameworks are unable to easily 
mock the Markdown class for unit testing.

What is the expected output? What do you see instead?

N/A

What version of the product are you using? On what operating system?

1.12, Windows

Please provide any additional information below.

Using Visual Studio refactoring tools, simply use Create Interface.  This will 
create an IMarkdown (by default) interface and set the class to implement it.  
Users can then mock this interface for unit tests, supplying an instance of the 
class for production code.

Original issue reported on code.google.com by [email protected] on 10 Jun 2010 at 6:19

Invalid auto-newlines in multi-line ordered lists

What steps will reproduce the problem?
Transform the following text with  AutoNewlines option set to true:

1. Line1
   Line2

   Line3
   Line4

2. Suspendisse id sem consectetuer libero luctus adipiscing

Expected output:
<ol>
<li><p>Line1<br>
Line2</p>

<p>Line3<br>
Line4</p></li>
</ol>

Actual output:
<ol>
<li><p>Line1<br
Line2</p>

<p>Line3<br
Line4</p></li>
</ol>

Original issue reported on code.google.com by [email protected] on 22 Dec 2010 at 7:18

Changing the images/links before they're rendered as HTML

I'm using markdownsharp for Markdown support in an OS .NET wiki engine. The 
wiki engine also supports Creole and Media Wiki format and the parsers I use 
for those (from creoleparser.codeplex.com) support link manipulation before the 
images/links are written to HTML. This is critical for the local links in the 
wiki engine as they have to be rewritten for the HTML.

I've attached apatch and the Markdown.cs file (as HG's patch file isn't that 
great compared to diffmerge) with the changes. It's really just two new events 
and two complimentary EventArg classes. It passes the test suite.

Add it or don't add it, I don't mind either way :] But I thought it might be a 
useful addition. The patch version is in the Wiki source at 
http://hg.shrinkrays.net/roadkill

Regards
Chris

Original issue reported on code.google.com by [email protected] on 7 Mar 2011 at 4:53

Attachments:

markdown.patch.zip

Patch for /MarkdownSharp/Markdown.cs

spelling mistake

Original issue reported on code.google.com by [email protected] on 18 Jul 2011 at 12:02

Attachments:

Markdown.cs.patch

Don't version control the 'bin' folder

Instead of using the "bin" folder to contain library files, I find it
better to have them in a "lib" or "thirdparty" folder on the solution level
of the project so the "bin" folder doesn't need to be version controlled.

Having the "bin" folder version controlled is a bit messy since Visual
Studio creates and deletes folders and files within it without regard for
any possible contained .svn folders holding metadata about the versioned
files. I find that the best thing to do is to ignore the "bin" folder,
since a simple build of the solution imho shouldn't result in versioned
files being changed, folders added and otherwise things that SVN clients
would like to commit as changes.

Attached is a patch that creates the "lib" folder and ignores and excludes
the "bin" (as well as the "obj") folder.

Original issue reported on code.google.com by asbjornu on 4 Jan 2010 at 7:45

Attachments:

lib.diff

Can't have preformatted code immediately following a list (spec bug, not implementation bug)

Sorry if this isn't the place to raise this sort of issue.

Hopefully I'm missing something here but the spec doesn't seem to allow 
preformatted code blocks to come directly after a list (because it reads 
the 4 spaces as a new paragraph within the last list item).

Markdown:

Here's an unordered list:

* asdf
* qwerty

    // Here's some performatted code:
    function() { }



Desired HTML:

<p>Here's an unordered list:</p>
<ul>
<li>asdf</li>
<li>qwerty</li>
</ul>
<pre><code>// Here's some performatted code:
function() { }</code></pre>



Actual HTML (produced by both MarkdownSharp and the Markdown Dingus):

<p>Here's an unordered list:</p>
<ul>
<li>asdf</li>
<li><p>qwerty</p>
<p>// Here's some performatted code:
function() { }</p></li>
</ul>

Original issue reported on code.google.com by [email protected] on 16 Jan 2010 at 7:33

The TestLinkEmails() test fails

String lengths are both 14. Strings differ at index 12.
  Expected: "<p><a href="&#"
  But was:  "<p><a href="ma"
  -----------------------^

Original issue reported on code.google.com by [email protected] on 13 Jul 2011 at 7:46

URL encoding

What steps will reproduce the problem?
1. Add a hyperlink to 
'http://docs.jquery.com/Tutorials:Introducing_$(document).ready()'

What is the expected output? What do you see instead?
A link to be produced, but it's not parsed.

The link isn't URL encoded which is the issue.

Original issue reported on code.google.com by [email protected] on 17 Jan 2010 at 9:33

Additional unit tests

I ran code coverage over your default test suite, and found a few holes in 
corner cases.  I have patches for some of those:

1. Not testing the empty-string case (Transform with null/empty)
2. Handling reference links with bold/italic titles wasn't tested
3. Handling images with empty link ids wasn't tested
4. Handling images with invalid link ids wasn't tested
5. Some normalization/special character cases weren't completely tested (sub 
character, carriage return at end of file)

I also saw the options/config file stuff and options constructor aren't covered 
at all by default.  Maybe this is by design?  I didn't try out the tests you 
have.

And a minor code issue found by examining coverage:

1. LinkEvaluator checks for null in the matches returned - is this necessary?  
I'm getting partial coverage on it because of that.  I don't think other match 
evaluators check for null.  Or if it is necessary, maybe other ones should too?

Original issue reported on code.google.com by [email protected] on 7 Nov 2011 at 10:37

Attachments:

mdtest-1.1-additional.zip

code block on second line

You cannot have a code block on the second line of a string (with an empty
first line)

change the first line of the _codeBlock regex to (?:\n\n|\A\n?)

the optional new line character solves this issue and doesn't break any
existing tests.

Original issue reported on code.google.com by [email protected] on 2 Apr 2010 at 11:21

Provide means of setting options programmatically

I would love to see something like this:

var m = new Markdown(
    LinkEmails = true, 
    StrictBoldItalic = true
    (etc)
);

string s = m.Transform("blah blah");

This should be pretty easy to implement.  I suspect adding / organizing 
unit test coverage for this will be more work than actually implementing 
the functionality.

Assuming there aren't any overly negative reactions to this, I might put a 
patch together this week.

Original issue reported on code.google.com by [email protected] on 4 Jan 2010 at 2:41

input normalization

Attached diff includes code for combining newline normalization and Detab 
function into a function "NormalizeInput". It also ensures that the input 
ends in at least two newlines.

As expected, this change yields performance, although not a lot:

Benchmark before changes:
input string length: 475
7000 iterations in 4816 ms (0,688 ms per iteration)
input string length: 2356
2000 iterations in 5040 ms (2,52 ms per iteration)
input string length: 27737
180 iterations in 5131 ms (28,5055555555556 ms per iteration)
input string length: 11075
300 iterations in 3951 ms (13,17 ms per iteration)
input string length: 88607
40 iterations in 4288 ms (107,2 ms per iteration)
input string length: 354431
10 iterations in 4260 ms (426 ms per iteration)

Benchmark after changes:
input string length: 475
7000 iterations in 4688 ms (0,669714285714286 ms per iteration)
input string length: 2356
2000 iterations in 4968 ms (2,484 ms per iteration)
input string length: 27737
180 iterations in 4953 ms (27,5166666666667 ms per iteration)
input string length: 11075
300 iterations in 3840 ms (12,8 ms per iteration)
input string length: 88607
40 iterations in 4226 ms (105,65 ms per iteration)
input string length: 354431
10 iterations in 4243 ms (424,3 ms per iteration)

Original issue reported on code.google.com by [email protected] on 11 Jan 2010 at 11:21

Attachments:

Normalize.diff

Null string will throw NullReference exception in Transform().

A null  string will throw a NullReference in Transform(). Attached patch will 
perform a null comparison check to the beginning of this method, eliminating 
this issue.

I'm not sure how a empty string will behave, or the intended behaviour with a  
passed empty string.

Original issue reported on code.google.com by [email protected] on 29 Dec 2009 at 2:14

Attachments:

ap-nullcheck.patch

Include HTML/JavaScript component

It would be nice to have the HTML/JavaScript component included.  So that any 
modifications made to the Markdown standandard in markdownsharp can be 
paralleled in the JavaScript version.

Original issue reported on code.google.com by nberardi on 28 Dec 2009 at 3:31

Tidyness test

You have the Tidyness test listed as passing. I've run the 1.006 code and the 
quoted list comes back as indented (2 spaces on each line) for me, which 
isn't as per the test. Is this one of the examples you're counting as passing 
but with a 2 white-space difference?

Original issue reported on code.google.com by [email protected] on 3 Jan 2010 at 5:54

Running tests in MonoDevelop under MacOS

Patch for BaseTest.cs to make tests run in MonoDevelop under MacOS.  This
required switching the slash direction in the filename and using a relative
path to the MarkdownSharpTests directory.

I am *brand new* to C#, so it took me a while to get MonoDevelop setup and
figure out how to get the test running; I figure this simple patch may
assist others.

Original issue reported on code.google.com by drewgstephens on 31 Dec 2009 at 12:05

Attachments:

Test-MacOS-MonoDevelop.patch

Split Markdown.cs into several partial class files

As discussed in Issue #1, Markdown.cs is a bit hard to maintain and
understand at its current 1593 lines. To remedy this, the attached patch
splits Markdown.cs into several separate files by using the "partial"
keyword. As a result of this, Markdown.cs is down to the imho much more
comfortable 519 lines.

The external structure of the Markdown class remains unchanged, but each
partial class takes care of a separate part of the transformation process,
making it easier to open the right file regarding a specific issue. While
incremental search is brilliant, I still find short code files to be much
easier to work with than large ones.

The patch also does a bit of R# tweaking with regard to making all Regex
objects "readonly" and all instance methods that are called in a static
context "static".

Original issue reported on code.google.com by asbjornu on 7 Jan 2010 at 8:58

Attachments:

split.diff

Detab function performance and \t matching in regexes

The original Detab function uses a regex (_deTab) to convert tabs to an 
appropriate amount of spaces. Looking at this approach, it seems that a lot 
of strings are generated rather unnecessary, and a lot of overhead is 
involved.

Preliminary performance benchmark was:
input string length: 475
6000 iterations in 4691 ms (0,781833333333333 ms per iteration)
input string length: 2356
1500 iterations in 4489 ms (2,99266666666667 ms per iteration)
input string length: 27737
150 iterations in 5159 ms (34,3933333333333 ms per iteration)
input string length: 11075
300 iterations in 4506 ms (15,02 ms per iteration)
input string length: 88607
35 iterations in 4268 ms (121,942857142857 ms per iteration)
input string length: 354431
10 iterations in 4857 ms (485,7 ms per iteration)

By doing this fairly crucial (many input lines match _deTab at least once) 
part "manually" - see attached diff file for exact code - an about 10% 
performance gain can be achieved:
input string length: 475
6000 iterations in 4187 ms (0,697833333333333 ms per iteration)
input string length: 2356
1500 iterations in 4011 ms (2,674 ms per iteration)
input string length: 27737
150 iterations in 4464 ms (29,76 ms per iteration)
input string length: 11075
300 iterations in 4087 ms (13,6233333333333 ms per iteration)
input string length: 88607
35 iterations in 3922 ms (112,057142857143 ms per iteration)
input string length: 354431
10 iterations in 4451 ms (445,1 ms per iteration)

Also, it is now fairly obvious that all '\t' are removed from the input 
stream at this point, therefore all "[ \t]" (or similar) parts in the 
regular expressions can have their tab-matching stripped out. The savings 
in this (apart from readability) vary pretty big with respect to the input:

input string length: 475
6000 iterations in 4114 ms (0,685666666666667 ms per iteration)
input string length: 2356
1500 iterations in 3826 ms (2,55066666666667 ms per iteration)
input string length: 27737
150 iterations in 4239 ms (28,26 ms per iteration)
input string length: 11075
300 iterations in 3884 ms (12,9466666666667 ms per iteration)
input string length: 88607
35 iterations in 3719 ms (106,257142857143 ms per iteration)
input string length: 354431
10 iterations in 4264 ms (426,4 ms per iteration)

All unit tests pass the same after these modifications.

Original issue reported on code.google.com by [email protected] on 10 Jan 2010 at 1:43

Attachments:

detab.diff

Test code-inside-list fails

What steps will reproduce the problem?
1. Download v113
2. Build
3. Run Tests

What is the expected output? What do you see instead?

code-inside-list.html does not match. It repeats <pre><code> before 
"indented-12-spaces"

What version of the product are you using? On what operating system?
v112 (current version)

Please provide any additional information below.

Original issue reported on code.google.com by [email protected] on 23 Dec 2010 at 11:21

Make All Static Regex's Readonly and Compiled

I am submitting this patch which makes all Regex static statements readonly 
and compiled.

I changed all

private static Regex foo = new Regex("...", ...);

to

private static readonly Regex foo = new Regex("...", ... | 
RegexOptions.Compiled);

The numbers on the project home didn't match the numbers on my machine, but 
I wouldn't imagine that they would since they were done on a different 
machine.  

475 string length
I observed a 30% increase in time to complete the task when compiled, I 
think this has to do with the time it takes to compile on first hit of the 
Markdown class. Needs more testing.

2356 string length
I observed a 10% decrease in the time to complete the task when compiled.

27737 string length
I observed an 10.5% decrease in the time to complete the task when 
compiled.

I ran these numbers many times, and the numbers only slightly varied.

Original issue reported on code.google.com by nberardi on 26 Dec 2009 at 8:31

Attachments:

markdownsharp-compiled-regex.patch

Include NUnit and Log4Net With Source Code

Include NUnit and Log4Net references with source code.

Original issue reported on code.google.com by nberardi on 26 Dec 2009 at 8:08

Shorten code files

It would be nice and much easier to contribute to the project if code files
weren't several thousand lines long. Separating issues into sub classes or
at least into separate code files with 'partial' increases readability a
lot and makes it much easier to navigate the project.

Original issue reported on code.google.com by asbjornu on 26 Dec 2009 at 2:24

Unit test refactoring

I'm proposing the following changes:

1. Follow the Four-Phase Test pattern 
(http://xunitpatterns.com/Four%20Phase%20Test.html)  (we don't have any 
teardown, so we're left with three phases, but it's still worth doing)

2. Remove duplication of var m = new Markdown() in every test.

3. Introduce variables for input, expected and actual.  This separates the 
data from the logic of the test, which should make it easier to read.  For 
most tests in this patch, you only have to read the first two lines to see 
what they're doing.

See attached patch.  I look forward to hearing any comments.

Original issue reported on code.google.com by [email protected] on 4 Jan 2010 at 2:34

Attachments:

testRefactoring.patch

HTML block detection regex from Markdown PHP -- is this correct?

Please try this patch to implement the HashHTMLBlocks() algorithm from
Markdown PHP.

It *seems* to work, but results in two different unit test failures
compared to the HashHTMLBlocks() we inherited from Markdown.pl 1.0.1.

Did I translate the regex right? can someone double check my work here please?

The two unit test failures we have are related to faults in the
HashHTMLBlocks() routine, so I'd like to pull in the one from MarkdownPHP
if possible.. just need to verify if I got anything wrong here.

Original issue reported on code.google.com by [email protected] on 6 Jan 2010 at 7:25

Attachments:

markdown-php-html-block-regex.patch

Is MarkdownSharpe Threadsafe?

I cannot find anywhere if it is thread safe.

What steps will reproduce the problem?
1. Use a single instance of Markdown in an ASP.NET app
2. Open many pages at once, in each page format some text

What is the expected output? What do you see instead?
I intermittently get the following exception: 
System.Collections.Generic.Dictionary`2.get_Item(TKey key) +12681071
   MarkdownSharp.Markdown.FormParagraphs(String text) in D:\Development\OpenSource\MarkdownSharp\MarkdownSharp\Markdown.cs:437
   MarkdownSharp.Markdown.RunBlockGamut(String text) in D:\Development\OpenSource\MarkdownSharp\MarkdownSharp\Markdown.cs:389
   MarkdownSharp.Markdown.Transform(String text) in D:\Development\OpenSource\MarkdownSharp\MarkdownSharp\Markdown.cs:363

What version of the product are you using? On what operating system?
markdownsharp-20100703-v113.7z

Please provide any additional information below.

Original issue reported on code.google.com by [email protected] on 21 Sep 2011 at 4:19

Allow extensions: inline patterns

It would be nice to support extensions, like Markdown in Python does[1].

I created a patch that enables inline patterns to be added to a Markdown 
object:

    class WikiWords : IInlinePattern { /* ...  */ }

    var markdown = new MarkDown();
    markdown.ExtendWith(new WikiWords());


[1] http://www.freewisdom.org/projects/python-markdown/Writing_Extensions

Original issue reported on code.google.com by JanW.Boer on 9 Apr 2010 at 7:17

Attachments:

InlinePatternsExtensions.patch

Patch (Refactoring)

- Removed all references to System.Collections (ArrayList).
- Dictionary initialization with C# 3.0 syntax (we already rely on C# 3.0 as 
we're using "var" so that's a non-issue in terms of dependencies).
- Pair structure was used to store tokens in a dirty way (it used the first 
string to store whether it's a text token or a tag. I used an enum instead to 
hold that value. I considered KeyValuePair<TokenType,string> too but decided 
against it as Token structure more clearly describes intent).

Original issue reported on code.google.com by [email protected] on 29 Dec 2009 at 9:37

Attachments:

Markdown.patch

Links_inline_style failing test

Adding a conditional, literal closing parenthesis (i.e. \)?) to the $3
capturing parens trivially fixes this issue, but doesn't actually work. 
Attached is a patch for a test that this "fix" fails.

Original issue reported on code.google.com by drewgstephens on 31 Dec 2009 at 4:04

Attachments:

linksInlineStyle-test.patch

Enabled distributed version control

Hosting this project on a git-enabled (or otherwise distributed version
control system) site would make it much easier to contribute, since each
developer would have his or her own branch to commit changes to, instead of
just one master and numerous patches being applied to it almost randomly.

Contributing is so much easier when the version control is distributed. I
prefer git, but as far as I know, it should be possible to enable
distributed version control (through Mercurial) on Google Code as well.

Original issue reported on code.google.com by asbjornu on 4 Jan 2010 at 8:23

Formatting inside a list

What steps will reproduce the problem?
1. Attempt to use italic and bold formatting inside a list
2. This is a bullet point list, not numbering.

What is the expected output? What do you see instead?
This might not be in the markdown spec, but I'd expect formatting to work 
inside lists.

Original issue reported on code.google.com by [email protected] on 17 Jan 2010 at 9:35

nonintanon / markdownsharp Goto Github PK

markdownsharp's People

Contributors

Watchers

markdownsharp's Issues

Recommend Projects

Recommend Topics

Recommend Org