Giter Club home page Giter Club logo

dtext_rb's People

Contributors

albertc5 avatar evazion avatar r888888888 avatar type-kun avatar

Stargazers

 avatar

Watchers

 avatar  avatar  avatar  avatar

dtext_rb's Issues

Issues with @ symbol processing

The @ symbol breaks the state machine when DText markers appear at the end of text, as it's looking for a word break for the mention so it skips over any closing markers such as style markers.

This issue is demonstrated at http://testbooru.donmai.us/comments/16

Not sure what should or shouldn't be allowed when @ appears at the beginning of a word, but if @ appears in any other location then it should probably not enter the mention state.

Add Support For Hash (#) Only Links

With the inclusion of headers with ID's, thing like creating Table of Contents are now possible.

Example:
http://danbooru.donmai.us/wiki_pages/37251

However, these hash links only work on the wiki pages themselves. If used from the Wiki section of the post search page instead, those hash links will take the user away from the post search page and to the wiki page itself, which would be an undesired behavior.

A solution would be to add in support for hash only links.

Example:

"Example text":[#dtext-link_to_below]

h1#link_to_below.Title Heading

...which would create...

<a href="#dtext-link_to_below">Example text</a>

<h1 id="dtext-link_to_below">Title Heading</h1>

Named links can produce invalid nested links

The input "post #1234":http://example.com produces a invalid nested link:

<p>
  <a class="dtext-link dtext-external-link" href="http://example.com">
    <a class="dtext-link dtext-id-link dtext-post-id-link" href="/posts/1234">post #1234</a>
  </a>
</p>

The only kind of markup that should be allowed inside named links are the basic formatting tags: [b], [i], [s], [u].

Add an ability to use <hr> tags

This will help with being able to visually separate different elements, for example artist commentaries from different sources.

Allow [code] styling to be inline

Not sure if this used to be a thing with the old DText parser, since the Help:Blacklists wiki had [code] blocks inline. For now, I've replaced those with quotation marks, but it would be nice to be able to do code styling inline.

Regen dtext.c

Danbooru is still missing the recent updates to DText. dtext.c needs to be regenerated to include the latest changes to dtext.rl.

Does this file need to be in version control? It's a generated file after all. As long as one has ragel installed, it will be regenerated automatically as part of the build if it doesn't already exist.

Multiple memory leaks

The parser has several memory leaks:

  • The first leak is in the basic_wiki_link / aliased_wiki_link parsers. When parsing e.g. [[Hatsune Miku]], we call g_utf8_strdown to lowercase the tag, but g_utf8_strdown allocates a string that is never freed.

  • The second leak is in the header_with_id parser. When parsing e.g. h1#id. title, a string is allocated for id which is freed with g_string_free(id_name, false). Passing false here is wrong; it causes g_string_free to free only the GString object, not the underlying char * holding the actual string.

  • The third leak is in free_machine. The stack variable is freed with g_array_free(stack, FALSE), but again passing false here causes it to free only the GArray struct, not the underlying array. Also, g_array_free is not thread safe; the docs suggest using g_array_unref instead.

  • The fourth leak is in parse_file. A GOptionContext is allocated but never freed.

The first two leaks are high severity. An attacker can consume all available memory by exploiting these leaks on a high-traffic post or wiki page.

The third leak causes memory to be leaked every time the parser is invoked, but only by a small amount. This is mitigated by the use of the Unicorn worker killer gem on Danbooru, which restarts worker processes every 5000-10000 requests.

The fourth leak only occurs when using cdtext from the commandline, so it doesn't really matter.

[nodtext] Tags Require a Preceding Character in Order to Work Properly

I posted an example of this on Danbooru.

http://danbooru.donmai.us/forum_topics/9127?page=152#forum_post_125569

Basically, if the opening [nodtext] tag starts at the beginning of a line without any preceding characters, then the closing [/nodtext] tag will not reenable DText Parsing again.

Edit:

I'm guessing because an extra BLOCK_P is being pushed onto the stack from the main function (though not the inline function), and so when the nodtext function call goes to check the dstack, it sees a BLOCK_P instead of the BLOCK_NODTEXT.

Ignore trailing brackets when parsing links

【http://www.example.com】
「http://www.example.com」

Links like the above are common in Pixiv commentaries. The closing brackets should not be included as part of the link.

More generally, most (if not all) closing punctuation characters should be treated as boundary characters. Certain other punctuation like the ideographic full stop () should be too.

Block-level [tn] tags are never closed

The input [tn]blah[/tn] produces the output <p class="tn">blah. Omitting the </p> tag is allowed in HTML5 under certain contexts (see "Tag omission" at MDN), so technically this works, but only by accident.

Unable to install gem on Wheezy: G_OPTION_FLAG_NONE undeclared

While attempting to run bundle install on the most recent Danbooru on Debian Wheezy, it fails with the following in the output:

make "DESTDIR="
compiling rb_dtext.c
compiling dtext.c
ext/dtext/dtext.rl: In function ‘main’:
ext/dtext/dtext.rl:1418:27: error: ‘G_OPTION_FLAG_NONE’ undeclared (first use in this function)
ext/dtext/dtext.rl:1418:27: note: each undeclared identifier is reported only once for each function it appears in
make: *** [dtext.o] Error 1

make failed, exit code 2

Curiously, running gem install dtext_rb -v '1.4.4' like bundler suggests, executes without any trouble.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.