Giter Club home page Giter Club logo

remark's People

Contributors

arobase-che avatar ben-eb avatar benabel avatar brendo avatar canrau avatar chjj avatar christianmurphy avatar chriswren avatar eush77 avatar greenkeeperio-bot avatar hamms avatar ianstormtaylor avatar ikatyang avatar inokawa avatar isaacs avatar kitsonk avatar lepture avatar mike-north avatar minrk avatar mithgol avatar remcohaszing avatar rokt33r avatar s0 avatar selfcontained avatar spl avatar talatkuyuk avatar tmcw avatar vhf avatar wataru-chocola avatar wooorm avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

remark's Issues

ouput wrong position when passing empty string

var mdast = require('mdast');
var emptyString = "";
var ast = mdast.parse(emptyString);
console.log(JSON.stringify(ast));
/*
{
    "type": "root",
    "children": [],
    "position": {
        "start": {
            "line": 1,
            "column": 1
        }
    }
}
*/
// position.end is undeifned

Example : http://requirebin.com/?gist=ad3c34ef897338867009

Expected:

{
    "type": "root",
    "children": [],
    "position": {
        "start": {
            "line": 1,
            "column": 1
        },
        "end": {
            "line": 1,
            "column": 1
        }
    }
}

Actual:

position.end is undeifned

Could we add more code samples in manpages?

https://github.com/wooorm/mdast/blob/master/doc/mdastplugin.3.md

In this page, I am trying very hard to understand how to create and manipulate plugins. While I know concepts have been organized in a very able way, you are introducing a log of new terms as attacher, transformer and completer.

There is but one code example on creating a plugin, and this only implements the transformer. Code samples are are to coders as pictures are to anyone else in a manual. They put things in perspective. I am having a very hard time figuring out how to implement these definitions above, and if more snippits were given, I am sure this could be simplified greatly.

Example:

To access all files once they are transformed, create a completer. A completer is invoked before files are compiled, written, and logged, but after reading, parsing, and transforming. Thus, a completer can still change files or add messages.

Where does one create a signature? A simple example would be worth another 5 paragraphs of description.

Should be able to expose style information

  • Probably not by default;
  • Information like which emphasis markers are used, asterisks or underscores;
  • This would highly benefit the creation of something mdlint-like.

Is there an "encode" method to insert escaped text into the AST?

I sometimes have to insert text-as-is into the AST, e.g. I need to insert (Taylor, Stouffer, & Meehl, 2011) in a way that this exact text turns up in the markdown rendered to HTML. For this I need to insert something like \(Taylor, Stouffer, & Meehl, 2011\). Can mdast do this for me? Or should I use something like markdown-escape?

Make it easier for plugins to add tokenizers to the parser

Looking here and here it seems like I need to have intimate knowledge of the how the parser works in order to define regular expressions to tokenize. The use case is detecting and linking URLs (auto-linking) and @mentions. Some of the URLs I'd like to turn into special node types – such as "twitter", which another plugin could render as HTML for an embedded tweet.

Ideally I could write a plugin that only has to specify a regular expression, a function which returns the node, and some rules about scope (for example, I wouldn't want to create a link for a URL that is already inside a link).

Positions of fenced vs. unfenced code

Hi. I'm in the middle of switching from marked to mdast for parsing in my mockdown library. I've run into a slight snag, however, which is that mdast gives the start position of a fenced code block as the line where the backquotes are, but gives the start position of an indented code block as the line where the actual code starts.

When I was using marked, this wasn't a problem because I could detect the absence of a lang property to know that a code block was indented rather than fenced, and the presence of the attribute (even if null) to know when I need to offset the code's line position by 1. But mdast creates the property with a null value on indented blocks as well as on fenced ones, so there is no way for me to know whether to offset the line number.

Well, technically, there is: I can count the number of lines in the code node's value, and compare this to the number of lines in the node's position range, and if it's 2 less, I know it's a fenced code block and can offset the start position of the code accordingly.

This seems a bit fragile, though, so I was wondering if there can be some official way to do this. That is, to either be able to tell the two kinds of code blocks apart (e.g. via a fenced property), or to have the position of a code block be registered as the position where the code starts, rather than the position where the code's block wrapper starts.

Heck, just allowing an empty string for lang when it's a fenced block without a language would work for me. The main point is just to have an officially supported way to be able to know what line number the actual code of a code node begins on, whether the block is indented or fenced.

Thanks!

Paragraph `mdast.stringify` creates line-breaks on return

When invoking mdast.stringify on a paragraph node and all of its child nodes, it renders the original paragraph with line breaks. Example:

This is a markdown pargraph with a [link](http://this-page-intentionally-left-blank.org) to something silly.

On stringifying this, one gets:

This is a markdown pargraph with a 
[link](http://this-page-intentionally-left-blank.org)
 to something silly.

LInk parser lowercases identifiers

When I parse [][@TayEA11], the resulting AST is

{
  "type": "root",
  "children": [
    {
      "type": "paragraph",
      "children": [
        {
          "type": "linkReference",
          "identifier": "@tayea11",
          "referenceType": "full",
          "children": [],
          "position": {
            "start": {
              "line": 1,
              "column": 1
            },
            "end": {
              "line": 1,
              "column": 13
            },
            "indent": []
          }
        }
      ],
      "position": {
        "start": {
          "line": 1,
          "column": 1
        },
        "end": {
          "line": 1,
          "column": 13
        },
        "indent": []
      }
    }
  ],
  "position": {
    "start": {
      "line": 1,
      "column": 1
    },
    "end": {
      "line": 1,
      "column": 13
    }
  }
}

Is there a setting that keeps the casing of identifiers?

Transformer should not rely on mutated object

Take the following abbreviated sample of an embedded plugin:

// This will only return the first element in the .md
const processor = mdast().use(function (mdst, opt) {
  function transformer(ast, file) {
    ast.children = ast.children.slice(0, 1);
  }
  return transformer;
});
return processor.process(data);

In this example, the transformer method is expected to mutate the incoming parameters, ast and file. This had me confused for quite a while as it is commonly considered a best-practice to keep parameters immutable. Due to expecting transformer to return the tranformed objects and not seeing it in any of your plugins, I was thrown a bit. The transformer doesn't actually do anything with its returned object.

A more optimum approach would be something like this:

// This will only return the first element in the .md
const processor = mdast().use(function (mdst, opt) {
  function transformer(ast, file) {
    var mutatedAst = ast.children.slice(0, 1);
    return mutatedAst;
  }
  return transformer;
});
return processor.process(data);

While I understand two parameters are in play, they should probably be returned grouped together as an object. The point is that one should not expect the user to mutate incoming parameters and not even return a result, which is a basic in functional programming.


I know correcting this would probably break other plugins: perhaps you could schedule it in to the next major release?

Refactor breaks in CommonMark

They’re currently added as an escape node ({type: 'escape', value: '\n'}), but should be added as {type: 'break'}.

This should be accompanied by a stringily option to either use CommonMark style, or trailing-space style.

Cannot distinguish `|---|` and `|:---|`

mdast parses un-aligned table column (|---|) as left-aligned, as well as |:---|. This makes it impossible to emulate GitHub's Markdown renderer -- it renders the header of un-aligned table column center, and the body left, by leaving their text-align style unspecified:

|un-aligned(center)|center|left|
|---|:---:|:---|
|Lorem ipsum dolor sit amet|Lorem ipsum dolor sit amet|Lorem ipsum dolor sit amet|
|un-aligned(left)|center|left|

un-aligned(center) center left
Lorem ipsum dolor sit amet Lorem ipsum dolor sit amet Lorem ipsum dolor sit amet
un-aligned(left) center left

Should expose footnote definitions as a node.

An object Instead of an array:

   "footnotes": {
-    "1": [
-      {
-        "type": "paragraph",
-        "children": [
-          {
-            "type": "text",
-            "value": "A footnote."
-          }
-        ]
-      }
-    ]
+    "1": {
+      "type": "footnoteDefinition",
+      "id": "1",
+      "children": [
+        {
+          "type": "paragraph",
+          "children": [
+            {
+              "type": "text",
+              "value": "A footnote"
+            }
+          ]
+        }
+      ]
+    }
   }

Avoid using peerDependencies

I'm trying to force mdast-react to use 0.26.2 or newer because of the recently-fixed parsing bugs. Doing so results in a

~/src/mdast-react〉npm install
npm ERR! Darwin 14.3.0
npm ERR! argv "node" "/usr/local/bin/npm" "install"
npm ERR! node v0.12.6
npm ERR! npm  v2.12.1
npm ERR! code EPEERINVALID

npm ERR! peerinvalid The package mdast does not satisfy its siblings' peerDependencies requirements!
npm ERR! peerinvalid Peer [email protected] wants mdast@>=0.22.0
npm ERR! peerinvalid Peer [email protected] wants mdast@>=0.22.0
npm ERR! peerinvalid Peer [email protected] wants mdast@>=0.25.0
npm ERR! peerinvalid Peer [email protected] wants mdast@>=0.24.0
npm ERR! peerinvalid Peer [email protected] wants mdast@>=0.22.0

npm ERR! Please include the following file with any support request:
npm ERR!     /Users/tmcw/src/mdast-react/npm-debug.log

Combine with npm moving away from peerDependencies, it would be awesome to use normal ol' dependencies rather than peerDependencies to string mdast packages together.

Fix CLI-settings

Currently, it’s impossible to pass nested objects or arrays because the parsing system is way too simple. This should be changed to accepting just JSON.

Something like mdast . -s 'foo: {bar: "baz"}'?

Parse error in bullet with space before newline

I encoutered this.

> mdast.parse('- \n')

TypeError: Cannot read property 'length' of null
  at Parser.tokenizeList (/Users/mizchi/sandbox/mdast/lib/parse.js:299:25)
  at Parser.tokenizeBlock (/Users/mizchi/sandbox/mdast/lib/parse.js:1572:28)
  at Parser.parse (/Users/mizchi/sandbox/mdast/lib/parse.js:1225:14)
  at Object.parse (/Users/mizchi/sandbox/mdast/lib/parse.js:1733:50)
  at repl:1:8
  at REPLServer.replDefaults.eval (/Users/mizchi/.nodebrew/node/v0.10.33/lib/node_modules/coffee-script/lib/coffee-script/repl.js:33:42)
  at repl.js:239:12
  at Interface.<anonymous> (/Users/mizchi/.nodebrew/node/v0.10.33/lib/node_modules/coffee-script/lib/coffee-script/repl.js:66:9)
  at Interface.emit (events.js:117:20)
  at Interface._onLine (readline.js:202:10)
  at Interface._line (readline.js:531:8)
  at Interface._ttyWrite (readline.js:760:14)
  at ReadStream.onkeypress (readline.js:99:10)
  at ReadStream.emit (events.js:117:20)
  at emitKey (readline.js:1095:12)
  at ReadStream.onData (readline.js:840:14)
  at ReadStream.emit (events.js:95:17)
  at ReadStream.<anonymous> (_stream_readable.js:764:14)
  at ReadStream.emit (events.js:92:17)
  at emitReadable_ (_stream_readable.js:426:10)
  at emitReadable (_stream_readable.js:422:5)
  at readableAddChunk (_stream_readable.js:165:9)
  at ReadStream.Readable.push (_stream_readable.js:127:10)
  at TTY.onread (net.js:528:21)

Add website

mdast should have a cool website!

Maybe http://mdast.md? http://mdast.js.org (free)?, or just at GitHub (free)?

Also: should be good looking and useful.

Can't parse html tag correctly

  • can parse <div>div</div> and <pre>pre</pre>
  • <a>foo</a> and <span>foo</span>

It looks inline tag can't be parsed.

coffee> mdast.parse('<a>foo</a>').children[0]
{ type: 'paragraph',
  children: 
   [ { type: 'html',
       value: '<a>',
       position: [Object] },
     { type: 'text',
       value: 'foo',
       position: [Object] },
     { type: 'html',
       value: '</a>',
       position: [Object] } ],
  position: 
   { start: { line: 1, column: 1 },
     end: { line: 1, column: 11 } } }

Create mdast-html

One of the major things to do is create a plug-in which compiles an mdast AST into HTML.

This plug-in would be a great way to test how applicable the AST is for heavy duty transpiling into another language.

Nested tasklist

Here is trivial difference.

- [x] aaa
  - [ ] bbb
  - [ ] ccc
  • aaa
    • bbb
    • ccc

It looks mdast doesn't handle nested tasklist.

Ways to use global mdast plugins with CLI?

What is the preferred way of using globally installed plugins with CLI?

$ echo "# hello" | mdast -u mdast-html
# hello

<stdin>
        1:1  error    Error: Cannot find module 'mdast-html'

It just worked before. I found two ways of working around it.

  • Including $(npm root -g) in $NODE_PATH:
$ echo "# hello" | env NODE_PATH="$(npm root -g):$NODE_PATH" mdast -u mdast-html
  • Specifying full path to a plugin:
$ echo "# hello" | mdast -u "$(npm root -g)/mdast-html"

Both ways are somewhat clumsy. Is there a simpler way of doing that or some relevant configuration option?

Fix demo

The current demo is horrible. It’s slow, not that useful, and more.

  • It should be good looking;
  • It should use a faster editor;
  • it should be user-friendly.

Add `style` properties on nodes

Currently, only global stringification settings, such as bullet, are supported. I’d like to extend stringification style to per-node settings. Thus, a list-item can have a style.bullet = ‘*' property.

Something like:

  • heading nodes have an enum headingStyle property set to "atx",
    "atx-closed", or "setext";
  • tables nodes have a boolean looseTable property;
  • tables nodes have a boolean spacedTable property;
  • code nodes have a nullable enum fenceMarker property set to ""or "~"`;
  • code nodes have a boolean fences property;
  • listItem nodes have an enum listItemBullet property set to *, -,
    +, ., or ).
  • listItem nodes have a nullable listItemIndex property set to an integer;
  • horizontalRule nodes have an enum ruleMarker property set to *, -, or
    _.
  • horizontalRule nodes have a boolean ruleRepetition property;
  • horizontalRule nodes have a boolean ruleSpaces property;
  • strong and emphasis nodes have an enum emphasisMarker property
    set to _ or *.

These should be overwritten when a setting is given to mdast (this allows
mdast to fix code-style), but overwrite the default values noted in
mdast.process()

Supersedes GH-30.

Want a "don't merge HTML nodes" option

Sometimes merged HTML nodes get in my way when transforming AST into vertual DOM.

We can't just split a seemingly-merged HTML node by /\n\n/ because doing so breaks <div>text\n\n</div>[1] in <div>text and </div>. Though I'm fine with nodes whose value is simple tag (<div>, </div>) or balanced fragment (<div>text</div>), something like <div>text is not very acceptable.

[1] it can be obtained by parsing this Markdown document:

<div>text

</div>

Watching files

Hi, thanks for your work. I'm trying to use mdast-lint, and am thinking it'd be wonderful to have something like a --watch option built into mdast.

Github-flavored markdown html incompatibility

FYI, mdast does not parse HTML the way Github itself does. More specifically, it doesn't parse invalid HTML the same way Github does, or at least invalid HTML comments. If you have an HTML comment containing --, Github ignores this invalidity and still treats the overall comment as HTML and doesn't turn it into a paragraph.

I would say this is a bug rather than a feature, since no user-facing tool I've tried (e.g. Marked 2, MarkdownPad, MacDown, etc.) ever insists on HTML being valid HTML and reverting it to a paragraph otherwise. Likewise, of the parsers I've tried, mdast seems to be unique in this respect.

Store all links in central place, not just referenced links

This would make sure just one reference is created when stringifying with referenceLinks: true:

[a link][link] and [another link](http://example.com)

[link]: http://example.com

Yields:

[a link][1] and [another link][2]

[1]: http://example.com
[2]: http://example.com

Extending grammar

How would one extend the parsers grammar? I understand that I can create a plugin and create a parser that inherits from mdast's parser, but writing the tokenizer and whatever else is needed is unclear.

Do you mind helping me out with one example?

Let's say I have some custom markdown that looks like this:

+++small

SOME TEXT CONTENT

+++

How would one add this grammar to the parser such that content enclosed in +++ is marked as children? For example:

{
  type: MY_CUSTOM_TYPE, // captured by enclosing +++
  size: 'small',
  children: [{
    type: 'text'
    ....
  }]
}

I'm open to ideas if you have a better idea for how the ast should look. You're certainly more expert than I am. :)

Thanks for your time.

Lifecycle events for plugins

Hey! Great work on mdast, it's really rad. I'm using it to set up a build system for the Node.js documentation WG. As part of that effort, I started building count-docula, which currently consumes mdast and presents its own CLI. If possible, I'd love to make count-docula just another plugin that mdast consumes.

What count-docula is currently doing:

  • Given a directory, it collects every markdown file within that directory.
    • This duplicates work from mdast's CLI.
  • For each markdown file, the plugin looks for three directives (import, export, and anchor.)
    • Anchors are user-defined ids that are assigned to the closet parent block element — they're there so that heading text can be changed independent of links, and so that links can be tracked and verified across documents.
    • Once all anchors are found, then all exports are determined. These are links that will be made available when "importing" the current document.
    • Finally, the import directives are hit.
      • Importantly, import directives are able to bring in documents from outside the original working set.
  • The plugin artificially blocks process from completing (using a function passed as an option) until all documents have been visited, and their anchors, exports, and imports declared.
    • Warnings are added at this stage for unknown|duplicate reference link definitions, bad imports, and bad exports.
  • Once all documents have been visited & resolved, the plugin continues to the "render" or "test" task.
    • The test task augments lint with a test checking to see that no documents in the original working set are "orphaned" — only one document in the original working set may have no incoming links.
      • Otherwise, this step replicates much of mdast's CLI machinery.
    • The build task accepts a template for rendering the document into, but otherwise works the same as mdast's CLI machinery.

In order to turn count-docula into a plugin:

  • mdast's plugin API would need a lifecycle event for "the CLI has collected all of the docs in this dir." That event may be asynchronous, so mdast should delegate to the plugin before continuing (via a callback or other method.)
  • The directory set API may have to be capable of adding new source md document paths and making the resulting ASTs available to the plugin.

Something like:

module.exports = attacher(md, opts) {
  md.onDocsCollected((workingSet, next) => {
    // workingSet is an "array-ish" set of all of the `File` objects that
    // mdast's cli found.
    workingSet.parseEach(({filename, ast}, next) => {
      // search for documents to import from the AST
      workingSet.add('some/new/path')
      next()
    }, function(err) {
      workingSet.forEach(({filename, ast}) => {
        // resolve all of the links, then let `mdast` know that
        // the workingSet's files are ready to be rendered / tested / etc.
        // if the workingSet's files were parsed, use those asts
        // instead of parsing again. Otherwise parse them.
        next()
      })
    })
  })
}

Of course, there's zero pressure to do this — or if you'd like I would be happy to take a stab at implementing it. A workingSet API seems like a natural place to add meta information for other plugins, as well — for example, providing a template/framing API for mdast-html.

Thanks again, and great work on mdast!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.