eemeli / yaml Goto Github PK

View Code? Open in Web Editor NEW

1.2K 14.0 101.0 8.31 MB

YAML parser and stringifier for JavaScript

Home Page: https://eemeli.org/yaml

License: ISC License

JavaScript 0.70% TypeScript 99.30%

yaml yaml-parser yaml-editor

yaml's People

Stargazers

Watchers

Forkers

ikatyang jmcken8 canvaspixels dubzzz mikeralphson tripodsan isaacs yumeru ponelat njzydark eb-forks blenderdude burtharris tyolab shirk3y whoaa512 matt-ben monotykamary tetedacier avtomon donbowman ccf19881030 dota17 kyeotic yuhr iann0036 webmaster-zundux fkirc patsplat nageshlop 1nvitr0 muff1nman easbar fearthecowboy half-shot gorkem cybernetics puneet-porwal-tfs mor3x witless maunicmer 0048503167566 lxmesh cemerick andreynovik1993 jjkavalam thom1729-forks lobidu nathan-fenner shirk33y snakeyaml belgattitude sitedata nishad golergka to-ph remcohaszing litcloud9 ericwittmann linghaosu js665999 ladyk-21 schweinepriester kebetsi semabigcorp quezler janv rubenhak xlee coredevorg niaeashes mguella icholy mitgedanken-lab ammr771 v3g42 devanshukoli featherbear thibthib mhmh261 gaochaodd kkpan11 samkenxstream dregimbal1 bambuchaadm revolunet tanusribaidya leena8686 manunio elob scantist-ossops-m1 danielbayley nhannh-senspark guieco no92 nilcons-contrib sharkgurl wedaren scantist-ossops-m2 bluelovers

yaml's Issues

Sequence items must not have preceding content on the same line

It seems that the parser is unable to parse nested arrays in some cases.

Version:

1.0.0-rc.7

Steps to reproduce:

const YAML = require('yaml').default

const yamlContent = 
`
content:
  arrayOfArray:
  -
    - first: John
      last: Black
    - first: Brian
      last: Green
  -
    - first: Mark
      last: Orange
  -
    - first: Adam
      last: Grey
`

console.log(YAML.parse(yamlContent))

Result:

YAMLSemanticError: Sequence items must not have preceding content on the same line

Expected result:

Parsed JS object with nested arrays.

SyntaxError: Map keys must be unique; "null" is repeated

version: 1.0.0-beta.5

input:

{ ? : 123 }

output:

{ null: 123, null: null }

SyntaxError: Map keys must be unique; "null" is repeated

expected:

{ null: 123 } without error.

Document the schema tag object structure

It's not enough to just say "look at the code", esp. as the code in question is not well commented.

Invalid serializing of literal "-"

This fails:

YAML.parse(YAML.stringify(YAML.parse(`
a: "-"
`)));

with YAMLSemanticError: Sequence items must not have preceding content on the same line. This is because stringify() serializes it like so:

a: -

SyntaxError: Sequence items are not allowed on the same line with map keys

First of all, thanks for the great parser so that I can use it to add YAML support in Prettier.

(from prettier/prettier#4563 (comment))

version: 1.0.0-beta.5

input:

aliases:
  - docker:
      - image: circleci/node:8.11.2
  - key: repository-{{ .Revision }}

output:

[ { type: "DOCUMENT", contents: [ 
    { type: "MAP", items: [
        { type: "PLAIN", strValue: "aliases" },
        { type: "MAP_VALUE", node: {
            type: "SEQ", items: [
                { type: "SEQ_ITEM", node: { 
                    type: "MAP", items: [
                        { type: "PLAIN", strValue: "docker" },
                        { type: "MAP_VALUE", node: {
                            type: "SEQ", items: [
                                { type: "SEQ_ITEM", node: {
                                    type: "MAP", items: [
                                        { type: "PLAIN", strValue: "image" },
                                        { type: "MAP_VALUE", node: {
                                            type: "PLAIN", strValue: "circleci/node:8.11.2"
    } } ] } } ] } } ] } } ] } } ] },
    { type: "SEQ", items: [
        { type: "SEQ_ITEM", node: {
            type: "MAP", items: [
                { type: "PLAIN", strValue: "key" },
                { type: "MAP_VALUE", node: {
                    type: "PLAIN", strValue: "repository-{{ .Revision }}"
} } ] } } ] } ] } ]

SyntaxError: Sequence items are not allowed on the same line with map keys (4:3)
  2 |   - docker:
  3 |     - image: circleci/node:8.11.2
> 4 |   - key: repository-{{ .Revision }}
    |   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  5 |

expected:

[ { type: "DOCUMENT", contents: [
    { type: "MAP", items: [
        { type: "PLAIN", strValue: "aliases" },
        { type: "MAP_VALUE", node: {
            type: "SEQ", items: [
                { type: "SEQ_ITEM", node: {
                    type: "MAP", items: [
                        { type: "PLAIN", strValue: "docker" },
                        { type: "MAP_VALUE", node: {
                            type: "SEQ", items: [
                                { type: "SEQ_ITEM", node: {
                                    type: "MAP", items: [
                                        { type: "PLAIN", strValue: "image" },
                                        { type: "MAP_VALUE", node: {
                                            type: "PLAIN", strValue: "circleci/node:8.11.2"
                } } ] } } ] } } ] } }
                { type: "SEQ_ITEM", node: {
                    type: "MAP", items: [
                        { type: "PLAIN", strValue: "key" },
                        { type: "MAP_VALUE", node: {
                            type: "PLAIN", strValue: "repository-{{ .Revision }}"
} } ] } } ] } } ] } ] } ]

with no error.

Missing error.source for #6

version:
1.0.0-beta.6

input:

abc: 123
def

output:

YAMLSemanticError {
  name: "YAMLSemanticError",
  message: "Implicit map keys need to be followed by map values", 
  source: undefined
}

expected:

error.source exists.

Maintain the loc info from input text? (CRLF)

version: 1.0.0-rc.7

yaml/src/cst/parse.js

Line 7 in c04ab2c

if (src.indexOf('\r') !== -1) src = src.replace(/\r\n?/g, '\n')

It's surprising that the loc info is not same as the input one when it comes to CRLF:

const YAML = require("yaml").default;

const text = "a:\r\n  123\r\n";
const cst = YAML.parseCST(text);

const plain = cst[0].contents[0].items[1].node;
plain.strValue; //=> "123"

text.slice(plain.valueRange.start, plain.valueRange.end); //=> " 12"

Document.toString result is different from the input document

I have an application that needs to manipulate the YAML document, add / remove some entries, and then render the new document back again to a file, keeping all the comments. The problem is, when the rendering finishes, the final product is removing the linebreaks and changing the comments scope.

Reproducer:

const doc = yaml.parseDocument(
  '# This comment is ok\n' +
  'entryA:\n' +
  '  - foo\n' +
  '\n' +
  'entryB:\n' +
  '  - bar # bar comment\n' +
  '\n' +
  '# Ending comment\n'
)
console.log(doc.toString())

Result:

# This comment is ok
entryA:
  - foo
entryB:
  - bar # bar comment # Ending comment

When I add a second ending comment, the behavior changes:

doc = yaml.parseDocument(
  '# This comment is ok\n' +
  'entryA:\n' +
  '  - foo\n' +
  '\n' +
  'entryB:\n' +
  '  - bar # bar comment\n' +
  '\n' +
  '# Ending comment\n' +
  '# Ending comment 2\n'
);
console.log(doc.toString())

Result:

# This comment is ok
entryA:
  - foo
entryB:
  - bar # bar comment
  # Ending comment
  # Ending comment 2

The identation of the comments is important for me, because I am giving my users some instructions of how to configure their application using the YAML file. So if my comments end up in the wrong scope, my users will think that their input should be done in that scope, which will be wrong.
Is there a way to maintain the scope of my comments and avoid the linebreaks removal?

I was expecting this:

# This comment is ok
entryA:
  - foo

entryB:
  - bar # bar comment

# Ending comment
# Ending comment 2

$ref syntax or other?

Following this project with interest!

Are you planning to support resolving and loading external yaml files somehow?

False positive for checkKeyLength

version:
master (6f30489)

Input:

{x:
12345678901234567890123456789012
12345678901234567890123456789012
12345678901234567890123456789012
12345678901234567890123456789012
12345678901234567890123456789012
12345678901234567890123456789012
12345678901234567890123456789012
12345678901234567890123456789012
12345678901234567890123456789012
12345678901234567890123456789012
12345678901234567890123456789012
12345678901234567890123456789012
12345678901234567890123456789012
12345678901234567890123456789012
12345678901234567890123456789012
12345678901234567890123456789012
12345678901234567890123456789012
12345678901234567890123456789012
12345678901234567890123456789012
12345678901234567890123456789012
12345678901234567890123456789012
12345678901234567890123456789012
12345678901234567890123456789012
12345678901234567890123456789012
12345678901234567890123456789012
12345678901234567890123456789012
12345678901234567890123456789012
12345678901234567890123456789012
12345678901234567890123456789012
12345678901234567890123456789012
12345678901234567890123456789012
12345678901234567890123456789012
}

Output:

SemanticError: The "undefine...ndefined" key is too long

Expected:
No error.

Is there a way to sort object keys when stringifying?

Is there an option or way of sorting keys when stringifying objects? Right now it seems random (as keys are randomly hashed in memory in the browser and Node I believe).

When stringifying JS objects, it is very important to get a predictable YAML that can be diffed and easily read by a human, which translates into having the keys from an object sorted.

Example:

console.log( YAML.stringify({ c: 30, a: 10, b: 20 }) );
// would print the following every time:
a: 10
b: 20
c: 30

Support Node Stream input

This should not be too difficult to implement at a document level; not sure if it would make sense at deeper levels.

Add option to parse !!map as a Map rather than an Object

Based on discussion in #46. In JS Object keys are forced to string, while Map keys are not, and both are arguably good representations of a YAML !!map.

This also requires stringifier support, that's in issue #49.

Distinguish SemanticError and SyntaxError

From the perspective of a formatter, we don't care semantic errors like #3, we should format it anyway, so it'd be better to have a way to distinguish them.

FlowChar: node.setOrigRanges is not a function

version: 1.0.0-rc.8

const YAML = require('yaml').default

YAML.parseCST('{ : }\r\n').setOrigRanges()

Expected:

no error.

Actual:

TypeError: node.setOrigRanges is not a function
    at items.forEach.node (yaml/dist/cst/FlowCollection.js:144:21)

yaml/src/cst/FlowCollection.js

Lines 113 to 119 in 2163323

 setOrigRanges(cr, offset) { 

 offset = super.setOrigRanges(cr, offset) 

 this.items.forEach(node => { 

 offset = node.setOrigRanges(cr, offset) 

 }) 

 return offset 

 }

yaml/src/cst/FlowCollection.js

Line 32 in 2163323

this.items = [{ char, offset: start }]

Doesn't quote colons

Not sure about the exact spec (if more special characters need to be quoted and maybe in other combinations), but at least this library doesn't quote single colons, which leads it to generate invalid YAML:

$ npm init
...
$ npm install yaml
npm notice created a lockfile as package-lock.json. You should commit this file.
npm WARN [email protected] No description
npm WARN [email protected] No repository field.

+ [email protected]
added 1 package in 0.825s

$ node -p "require('yaml').stringify({ key: ':' })"
key: :

# PyYaml doesn't like it:

$ python3 -c 'import yaml; from io import StringIO; print(yaml.load(StringIO("key: :")))'
yaml.scanner.ScannerError: mapping values are not allowed here
  in "<file>", line 1, column 6

Missing error `Expected flow map to end with }`

version:
1.0.0-beta.6

input:

output:

no error

expected:

Expected flow map to end with }

Stringify method remove Date Objects with empty object

const yaml = require('yaml');
const myDate = new Date();
const yamlStr = yaml.stringify(myDate);
console.log(myDate);
console.log(yamlStr);

output:

2018-12-22T07:29:51.234Z

{}

Wrong output produced by empty value with explicit key in flow sequence

version:
master (6f30489)

Input:

[? 123]

Output:

[123, { "": null }]

Expected:

[{ 123: null }]

SyntaxError: Sequence items are not allowed on the same line with map keys

Sorry for the bad title, I'm not sure how to name this issue.

version:
1.0.0-beta.6

input:

(extract from https://github.com/prettier/prettier/blob/d0cd112/.circleci/config.yml)

aliases:
  - restore_cache:
      - v1-yarn-cache
  - save_cache:
      paths:
        - ~/.cache/yarn
  - &restore_deps_cache
    keys:
      - v1-deps-cache-{{ checksum "yarn.lock" }}

output:

SyntaxError: Sequence items are not allowed on the same line with map keys (7:3)
  5 |       paths:
  6 |         - ~/.cache/yarn
> 7 |   - &restore_deps_cache
    |   ^^^^^^^^^^^^^^^^^^^^
> 8 |     keys:
    | ^^^^^^^^^
> 9 |       - v1-deps-cache-{{ checksum "yarn.lock" }}
    | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

expected:

no error.

P.S. It parses correctly if there's no indentation:

aliases:
- restore_cache:
    - v1-yarn-cache
- save_cache:
    paths:
      - ~/.cache/yarn
- &restore_deps_cache
  keys:
    - v1-deps-cache-{{ checksum "yarn.lock" }}

TypeScript Type declarations

TypeScript definition files or something in DefinitelyTyped would be great. Do you have any plans for them?

I may create some definition files myself on DefinitelyTyped - as I would really like to use this package. Unless you have any objections? It would probably only be basic though.

Great package and great work on the documentation by the way! :)

Range in characters is sub-optimal for error reporting to humans

As YAML is line-delimited (can't be strung together minified like JSON) it makes at least some sense to be able to report errors (both syntactically and semantically) as a line number and column number pair.

Is it simple to add a helper routine to achieve this? Happy to PR with a bit of pointing in the right direction.

My end use-case is to turn JSON pointer references to schema violations back into line number, column number pairs.

List like [[":,"]] is not parsed properly

I wrote a property-based check for the yaml library using fast-check (https://github.com/dubzzz/fast-check) and it uncovered the following error:

> const YAML = require('yaml');
undefined
> YAML.stringify([[':,']])
'- - :,\n'
> YAML.parse('- - :,\n')
YAMLSyntaxError: Document is not valid YAML (bad indentation?)

I don't think this issue will impact us anytime soon, but you might want to be aware of it.

Here's my code (TypeScript) if you want to experiment more. Right now it will trigger mostly on the [[":,"]] bug and so will not make a lot of progress. I can only continue testing once that bug is fixed.

import fc = require('fast-check');
import YAML = require('yaml');
import deepEqual = require('deep-equal');
import { Arbitrary } from 'fast-check';

// Unfortunately it's hard to generate fully recursive type definitions
// so we'll unroll by hand a couple of times

function makeRecord<T, U>(a: Arbitrary<T>) {
  return fc.record({
    key1: a,
    key2: a
  });
}

const arbitraryJson0 = fc.oneof<string|number|boolean|null>(
  fc.string(),
  fc.integer(),
  fc.float(),
  fc.boolean(),
  fc.constant(null),
);


const arbitraryJson1 = fc.oneof<any>(
  arbitraryJson0,
  fc.array(arbitraryJson0),
  makeRecord(arbitraryJson0)
);

const arbitraryJson2 = fc.oneof<any>(
  arbitraryJson1,
  fc.array(arbitraryJson1),
  makeRecord(arbitraryJson1),
);

const arbitraryJson3 = fc.oneof<any>(
  arbitraryJson2,
  fc.array(arbitraryJson2),
  makeRecord(arbitraryJson2),
);


// Property #1: validate that stringifying and parsing are each other's inverse
// Do a manual sampling and test so that we can continue even if we find failing cases.
function isReversible(original: any) {
    const rendered = YAML.stringify(original);
    try {
      const parsed = YAML.parse(rendered);
      return deepEqual(original, parsed);
    } catch(e) {
      // Throwing is also bad
      return false;
    }
}

const N = 10000;

fc.assert(fc.property(arbitraryJson3, isReversible), {
  numRuns: N
});

Support YAML.load('pathToYmlFile.yml')

awesome package, I'm using it on one of my project and was wondering if you are willing to support something like YAML.load() where you can read in the file from fs and parse at the same time?

If you are not against this approach, I can help send in PR.

Thanks.

Using custom tags as keys?

Thanks for this library, this looks awesome and I'm going to play with it a bit.

Can we use custom tags as keys? i.e.:

foo:
    !bar: Hello

Duplicate comment in MAP items

This only happens if there's no trailing newline.

Version: 1.0.0-rc.7

Input

a:
  # 123

Output

[
  { type: "PLAIN" },
  { type: "MAP_VALUE", comment: "123" },
  { type: "COMMENT", comment: "123" }
]

Expected

[
  { type: "PLAIN" },
  { type: "MAP_VALUE", comment: "123" }
]

Original post: prettier/prettier#4861

Define CST (and AST) acronyms in the README

I was clear about the meaning of AST, but hadn't actually come across the acronym CST before and had to look it up. Possibly linking to wikipedia (though that isn't a computer-science focussed article) or maybe here might be helpful?

Happy to PR if you think this is worth adding.

Duplicate comment when converting back from document

The yaml will contain duplicate comment when converting the parsed document by "toString"
Example:

test1:
  foo: #123
    bar: 1

After parseDocuments and doc.toString()

test1:
  foo: #123
    #123
    bar: 1

Alias & merge nodes are resolved too early

At the moment, alias and merge nodes are resolved during YAML.parseDocuments, and they cannot be created in a YAML.Document. Instead, new Node types should be added for them, and they should be modifiable and creatable for new Documents.

Collections should provide a default format for the scalars they contain

It should be possible to e.g. parse [ 0x01, 0x02, 0x03 ] as a sequence of hexadecimal integers, and to:

Automatically assign added items to have the same format, such that after seq.items.push(4), the sequence would be stringified as [ 0x01, 0x02, 0x03, 0x04 ].
Re-format it as a sequence of octal integers [ 0o1, 0o2, 0o3 ].

Consider adding typings

either by exposing your public interface as a .d.ts file, or just renaming all your files to .ts =)

Thanks!

Indentation level is not taken into account for trailing comments

This is split off from #28, which was reported by @douglasmuraoka.

With input like this:

a1:
a2:
  b1: c1
  b2: c2
#comment

the comment is attached to the b-level collection, rather than the root a-level collection. This means that when re-stringifying, the comment will be indented as well.

To fix, the indentation level of such trailing comments should be taken into account when determining which collection to attach them to.

Stringify of values with double and single quotes can create invalid results

This object when passed through stringify():

{ "x": "{\"module\":\"database\",\"props\":{\"databaseType\":\"postgresql\"},\"extra\":{},\"foo\":\"bar'\"}" }

Will result in the following (invalid) output:

x: "{\"module\":\"database\",\"props\":{\"databaseType\":\"postgresql\"},\"extra\\
  ":{},\"foo\":\"bar'\"}

Remove the single quote from the "bar" at the end and everything will stringify okay.

Allow disabling automatic line wrapping.

Summary

Stringification (of unmodified nodes) applies it's own conventions on desired line length

Example

In:

metadata:
  maintainer: "[email protected]"

settings:
  kubeContext: "my.cloud"
  serviceAccount: "tiller"
  tillerVersion: "--canary-image"
  slackWebhook: "https://hooks.slack.com/services/XXXXXXXXXX/XXXXXXXXX/XXXXXXXXXXXXXXXXXXXXXXXX"

Code:

const yaml = YAML.parseDocument(input, {
  keepCstNodes: true,
})

// I would add some mutations on individual nodes here, but in this example there aren't any

process.stdout.end(yaml.toString())

Out:

metadata:
  maintainer: "[email protected]"
settings:
  kubeContext: "my.cloud"
  serviceAccount: "tiller"
  tillerVersion: "--canary-image"
  slackWebhook: "https://hooks.slack.com/services/XXXXXXXXX/XXXXXXXXX/XXXXXXXXXXXXXXXXXXXXX\
    XXX"

Use Case

I want to make use of this library to automate adjusting specific fields (Pairs) in a yaml file as part of a CI pipeline. However, I want the changes made to be as minimal as possible, so as to avoid confusion on what is actually different. Thus, I would like yaml to keep the current style (including max line length) intact and only output those nodes differently which were adjusted.

Scalar formatting should be retained & exposed

Scalar values may be formatted in a number of different ways, even for a single tag. The source format should be retained and exposed at the object level of the API, and used (when appropriate) as the default when re-stringifying.

Poll: Comments & spaces among document directives

YAML documents may have a directives section at their start, separated from the rest of the document by a --- line. The two valid directives are %YAML for defining an explicit YAML version and %TAG for defining tag prefixes. These directives may of course have comment or empty lines before, between, and after them, and they may have comments on their own lines as well.

At the moment, yaml extracts all of these comments into a single doc.commentBefore string value. When stringifying, this comment is printed before all of the directive lines. The current plan is to add to this a doc.spaceBefore boolean value, which if true would get you a blank line just after the comment.

Now, here's my question: Is this okay? Or do you have a use case for which this simplification doesn't really work? The tricky part here is that yaml currently doesn't really keep the original document's directives when stringifying; instead this metadata is generated on the spot from the doc.version and doc.tagPrefixes values, which are not represented internally as nodes, so they don't carry comment and spacing data.

Please react to this with a 👍 if you see no reason to change the current situation, or add a comment if you'd like a different API.

Duplicate << keys shouldn't be marked as error when merge=true

version: 1.0.0-rc.7

var YAML = require("yaml").default;

YAML.parse(`
.anchors:
  - &anchor1
    key: value
  - &anchor2
    another: prop
foo:
  bar: baz
  <<: *anchor1
  <<: *anchor2
`, { merge: true }) //=> YAMLSemanticError: Map keys must be unique; "<<" is repeated

Expected: no error

Original post: prettier/prettier#4919

Empty lines should not be discarded

This is split off from #28, which was reported by @douglasmuraoka.

Empty lines can help semantically separate blocks of content, and so they should not be discarded as is currently done.

Another crazy bug the automatic tester found

Hi Eemeli,

It's me again! :)

Just ran the autotester on 1.1.0 and it came up with the following:

{"key1":{"key1":{"key1":"","key2":""},"key2":[]},"key2":[[">####################################\"##########################'####\\P#"]]}
--[ roundtrips to ]--->
{"key1":{"key1":{"key1":"","key2":""},"key2":[]},"key2":[[">####################################\"##########################'####\\  #"]]}

Repro:

const YAML = require('yaml');
const assert = require('assert');


const input = {"key1":{"key1":{"key1":"","key2":""},"key2":[]},"key2":[[">####################################\"##########################'####\\P#"]]};

const stringified = YAML.stringify(input);
console.log("------------\n" + stringified + "\n-----------------_");
const parsed = YAML.parse(stringified);

console.log(JSON.stringify(parsed));

assert.deepEqual(input, parsed);

YAML output:

key1:
  key1:
    key1: ""
    key2: ""
  key2:
    []
key2:
  - - ">####################################\"##########################'####\\
      \P#"

Failed to parse empty string

import YAML from 'yaml';

YAML.parse('');

The above code throws an error. However it should return an empty object {} or null.

Classes in bundled code breaks IE11 compatibility

I've been having a real hard time debugging an issue with IE11 and it turns out the offending party is yaml, since your babel config apparently doesn't transform classes. Is there a special reason for this? I was surprised since it's the first dependency I've come across with this issue. 80% of our customers use IE11 (yeah, I know 🙄) so It's hard to ignore.

Anyway, I'll just write out yaml for now. Thank you for an otherwise great library.

Collection formatting should be retained & exposed

Collection values may be formatted as either block or flow collections. The source format should be retained and exposed at the object level of the API, and used (when appropriate) as the default when re-stringifying.

Error: Flow sequence contains an unexpected ,

version:
1.0.0-beta.6

input:

[ , ]
---
[ 123,, ]

output:

YAMLSyntaxError: Flow sequence contains an unexpected ,
YAMLSyntaxError: Flow sequence contains an unexpected ,

expected:

[ null ]
---
[ 123, null ]

P.S. There's no error for its flowMap version but its YAMLMap#items is also empty.

Missing error.source for 'SyntaxError: Document is not valid YAML (bad indentation?)'

I'm not sure if this is intended, but if so, is it safe to rely on that error.source === null means the error range is the entire document?

Support stringifying JavaScript Map as !!map

A Map should obviously be a !!map.

Following discussion in #46

TypeError: Cannot read property 'type' of undefined

version: 1.0.0-beta.5

input:

abc: 123
def

output:

TypeError: Cannot read property 'type' of undefined
    at YAMLMap.resolveBlockMapItems (./node_modules/yaml/dist/schema/Map.js:152:34)
    at YAMLMap.parse (./node_modules/yaml/dist/schema/Map.js:71:14)
    at Object.resolve (./node_modules/yaml/dist/schema/failsafe.js:22:34)
    at Tags.resolveNode (./node_modules/yaml/dist/Tags.js:129:29)
    at Tags.resolveNodeWithFallback (./node_modules/yaml/dist/Tags.js:153:22)
    at Document.resolveNode (./node_modules/yaml/dist/Document.js:302:22)
    at ./node_modules/yaml/dist/Document.js:116:35
    at Array.forEach (<anonymous>)
    at Document.parse (./node_modules/yaml/dist/Document.js:114:16)

expected:

A syntax error.

Empty blockValues are resolved as null

version:
master (6f30489)

Input:

Output:

null

Expected:

''

Space gets duplicated

Like #56, not an issue that we encountered in the wild but one that my property-based test uncovered.

> YAML.stringify([{"key1":[],"key2":"!\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"#\"\\ '"}])
'- key1:\n    []\n  key2: "!\\"\\"\\"\\"\\"\\"\\"\\"\\"\\"\\"\\"\\"\\"\\"\\"\\"\\"\\"\\"\\"\\"\\"\\"\\"\\"\\"\\"\\"\\"\\"\\"\\"\\"#\\"\\\\\n    \\ \'"\n'

> YAML.parse('- key1:\n    []\n  key2: "!\\"\\"\\"\\"\\"\\"\\"\\"\\"\\"\\"\\"\\"\\"\\"\\"\\"\\"\\"\\"\\"\\"\\"\\"\\"\\"\\"\\"\\"\\"\\"\\"\\"\\"#\\"\\\\\n    \\ \'"\n')
[ { key1: [],
    key2: '!""""""""""""""""""""""""""""""""""#"\\  \'' } ]
                                                   ^~~~ extra space inserted here

I updated the test script to return true in case of parsing errors, and passed some arguments to fc.string() to control for the length. Had to run it a couple of times to come up with this error.

import fc = require('fast-check');
import YAML = require('yaml');
import deepEqual = require('deep-equal');
import { Arbitrary } from 'fast-check';

// Unfortunately it's hard to generate fully recursive type definitions
// so we'll unroll by hand a couple of times

function makeRecord<T, U>(a: Arbitrary<T>) {
  return fc.record({
    key1: a,
    key2: a
  });
}

const arbitraryJson0 = fc.oneof<string|number|boolean|null>(
  fc.string(0, 100),
  fc.integer(),
  fc.float(),
  fc.boolean(),
  fc.constant(null),
);


const arbitraryJson1 = fc.oneof<any>(
  arbitraryJson0,
  fc.array(arbitraryJson0),
  makeRecord(arbitraryJson0)
);

const arbitraryJson2 = fc.oneof<any>(
  arbitraryJson1,
  fc.array(arbitraryJson1),
  makeRecord(arbitraryJson1),
);

const arbitraryJson3 = fc.oneof<any>(
  arbitraryJson2,
  fc.array(arbitraryJson2),
  makeRecord(arbitraryJson2),
);


// Property #1: validate that stringifying and parsing are each other's inverse
// Do a manual sampling and test so that we can continue even if we find failing cases.
const N = 100000;

fc.assert(fc.property(arbitraryJson3, fc.context(), (original, context) => {
  const rendered = YAML.stringify(original);
  try {
    const parsed = YAML.parse(rendered);
    context.log('Parsed as ' + JSON.stringify(parsed));
    return deepEqual(original, parsed);
  } catch(e) {
    // Throwing is also bad but not what we're looking for right now
    return true;
  }
}), {
  numRuns: N
});

Long strings that start with spaces aren't rendered properly (and lose newlines)

I'm not well-versed enough in the YAML spec to tell you what the proper terms are to use in this bug report. It seems to be in "block mode" (?) rendering of long strings starting with spaces.

They don't need to contain newlines to trigger the issue, but if they do the newlines are lost.

The issue is in rendering, not parsing, as all YAML engines seem to parse the string the same (which is different from the original string).

Summary

If the renderer switches to some particular rendering mode marked by ">-2" (seems to be triggered by the string starting with a space) then if word wrap occurs the word wrap is interpreted as a literal newline. Actual newlines that are in this block will be lost upon parsing as well.

A string that looks like this:

" very long string that will cause wrapping. \n a newline"

Will render like this:

field: >-2
    very long string that will cause
   wrapping.
   a newline

And when parsed back will be parsed as:

" very long string that will cause \n wrapping. a newline"

So the wrap is interpreted as a newline and the actual newline is lost.

Repro

Here is the reproduction (still occurs with version 1.0.3).

const YAML = require('yaml');
const assert = require('assert');

const obj = {
    Field: ' very long line that starts with a space. very long line that starts with a space.\nstart on a new line'
};

const yamlified = YAML.stringify(obj);
console.log(yamlified);
const parsed = YAML.parse(yamlified);

assert.deepEqual(obj, parsed);

Output:

Field: >-2
   very long line that starts with a space. very long line that starts with a
  space.
  start on a new line

Assertion fails:

AssertionError [ERR_ASSERTION]: { Field: ' very long line that starts with a space. very long line that starts with a space.\nstart on a new line' } deepEqual { Field: ' very long line that starts with a space. very long line that starts with a\nspace. start on a new line' }

NOTE: We're normally using this library in yaml-1.1 mode (because we're generating output for AWS CloudFormation), but this issue also crops up in YAML 1.2 mode as the repro shows.

Indentation of array elements is not standard

Block sequences (ie., arrays) need not be indented. As the 1.1 spec says, the dash is considered part of the indentation. So this is the correct, idiomatic way of representing arrays:

array:
- one
- two
- three

The 1.2 spec does not explain this very well (it's not very well written as specs go), but this is the style used on the current yaml.org front page.

The current serialization adds one indentation level too many:

array:
  - one
  - two
  - three

This is admittedly a frequent point of confusion among YAML users; some people do indent. But as evidence of implementations that do it right, look at Ruby's standard yaml library. At least there should be a configuration option.

	setOrigRanges(cr, offset) {
	offset = super.setOrigRanges(cr, offset)
	this.items.forEach(node => {
	offset = node.setOrigRanges(cr, offset)
	})
	return offset
	}

eemeli / yaml Goto Github PK

yaml's People

Stargazers

Watchers

Forkers

yaml's Issues

Version:

Steps to reproduce:

Result:

Expected result:

Summary

Example

Use Case

Summary

Repro

Recommend Projects

Recommend Topics

Recommend Org