Giter Club home page Giter Club logo

hsyaml's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

hsyaml's Issues

Add instance Functor Doc

I was looking at the result of decodeNode and wanted to get rid of the source locations via void

test-case 9KAX

https://matrix.yaml.io/details/9KAX.html doesn't pass as of HsYAML-0.1.2.0 specifically because of

---
!!map
&a8 !!str key8: value7

which yaml2token decodes as

<stdin>:1:0: BeginDocument  | 
<stdin>:1:0: DirectivesEnd  |  ---
<stdin>:1:3: BeginNode      | 
<stdin>:1:3: BeginScalar    | 
<stdin>:1:3: EndScalar      | 
<stdin>:1:3: EndNode        | 
<stdin>:1:3: Break          |     \n

<stdin>:2:0: Unparsed       |  !!map
<stdin>:2:5: Unparsed       |       \n

<stdin>:3:0: Unparsed       | "&a8 !!str key8: value7"
<stdin>:3:22: Unparsed       |                        \n

<stdin>:4:0: EndDocument    | 

in other words, parsing chokes on the !!map token

TODO: review YAML 1.2 spec

Decoding different yaml types for a single node is tricky

In some cases you might have yaml like:

list-o-things:
  - "Im text"
  - 2
  - name: complex-case
    extra-info: stuff

In yaml / aeson that is pretty simple to handle:

newtype Thing = Thing Text

instance FromJSON Thing where
  parseJSON (String a) = pure $ Thing a
  parseJSON (Object v) =   -- v is a HashMap and quite easy to work with

In HsYAML there doesn't seem to ways to do this without getting deep into library internals.

instance FromYAML Thing where
  parseYAML (Scalar _ (SStr a)) = pure $ Thing a
  parseYAML (Mapping _ _ _) = -- withMap would be redundant, but how else to write the parser?

Data.YAML.Compat

I'm adding support for HsYAML to hlint and it would be useful for us if HsYAML and/or HsYAML-aeson exported a compatibility module with an identical interface to the yaml library. Would such an interface be useful to others as well?

Decimal numbers have rounding error

The aeson package uses Scientific which avoids numeric instability from floating point numbers. This package uses Double, and that causes its behavior to diverge from that of aeson when dealing with decimal numbers. Here's an example ghci session that illustrates the issue:

λ fromRight "error" (Data.Aeson.eitherDecode "{\"foo\": 0.1}") :: Data.Aeson.Value
Object (fromList [("foo",Number 0.1)])
λ fromRight "error" (Data.YAML.Aeson.decode1Strict "foo: 0.1") :: Data.Aeson.Value
Object (fromList [("foo",Number 0.1000000000000000055511151231257827021181583404541015625)])

Would you be willing to accept a pull request that solves this problem? If so, any pointers as to what solution you'd prefer?

Empty lines fail to round trip

It seems like empty lines do not have a corresponding Event:

a: 1

# comment
b: 2

Results in (after passing through yaml-test yaml2yaml):

a: 1
# comment
b : 2

Would it be possible to add a new event or comment attribute to preserve empty lines?

Inline trailing comments fail to round trip

It seems like the comment event is missing some information to indicate if it is inline or standalone:

key: "value" # a comment

Results in (after passing through YE.writeEvents YT.UTF8 . map eEvent . rights . YE.parseEvents

key: "value"
# a comment

The events are:

MappingStart Nothing Nothing Block
Scalar Nothing Nothing Plain "key"
Scalar Nothing Nothing DoubleQuoted "value"
Comment " a comment"
MappingEnd

Would it be possible to add an attribute to the Comment event to preserve its line position?

Bad #if breaks build on ghc-8.4.2

This #if:

#if !MIN_VERSION_mtl(2,2,2) || (__GLASGOW_HASKELL__ == 804 && __GLASGOW_HASKELL_PATCHLEVEL1__ < 2)

helps if you are on the very first release of ghc-8.4.1, but if you are on ghc-8.4.2 patchlevel 1 or ghc-8.4.3 patchlevel 1 it breaks the build.

Serializing `[Event]`s ought not emit invalid YAML streams

See also #15 (comment)

We should have two modes: one which throws an exception when an invariant is broken by the stream; and one mode which silently fixes up the stream; this could either be a separate phase (i.e. a separate "stream transformer" function) or be integrated into the printing phase.

Hyphens in key names (dumper)

I've been trying out dumping YAML from the structure we use to represent pandoc command-line options (with HsYAML-aeson).

I'm getting this:

abbreviations: null
ascii: false
"base-header-level": 1
"cite-method": Citeproc
etc.

Keys with hyphens are quoted. This isn't necessary in YAML, so the quotes are overkill and undesirable. Can the code that determines when a key name needs quoting be tweaked to make it less aggressive? (I could take a look if you point me to the right place in the code.)

I'm assuming this is an issue with HsYAML itself rather than HsYAML-aeson.

[EDIT: fixed YAML nomenclature]

Export getDoc

It's just slightly inconvenient to have to pattern match on Doc.

test-case S98Z

https://matrix.yaml.io/details/S98Z.html doesn't pass as of HsYAML-0.1.2.0

empty block scalar: >
 
  
   
 # comment

I think this test-case is illicit under the YAML 1.2 specification as there's leading empty lines that contain more spaces than the first non-empty line; but this needs a proper refutation

GHC.Generics instances for {From,To}YAML

It would be great if HsYAML would be able to generate the instances automagically through generics, like aeson does, so that there would be no need to tediously mirror data declarations with trivial instances.

Laziness of parser

If I do

case decode inp of
  Right (opt :: Opt : _) -> doSomethingWith opt
  Right [] -> oneKindOfError
  Left (pos, err) -> anotherKindOfError

and inp is something like

---
foo: bar
...

baz

then I get an error when it hits baz, even though it can successfully parse the one YAML document I'm asking for. Is this expected? Could the parser be made lazier, so it returns the result after having parsed the first YAML document in the stream, and doesn't worry about the rest unless I ask for it?

Alternatively, could there be an option telling it to ignore non-YAML content after the first YAML document, if you just use decode1?

dlist dependency needs to be updated?

When installing on gentoo with dlist-0.8.0.2 I get the following error. It is fixed by updating the dependency to dlist-0.8.0.4.

Preprocessing library for HsYAML-0.1.1.2..
Building library for HsYAML-0.1.1.2..
[1 of 7] Compiling Util ( src/Util.hs, dist/build/Util.o )
[2 of 7] Compiling Data.YAML.Token.Encoding ( src/Data/YAML/Token/Encoding.hs, dist/build/Data/YAML/Token/Encoding.o )
[3 of 7] Compiling Data.YAML.Token ( src/Data/YAML/Token.hs, dist/build/Data/YAML/Token.o )

src/Data/YAML/Token.hs:28:1: error:
Data.DList: Can't be safely imported! The module itself isn't safe.

xref gentoo-haskell/gentoo-haskell#814

Maping fields order is not preserved

Encoding the following mapping

encode [mapping ["a" .= (1 :: Int), "d" .= (2 :: Int), "c" .= (3 :: Int), "b" .= (4 :: Int)]]

results in:

a: 1
b: 4
c: 3
d: 2

Consider standard Haskell license, or document the reasons for divergence

HsYAML is particularly interesting to me, as it compiles with GHCJS, while the normal yaml does not. That would be useful in HLint, for a web version. However, I strive to avoid GPL dependencies where possible, since in my experience they reduce adoption and don't increase contributions. My current thought is to have an explicit GHSJS Cabal flag + preprocessor flag, and do a compile-time switch on YAML library. That's a bit grim, and I hate writing grim code for legal reasons.

If you would consider moving to the more usual Haskell licenses of BSD/MIT/Apache that would be great - but it's your code, so entirely your choice. If not, perhaps put in the README that the project is deliberately GPL (including the reason, if you feel comfortable sharing) so that people know not to ask?

The HsYAML family needs (an) active maintainer(s)

If I understood Herbert (@hvr) correctly, he will not be available for maintenance work in the near future.
I stepped in to keep stuff buildable, but I have no genuine interest in the HsYAML family (at least not so far). I wouldn't do more than keep the package afloat on the Haskell ecosystem and merge bugfixes.

There seem a couple of active developers though that want hang on and see improvements. If you are one of them, maybe this is your call to step forward and volunteer as maintainer?

Applications in the comments or via email to me. (@andreasabel)

CC: @jgm @mightybyte @TristanCacqueray @vaibhavsagar @vijayphoenix

Puzzling build failure

I've switched to using HsYAML in pandoc, but my Travis build has a mysterious failure for ghc 8.4.1.

Building library for HsYAML-0.1.1.1..
[1 of 7] Compiling Util             ( src/Util.hs, dist/build/Util.o )
src/Util.hs:12:7: error: Not in scope: ‘liftEither’
   |
12 |     ( liftEither
   |       ^^^^^^^^^^

Other ghc versions work fine. As far as I can see, HsYAML should work for both newer and older versions of mtl. So I don't understand where this error could be coming from. Any thoughts?

Performance could be better

One pandoc user has run into an issue with a large (100k line) bibliography in YAML format (for details see jgm/pandoc#6084). Prior to pandoc 2.8 (when we used the yaml package), this was handled fairly quickly, but now that we use HsYAML it takes 18 seconds to read the bibliography. I confirmed that the slowdown is due to HsYAML, by loading the file in a GHCI session as b and trying

GHCI> :set +s
GHCI> let x = decodeNode b in x `seq` 3 -- this is just to ensure it's evaluated
(25.28 secs, 82,135,579,376 bytes)

What are the performance expectations for HsYAML? Have you made efforts to optimize here? aeson claimed decoding speeds of 46M/sec on a slower machine than mine; this file is 3M. I wouldn't expect that YAML parsing could be as fast as JSON parsing, but it would be nice to get in the 4M/sec range (10x slower than aeson).

EDIT: 82G allocated with 1G max residency seems an awful lot to parse a 3M file!

Profiling reports these as the biggest cost centers:

applyParser                  Data.YAML.Token src/Data/YAML/Token.hs:220:1-30             32.5    0.4
*>.\                         Data.YAML.Token src/Data/YAML/Token.hs:(435,5)-(439,67)      9.4   25.3
^.                           Data.YAML.Token src/Data/YAML/Token.hs:73:1-30               9.3    0.8
&                            Data.YAML.Token src/Data/YAML/Token.hs:567:1-44              4.3    4.0
<|>.decideParser             Data.YAML.Token src/Data/YAML/Token.hs:(599,7)-(609,95)      3.9    5.2
nextIf.consumeNextIf         Data.YAML.Token src/Data/YAML/Token.hs:(791,5)-(817,52)      3.1    0.4
prefixErrorWith.\            Data.YAML.Token src/Data/YAML/Token.hs:(913,5)-(917,95)      2.7    7.8
prefixErrorWith.\.reply      Data.YAML.Token src/Data/YAML/Token.hs:913:9-49              2.4    0.0
append                       Data.DList      src/Data/DList.hs:34:1-46                    1.8    2.7
/                            Data.YAML.Token src/Data/YAML/Token.hs:572:1-68              1.8    0.1
reject.\                     Data.YAML.Token src/Data/YAML/Token.hs:673:5-67              1.7    8.1
*>                           Data.YAML.Token src/Data/YAML/Token.hs:(434,3)-(439,67)      1.6    0.0
returnReply                  Data.YAML.Token src/Data/YAML/Token.hs:(387,1)-(390,52)      1.5    7.9

Heap profiling shows that the DLists account for a lot of the allocation.

test-case X38W

https://matrix.yaml.io/details/X38W.html doesn't pass as of HsYAML-0.1.2.0

{ &a [a, &b b]: *b, *a : [c, *b, d]}

fails on the first [

  ""  BeginDocument
  ""  BeginNode
  ""  BeginMapping
  "{"  Indicator
   " "  White
    ""  BeginPair
    ""  BeginNode
    ""  BeginProperties
    ""  BeginAnchor
    "&"  Indicator
     "a"  Meta
      ""  EndAnchor
      ""  EndProperties
      ""  BeginScalar
      ""  EndScalar
      ""  EndNode
      ""  BeginNode
      ""  BeginScalar
      ""  EndScalar
      ""  EndNode
      ""  EndPair
      " "  White
       ""  EndMapping
       ""  EndNode
       "Unexpected '['"  Error
       "[a, &b b]: *b, *a : [c, *b, d]}"  Unparsed
                                      "\n"  Unparsed
  ""  EndDocument

needs investigation

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.