Giter Club home page Giter Club logo

jsoneventparser's Introduction

JSON Event Parser - a pure python JSON event based parser.

Copyright (C) 2021 J. Férard https://github.com/jferard

License: GPLv3

Build Status codecov

Summary

JSON Event Parser is a toy project written to convert JSON files to XML without the need to load the whole JSON object into memory. It relies on three elements :

  • a straightforward lexer, based on the RFC 8259 ;
  • a straightforward parser, based on the lexer above and the same RFC 8259 ;
  • a simple converter from JSON to XML.

Each of these elements is a generator which can be used in a simple for loop. The lexer provides tokens, the parser provides other tokens, and the converter provides lines of an XML file.

JSON Event Parser is really slow but does not require any dependency.

If you want to use a full iterative parser, have a look at ijson that relies on YAJIL (there is also a pure Python backend).

Usage

Command line

  usage: json_event_parser.py [-h] [-hd HEADER] [-r ROOT] [-li LIST_ITEM] [-t]
                              [-f]
                              [infile] [outfile]
  
  Convert an JSON file to an XML file.
  
  positional arguments:
    infile                a JSON file to convert
    outfile               the output file
  
  optional arguments:
    -h, --help            show this help message and exit
    -hd HEADER, --header HEADER
                          the header line
    -r ROOT, --root ROOT  the root tag
    -li LIST_ITEM, --list-item LIST_ITEM
                          the list item tag (default is <li> as in HTML
    -t, --typed           tags are typed
    -f, --formatted       format the XML (use with caution: huge files may be
                          generated because of spaces)

As a library

Parse a JSON file

with open("path/to/json/file", "r", encoding="utf-8") as source:
    for token, value in JSONParser(source):
        print(token, value)

Print the XML counterpart of a JSON file

with open("path/to/json/file", "r", encoding="utf-8") as source:
    print("\n".join(JSONAsXML(source, typed=True)))

Tests

$ python3.8 -m pytest --cov-report term-missing --cov=json_event_parser \
  && python3.8 -m pytest --cov-report term-missing --cov-append --doctest-modules json_event_parser.py --cov=json_event_parser \
  && flake8 json_event_parser.py

jsoneventparser's People

Contributors

jferard avatar

Watchers

 avatar  avatar

jsoneventparser's Issues

Handle unicode surrogates

If code_point is between 0xD800 and 0xDBFF, it is a high surrogate. We should not try to use chr, but set high_code_point = code_point and code_point = 0 . When index is 8, use the formula : 0x10000 + (high_code_point - 0xd800) * 0x400 + code_point - 0xdc00 (code_point should be between 0xDC00 and 0xDFFF).

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.