Giter Club home page Giter Club logo

pydsdl's Introduction

PyDSDL

Build status Documentation Status PyPI - Downloads Forum

PyDSDL is a Cyphal DSDL compiler front-end implemented in Python. It accepts a DSDL namespace at the input and produces a well-annotated abstract syntax tree (AST) at the output, evaluating all constant expressions in the process. All DSDL features defined in the Cyphal Specification are supported. The library should, in theory, work on any platform and with any Python implementation.

Read the docs at pydsdl.readthedocs.io.

import pydsdl
try:
    types = pydsdl.read_namespace(target_directory, lookup_directories)
except pydsdl.InvalidDefinitionError as ex:
    print(f'{ex.path}:{ex.line}: Invalid DSDL: {ex.text}', file=sys.stderr)
    exit(1)
else:
    for t in types:
        print(t)  # Process the type -- generate code, analyze, etc.

pydsdl's People

Contributors

aasmune avatar bbworld1 avatar clyde-johnston avatar coderkalyan avatar pavel-kirienko avatar thirtytwobits avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

pydsdl's Issues

Combinatorial explosion

The following benign-looking definition is parsed very quickly (some milliseconds), but any attempt to determine its bit length set (e.g., using @assert _offset_ <...> or to generate code) takes a very, very long time (possibly hours):

uavcan.register.Value.0.1[<=10] b

The referenced type uavcan.register.Value.0.1 has a very complex layout (thousands of bit length options), so some delay is expected. However, a delay of more than several seconds is highly undesirable, and anything over ~20 seconds is unacceptable. What makes things worse is that most of the computation is performed in the native context, so the user can't interrupt it with SIGINT/CTRL+C.

The bit length set computation logic should be reviewed some day, it should be very optimizable.

The problem does not affect any of the standard definitions.

emit namespaces as a type

Define a Namespace class and provide it with the parsed output. Namespaces should contain a full name, shot name, path, list of types contained within, a list of nested namespaces, and namespace documentation (see #21 )

pydsdl appears to be broken by python 3.10

When importing pydsdl using python 3.10 I get:

ImportError while loading conftest '/Volumes/workspace/github/thirtytwobits/nunavut/conftest.py'.
conftest.py:19: in <module>
    import pydsdl
.tox/local/lib/python3.10/site-packages/pydsdl/__init__.py:28: in <module>
    from ._namespace import read_namespace as read_namespace
.tox/local/lib/python3.10/site-packages/pydsdl/_namespace.py:13: in <module>
    from . import _dsdl_definition
.tox/local/lib/python3.10/site-packages/pydsdl/_dsdl_definition.py:11: in <module>
    from . import _parser
.tox/local/lib/python3.10/site-packages/pydsdl/_parser.py:11: in <module>
    import parsimonious
.tox/local/lib/python3.10/site-packages/pydsdl/third_party/parsimonious/__init__.py:9: in <module>
    from parsimonious.grammar import Grammar, TokenGrammar
.tox/local/lib/python3.10/site-packages/pydsdl/third_party/parsimonious/grammar.py:14: in <module>
    from parsimonious.expressions import (Literal, Regex, Sequence, OneOf,
.tox/local/lib/python3.10/site-packages/pydsdl/third_party/parsimonious/expressions.py:13: in <module>
    from six.moves import range
E   ImportWarning: _SixMetaPathImporter.find_spec() not found; falling back to find_module()

dependency cleanup

  1. Per erikrose/parsimonious#231, provide a way to avoid the regex package dependency and then update parsimonious to the latest.
  2. At the same time remove six.py since it's not needed in the latest upstream

Symlinked namespace directories are mishandled

Reported as:

yakut compile https://github.com/UAVCAN/public_regulated_data_types/archive/refs/heads/master.zip
Traceback (most recent call last):
  <...>
  File "<...>/pydsdl/_namespace.py", line 422, in _ensure_no_namespace_name_collisions
    raise RootNamespaceNameCollisionError("The name of this namespace conflicts with %r" % b, path=a)
pydsdl._namespace.RootNamespaceNameCollisionError: /private/var/folders/z9/y477y8hd1gn8ss5__l_x70t00000gn/T/yakut-dsdl-fg59acm6/public_regulated_data_types-master/reg: The name of this namespace conflicts with '/var/folders/z9/y477y8hd1gn8ss5__l_x70t00000gn/T/yakut-dsdl-fg59acm6/public_regulated_data_types-master/reg'

Notice the difference: /private/var/ <--> /var/. This issue appears to affect only macOS users.

The fix should be to use os.path.realpath along with abspath here:

https://github.com/UAVCAN/pydsdl/blob/7d93f00158d2145a2997c37fc16710ae291220a6/pydsdl/_namespace.py#L135-L137

@coderkalyan can you look into this, please?

Provide comments as data in output.

If we treat comments as data (e.g. python __doc__) and provide them as part of the pydsdl output data then it will be trivial for pydsdlgen to be used as an autodoc system for dsdl.

Python 3.11 support

TLDR; update parsimonious to get around getargspec change in the inspect module.

Fix Codacy integration

Codacy analysis hangs pending forever. I might have damaged its integration when migrating from Travis to AppVeyor.

It might be sensible to switch from Codacy to PyLint, though, for simplicity.

CoarseOrientation has only one namespace_component; which causes test_namespace.py to crash. no parsing takes place.

From dsdl_definition.py lines 44-55

        relative_directory, basename = [str(x) for x in os.path.split(relative_path)]   # type: str, str
        print (basename)
        # Parsing the basename, e.g., 434.GetTransportStatistics.0.1.uavcan
        basename_components = basename.split('.')[:-1]
        print (len(basename_components))
        str_regulated_port_id = None    # type: typing.Optional[str]
        if len(basename_components) == 4:
            str_regulated_port_id, short_name, str_major_version, str_minor_version = basename_components
        elif len(basename_components) == 3:
            short_name, str_major_version, str_minor_version = basename_components
        else:
            raise FileNameFormatError('Invalid file name', path=self._file_path)

Support mypy

Your weird python import scheme in this project breaks any ability to use mypy when you depend on pydsdl. This seems like a pretty big miss for a project that is dedicated to datatypes; that you can't verify type safety.

Use of pathlib, os.path, and manual path transformation is brittle

We mix our use of pathlib, os.path, and split/join manipulation of paths as strings and lists of strings. This causes brittleness when edge cases are encountered or when supporting different filesystems like Windows.

This task is to clean up all path handling to use pathlib consistently, use os.path only when specifically necessary, and to limit transformations between Path objects and strings.

Also to keep an eye on is the difference between PurePath and concrete paths. Use of Path.resolve or Path.absolute should be limited to concrete paths since the path names are first abstractions for the dsdl namespaces and are not, necessarily, compatible with canonical names in a filesystem. When in unittests it is important to distinguish between valid filenames and valid DSDL tree structures. For example, assert expected_file.samefile(actual_file) is correct regardless of canonical names whereas assert expected_path == actual_path is only true for pure, relative Path objects that have not been canonicalized.


Examples include:

os.path.join(
    os.path.split(self._root_namespace_path)[-1],
    os.path.relpath(self._file_path, self._root_namespace_path),
)

should be

self._root_namespace_path.name / self._file_path.relative_to(self._root_namespace_path)

this...

relative_directory, basename = [str(x) for x in os.path.split(relative_path)]

should be

relative_directory = str(relative_path.parent)
basename = str(relative_path.name)

In _test.py:

assert d.file_path == Path(_DIRECTORY.name, "uavcan", "test", "5000.Message.1.2.dsdl").resolve()

should be:

assert d.file_path.samefile(Path(_DIRECTORY.name, "uavcan", "test", "5000.Message.1.2.dsdl"))

etc

Support symlinked namespaces

Create bug.py:

import pydsdl
import pathlib
import sys

target_directory = pathlib.Path("reg")
lookup_directories = pathlib.Path("uavcan")

try:
    types = pydsdl.read_namespace(target_directory, lookup_directories)
except pydsdl.InvalidDefinitionError as ex:
    print(f'{ex.path}:{ex.line}: Invalid DSDL: {ex.text}', file=sys.stderr)
    exit(1)
else:
    for t in types:
        print(t)  # Process the type -- generate code, analyze, etc.

create a symlink to the top-level types:

mkdir reg
cd reg
mkdir udral
cd udral
ln -s ../../../public_regulated_data_types/reg/udral/physics physics
ln -s ../../../public_regulated_data_types/reg/udral/service service

do:

python bug.py

Expected to parse and print dsdl. Actual:

/workspace/github/thirtytwobits/public_regulated_data_types/reg/udral/physics/acoustics/Note.0.1.uavcan:None: Invalid DSDL: Invalid name for namespace component: '..'

tagged unions are modeled wrong in the AST

Unions in the AST are modeled as if they were structs which confounds symmetrical parsing logic since each field is treated as a data member instead of an alternate representation of a single data member. Tagged unions should become structure types containing a single field that is an anonymous-union compound type. This compound type's fields would be the alternate representation of that field.

Proposed

image

Unexpected exception if the namespace is empty

https://forum.uavcan.org/t/viper-quadcopter/816/15?u=pavel.kirienko

2020-05-27 07:38:36,776  5548 INFO     pyuavcan._cli._main: Unhandled exception: list index out of range
Traceback (most recent call last):
  File "/home/alex/.local/lib/python3.8/site-packages/pyuavcan/_cli/_main.py", line 24, in main
    exit(_main_impl())
  File "/home/alex/.local/lib/python3.8/site-packages/pyuavcan/_cli/_main.py", line 50, in _main_impl
    result = args.func(args)
  File "/home/alex/.local/lib/python3.8/site-packages/pyuavcan/_cli/_main.py", line 149, in execute
    return cmd.execute(args, subsystems)
  File "/home/alex/.local/lib/python3.8/site-packages/pyuavcan/_cli/commands/dsdl_generate_packages.py", line 118, in execute
    gpi_list = self._generate_dsdl_packages(source_root_namespace_dirs=inputs,
  File "/home/alex/.local/lib/python3.8/site-packages/pyuavcan/_cli/commands/dsdl_generate_packages.py", line 186, in _generate_dsdl_packages
    gpi = pyuavcan.dsdl.generate_package(root_namespace_directory=ns,
  File "/home/alex/.local/lib/python3.8/site-packages/pyuavcan/dsdl/_compiler.py", line 179, in generate_package
    composite_types = pydsdl.read_namespace(root_namespace_directory=str(root_namespace_directory),
  File "/home/alex/.local/lib/python3.8/site-packages/pydsdl/_namespace.py", line 161, in read_namespace
    list(set(map(lambda t: t.root_namespace, target_dsdl_definitions)))[0],
IndexError: list index out of range

Support multi-package builds

When building complex software where several packages are contributing DSDL types the simplified check _ensure_no_namespace_collisions gets in the way where different types are contributed to a single namespace from different folder structures.

We should fix this check to only be concerned with conflicting types within namespaces and not the namespace folders themselves.

v1.0 API stabilization

As we are approaching v1.0, I would like @thirtytwobits and @aasmune to report here if they are aware of any usability issues in the current API. I am inclined to release v1.0 around the next week.

Slow import

The package is slow to import. This is bad because it affects the performance of CLI tools that rely on it, among other things. Import profiling (set -x PYTHONPROFILEIMPORTTIME 1) shows that most of the time is spent importing pydsdl._parser:

import time:       363 |        363 |                   pydsdl._error
import time:       440 |        440 |                           numbers
import time:       799 |       1239 |                         _decimal
import time:       189 |       1427 |                       decimal
import time:       237 |        237 |                       math
import time:      1711 |       3374 |                     fractions
import time:       199 |        199 |                     unicodedata
import time:       490 |       4062 |                   pydsdl._expression._primitive
import time:       377 |       4802 |                 pydsdl._expression._any
import time:       531 |        531 |                   pydsdl._expression._operator
import time:       581 |       1111 |                 pydsdl._expression._container
import time:       261 |       6173 |               pydsdl._expression
import time:       448 |        448 |               pydsdl._bit_length_set
import time:       342 |       6961 |             pydsdl._serializable._serializable
import time:       694 |        694 |             pydsdl._serializable._primitive
import time:       264 |        264 |             pydsdl._serializable._void
import time:       405 |        405 |             pydsdl._serializable._array
import time:       149 |        149 |               pydsdl._port_id_ranges
import time:       186 |        186 |                 pydsdl._serializable._name
import time:       327 |        512 |               pydsdl._serializable._attribute
import time:      1055 |       1716 |             pydsdl._serializable._composite
import time:       260 |      10297 |           pydsdl._serializable
import time:       212 |        212 |                     __future__
import time:      1397 |       1608 |                   six
import time:        85 |         85 |                       _ast
import time:       282 |        366 |                     ast
import time:       191 |        556 |                   parsimonious.utils
import time:       321 |       2485 |                 parsimonious.exceptions
import time:        47 |         47 |                     six.moves
import time:       387 |        387 |                     parsimonious.nodes
import time:       468 |        901 |                   parsimonious.expressions
import time:     15945 |      16845 |                 parsimonious.grammar
import time:       194 |      19524 |               parsimonious
import time:     37296 |      56819 |             pydsdl._parser
import time:       409 |      57228 |           pydsdl._dsdl_definition
import time:       805 |      71175 |         pydsdl._namespace
import time:       436 |      71610 |       pydsdl

Without further analysis I predict that the culprit is right here:

https://github.com/UAVCAN/pydsdl/blob/cb35ad3e8c0886d44facdfac14489b9ba7999d44/pydsdl/_parser.py#L140-L142

We parse the grammar definition at the time of package initialization. This is not great. Consider implementing lazy initialization:

@functools.lru_cache(None)
def _get_grammar() -> parsimonious.Grammar:
    ...

Regular definitions take too long to process

This is not a duplicate of #23. It is known that the addition of delimited serialization made things slower, but what I did not expect is that the new version slows down the PyUAVCAN test suite from about 14 minutes down to... about an hour. I guess I will have to whip out the profiler now?

Implement a spec-adhering parser backend

Currently, PyDSDL uses regular expressions to parse definitions, and eval() to process initialization expressions. This approach is simple and works acceptably, but the resulting parser is also capable of properly processing definitions that are not well-formed according to the specification; this is undesirable for a reference implementation. Therefore, a stricter parser is needed.

Option for skipping assertion checks

This will be useful for quick compilation. Offset checks may involve extensive computations which slow parsing down considerably. My 4 GHz workstation needs over one second to compile 91 standard data types.

Automatically detect zero-cost types

On a little-endian IEEE 754-compliant machine the following definition (comments removed) can be (de-)serialized in C/C++ using memcpy (possibly DMA-assisted). In Python, serialization can be delegated to the standard struct module.

truncated uint64 unique_id

uavcan.si.unit.mass.Scalar.1.0 mass

uavcan.si.unit.electric_charge.Scalar.1.0 design_capacity

uavcan.si.unit.voltage.Scalar.1.0[2] design_cell_voltage_min_max

uavcan.si.unit.electric_current.Scalar.1.0 discharge_current
uavcan.si.unit.electric_current.Scalar.1.0 discharge_current_burst
uavcan.si.unit.electric_current.Scalar.1.0 charge_current
uavcan.si.unit.electric_current.Scalar.1.0 charge_current_fast
uavcan.si.unit.electric_current.Scalar.1.0 charge_termination_treshold
uavcan.si.unit.voltage.Scalar.1.0          charge_voltage

uint16 cycle_count

void8
uint8 series_cell_count

uint7 state_of_health_pct
void1

Technology.0.1 technology  # This is an enumeration

This is based on https://forum.uavcan.org/t/future-zero-cost-serialization-constraint/469, but this capability is just an implementation detail that does not require any support from the Specification.

PyDSDL should detect such zero-cost types automatically. Probably it makes sense to express the availability of zero-cost (de-)serialization as a function of platform properties:

  • Zero-cost (de-)serialization is not possible.
  • ZCS is possible for little-endian IEEE 754-compliant platforms with a conventional memory model.
  • ZCS is possible for little-endian platforms with a conventional memory model (when floating point fields are not present).
  • ZCS is possible for any platform with a conventional memory model (when the largest field is not larger than one byte).

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.