anmolitor / protoc-gen-elm Goto Github PK

View Code? Open in Web Editor NEW

15.0 15.0 4.0 1.33 MB

An Elm plugin for protoc

Home Page: https://www.npmjs.com/package/protoc-gen-elm

License: BSD 3-Clause "New" or "Revised" License

Shell 0.13% JavaScript 0.54% Elm 93.46% TypeScript 5.36% Rust 0.41% HTML 0.09% CSS 0.01%

protoc-gen-elm's People

Contributors

Stargazers

Watchers

Forkers

marpulli okkero burzomir nats-rpc

protoc-gen-elm's Issues

Odd generation of optional fields

I think it is a bit strange the way optional fields are generated. For example, with the following message definition:

message Foo {
  optional uint32 n = 1;
}

instead of simply generating a record type with n : Maybe Int, a new type type Foo_N = Foo_N Int is created (or, in 3.0.0 beta: type N = N Int), and the field in the message is n : Maybe Foo_N.

There might be a good reason for doing it like this, but right now I can't see it. From where I stand it seems like an unnecessary piece of complexity that makes the data more difficult to work with. Perhaps you could shed some light on this? Thank you.

proto comments are not transferred in the generated files

.proto files can be documented by leaving comments in the file, which are picked up by protoc and exposed
as SourceCodeInfo. We could use that info to move the comments to fitting places in the generated elm code.

Example:

// A simple message
message Simple {
  // a field containing a string
  string field = 1;
}

{- |  A simple message -}
type alias Simple = {
  -- a field containing a string
  field : String
}

Support for Int64 Datatypes

Currently Int64 (and its variants) are not supported and generate Int32 encoders and decoders instead.
This is obviously not compatable with servers in other languages that use these data types.

I opened a PR in elm-protocol-buffers eriktim/elm-protocol-buffers#9 which adds encoders and decoders
for the Int64 and UInt64 data types (defined in non-official libraries).

The PR will probably change to handwritten implementations instead, but this code generator could generate the conversions to the library datatypes automatically. The downside would be additional dependencies you would need to add to your codebase.

The easiest way (and what I will probably do) to implement this would be to just use the elm-protocol-buffers representation (with no utility functions on it).
If you have an opinion on this, feel free to comment.

Get rid of fromInternal/toInternal functions in favor of generics

I have recently thought about our current solution to handle mutually recursive protobuf modules/namespaces.

The examples below all reference the following protobuf file

// file: test.proto
package test;

enum Outer { A = 0 }

message Scope {
  enum Inner { B = 0 }
  message InnerMsg {
    Outer outer = 1;
    Inner inner = 2;
  }

  InnerMsg msg = 1;
}

The original protoc-gen-elm would have generated one big module:

type Test__Outer = Test__Outer__A
type Test__Scope__Inner = Test__Scope__Inner__B
type alias Test__Scope__InnerMsg = { outer : Test__Outer , inner : Test__Scope__Inner }
type alias Test__Scope = { msg : Test__Scope__InnerMsg }

We moved from this one big module to one module per protobuf namespace due to ugly names mostly -
pattern matching on Test__Outer__Inner is very verbose for example. So the current implementation generates a module for the top level namespace and a seperate module for each inner namespace, for example the inside of message Scope.

// file: Test.elm
import Test.Scope

type Outer = A

type alias Scope = {
  inner : Test.Scope.InnerMsg
}

// file: Test/Scope.elm
import Test

type Inner = B

type alias InnerMsg = {
  outer : Test.Outer,
  inner : Inner
}

Except this does not work because the modules are mutually recursive and the Elm compiler does not allow that.
So we do something similar, but we use an _Internals module as the big file just like previously since a single file does not have issues with mutual recursion.
This approach comes with the disadvantage of generating the same union type multiple times. In contrast to type aliases for structs, we cannot rename the union type constructors. So we need to generate mapping functions between the union types which is not really user friendly at all, since the user just gets a type error with no explanation how to fix it.

To mitigate this I had the following idea:
What if we generated:

// file: Internals.elm
import Test.Unions
import Test.Scope.Unions

type alias Test__Outer = Test.Unions.Outer
type alias Test__Scope__Inner = Test.Scope.Unions.Inner
type alias Test__Scope__InnerMsg = { outer : Test__Outer , inner : Test__Scope__Inner }
type alias Test__Scope = { msg : Test__Scope__InnerMsg }

// file: Test/Scope.elm
import Internals
import Test.Scope.Unions

type alias Inner = Test.Scope.Unions.Inner

type alias InnerMsg = Internals.Test__Scope__InnerMsg

// file: Test.elm
import Internals
import Test.Unions

type alias Outer = Test.Unions.Outer

type alias Scope = Internals.Test__Scope

// file: Test/Scope/Unions.elm
type Inner = B

// file: Test/Unions.elm
type Outer = A

So far so good, now when you want to pattern match on one of the generated union types, you need to import the .Unions file of the corresponding File you are importing as well, which is likely fine. We might even generate one file per union type so you can do

import Test.Scope.Inner as Inner
import Test.Scope exposing (Inner)

x : Inner
x = Inner.B

An issue arises when oneof declarations enter the mix.
Consider the following proto file

package order;
message Succeeded {
  string id = 1;
  string description = 2;
}
message Failed {
  string error = 1;
}
message OrderStatus {
  oneof status {
    Succeeded succeeded = 1;
    Failed failed = 2;
  } 
}

The oneof declaration generates a file Order.OrderStatus.Status which declares this union type:

type Status =
   Succeeded Internals.Order__Succeeded
  | Failed Internals.Order__Failed

and Internals, as usual has another declaration of the same union type:

// Internals.elm
type Order__OrderStatus__Status =
   Order__OrderStatus__Status __Succeeded Order__Succeeded
  | Order__OrderStatus__Status__Failed Order__Failed

We would like to just write

// Internals.elm
type alias Order__OrderStatus__Status = Order.OrderStatus.Status.Status

but we cannot since the module Order.OrderStatus.Status imports from Internals.

I propose the following solution:

// Order.OrderStatus.Status.elm
type Status a b =
   Succeeded a
  | Failed b

// Internals.elm
type alias Order__OrderStatus__Status = Order.OrderStatus.Status.Status Order__Succeeded Order__Failed

By introducing a generic parameter, we have effectively sidestepped the problem.
We can also generate

// Order.OrderStatus.elm
type alias Status = Order.OrderStatus.Status.Status Succeeded Failed

which should make the usage easier.
I have checked, and Elm does not seem to limit the amount of type parameters a type can have so this should scale without issues as well.

Name clashes

I'm getting name clashes in certain situations with nested messages and oneof. Consider the following message structure, which is a relatively common pattern in some of my protobuf definitions:

message OrderStatus {
  message Processing {}
  message Succeeded {}
  message Failed {
    string error_text = 1;
  }

  oneof status {
    Processing processing = 1;
    Succeeded succeeded = 2;
    Failed failed = 3;
  }
}

This will generate record types for each nested message named prefixed with the outer message name as a form of namespacing (type alias OrderStatus_Failed = { errorText : String } etc.)
It will also generate a custom type for status with variant names also prefixed with the outer message name (type OrderStatus_Status = OrderStatus_Failed OrderStatus_Failed | ...)

The problem is that custom types in Elm don't create a namespace; all variants are module level. So the definition above creates value constructors with identical names in the same module. Compare this to Rust (prost) where each enum variant is scoped to its declaring enum, and so can be disambiguated from a struct in the same module and with the same name as the enum variant.

I don't know what would be the best way to address this. I see a couple of possibilities:

Prefix each variant with the custom type name as well (type OrderStatus_Status = OrderStatus_Status_Failed OrderStatus_Failed | ...)
Generate a new module for nested messages and oneofs

I don't know if any of these will guarantee no name clashes, and either way it's going to be a breaking change. I can work around this for now by giving different names to the variants, but it would be interesting to see if we could find a foolproof way to avoid name clashes like this.

All integer types encode/decode as int32

It seems like no matter what integer type I specify in my protofiles, the generated code will always try to encode/decode values as if they had the type int32. This is problematic, as decoding/encoding might end up yielding the wrong result. This seems to be especially true for sint32 (which is how I stumbled upon this).

This seems like an oversight, as the readme implies that using all 32-bit integer types should be supported, and elm-protocol-buffers does have encoders and decoders defined for all 32-bit integer types.

At a first glance, it seems like a simple fix would be to change the implementations of Generator.Message.toDecoder and Generator.Message.toEncoder to use a decoder directly based on the field's literal data type in the protofiles, instead of its inferred Elm primitive type. See line 222. It seems like this branch could just be replaced with the following:

( _, Primitive _ dataType _ ) ->
    C.fqFun Meta.Decode.moduleName dataType

and then similar changes for lines 294, 328, 341 and 350.

I have tested these changes locally, and it does seem to work, but there may be corner cases I don't know about. If you think this seems like a feasible solution, I'll gladly submit a pull request.

Indistinguishable messages

Hello, I’m learning about protobuf and how to use it in Elm. I was able to generate Elm code for the following proto file:

syntax = "proto3";

message Request {
  repeated string users = 1;
}

message Response {
  repeated string users = 1;
}

Now, using the generated code I see that I can use request and response decoders interchangeably.

encodedRequest =
    { users = [ "x", "y" ] }
       |> Proto.encodeRequest
       |> Protobuf.Encode.encode

decodedRequest = Protobuf.Decode.decode Proto.decodeRequest encodedRequest

decodedResponse = Protobuf.Decode.decode Proto.decodeResponse encodedRequest

isThisCorrect = decodedRequest == decodedResponse

Is it expected behaviour? My expectation was to be able to distinguish between decoded Request and Response. Or do I have to add additional fields to each message to be able to tell the difference between them?

explicitly specifying plugin path

Could you add a line to the readme about specifying the plugin path directly? I can't seem to get the invocation right.

$ protoc --version
libprotoc 3.19.6

$ protoc --plugin=$(which protoc-gen-elm) --elm-out=. proto/api.proto
Unknown flag: --elm-out

Dev tools support

Was there any luck supporting something like https://github.com/SafetyCulture/grpc-web-devtools ?

Importing Enums generates Code that fails to compile

Given the two files

// in imported_enum.proto
enum SomeEnum {
  OptionA = 0;
  OptionB = 1;
}

import "imported_enum.proto";

message Msg {
  SomeEnum someEnum = 1;
}

An .elm file is generated for the second file that fails to compile.
Specifically

defaultMsg : Msg
defaultMsg =
    { someEnum = SomeEnum_OptionA }

this part fails to compile, since SomeEnum_OptionA is not in scope. Instead Proto.ImportedEnum.SomeEnum_OptionA should be generated.

Code repetition while encoding nested messages

I'd like to introduce alternate constructors to reduce number of lines required to create and encode a message.

Example .proto

message ParentMessage {
  oneof msg {
    ChildMessage ChildMessage = 1
  }
}

message ChildMessage {
  oneof msg {
    GrandChildMessage GrandChildMessage = 1
  }
}

message GrandChildMessage = {}

How I currently have to structure my code:

{}
        |> grandChildMessage
        |> (\msg_ -> { msg = Just msg_ })
        |> childMessage
        |> (\msg_ -> { msg = Just msg_ })
        |> encodeParentMessage

How I would like to structure my code:

{}
  |> justGrandChildMessage
  |> justChildMessage
  |> encodeParentMessage

@anmolitor What do you think?

anmolitor / protoc-gen-elm Goto Github PK

protoc-gen-elm's People

Contributors

Stargazers

Watchers

Forkers

protoc-gen-elm's Issues

Odd generation of optional fields

proto comments are not transferred in the generated files

Support for Int64 Datatypes

Get rid of fromInternal/toInternal functions in favor of generics

Name clashes

All integer types encode/decode as int32

Indistinguishable messages

explicitly specifying plugin path

Dev tools support

Importing Enums generates Code that fails to compile

Code repetition while encoding nested messages

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent