Giter Club home page Giter Club logo

Comments (14)

hellerve avatar hellerve commented on May 9, 2024

I have a proposal for this in a feature branch that works with interfaces. Let me describe it and see whether it does what we need, is okay to look at, and makes sense.

It registers an interface format of type (λ [String a] String) that specializes on the second argument. A motivating example:

> (Float.format @"%.4d" 10.0f)
=> "10.0000"
> (Int.format @"%x" 255)
=> "ff"

Why do we need to define this interface for any type when it is only a thin wrapper around snprintf? Well, I’m glad you asked. This is needed to preserve type hygiene; otherwise we’d have to introduce some sort of catch-all type into the interop, and that doesn’t sound like a great idea.

There are some open questions and concerns from my side (feel free to add to the list):

  • is this good enough? While I feel like this interface is nice to have, it might become tedious to print more complicated expressions. This can be partially solved by a simple macro, if we made, say, an interface like this: (fmt "%x" 255 " %.4d" 100.0f). I didn’t add that macro, because I couldn’t settle on a good API.
  • having user-defined strings in *printf functions is a common security hole. We’ll have to make sure that we document that having a potential attacker define the format string is insecure.
  • what about compound types? Since there is no ready-made solution from *printf for structs, I’m not sure how to best design such an interface. Keep in mind that we already have str, and we are able to format the properties of a type by themselves, but maybe a syntax for structs would be interesting? That’s kind of a moonshot, though, I think, because it would complicate the implementation a lot.

I think that’s all!

Cheers

from carp.

jiegec avatar jiegec commented on May 9, 2024

I think we should add variadic function first?

from carp.

hellerve avatar hellerve commented on May 9, 2024

The problem is that in this case that wouldn’t help, because printf is not only variadic, but also has a list of catch-all types as parameters. Unless I’m missing something profound, there is no valid signature for printf that doesn’t use the varargs ellipsis.

from carp.

jiegec avatar jiegec commented on May 9, 2024

What about implement a printf-like macro that convert to (concat (Int.fmt "%d" a) (Char.fmt "%c" b))? Macro can have varargs (and maybe compile-time type check, which some c/c++ compilers already does and shows a warning when type mismatched).

from carp.

hellerve avatar hellerve commented on May 9, 2024

I think we talked about that idea on the chat! I do agree that this sounds good, but have one major concern:

It would require introspection and parsing of the string, which is not currently possible inside the macro. We could do it at runtime, but that would be a terribly costly idea. Why would we need that? Because it needs to split the format strings on the % characters; and it’d also have to take care of the %% special case and such. That’s why I talked about an already-split fmt macro in my first comment. I would also try to optimize it, but bear in mind that it would generate a string for every argument and then merge them, which sounds like a fairly slow operation, even without benchmarking it.

As I said, I’m not sure the format interface solves our problems. But I think it is a step in the right direction.

from carp.

jiegec avatar jiegec commented on May 9, 2024

Oh, I didn't join the gitter chat and missed those conversations.

from carp.

hellerve avatar hellerve commented on May 9, 2024

That’s alright, it’s good that we document it here.

To recap, a macro that analyzes the string would be awesome, but sounds like a lot of work. I’m not opposed to work per se, but I would like to make sure we do not find a simpler version that is satisfying our requirements—which also includes a good, clean, intuitive API of course.

In general, I also think that while it’s important to think of a solid system to implement, it is empowering to realize that the system might change again before 1.0 if we find a better way to do things.

from carp.

eriksvedang avatar eriksvedang commented on May 9, 2024

What (dynamic) functions would a macro that analyses the format string need access to? (roughly)

It's probably pretty handy functions in general, so might be worth adding those and give it a try.

from carp.

hellerve avatar hellerve commented on May 9, 2024

We would have to traverse the string and be able to split and merge it. I can sketch a function that does what we want and see what I’ll need for it.

In general, I’d opt for the functions being consistent for the ones we’ll implement in #94, to reduce confusion, even if the internal implementation will probably be quite different.

from carp.

eriksvedang avatar eriksvedang commented on May 9, 2024

Yes agreed. I'll create a module, something like Dynamic.String so they can avoid clashing with the existing ones.

from carp.

hellerve avatar hellerve commented on May 9, 2024

I’ve attached an obviously untested prototype. This might work, but mostly it highlights which functions we need:

(defdynamic fmt-internal [s args]
  (let [idx (String.index-of s \%)
        len (String.count s)]
    (cond
      (= idx -1) s ; no more splits found, just return string
      (= \% (String.char-at s (inc idx))) ; this is an escaped %
        (list 'String.append
              (String.substring s 0 (+ idx 2))
              (fmt-internal (String.substring (+ idx 2) len)))
      (= 0 (count args)) ; we need to insert something, but have nothing
        (macro-error "error in format string: not enough arguments for format string")
      ; okay, this is the meat:
      ; get the next % after our escaper
      (let [next (String.index-of (String.substring s (inc idx) len) \%)]
        (if (= 0 next)
          (list 'fmt slice (car args))
          (let [offs (+ (inc idx) 1)
                slice (String.substring s 0 offs)]
            (list 'String.append (list 'fmt slice (car args))
                                 (fmt-internal (String.substring s offs len)
                                               (cdr args)))))))))

(defmacro fmt [s :rest args]
  (fmt-internal s args))

The functions we need are thus (substring str from to), (index-of str char), and (char-at str index). One thing about this is problematic: the appending actually happens at runtime—because of the fmt, which might not be evalable at macro expansion time—, thus leading to a fairly high number of appends and possibly tiny strings in memory. I’m also not at all sure about the soundness of the algorithm, but it might work.

I also attached a proof of concept/hopefully faithful transcription in Python for people to run if they want to test the behaviour:

def fmt(s, args):
    idx = s.find("%")

    if idx == -1:
      return s
    if s[idx+1] == "%":
      return s[:idx+2] + fmt(s[idx+2:], args)
    if not args:
      raise Exception("error in format string: not enough arguments for format string")

    nxt = s[idx+1:].find("%")
    if nxt == -1:
      return s + "(arg={})".format(args[0])
    slc = s[:nxt+1+idx]
    return slc + "(arg={})".format(args[0]) + fmt(s[nxt+1+idx:], args[1:])

The args are interspersed differently, of course, but it should work.

from carp.

eriksvedang avatar eriksvedang commented on May 9, 2024

Excellent, that's a very good start! I'll try to make this convenient to write using the dynamic String functions. Will write here again when those are in place and works well enough.

from carp.

jiegec avatar jiegec commented on May 9, 2024

We can force the format string to be a string literal as in Rust std::format!.

And we can add 'varargs of same type' function like:

// Force the type check on Carp side
// (register concat (\lambda [:rest String] String) (omitting the ref for now)
// (String.concat str1 str2 str3)
// expands to
// (String.concat_internal [str1 str2 str3])
void String_concat_internal(Array args) {
  ...
}
// then (format "%d %s" 6 s) can be expanded to
// (String.append (Int.to_string 6) " " s)

Alternatively, we can use a precompiled version of format string and its arguments as in Rust std::fmt::Arguments and pass it to internal functions to avoid additional costs.

from carp.

hellerve avatar hellerve commented on May 9, 2024

I’ve added #154 which puportedly fixes this. I hope this is what we want.

from carp.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.