Comments (57)
Yeah, I'd love to have it. Not so easy to implement, because it significantly complicates lexing strings (and strings inside of strings, ad infinitum), but should be quite do-able. Once we have it for strings, we should have it for regexes too.
There's a draft of it, using the lexer, over on the "interpolation" branch, but we'll probably have to implement it in the parser, in order to handle interpolated strings nested to an arbitrary depth. So, leaving it out for 0.2
from coffeescript.
One way to at least modularize the string lexing complication is to make it so that strings enclosed by ' ' are NOT interpolated, and strings enclosed by " " are.
from coffeescript.
I think it is too much of a security hole. Why not just use the underscore template function and add it to the String prototype?
See: http://gist.github.com/277712 and
http://github.com/documentcloud/underscore
var a, b;
a = "dog";
b = ['f','o','x']
String.prototype.template = function(context) { return _.template(this, context || self); }
// and call it like this:
"The <%= a %> jumps over the <%= b.join('') %>!".template(self);
// use global variables, or:
"The <%= a %> jumps over the <%= b.join('') %>!".template({ a: "cow", b: ['m','o','o','n' ]});
// put variables in a context.
from coffeescript.
We're keeping it a rule to not add functions to the JavaScript runtime, and certainly not to core object prototypes. What exactly is the security hole you're concerned about? String interpolation would be converted at compile time into a big array.join.
from coffeescript.
I'd love some help with implementing string interpolation -- otherwise this ticket will have to remain in limbo for the time being. There seem to be two ways to go about it, neither of which are particularly satisfactory.
The first (recommended, I think) way is to have the parser deal with all aspects of interpolation. This means that the lexer know longer knows what tokens are part of a string, and what tokens are part of the code, and is reduced to sending down a much simpler set of tokens: word, number, whitespace, etc. The parser decides the division between string and code based on the overall structure, as it parses. Doing this would add so much complication to the parser, that it's completely infeasible if we want to keep CoffeeScript small and tidy -- something that is only possible because of CoffeeScript's smart Lexer/Rewriter. For a simple comparison, CoffeeScript's grammar.y
is 469 lines, at the moment. Ruby's parse.y
is 10,492 lines.
The second alternative would be to perform the string interpolation entirely within the lexer, producing an equivalent stream of tokens that the grammar is already able to understand. In order to make this work, we'd need to build a simple state machine into the lexer, maintaining a stack of interpolation levels, and lexing in the correct mode (either "code" or "string"). The same mechanism could be used for regex and heredoc interpolation. I'm not sure if this is possible entirely within the lexer, although it seems likely that it could be.
Advice?
from coffeescript.
Would it make it any simpler if we only allowed interpolation to depth 1. It seems that ? 2 levels is only just about useful, and 3 is just insane, e.g.
"#{a: 1 ? "a#{a ? v : "#{a}}b"}"}"
from coffeescript.
For a quick implementation, you can simply say that quotes aren't allowed unescaped in the embedded code. Then the lexer doesn't have to do anything special, it passes the string through wholesale. And then in the code generator you have a mini-compiler that looks for embedded strings and recursively runs the lexer,compiler,generator on the embedded code.
This is how I have it working in CoffeePot.
Then later when you want to allow arbitrary code in the embedded section only requires making the lexer smarter about parsing out strings.
from coffeescript.
Ah --- cool. I just wrote a recursive string interpolator :
function exterpolate(s) { var y = "\"" + s + "\"", z function expand(a) { return a.replace(/{{(.*)}}/g, function(a,b) { return "\" + (" + b + ") + \""}) } while(true) { z = expand(y) if(z == y) return (y) y = z } } input = ("1 + {{true ? \"aa{{1+1}}bb\" : \"ccdd\" }}") output = exterpolate(input) log(input) // 1 + {{true ? \"aa{{1+1}}bb\" : \"ccdd\" }} log(output) // "1 + " + (true ? "aa" + (1+1) + "bb" : "ccdd" ) + "" log(eval(output)) // 1 + aa2bb
It seems to work ok. Regex is not quite right for > 1 regex per level and only handles "s
from coffeescript.
Closing this one as a "wontfix" for the time being. After several stabs at it, we really need to do it right -- not permitting double quotes within an interpolation is not a sustainable answer, and neither is only allowing a single level of interpolation. As much as I love it in Ruby (where it has different semantics than string concatentation), in CoffeeScript the following two expressions would be functionally identical:
name: "Moe"
string: "Hello #{name}!"
string: "Hello "+name+"!"
It would only save you a grand total of a single character.
Appeals to the speed of array.join-ing strings versus concatenating them may be optimized away soon. This article mentions optimizations for string concatenation in Firefox 3.6 that should make it just as fast as any alternative way:
http://hacks.mozilla.org/2010/01/javascript-speedups-in-firefox-3-6/
from coffeescript.
String interpolation is definitely a good feature. If it's not possible to do then fair enough, but it would be good.
I've rewritten the recursive JS above to create a Ruby parser that works ok for both nested and " and '.
def expand string ret = string ["\"", "'"].each do |char| regx = /(#{char}.*)\{\{([^{}]*)\}\}(.*#{char})/ ret = ret.gsub(regx) do |match| "#{$1}#{char} + (#{$2}) + #{char}#{$3}" end end ret end def exterpolate string while true expanded = expand(string) return expanded if expanded == string string = expanded end end coffee =<<EOF x: "1 + 2" y: "1 + {{x}}" z: "2 + {{ fn ? 3 : 2}}" k: '123{{true ? a : 'vc{{if x then "{{x}}"}}'}}' EOF ["COFFEE", coffee, "JS", exterpolate(coffee)].each {|x| puts x}
which outputs:
COFFEE x: "1 + 2" y: "1 + {{x}}" z: "2 + {{ fn ? 3 : 2}}" k: '123{{true ? a : 'vc{{if x then "{{x}}"}}'}}' JS x: "1 + 2" y: "1 + " + (x) + "" z: "2 + " + ( fn ? 3 : 2) + "" k: '123' + (true ? a : 'vc' + (if x then "" + (x) + "") + '') + ''
from coffeescript.
Fantastic, and a very neat trick -- if you want to spearhead this, and contribute a patch, go right ahead. A couple of concerns:
- Using text transformations like this would mean that we can't syntax highlight the inner code, but that's probably a fine tradeoff to make.
- You should be able to escape the interpolation delimiters with
\
, something that your current regex doesn't account for. - If we use mustache-style delimiters, then CoffeeScript will be forever unable to use mustache.js as a library without ugly escaping, because their templates will already be valid CoffeeScript interpolations. I like it too, but it might be better to stick with
#{ ... }
, or the proposed ECMAScript Harmony${ ... }
. See here:
http://wiki.ecmascript.org/doku.php?id=strawman:string_interpolation
from coffeescript.
Yeh - it's a little bit brittle with the regex's as it is. E.g. the following breaks:
"${"}"}"
There's a few other ones I should think.
Any votes on syntax ?
from coffeescript.
It is possible to look at the previous character in a regex ?!
from coffeescript.
Wow, I love that article on interpolation. My favorite part is:
Define an Interpolator that allows SQL libraries, document.write, etc. to specify context-dependent escaping.
If we could build that into the language somehow that would be awesome.
from coffeescript.
Before I found coffeescript I was working on my own language. I updated my example literal syntax. See if you think my idea of Interpolated strings would work for CoffeeScript too. http://github.com/creationix/jack/blob/master/data_types.jack#L29
from coffeescript.
Interesting. It reminds me of an idea I had a while back:
name: "John" verb: "Meet" age: 17 salary: 40 puts "$y $x. He is $age and his salary is \$${salary}k" # => "Meet John. He is 17 and his salary is $40k"
from coffeescript.
Here's an interesting page :
http://google-caja.googlecode.com/svn/changes/mikesamuel/string-interpolation-29-Jan-2008/trunk/src/js/com/google/caja/interp/index.html
from coffeescript.
OK - I've got it working a little better, but it seems that it's quite hard to do escaping properly, because 1.8's regex doesn't support lookbehinds (i.e. if we want to avoid matching groups). Either we can depend on the oniguruma gem + binary or do a hack suggested by Gray, which is to reverse the string and use look aheads. Any suggestions/ideas ?
from coffeescript.
Well I made a little progress using the 'reverse' method - which is a bit a head funk. I'm stuck on the following though:
1 + ${join({a:1})}
The regex needs to be able to count the curly braces inside ? I'm beginning to think that we need a simple parser/tokenizer instead. Thoughts ?
from coffeescript.
Yep. This brings us full circle back to here:
http://github.com/jashkenas/coffee-script/issues#issue/28/comment/117194
from coffeescript.
I've written some pseudo-code. we need 4 states: code, string, interp, string_interp.
My trick of recursive interpolating until no difference should mean would don't need to go any deeper.
from coffeescript.
I've written some Ruby code: http://gist.github.com/287890
Here's the output
$ ruby string.rb in: x: "1 + 2" out: x: "1 + 2" in: x: "1 + \"2" out: x: "1 + \"2" in: y: "1 + #{x}" out: y: ("1 + ")+(x)+("") in: y: "1 + \#{x}" out: y: "1 + \#{x}" in: z: "2 + #{fn ? 3 : 2} + etc" out: z: ("2 + ")+(fn ? 3 : 2)+(" + etc") in: "1 + #{x}" out: ("1 + ")+(x)+("") in: z: "1 + #{join({a:1})} + etc" out: z: ("1 + ")+(join({a:1}))+(" + etc") in: z: '1 + #{join({a:1})}' out: z: ('1 + ')+(join({a:1}))+('') in: z: '1 + #{if x then join {a: "hello"} }' out: z: ('1 + ')+(if x then join {a: "hello"} )+('') in: z: '1 + #{ {a: "hel#{X}lo"}.keys }' out: z: ('1 + ')+( {a: ("hel")+(X)+("lo")}.keys )+('') in: z: '#{"#{"#{"#{"#{"#{"hello!"}"}"}"}"}"}' out: z: ('')+(("")+(("")+(("")+(("")+(("")+("hello!")+(""))+(""))+(""))+(""))+(""))+('') in: z: "#{{}" out: z: in: z: "hello: #{}}" out: z: ("hello: ")+()+("}")
from coffeescript.
Looks pretty good.
- That second-to-last example isn't compiling any output when I run it. Looks like it should be a single closed brace.
- It would be nice to only wrap generated portions in parens when we really need to.
- It would be nice to support the full ECMA Harmony syntax, including this:
"Hello $planet"
, I think. - Instead of tracking "prev" and "prevprev" etc by hand, it might be easier for you to take a tack more similar to what Rewriter#scan_tokens does -- write a method that takes a block, yielding the current character, the previous and next characters (to whatever length you need), and the index, for each iteration.
from coffeescript.
- 2nd last is invalid as it's code with an unclosed brace.
- Yes it would, it might be hard...
- Agree $var would be nice
from coffeescript.
I still think it is a Security Hole if there eventually is a Client Side CoffeeScript interpreter. I believe that at Netscape they were considering putting basic variable parsing into Strings for the original JavaScript 0.9 spec (in the early 1990's) but threw it out because they did not want users to be able to fill in a standard Input Form values with something like:
"My favorite color is ${document.bgColor}"
"I like to look at other people's ${document.forms[0].password.value}"
String interpolation is much safer to do on the Server Side than on the Client Side, unless it is done with templates:
"This is a ${template}".template({template: "String!"}).
Otherwise it can act like an embedded #{ eval() }
Just my 2 cents.
from coffeescript.
As long as you're properly escaping the values there is no security hole. I wonder if we could figure out a way to do caja style interpolations without breaking the "no special function" rule.
from coffeescript.
"caja style interpolations" <= pray do tell ?
from coffeescript.
I'm still not completely sold on the difficulty/payoff tradeoff of implementing interpolation, but it is in no way a security hole.
Server-side versus client-side has nothing to do with it. If you're interpolating SQL strings on the server side without properly validating them, you get SQL injection. If you're interpolating unvalidated HTML strings on the client side, you have XSS.
The point is that strings are going to be interpolated unsafely, one way or the other, whether you're just concatenating fragments of HTML together, or have nicer interpolation sugar for it, you still need to be aware of your inputs. The main use case here is not injecting arbitrary user input into HTML -- that always needs to be escaped, but for programmer convenience. For example:
$('#image_counter').text "You have ${images.length} uploaded images."
Caja-style escaping is still an open problem in JavaScript. I don't know of any small client-side libraries that provide all of the different kinds of escaping you need. It would definitely break the "no special functions" rule to have it in core CoffeeScript, but would make for a great library.
from coffeescript.
Maybe just have it convert the string into an array of text and value pairs. Then people could use their own library to do interpolation (Naive case would be just to do a join on the array). As the caja page points out, the problem with the nice sugar is the required use of eval to get the current scope. CoffeeScript can do that half for the programmer and let them worry about the other half.
@weepy, It was your link above to the google caja page I'm talking about.
On the other hand ruby style interpolations with no smart escaping at all shouldn't be hard to implement. It's just more prone to sql injection and xss since the programmer has to explicitly escape all unsafe input themselves.
from coffeescript.
Coffee script doesn't have to do any eval to provide interpolation. See the code above.
from coffeescript.
yes, that's my point, CoffeeScript can get past the need for eval.
from coffeescript.
What do people feel about the syntax. I vote for using # rather than $, since # is rather underused and $ can be used in a normal variable name
from coffeescript.
It's a toss-up. I'm more used to #{}
as well, but ${}
is the proposal for ECMA Harmony, and they both have conflicts with regular strings that'll need escaping, if you implement direct variable substitution:
"Go $team!" vs. "He made $100."
"Destination: #city" vs. "Bachelor #1"
Edit: Holy moly look at that. Writing out numbers like #1 conflicts with Github Issues.
from coffeescript.
For what's it worth, I'm going the ECMA Harmony style route for Jack, but I'm also using caja style smart interpolations. It may be better to use ruby syntax if you're doing ruby style interpolations.
from coffeescript.
Creationix -- where are you getting the escaping parsers for all of the interpolation types that Caja wants to support? Are you going to write them yourself?
from coffeescript.
I'm basically implementing the library they describe in the paper from scratch. I may include parsers for html, xml, json and possibly postgresql, but for the most part it's ok for those to be packaged in a separate library.
from coffeescript.
I've updated the gist; here's the output:
in: x: "1 + 2" out: x: "1 + 2" in: x: "1 + \"2" out: x: "1 + \"2" in: y: "1 + #{x}" out: y: "1 + "+(x)+"" in: y: "1 + #myvar + #someothervar \#{helo}" out: y: "1 + "+(myvar)+"+ "+(someothervar)+"\#{helo}" in: y: "1 + \#{x}" out: y: "1 + \#{x}" in: z: "2 + #{fn ? 3 : 2} + etc" out: z: "2 + "+(fn ? 3 : 2)+" + etc" in: "1 + #{x}" out: "1 + "+(x)+"" in: z: "#{join({a:1})} + etc" out: z: ""+(join({a:1}))+" + etc" in: z: '#{join({a:1})}' out: z: ''+(join({a:1}))+'' in: z: '1 + #{if x then join {a: "hello"} }' out: z: '1 + '+(if x then join {a: "hello"} )+'' in: z: '1 + #{ {a: "hel#{X}lo"}.keys }' out: z: '1 + '+( {a: "hel"+(X)+"lo"}.keys )+'' in: z: '#{"#{"#{"#{"#{"#{"hello!"}"}"}"}"}"}' out: z: ''+(""+(""+(""+(""+(""+("hello!")+"")+"")+"")+"")+"")+'' in: z: "#{{}" out: z: in: z: "hello: #{}}" out: z: "hello: "+()+"}" in: '"#{x ? y : "#{z || '#{1 + "#{2}" + 1}'}" }"' out: '"'+(x ? y : ""+(z || ''+(1 + ""+(2)+"" + 1)+'')+"" )+'"'
from coffeescript.
I would recommend doing some static analysis and optimizing the generated code when possible.
For example remove all the empty strings in the output and removing parens when there is only one item inside them.
By the way, I got my initial prototype of Jack up. The string interpolation part is mostly done there. No nested code yet, but the generator part is pretty polished if you want to see what I mean about compacting the output.
http://static.creationix.com/jack/public/index.html
from coffeescript.
For example
''+(""+(""+(""+(""+(""+("hello!")+"")+"")+"")+"")+"")+''
is just:
"Hello!"
from coffeescript.
I've put together an improved version based on a state machine: http://gist.github.com/291087
Currently doesn't support interpolation in heredoc. => (it's not quite clear how this would be expanded). It also leads to the question of how this might be best integrated.
from coffeescript.
Hey Weepy. This would be something best handled as a pass in the rewriting stage. Take a look at rewriter.rb
You would add a method called interpolate_strings
, and call it at an appropriate point within rewrite
.
The rough structure could look something like this:
def interpolate_strings
scan_tokens do |prev, token, post, i|
next 1 unless token[0] == :STRING
# Do your magic here...
end
end
Return the number of tokens you'd like to move (forward or backward) in the token stream from the block -- the next 1
bit. If you're inserting a bunch of new tokens, you might move backwards to scan over them again. This would allow the recursive-ish interpolation of nested strings, but in an iterative fashion.
How does that sound?
from coffeescript.
One more thing -- for what it's worth, here's the source code for Ruby's HTML-escaping method -- it's a one liner, if you feel like making HTML-safe interpolations possible.
def html_escape(s)
s.to_s.gsub(/&/, "&").gsub(/\"/, """).gsub(/>/, ">").gsub(/</, "<")
end
from coffeescript.
soz - what do u mean? Have a particular syntax for making safe html interpolation?
from coffeescript.
just for the record - I am still planning to do this --- I've just been extremely busy recently. Hope to get to it next week,
from coffeescript.
Hey Weepy. Don't know if you're still working on this, but it would be awfully lovely to be able to use it for the code generation portion of the CoffeeScript self-compiler. There's lots of muck like this in there right now:
intro + ' += ' + step + ' : ' + idx + ' -= ' + step + ')'
Which cries out for better interpolation. Having the efficiency of an Array join would be nice too.
from coffeescript.
Hiya - it's been on pause due to bizzyness - hopefully I'll take a look today. I suppose I'm also slightly intimidated by the unknown :P (i.e. not yet looked at the nitty gritty of how CS works). Should I be trying to integrate into the node version now ?
from coffeescript.
The node version is now the latest master. With CoffeeScript 0.3.2, Node.js is the default engine. Pass --narwhal
if you'd like to continue using Narwhal/Rhino.
from coffeescript.
There hasn't been any activity on this ticket for about a month. I decided to give it a go tonight and here is the result. String interpolation works only in double-quoted strings. You can either use $identifier
to substitute a variable or ${expression}
to inject an expression.
list: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
puts "values: ${list.join(', ')}, length: ${list.length}."
# outputs 'values: 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, length: 10.'
from coffeescript.
Alright. String interpolation is now on master, thanks to Stan's patch. The code generation in nodes.coffee
now uses it extensively. Closing this ticket.
from coffeescript.
Nicely done!
Can I make a couple of suggestions:
- I think it's better to use # than $. $ is already an allowed variable character and used a lot in various frameworks. # is barely used other than for comments.
- It might be useful to be able to use pass @ and . for the simple interpolated variables. e.g
"my #@var #x.prop"
from coffeescript.
We were just following ECMA Harmony's proposed syntax:
http://wiki.ecmascript.org/doku.php?id=strawman:string_interpolation
But I agree that $
is ugly. Perhaps we should use backticks. Since backticks already indicate the interpolation of raw JavaScript into CoffeeScript, perhaps they could also indicate the interpolation of raw CoffeeScript into a string literal. So this (from nodes.coffee
):
"${@idt()}return ${@expression.compile(o)};"
Would become this:
"`@idt()`return `@expression.compile(o)`;"
If you want backticks in your string, you can escape them. Objections?
from coffeescript.
Why not reserve the backtick in strings for literal JavaScript substitution in the future? If they are meant for JavaScript in Coffee, why would they mean anything different in strings?
If we need to get a new syntax, I'd much rather go with Ruby's/weepy's #{...}
and #identifier
or perhaps #(2 + 2)
which also gives you a hint and the produced output, i.e., the explicit parentheses.
from coffeescript.
It might be familiarity, but I have trouble following the interpolation in the backtick version.
Also, I don't know if it's a big deal, but you can't use any variables staring with a $ without using the {} and you can't use double quoted strings within the expression interpolation:
$example: 2
"this will fail: $$example" # produces "this will fail: $$example"
"this works: ${$example}" # produces "this works: 2"
"this will compile error: ${\"even if escaped\"}"
from coffeescript.
Ok. Let's leave it as Harmony-style for the time being. A quick stab at backticks demonstrated that if the interpolation delimiters are symmetrical, it makes them harder to escape.
As for grayrest's concerns: $$dollars
should become "$" + dollars;
as it does currently. The naked dollars are only for simple identifers, anything fancier (including property accesses) should use the full ${ ... }
The second concern is a limitation of the current implementation, which uses regexes to lex entire strings at a chunk, instead of walking them character-by-character and counting interpolation boundaries. If we can switch to the latter without adding too much complexity to the lexer, it would be a welcome improvement. Otherwise, using single quoted strings within interpolations isn't too much of a burden.
from coffeescript.
Yeah, both of those come from reading the source. I only brought up the first because identifiers starting with $ are still simple identifiers and I just wanted to get your thoughts on it before it came up as a bug.
As for the second, I'll give it a go this afternoon.
from coffeescript.
There is a new version on master
where you can have "this will compile: ${"even if not escaped"}"
. The string tokenizer was rewritten to match strings one character at a time which allows Coffee to be smart about double-quoted strings inside double-quoted strings.
from coffeescript.
For those interested, there is a new commit on master which will allow you to nest interpolations. You can get crazy strings like ${"Hello $name ${", from $me"}"}
to produce something close to "Hello " + name + " from " + me
.
from coffeescript.
Related Issues (20)
- Bug?: Cannot Base64 encode value: 0 (Legacy browser/WSH) HOT 2
- Unnecessary `splice` ref added for Array destructuring with rest element not in last position HOT 4
- Bug: Re-ordered nested non-end BindingRestElement doesn't get transpiled HOT 1
- CoffeeScript is fantastic, please donβt give it up HOT 1
- How to imitate `let` behavior in loops? HOT 5
- Proposal: Alternative file extension HOT 1
- Bug: Excessive variable and shallow copy for leading or middle rest parameter
- Proposal: Introduce `let` statement. HOT 6
- Proposal: Document Existential Operator Assignment
- Site issue: code blocks twitch on hover HOT 2
- Proposal: cake command should support ES6 modules HOT 2
- Need help understanding class member meanings HOT 1
- CLI `npm` `scripts` and input `.coffee` file/s as last argument conflicting with `--watch` HOT 2
- Bug: wrong code is transpiled for function call without parentheses HOT 2
- Proposal: Add end word to close method or class HOT 2
- Bug: Invalid indentation allowed after `do`
- Bug: Remove checkShebangLine multi arguments check HOT 1
- feature_request(html): backend CoffeeScript compilation inside HTML files HOT 3
- Bug: yield cannot be used in do -> expressions reliably? HOT 5
- [not an issue] An embeddable playground for CoffeeScript HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
π Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. πππ
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google β€οΈ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from coffeescript.