Giter Club home page Giter Club logo

sablon's Introduction

Sablon

Gem Version Build Status

Is a document template processor for Word docx files. It leverages Word's built-in formatting and layouting capabilities to make template creation easy and efficient.

Table of Contents

Installation

Add this line to your application's Gemfile:

gem 'sablon'

Usage

require "sablon"
template = Sablon.template(File.expand_path("~/Desktop/template.docx"))
context = {
  title: "Fabulous Document",
  technologies: ["Ruby", "HTML", "ODF"]
}
template.render_to_file File.expand_path("~/Desktop/output.docx"), context

Writing Templates

Sablon templates are normal Word documents (.docx) sprinkled with MailMerge fields to perform operations. The following section uses the notation «=title» to refer to Word MailMerge fields.

A detailed description about how to create a template can be found here

Content Insertion

The most basic operation is to insert content. The contents of a context variable can be inserted using a field like:

«=title»

It's also possible to call a method on a context object using:

«=post.title»

NOTE: The dot operator can also be used to perform a hash lookup. This means that it's not possible to call methods on a hash instance. Sablon will always try to make a lookup instead.

This works for chained method calls and nested hash lookup as well:

«=buyer.address.street»
WordProcessingML

Generally Sablon tries to reuse the formatting defined in the template. However, there are situations where more fine grained control is needed. Imagine you need to insert a body of text containing different formats. If you can't decide the format ahead of processing time (in the template) you can insert WordProcessingML directly.

It's enough to use a simply insertion operation in the template:

«=long_description»

To insert WordProcessingML prepare the context accordingly:

word_processing_ml = <<-XML.gsub("\n", "")
<w:p>
<w:r w:rsidRPr="00B97C39">
<w:rPr>
<w:b />
</w:rPr>
<w:t>this is bold text</w:t>
</w:r>
</w:p>
XML

context = {
  long_description: Sablon.content(:word_ml, word_processing_ml)
}
template.render_to_file File.expand_path("~/Desktop/output.docx"), context

In the example above the entire paragraph will be replaced because all of the nodes being inserted aren't valid children of a paragraph (w:p) element. The example below shows inline insertion, where only runs are added and instead of replacing the entire paragraph only the merge field gets removed.

Important: All text must be wrapped in a run tag for valid inline insertion because WordML is still inserted directly into the document "as is" without any structure transformations other than run properties being merged.

word_processing_ml = <<-XML.gsub("\n", "")
<w:r w:rsidRPr="00B97C39">
<w:rPr>
<w:b />
</w:rPr>
<w:t>this is bold text</w:t>
</w:r>
XML

context = {
  long_description: Sablon.content(:word_ml, word_processing_ml)
}
template.render_to_file File.expand_path("~/Desktop/output.docx"), context
HTML

Similar to WordProcessingML it's possible to use html as input while processing the template. You don't need to modify your templates, a simple insertion operation is sufficient:

«=article»

To use HTML insertion prepare the context like so:

html_body = <<-HTML.strip
<div>
    This text can contain <em>additional formatting</em> according to the
    <strong>HTML</strong> specification. As well as links to external
    <a href="https://github.com/senny/sablon">websites</a>, don't forget
    the "http/https" bit.
</div>

<p style="text-align: right; background-color: #FFFF00">
    Right aligned content with a yellow background color.
</p>

<div>
    <span style="color: #123456">Inline styles</span> are possible as well
</div>

<table style="border: 1px solid #0000FF;">
    <caption>Table's can also be created via HTML</caption>
    <tr>
        <td>Cell 1 only text</td>
        <td>
            <ul>
                <li>List in Table - 1</li>
                <li>List in Table - 2</li>
            </ul>
        </td>
    </tr>
    <tr>
        <td></td>
        <td>
            <table style="border: 1px solid #FF0000;">
                <tr><th>A</th><th>B</th></tr>
                <tr><td>C</td><td>D</td></tr>
            </table>
        </td>
    </tr>
</table>
HTML
context = {
  article: Sablon.content(:html, html_body) },
  # Or use html: prefix to make sablon parse the value as HTML.
  # Does the same as above.
  'html:article' => html_body
}
template.render_to_file File.expand_path("~/Desktop/output.docx"), context

There is no 1:1 conversion between HTML and Open Office XML however, a large number of tags are very similar. HTML insertion is relatively complete covering several key content structures such as paragraphs, tables and lists. The snippet above showcases some of the capabilities present, for a comprehensive example please see the html insertion test fixture here. All html element conversions are defined in configuration.rb with their matching AST classes defined in ast.rb.

Basic conversion of CSS inline styles into matching WordML properties is possible using the style=" ... " attribute in the HTML markup. Not all CSS properties are supported as only a small subset of CSS styles have a direct Open Office XML equivalent. Styles are passed onto nested elements if the parent can't use them. The currently supported styles are also defined in configuration.rb. Toggle properties that aren't directly supported can be added using the text-decoration: style attribute with the proper XML tag name as the value (i.e. text-decoration: dstrike for w:dstrike). Simple single value properties that do not need a conversion can be added using the XML property name directly, omitting the w: prefix i.e. (highlight: cyan for w:highlight).

Table, Paragraph and Run property references can be found at:

The full Open Office XML specification used to develop the HTML converter can be found here (3rd Edition).

The example above shows an HTML insertion operation that will replace the entire paragraph. In the same fashion as WordML, inline HTML insertion is possible where only the merge field is replaced as long as only "inline" elements are used. "Inline" in this context does not necessarily mean the same thing as it does in CSS, in this case it means that once the HTML is converted to WordML only valid children of a paragraph (w:p) tag exist. As with WordML all plain text needs to be wrapped in a HTML tag. A simple <span>..</span> tag enclosing all other elements will suffice. See the example below:

inline_html = <<-HTML.strip
    <span>This text can contain <em>additional formatting</em> according to the
    <strong>HTML</strong> specification. As well as links to external
    <a href="https://github.com/senny/sablon">websites</a>, don't forget
    the "http/https" bit.</span>
HTML
context = {
  article: Sablon.content(:html, inline_html) }
  # alternative method using special key format
  # 'html:article' => html_body
}
template.render_to_file File.expand_path("~/Desktop/output.docx"), context
Images (beta)

Images can be added to the document using a placeholder image wrapped in a pair of merge fields set up as «@figure:start» and «@figure:end». Where in this case "figure" is the key of the context hash storing the image.

Images are wrapped in an instance of a Sablon::Content class in the same fashion as HTML or WordML strings. An image may be initialized from multiple sources such as file paths, URLs, or any object that exposes a #read method that returns image data. When using a "readable object" if the object doesn't have a #filename method then a filename: '...' option needs to be added to the Sablon.content method call.

By default the inserted image takes the dimensions of the placeholder. The size of an image can also be defined dynamically by specifying width and height with unit (cm or in) in a properties hash like properties: {height: "2cm", width: "20cm"}

context = {
  figure: Sablon.content(:image, 'fixtures/images/c3po.jpg'),
  figure2: Sablon.content(:image, string_io_obj, filename: 'test.png'),
  figure3: Sablon.content(:image, string_io_obj, filename: 'test.png', properties: {height: '2cm', width: '2cm'})
  # alternative method using special key format for simple paths and URLs
  # 'image:figure' => 'fixtures/images/c3po.jpg'
}

Example: image merge fields example

Additional examples of usage can be found in images_template.docx and in sablon_test.rb.

Conditionals

Sablon can render parts of the template conditionally based on the value of a context variable. Conditional fields are inserted around the content.

«technologies:if»
    ... arbitrary document markup ...
«technologies:endIf»

This will render the enclosed markup only if the expression is truthy. Note that nil, false and [] are considered falsy. Everything else is truthy.

For more complex conditionals you can use a predicate like so:

«body:if(present?)»
    ... arbitrary document markup ...
«body:endIf»

Finally, you can also mix in elsif and else clauses as well.

«body:if(present?)»
    ... arbitrary document markup ...
«body:elsif(nil?)»
    ... arbitrary document markup ...
[additional elsif blocks...]
«body:else»
    ... arbitrary document markup ...
«body:endIf»

Loops

Loops repeat parts of the document.

«technologies:each(technology)»
    ... arbitrary document markup ...
    ... use `technology` to refer to the current item ...
«technologies:endEach»

Loops can be used to repeat table rows or list enumerations. The fields need to be placed in within table cells or enumeration items enclosing the rows or items to repeat. Have a look at the example template for more details.

Nesting

It is possible to nest loops and conditionals.

Comments

Sometimes it's necessary to include markup in the template that should not be visible in the rendered output. For example when defining sample numbering styles for HTML insertion.

«comment»
    ... arbitrary document markup ...
«endComment»

Configuration (Beta)

The Sablon::Configuration singleton is a new feature that allows the end user to customize HTML parsing to their needs without needing to fork and edit the source code of the gem. This API is still in a beta state and may be subject to change as future needs are identified beyond HTML conversion.

The example below show how to expose the configuration instance:

Sablon.configure do |config|
  # manipulate config object
end

The default set of registered HTML tags and CSS property conversions are defined in configuration.rb.

Customizing HTML Tag Conversion

Any HTML tag can be added using the configuration object even if it needs a custom AST class to handle conversion logic. Simple inline tags that only modify the style of text (i.e. the already supported <b> tag) can be added without an AST class as shown below:

Sablon.configure do |config|
  config.register_html_tag(:bgcyan, :inline, properties: { highlight: 'cyan' })
end

The above tag simply adds a background color to text using the <w:highlight w:val="cyan" /> property.

More complex business logic can be supported by adding a new class under the Sablon::HTMLConverter namespace. The new class will likely subclass Sablon::HTMLConverter::Node or Sablon::HTMLConverter::Collection depending on the needed behavior. The current AST classes serve as additional examples and can be found in ast.rb. When registering a new HTML tag that uses a custom AST class the class must be passed in either by name using a lowercased and underscored symbol or the class object itself.

The block below shows how to register a new HTML tag that adds the following AST class: Sablon::HTMLConverter::InstrText.

module Sablon
  class HTMLConverter
    class InstrText < Node
      # implementation details ...
    end
  end
end
# register tag
Sablon.configure do |config|
  config.register_html_tag(:bgcyan, :inline, ast_class: :instr_text)
end

Existing tags can be overwritten using the config.register_html_tag method or removed entirely using config.remove_html_tag.

# remove tag
Sablon.configure do |config|
  # remove support for the span tag
  config.remove_html_tag(:span)
end

Customizing CSS Style Conversion

The conversion of CSS stored in an element's style="..." attribute can be customized using the configuration object as well. Adding a new style conversion or overriding an existing one is done using the config.register_style_converter method. It accepts three arguments the name of the AST node (as a lowercased and underscored symbol) the style applies to, the name of the CSS property (needs to be a string in most cases) and a lambda that accepts a single argument, the property value. The example below shows how to add a new style that sets the <w:highlight /> property.

# add style conversion
Sablon.configure do |config|
  # register new conversion for the Sablon::HTMLConverter::Run AST class.
  converter = lambda { |v| return 'highlight', v }
  config.register_style_converter(:run, 'custom-highlight', converter)
end

Existing conversions can be overwritten using the config.register_style_converter method or removed entirely using config.remove_style_converter.

# remove tag
Sablon.configure do |config|
  # remove support for conversion of font-size for the Run AST class
  config.remove_style_converter(:run, 'font-size')
end

Executable

The sablon executable can be used to process templates on the command-line. The usage is as follows:

cat <context path>.json | sablon <template path> <output path>

If no <output path> is given, the document will be printed to stdout.

Have a look at this test for examples.

Examples

Using a Ruby script

There is a sample template in the repository, which illustrates the functionality of sablon:

Sablon Template

Processing this template with some sample data yields the following output document. For more details, check out this test case.

Sablon Output

Using the sablon executable

The executable test showcases the sablon executable.

The template

Sablon Output

is rendered using a json context to provide the data. Following is the resulting output:

Sablon Output

Contributing

  1. Fork it ( https://github.com/senny/sablon/fork )
  2. Create your feature branch (git checkout -b my-new-feature)
  3. Commit your changes (git commit -am 'Add some feature')
  4. Push to the branch (git push origin my-new-feature)
  5. Create a new Pull Request

Inspiration

These projects address a similar goal and inspired the work on Sablon:

sablon's People

Contributors

ahmedelmahalawey avatar bensie avatar bknarendra avatar blrsn avatar buchi avatar dependabot[bot] avatar enrico-geissler avatar inetbusiness avatar luis-ca avatar lukasgutwinski avatar odujokod avatar pimlottc-gov avatar senny avatar siegy22 avatar stadelmanma avatar tmikoss avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

sablon's Issues

Crash when looping over a non-existent variable

Crash if the expression inside a loop evaluates to something other than an Array or does not exist. Let's treat nil as empty Array and raise a descriptive error if the expression evaluates to a non-nil non-Array value.

EDIT: After thinking about it I propose the following change. nil-values should not be treated as blank array but should also raise an error. Otherwise this might shadow errors, which are hard to track down.

Use instance_eval to evaluate expressions

This would replace the manual evaluation that is currently done which has the limitation that methods cannot be called on hash objects. Executing things this way would add the possibility for NoMethodErrors which aren't possible in the current LookupOrMethod#evaluate method (shown below). Instead we would catch the NoMethodError and just return nil in that instance.

This would be a huge step forward in allowing expressions of arbitrary complexity to be evaluated like in regular ERB templates although we would still be limited to one-liners.

Handling Backwards Compatibility Issues:

For full backwards compatibility with .expression used on a Hash we would need to monkey-patch the Hash class's method_missing method to do a hash lookup. However, that is not something we want floating around in our general namespace.

This would only be an option if the patch can be applied inside the limited scope of instance_eval and no where else. method_missing and refine are available in ruby 2.0 and I think they might do the trick. (Links below). However things accessed using send fail for refinements. This might kill the idea because overriding method_missing doesn't appear possible in my test cases.

While dropping support for ruby less than 2.0 is undesirable support for 1.9.3 ended in 2015 and support for 2.0 ended in 2016. Updating the gemspec and tagging a release (0.0.22) before this change and then (0.1.0) afterwards should be sufficient as people using old rubies can still use an older version.

Excerpt from operations.rb

    class LookupOrMethodCall < Struct.new(:receiver_expr, :expression)
      def evaluate(context)
        if receiver = receiver_expr.evaluate(context)
          expression.split(".").inject(receiver) do |local, m|
            case local
            when Hash
              local[m]
            else
              local.public_send m if local.respond_to?(m)
            end
          end
        end
      end

....

Non-functional test case (added incase i figure out a fix)

module LookupHash
  refine Hash do
    def method_missing(name, *args, &block)
      fetch(name, nil)
    end
  end
end

InstanceObj = Struct.new(:data)
instance = InstanceObj.new({ key1: 123 })

using LookupHash
begin
  instance.instance_eval("data.key1")
rescue NoMethodError => e
  puts e
end

http://ruby-doc.org/core-2.0.0/BasicObject.html#method-i-method_missing
http://ruby-doc.org/core-2.0.0/doc/syntax/refinements_rdoc.html
https://ruby-doc.org/core-1.9.3/BasicObject.html#method-i-instance_eval

Variables not being replaced in the template

Rendering the template to file does not seem to work as expected, when attempting to render this template its not replacing the single variable.

template = Sablon.template(File.expand_path('./template.docx'))
    context = {
      please: 'test'
    }
template.render_to_file File.expand_path('./output.docx'), context

Please see template attached.
template.docx

JRuby Support

I've been having trouble running Sablon in JRuby. The first problem was that JRuby doesn't support redcarpet (because it has native extensions) so I replaced it with kramdown, which seemed to work fine. Here is the diff for that change. The second problem is much more obscure and seems to be caused by a difference in behavior between the Ruby and JRuby versions of Nokogiri. Sablon cannot properly render looped fields under JRuby. Here is the relevant portion of the stacktrace. I have banged my head against this problem for a while so any advice is welcome.

MailMerge fields not getting filled by data if inserted with Word/Insert tab/Quick Parts/Field/MailMerge field

If I get this straight, one should be able to create a new Word docx document as a template, insert some mailmerge fields, name them correctly, and sablon is supposed to fill those fields with data, just like its described in the gem page.

However, when I try to add a merge field via Word -> Insert tab -> Quick Parts -> Field -> MergeField (which seems to be the most convenient way to change any existing, well formated doc into a template for sablon) and name them "myfieldname" or "=myfieldname" (tried both), they are not being filled with data, and no error is thrown. Not certain if they are being correctly noticed by sablon either. In MS Word, fields display as mailmerge fields «myfieldname» or, (if you click Alt+F9) - { MERGEFIELD myfieldname } which looks just fine to me.

The curious part is that if you use this wizard you have in the Mailings tab->Start Mail Merge, to insert mailmerge fields in the template, it works properly - the field are indeed getting filled with data by sablon!

It would seem like the wizard is creating different XML tags, than just using Insert->Field->MergeField....

Any suggestions?

Possible bug when there is a fldCharType begin but no end

In my docx files it seems as a result of the AUTOTEXT fields there will be a
<w:fldChar w:fldCharType="begin"/> tag but no <w:fldChar w:fldCharType="end"/>

This causes a nil value to be passed into the ComplexField.new method and then blow up in initialize because n.search is being called when n == nil.

It is possible that my docx files are somehow syntactically invalid but they have no display issues, opening and re-saving them has no effect, nor does a C/P of the contents into a fresh document. I have a fix here if this is actually a bug and not some special case I happen to be running into.

Issue on 'for loop' with shapes inside?

Hello,

When I have a "for loop" with a shape inside, and the loop runs more than once, Word claims that output is corrupted.

Sample input:
image

If loop runs only once, output is as expected (one blue rectangle). If loop is run more than once, Word 2016 complains:
image

But if I try to proceed and let Word recover the document, output is as expected (two rectangles). If I replace the figure with text, everything works fine.

I will try to investigate the issue soon. Just posting it here to force myself to come later with a following up and to ask if this is a known bug.

Thanks again for the great work on the gem.

Inserting merge fields into Word document

This isn't directly related to the library but can you offer some tips on how to insert the required mail merge fields into a Word document without having a data source document?

Supporting non mail merge markup as suggested in #7 would make things easier but I understand why that is difficult to implement.

MailMerge field doesn't work in a Hyperlink field

In my context object, I have a key called 'insights_url' whose value is a dynamic url. I'm trying to use this 'insights_url' mail merge field in my Microsoft Word docx template. I've been trying to do this for days, searching through pages of Google search results, with no answers. I just want a hyperlink that says "VIEW ALL" and the url would be a mail merge field. The mail merge field for 'insights_url' works fine on its own, but I need it to say "VIEW ALL".

I've tried all these steps listed in: http://stackoverflow.com/questions/17428891/add-variable-hyperlink-in-mail-merge-in-word-2013

Basically, create a Hyperlink field, edit the field codes to throw in a mail merge field. It looks like:
{ HYPERLINK "{ MERGEFIELD =insights_url \* MERGEFORMAT}" \* MERGEFORMAT }

So I've tried that, and multiple variations-- using the hyperlink dropdown rather than inserting a field, adding a bookmark after, and more. They always end up with a broken hyperlink that cannot be opened, or are empty. I've also tried to unzip the doc, edit the document.xml.rels file to change the target of a working hyperlink to the mail merge field. However, it seems making any edits and re-zipping, will result in a corrupt file.

Just wondering, is there an easier way to do this? Like support for hyperlinked mail merge fields? Thank you.

Default values (retaining merge fields)

Is it possible to define a default value instead of just removing the merge field when doing insertions? This would be a great feature to speed up debugging a new template for typos, or just highlighting areas in a template that need to be user defined in a clean fashion, i.e. instead of manually typing REPLACE ME everywhere use a merge field and save some effort.

It seems like this would be handled by modifying the code below in operations.rb. While I haven't dug into the code to determine what variables are defined at that scope yet just from a first glance it may be a significant API change.

    class Insertion < Struct.new(:expr, :field)
      def evaluate(context)
        if content = expr.evaluate(context)
          field.replace(Sablon::Content.wrap(expr.evaluate(context)))
        else
          field.remove
        end
      end
    end

can't dup NilClass when render template

Hey,
When i try to render template i get an error - can't dup NilClass
The error occurs in the file lib/sablon/content.rb line 59

This code in file lib/sablon/parser/mail_merge.rb fixes problem for me

def replace_field_display(node, content)
  paragraph = node.ancestors(".//w:p").first
  display_node = node.search(".//w:t").first
  unless display_node.nil?
    content.append_to(paragraph, display_node)
    display_node.remove
  end
end

Support for Legacy Form Fields

Hey @senny,

First thank you very much for such an excellent gem!

My question is about other ways Word docs can be templated, in my instance Text Form Fields. The difference in the generated XML isn't too big as far as I can see, for example for a field with label "Fullname":

<w:r>
	<w:rPr>
		<w:b/>
		<w:sz w:val="36"/>
		<w:szCs w:val="36"/>
		<w:u w:val="single"/>
		<w:lang w:val="en-GB"/>
	</w:rPr>
	<w:fldChar w:fldCharType="begin">
		<w:ffData>
			<w:name w:val="Fullname"/>
			<w:enabled/>
			<w:calcOnExit w:val="0"/>
			<w:textInput/>
		</w:ffData>
	</w:fldChar>
</w:r>
<w:bookmarkStart w:id="1" w:name="Fullname"/>
<w:r>
	<w:rPr>
		<w:b/>
		<w:sz w:val="36"/>
		<w:szCs w:val="36"/>
		<w:u w:val="single"/>
		<w:lang w:val="en-GB"/>
	</w:rPr>
	<w:instrText xml:space="preserve">FORMTEXT </w:instrText>
</w:r>

Do you reckon it's worth me attempting an alternative parser for your gem? Or if you know an easy way for me to convert Form Fields to Merge fields, that would also be great!

Thanks for your help in advance!

Create a configuration object

There are several areas in sablon that would lend themselves well to a general configuration object that could replace current blocks of conditional code. Most notably the style attributes added in #55. This opens the door to easily allow the end user to add/change functionality without needing to change the code itself, especially in cases where it is not practical i.e. a Rails app.

Other possible candidates could be regexp's used in OperationConstruction#consume and Template#render.

Example using nested loops

I have been trying to understand how nested loops work. Not sure what sablon is looking for in the nested loop, i.e. the part of the "each" statement is not in the context passed in.

See below what I mean:

1 <widgets:each(anItem)>
2 <<anItem.details(foo)>>
3 <<=foo.description>>
4 <anItem.details:endEach>
5 <foo:endEach>

What I am seeing is that the inner loop is executing correctly but nothing shows up in the output document for what is expressed on line 3 above.

I am not sure if the syntax I am using is correct or what else I am doing incorrectly.

About Raw HTML

can this gem generate text with raw description like 'Test' with bold tag html? so that can show text with bold in word document

Implement different error messages for unknown HTML tags and invalid tag placement

This arises from #65 where it is not clear when the end user is using an unknown tag, a valid tag in an invalid way (i.e. <i></i> outside of a paragraph or div). A better route is to create a separate error message when the structure itself is invalid but the tag is supported.

Currently the code handles Runs and Paragraphs separately validating the structure as it loops through nodes. Depending on eventual refactors from #62 and/or #48 if we choose to implement inline HTML replacement that may not be the most effective and/or logical way. A separate method to check the final AST structure produced may be better in terms of future code flexibility.

Don't work with rubyzip < 1.1.2

OutputStream#write_buffer don't receive parameter in version under 1.1.2 generating a ArgumentError: wrong number of arguments (given 1, expected 0)
in:
rubyzip-1.0.0/lib/zip/output_stream.rb:55
sablon-0.0.1/lib/sablon/template.rb:21

Please upgrade the gemspec

Make URLs clickable in the generated File

Adding url in the text to be replaced doesn't get clickable when the file generated but when I open the docx file and set the cursor at the end of the url and press space or enter it gets clickable.

So is there a way to make the urls clickable in the generated file?

Support For Empty Fields

It takes me a while to add a new mailmerge field and as far as I can tell there is no keyboard shortcut. However, there is a shortcut for adding a new empty field with Cmd+Fn+F9 on OSX or Ctl+F9 on Windows. If this type of field was supported it could significantly speed up the templating process.

ArgumentError: Don't know how to handle node: #<Nokogiri::XML::Element:0x3fc5e96cf3d8 name="i" children=[#<Nokogiri::XML::Text:0x3fc5e96ce104 "short stories">]>

when i using your test code in test/sablon_test.rb:
I simplified the code,

@base_path = Pathname.new(File.expand_path("~/Downloads/sablon_test"))

template = Sablon.template(@base_path+"html_temp.docx")

context = {article: Sablon.content(:html, "I am fond of writing <i>short stories</i> and <i>poems</i> in my spare time, <br />and have won several literary contests in pursuit of my <b>passion</b>.")}

template.render_to_file((@base_path+"html2.docx"), context)

ArgumentError: Don't know how to handle node: #<Nokogiri::XML::Element:0x3fc5e96cf3d8 name="i" children=[#<Nokogiri::XML::Text:0x3fc5e96ce104 "short stories">]>

Don't remove false values

It is impossible to use false values in documents because Sablon::Statement::Insertion removes them. Does it intended or maybe content should be checked there using #nil? method.

Markdown list style (ListBullet) not applied

Apologies in advance if this is not pertinent to the gem: i'm not sure if it's some bug or i'm bad at Office 2011.

I'm trying to apply some list styles from markup data.
I've added the ListBullet style to the template, styles.xml source:

<w:style w:type="numbering" w:customStyle="1" w:styleId="ListBullet">
  <w:name w:val="ListBullet"/>
  <w:basedOn w:val="Nessunelenco"/>
  <w:uiPriority w:val="99"/>
  <w:rsid w:val="005547DB"/>
  <w:pPr>
    <w:numPr>
      <w:numId w:val="25"/>
    </w:numPr>
  </w:pPr>
</w:style>

and i see it on the generated document source:

<w:p>
  <w:pPr>
    <w:pStyle w:val="ListBullet"/>
  </w:pPr>
  <w:r>
    <w:t xml:space="preserve">my content</w:t>
  </w:r>
</w:p>

But i don't see any list style on the document, it's just normal text.
Any thoughts?

Expanding the HTML parsing

I am going to expand the HTML parsing capabilities of sablon in my fork and can merge the changes back into your repo if desired. Expanding the HTML capability allows me to leverage the power of the native ERB templating engine instead of trying to reinvent the wheel and port those same features into sablon directly.

Relevant Issue on in my repo: stadelmanma#9

Checklist added on 07/25/17

  • Implement support for the <span> tag
  • Implement support for a limited set of attributes specified by the style= attribute
    • Do this in a flexible way so it can be relatively easy to add additional styles in the future
    • Allow it to function on both the run and paragraph levels
  • Implement support for the <sup> and <sub> tags
  • Implement support for Hyperlinks via <a> tags
    • I think this will first require a refactor of how we handle a document template. Currently template files are read and written sequentially, instead we would really need more a "live DOM" where all aspects of a word doc are available since hyperlinks are stored in the document.xml.rels file. A "live DOM" approach would also benefit our existing logic in the numbering process and significantly simplify the logic required to support images (see #54).
    • See #83 for an implementation without the refactor mentioned below
  • Implement support for parsing of HTML <table> tags
    • I ran into some issues with this one my fork where it seemed like not all content was being properly generated inside a table. Lists are the only one that comes to mind initially but I didn't test images/figures.
    • I implemented the <caption> tag differently on my fork but perhaps I should do it according to the HTML5 spec here using similar logic, although implementing support for caption-side might be hard and I don't know if I can/want to add a table reference like in my fork. MDN, W3 Schools
    • Relevant branches on my fork: support-table-tag, add-th-element
  • Allow inline HTML insertion that only replaces the merge field itself, not the entire run
    • This would only be allowed if the HTML content only contained inline tags (i.e. w:r, w:hyperlink, etc.),
    • I'm not sure how to best assert that only inline tags were used probably a whitelist based on the allowed childnodes of a w:p tag as defined in the OOXML spec.

provide else statement for conditions

I'd like to have an else statement to provide alternate content when a condition is considered falsy, e.g:

«foo.bar:if»
FooBar!
«foo.bar:else»
42
«foo.bar:endIf»

Numbering continues from previous list

First of all, thanks for this awesome gem.
I came across this issue when generating a document today.

Sablon Template

«array_of_item:each(item)»
   «=item.text»
   «item.list:each(list_item)»
        1. «=list_item»
   «item.list:endEach»

«array_of_item:endEach» 

Context

array_of_item = 
[ { text: ‘List of colours’, list: [‘Red’, ‘Orange’] },
  { text: ‘List of shapes’, list: [ ‘Square’, ‘Rectangle'] } ]

Output Doc

List of colours
   1. Red
   2. Orange

List of shapes
   3. Square
   4. Rectange 

As you can see in the output document, the numbering in the second list continues from the previous.
I understand that this might be an issue with how word handles numbering, rather than Sablon.

Question/Idea: Implications of Numbering being a singleton in multi-user rails app

Could there be any issues with the above class being a singleton if say two users generated a docx file at the same time? It seems like the definitions being registers would get fouled up since if I am thinking correctly the rails app would only ever have one instance of Numbering available to work with.

    def reset!
      @numid = 1000
      @definitions = []
    end

    def register(style)
      @numid += 1
      definition = Definition.new(@numid, style)
      @definitions << definition
      definition
    end

Could it potentially be more robust to register a numbering instance on the Context object being passed around and use that instead? You should be able to maintain the same expected behavior if Context.trasnform is moved out of Document.process and into the template.rb file (likely in the render method).

  • Convert Context from module into class
  • Register a local Processor.Numbering instance to the Context instance

Word to PDF

This is a little off topic, but did you convert the word document into PDF? If so, any library you recommend?

That's for a great gem!

Image Support

Hey, this looks amazing! Great job.

Have you looked into Image support? I am looking at using this library and would really want image support. If you don't have anything in the works I can take a stab at it. :)

Is this gem working?

I tried using this gem but it looks like gem is not working. i tried a sample template docx file ( Dear «=title» ) and tried to pass the context = { title: 'sablon' }.

src = Rails.root.join('public','sample.docx')
template = Sablon.template(File.expand_path(src))
context = {title: "Fabulous Document"}
dst = Rails.root.join('public', 'report.docx')
template.render_to_file(File.expand_path(dst), context)
File.open(dst, 'r') do |f|
   send_data f.read.force_encoding('BINARY'), :filename => 'report.docx', :type => "application/docx"
end

Not working on linux

Hi,

I've just tried running sablon on a linux server, however, the output document is empty - on OSX it works. "Sadly", the tests seem to run without errors, except one skip:

/usr/local/rvm/rubies/ruby-2.1.0/bin/ruby -I"lib:test" -I"/usr/local/rvm/gems/ruby-2.1.0/gems/rake-10.4.2/lib" "/usr/local/rvm/gems/ruby-2.1.0/gems/rake-10.4.2/lib/rake/rake_test_loader.rb" "test/content_test.rb" "test/context_test.rb" "test/executable_test.rb" "test/expression_test.rb" "test/mail_merge_parser_test.rb" "test/processor_test.rb" "test/sablon_test.rb" "test/section_properties_test.rb" 
Run options: --seed 24583

# Running:

.....................................S........

Finished in 2.269466s, 20.2691 runs/s, 24.6754 assertions/s.

46 runs, 56 assertions, 0 failures, 0 errors, 1 skips

You have skipped tests. Run with --verbose for details.

The verbose command doesnt seem to work, i'll try again later. I realize it's trouble to work out the problem, but do you have any pointers where to look?

ruby 2.1.0
Linux 3.2.0-4-amd64 #1 SMP Debian 3.2.65-1+deb7u1 x86_64 GNU/Linux
nokogiri-1.6.6.2
2.1.0 :004 > Nokogiri::VersionInfo.instance.loaded_parser_version
=> "2.9.2"
2.1.0 :005 > Nokogiri::VersionInfo.instance.compiled_parser_version
=> "2.9.2"

Nested lists support

I would like to try adding support for nested lists if it's not impossibile... any help on how to start? :)

Thanks

Merged field text not being replaced

Run across an issue and not sure what is wrong. I have a merged field in a title that is putting the string I want in but does not remove the entire merged field text.

The code for this is below and the two files referenced in the code
template.docx
output.docx

are attached.

emplate = Sablon.template("template.docx")

context = {
system_name: "Jeff's Wonderful System"
}

template.render_to_file "output.docx", context

about comment tag

in document.rb line 174, when /comment/ makes it starts comment the document as long as there is a key contains comment. In my opinion, it shall be improved with an unique key to trigger comment. Since the current any key contain comment is too general. To be frank, in my project, one of my key is the exact as comment. Originally, I rename it to a_comment. But it still trigger the comment

Test code not working, OpenOffice jumbled

I've tried using Sablon in my own code with the same code used in the test example and the provided docx template, however, none of the contents seem to be replaced by sablon. Here is the code:

template = Sablon.template(File.expand_path(Rails.root.join('doc', 'translation-template.docx')))
person = OpenStruct.new "first_name" => "Ronald", "last_name" => "Anderson", "address" => {"street" => "Panda Bay 4A"}
item = Struct.new(:index, :label, :rating)
position = Struct.new(:duration, :label, :description)
language = Struct.new(:name, :skill)
context = {
  current_time: Time.now.strftime("%d.%m.%Y %H:%M"),
  author: "Yves Senn",
  title: "Letter of application",
  person: person,
  about_me: "asdf",
  items: [item.new("1.", "Ruby", "★" * 5), item.new("2.", "Java", "★" * 1), item.new("3.", "Python", "★" * 3)],
  career: [position.new("1999 - 2006", "Junior Java Engineer", "Lorem ipsum dolor\nsit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat."),
           position.new("2006 - 2013", "Senior Ruby Developer", "Sed ut perspiciatis unde omnis iste natus error sit voluptatem accusantium doloremque laudantium, totam rem aperiam, eaque ipsa quae ab illo inventore veritatis et quasi architecto beatae vitae dicta sunt explicabo."),
           position.new("2013 - today", "Sales...", nil)],
  technologies: ["HTML", "CSS", "SASS", "LESS", "JavaScript"],
  languages: [language.new("German", "native speaker"), language.new("English", "fluent")],
  training: "At vero eos et accusam et justo duo dolores et ea rebum.\n\nStet clita kasd gubergren, no sea takimata sanctus est Lorem ipsum dolor sit amet."
}
template.render_to_file File.expand_path("~/Desktop/output.docx"), context

all I've changed is the filename, the about_me content and removed the page properties. In addition, opening the output document in OpenOffice 4.0.1, the document is all over the place, see picture of first page.

Specs:

OSX 10.9.5
ruby-2.1.0, rails 3.2.21
added sablon through gemfile like this
gem 'sablon', :git => "https://github.com/senny/sablon.git"
doing it without git yielded the same results

screen shot 2015-04-09 at 18 15 02

Dropping the Gemfile.lock from version control

@senny I was going through and updating our dependencies and that got me thinking since this is a gem should we drop the Gemfile.lock from version control?

According to these links we shouldn't have a Gemfile.lock for ruby gems:
http://yehudakatz.com/2010/12/16/clarifying-the-roles-of-the-gemspec-and-gemfile/
https://stackoverflow.com/questions/4151495/should-gemfile-lock-be-included-in-gitignore

Their logic makes sense, let me know what you think.

ArgumentError (wrong number of arguments (1 for 0)):

i dont know if i making mistake, but coping and paste instruction code, i getting thats error:
ArgumentError (wrong number of arguments (1 for 0)):

my gem file:
http://pastebin.com/jJ92wkjH

require "sablon"
template = Sablon.template(File.expand_path("~/Desktop/template-a.docx"))
context = {
title: "Fabulous Document",
technologies: ["Ruby", "Markdown", "ODF"]
}

error on next line

template.render_to_file File.expand_path("~/Desktop/output.docx"), context

Support for Non-mailmerge fields?

So, as we are investigating using this library we are finding that creating mail merge fields are a PITA. Dang Microsoft.

I imagine our customers editing the text of the merge fields rather than the merge field, which would result in XML like:

            <w:r>
              <w:instrText xml:space="preserve">MERGEFIELD =email.address \* MERGEFORMAT </w:instrText>
            </w:r>
            <w:r>
              <w:fldChar w:fldCharType="separate"/>
            </w:r>
            <w:r>
              <w:rPr>
                <w:noProof/>
              </w:rPr>
              <w:t>«=email.type»</w:t>
            </w:r>
            <w:r>
              <w:fldChar w:fldCharType="end"/>
            </w:r>

In this you can see the merge field specified is email.address but the one visible in the word document is email.type. I am wonderif if we should make it so that allowing the use "fake" merge fields would be a good idea. i.e. {{email.type}} or <<email.type>>

Thoughts?

notation «=content» not working

I try using the cv_template.docx, generate the document was successfull.
when I create a template by myself containing only: «=content»
the content value won't render the value in context variable.

then I try to copy paste «=title» (with keep source formatting) from cv_template.docx
it display the value from the context variable. is this formatting issue? because when I type manually «=title» it won't work

*ps: I user microsoft office word mac 2011

single paragraph block inside table cell can result in corrupted markup

If a paragraph block (loop or conditional) is the only content inside a table cell the output document can get corrupted. This happens only of the block is removed and not replaced (loop over [] or a non matching conditional).

Word is still able to render the document but the user is presented with some technical warning dialogs.

The reason is that every table cell needs to contain a paragraph. If the block is removed and not replaced , the cell will be empty.

Is the equal sign before merge fields necessary?

Hi!

We're integrating sablon in our application, but we're facing an issue with the typical Word Merge Fields Workflow. The idea is to follow a similar approach to the explained in this video: https://www.youtube.com/watch?v=G-pZKyG373s

So, we want to:

  • Create an excel document with a unique row, where we include all the available merge fields available in our app.

  • Use that excel document as a template for mail merge, so users would be able to compose a template in an easy way, using the Word utility to insert merge field), and simplify the interaction explained in TEMPLATE.md.

The problem is that sablon requires an equal sign before the variable name for simple expressions, and Word doesn't allow to include special characters in the list of available fields. If you add an equal sign before the field in the excel template, Word simply ignores it.

I've been reviewing sablon source code, to try to understand why the equal sign is required before the variables, and as I understand, it's only to differentiate these expressions from complex expressions, as you can see in sablon/lib/sablon/processor/document.rb.

def consume(allow_insertion)
  @field = @fields.shift
  return unless @field
  case @field.expression
  when /^=/
    if allow_insertion
      Statement::Insertion.new(Expression.parse(@field.expression[1..-1]), @field)
    end
  when /([^ ]+):each\(([^ ]+)\)/
    block = consume_block("#{$1}:endEach")
    Statement::Loop.new(Expression.parse($1), $2, block)
  when /([^ ]+):if\(([^)]+)\)/
    block = consume_block("#{$1}:endIf")
    Statement::Condition.new(Expression.parse($1), block, $2)
  when /([^ ]+):if/
    block = consume_block("#{$1}:endIf")
    Statement::Condition.new(Expression.parse($1), block)
  when /^comment$/
    block = consume_block("endComment")
    Statement::Comment.new(block)
  end
end

I was thinking that maybe it's possible to refactor that code to something like this:

def consume(allow_insertion)
  @field = @fields.shift
  return unless @field
  case @field.expression
  when /([^ ]+):each\(([^ ]+)\)/
    block = consume_block("#{$1}:endEach")
    Statement::Loop.new(Expression.parse($1), $2, block)
  when /([^ ]+):if\(([^)]+)\)/
    block = consume_block("#{$1}:endIf")
    Statement::Condition.new(Expression.parse($1), block, $2)
  when /([^ ]+):if/
    block = consume_block("#{$1}:endIf")
    Statement::Condition.new(Expression.parse($1), block)
  when /comment/
    block = consume_block("endComment")
    Statement::Comment.new(block)
  else
    if allow_insertion
      Statement::Insertion.new(Expression.parse(@field.expression), @field)
    end
  end
end

I've done a quick test and it seems it works, but really I haven't analyzed the library in depth to determine if it could cause side effects (apart from the obvious, the lost of backward compatibility for previously generated templates). What do you think? Could you considerate a merge if I send a PR?

Thanks for your work!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.