Giter Club home page Giter Club logo

htmltoword's Introduction

Ruby Html to word Gem Code Climate Build Status

OBS: This repository is no longer being maintained. Please take a look at https://github.com/karnov/htmltoword

This simple gem allows you to create MS Word docx documents from simple html documents. This makes it easy to create dynamic reports and forms that can be downloaded by your users as simple MS Word docx files.

Add this line to your application's Gemfile:

gem 'htmltoword'

And then execute:

$ bundle

Or install it yourself as:

$ gem install htmltoword

Usage

Standalone

Using the default word file as template

require 'htmltoword'

my_html = '<html><head></head><body><p>Hello</p></body></html>'
file = Htmltoword::Document.create my_html, file_name

Using your custom word file as a template, where you can setup your own style for normal text, h1,h2, etc.

require 'htmltoword'

# Configure the location of your custom templates
Htmltoword.config.custom_templates_path = 'some_path'

my_html = '<html><head></head><body><p>Hello</p></body></html>'
file = Htmltoword::Document.create my_html, file_name, word_template_file_name

With Rails

For htmltoword version >= 0.2 An action controller renderer has been defined, so there's no need to declare the mime-type and you can just respond to .docx format. It will look then for views with the extension .docx.erb which will provide the HTML that will be rendered in the Word file.

# On your controller.
respond_to :docx

# filename and word_template are optional. By default it will name the file as your action and use the default template provided by the gem. The use of the .docx in the filename and word_template is optional.
def my_action
  # ...
  respond_with(@object, filename: 'my_file.docx', word_template: 'my_template.docx')
  # Alternatively, if you don't want to create the .docx.erb template you could
  respond_with(@object, content: '<html><body>some html</body></html>', filename: 'my_file.docx')
end

def my_action2
  # ...
  respond_to do |format|
    format.docx do
      render docx: 'my_view', filename: 'my_file.docx'
      # Alternatively, if you don't want to create the .docx.erb template you could
      render docx: 'my_file.docx', content: '<html><body>some html</body></html>'
    end
  end
end

Example of my_view.docx.erb

<h1> My custom template </h1>
<%= render partial: 'my_partial', collection: @objects, as: :item %>

Example of _my_partial.docx.erb

<h3><%= item.title %></h3>
<p> My html for item <%= item.id %> goes here </p>

For htmltoword version <= 0.1.8

# Add mime-type in /config/initializers/mime_types.rb:
Mime::Type.register "application/vnd.openxmlformats-officedocument.wordprocessingml.document", :docx

# Add docx responder in your controller
def show
  respond_to do |format|
    format.docx do
      file = Htmltoword::Document.create params[:docx_html_source], "file_name.docx"
      send_file file.path, :disposition => "attachment"
    end
  end
end
  // OPTIONAL: Use a jquery click handler to store the markup in a hidden form field before the form is submitted.
  // Using this strategy makes it easy to allow users to dynamically edit the document that will be turned
  // into a docx file, for example by toggling sections of a document.
  $('#download-as-docx').on('click', function () {
    $('input[name="docx_html_source"]').val('<!DOCTYPE html>\n' + $('.delivery').html());
  });

Configure templates and xslt paths

From version 2.0 you can configure the location of default and custom templates and xslt files. By default templates are defined under lib/htmltoword/templates and xslt under lib/htmltoword/xslt

Htmltoword.configure do |config|
  config.custom_templates_path = 'path_for_custom_templates'
  # If you modify this path, there should be a 'default.docx' file in there
  config.default_templates_path = 'path_for_default_template'
  # If you modify this path, there should be a 'html_to_wordml.xslt' file in there
  config.default_xslt_path = 'some_path'
  # The use of additional custom xslt will come soon
  config.custom_xslt_path = 'some_path'
end

Features

All standard html elements are supported and will create the closest equivalent in wordml. For example spans will create inline elements and divs will create block like elements.

Highlighting text

You can add highlighting to text by wrapping it in a span with class h and adding a data style with a color that wordml supports (http://www.schemacentral.com/sc/ooxml/t-w_ST_HighlightColor.html) ie:

<span class="h" data-style="green">This text will have a green highlight</span>

Page breaks

To create page breaks simply add a div with class -page-break ie:

<div class="-page-break"></div>

Contributing / Extending

Word docx files are essentially just a zipped collection of xml files and resources. This gem contains a standard empty MS Word docx file and a stylesheet to transform arbitrary html into wordml. The basic functioning of this gem can be summarised as:

  1. Transform inputed html to wordml.
  2. Unzip empty word docx file bundled with gem and replace its document.xml content with the new transformed result of step 1.
  3. Zip up contents again into a resulting .docx file.

For more info about WordML: http://rep.oio.dk/microsoft.com/officeschemas/wordprocessingml_article.htm

Contributions would be very much appreciated.

  1. Fork it
  2. Create your feature branch (git checkout -b my-new-feature)
  3. Commit your changes (git commit -am 'Add some feature')
  4. Push to the branch (git push origin my-new-feature)
  5. Create new Pull Request

License

(The MIT License)

Copyright ยฉ 2013:

  • Cristina Matonte

  • Nicholas Frandsen

htmltoword's People

Contributors

anitsirc avatar jeppeliisberg avatar jharbert avatar mdh avatar nickfrandsen avatar tobiashm avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

htmltoword's Issues

Ability to set line spacing

This is a feature request to add the ability to set line spacing on a body or paragraph level (similar to #19)

The line spacing is set on paragraph level with a line spacing of 1 set to the value 240, 1.5 is then 360, and 2 is 480.

The following paragraph property seems to work:

<w:pPr>
  <w:spacing w:line="480" />
</w:pPr>

missing default template/xslt files in the gem install

I'm using RVM with Ruby 2.2.0 and when I try to use the gem I get path issues along the following lines:

Zip::Error (File /Users/jon/.rvm/gems/ruby-2.2.0@custom-gemset/gems/htmltoword-0.2.0/lib/htmltoword/templates/default.docx not found):
  app/controllers/welcome_controller.rb:6:in `block in index'
  app/controllers/welcome_controller.rb:5:in `index'

Looking in the gemset folder for the gem files shows they are not there either.

Setting fonts and sizes

This is a feature request for the ability to set a font and font size for a document, paragraph, or both.

Looking into the WordprocessingML doc, it appears you can either explicitly list the font and size on a run every time, or create a document style and apply that style to the run.

Perhaps we could apply a font-size and font-family style to the body tag and have that body style propagate to all runs under it, or apply a style to a paragraph and have the run use that paragraphs style.

Something like this appears to work:

    <w:p>
      <w:r>
        <w:rPr>
          <w:sz w:val="50"/>
          <w:rFonts w:ascii="Times New Roman" />
        </w:rPr>
        <w:t xml:space="preserve">This paragraph is Times New Roman at 25pt</w:t>
      </w:r>
    </w:p>

undefined method `gsub' for nil:NilClass

In my action when I call

respond_to do |format|
  format.docx do
    file = Htmltoword::Document.create params[:docx_html_source], "file_name.docx"
    send_file file.path, :disposition => "attachment"
  end
end

I get an error

undefined method `gsub' for nil:NilClass

on

file = Htmltoword::Document.create params[:docx_html_source], "file_name.docx"

line

I use 0.1.8 verison of htmltoword on rails 4.1.1

compilation error: element html xsltParseStylesheetProcess : document is not a stylesheet

I am probably missing something obvious, but I have not yet succeeded in generating a docx file with this gem.
I'm running rails 4.2.0 with ruby 2.2.0
I've added the gem to my gemfile and bundle install
I noticed that the template and xslt files were not downloaded, so I downloaded them manually and saved them the lib directory.

I've tried messing around with the configuration i orderto point to the correct lib folder - both in the gem as well as in my rails project.

The mentioned error message appears here in my controller:

  def download
    respond_with(@case, content: '<html><body>some html</body></html>')
  end

Any help is greatly appreciated!

Images

Is possible support to insert image tag?

Support for Ruby 2.1.2

After I bundle install the gem, when I load the library, this is what happened:

[1] pry(main)> require 'htmltoword'
NameError: constant Logger::Format not defined
from /srv/scholar/rbenv/versions/2.1.2/lib/ruby/gems/2.1.0/gems/actionpack-1.4.0/lib/action_controller/support/clean_logger.rb:5:in `remove_const'

ruby version: 2.1.2

understanding html layout

I'm having trouble applying a fixed width to a set of div's. When viewed in the word file, it doesn't seem to be working.

`

Label 1

Label 2
`

Also, if you don't mind clarifying what this is meant to do params[:docx_html_source]

Thank you!

Centering things

How do I center things when exporting. Is there documentation or a way to do this and other things found in word?

Request: debug mode

I'm getting a lot of line breaks in my document. I'm not entirely sure what's being fed into the converter. WickedPDF can enable debug mode with a URL parameter. I'd like to do the same with htmltoword. Something like this:

http://domain.com/controller/export.docx?debug=true

If I have the time, I might be able to contribute this but time is short these days.

Error while starting rails server after adding htmltoword gem

I'm added to Gemfile gem 'htmltoword', '0.2.0' into a new rails 4.2.1 app. But than i starts a rails server an error appear.

/home/vagrant/.rbenv/versions/2.2.2/lib/ruby/gems/2.2.0/gems/htmltoword-0.2.0/lib/htmltoword/action_controller.rb:41:in `': uninitialized constant ActionController::Responder (NameError)
        from /home/vagrant/.rbenv/versions/2.2.2/lib/ruby/gems/2.2.0/gems/htmltoword-0.2.0/lib/htmltoword.rb:26:in `require'
        from /home/vagrant/.rbenv/versions/2.2.2/lib/ruby/gems/2.2.0/gems/htmltoword-0.2.0/lib/htmltoword.rb:26:in `'
        from /home/vagrant/.rbenv/versions/2.2.2/lib/ruby/gems/2.2.0/gems/bundler-1.9.5/lib/bundler/runtime.rb:76:in `require'
        from /home/vagrant/.rbenv/versions/2.2.2/lib/ruby/gems/2.2.0/gems/bundler-1.9.5/lib/bundler/runtime.rb:76:in `block (2 levels) in require'
        from /home/vagrant/.rbenv/versions/2.2.2/lib/ruby/gems/2.2.0/gems/bundler-1.9.5/lib/bundler/runtime.rb:72:in `each'
        from /home/vagrant/.rbenv/versions/2.2.2/lib/ruby/gems/2.2.0/gems/bundler-1.9.5/lib/bundler/runtime.rb:72:in `block in require'
        from /home/vagrant/.rbenv/versions/2.2.2/lib/ruby/gems/2.2.0/gems/bundler-1.9.5/lib/bundler/runtime.rb:61:in `each'
        from /home/vagrant/.rbenv/versions/2.2.2/lib/ruby/gems/2.2.0/gems/bundler-1.9.5/lib/bundler/runtime.rb:61:in `require'
        from /home/vagrant/.rbenv/versions/2.2.2/lib/ruby/gems/2.2.0/gems/bundler-1.9.5/lib/bundler.rb:134:in `require'
        from /vagrant/HtmlWordConverter/config/application.rb:7:in `'
        from /home/vagrant/.rbenv/versions/2.2.2/lib/ruby/gems/2.2.0/gems/railties-4.2.1/lib/rails/commands/commands_tasks.rb:78:in `require'
        from /home/vagrant/.rbenv/versions/2.2.2/lib/ruby/gems/2.2.0/gems/railties-4.2.1/lib/rails/commands/commands_tasks.rb:78:in `block in server'
        from /home/vagrant/.rbenv/versions/2.2.2/lib/ruby/gems/2.2.0/gems/railties-4.2.1/lib/rails/commands/commands_tasks.rb:75:in `tap'
        from /home/vagrant/.rbenv/versions/2.2.2/lib/ruby/gems/2.2.0/gems/railties-4.2.1/lib/rails/commands/commands_tasks.rb:75:in `server'
        from /home/vagrant/.rbenv/versions/2.2.2/lib/ruby/gems/2.2.0/gems/railties-4.2.1/lib/rails/commands/commands_tasks.rb:39:in `run_command!'
        from /home/vagrant/.rbenv/versions/2.2.2/lib/ruby/gems/2.2.0/gems/railties-4.2.1/lib/rails/commands.rb:17:in `'
        from bin/rails:4:in `require'
        from bin/rails:4:in `'

After removing gem rails server starts normally. What problem in? What versions of Ruby/Rails supported? I tried ruby versions 2.0, 2.1.5, 2.2.2 and Rails 4.2.1. Same problem persist.

NameError: uninitialized constant Zip::File

Hello Guys,

I got strange error with rubyzip:

file = Htmltoword::Document.create my_html, file_name
NameError: uninitialized constant Zip::File
from /home/..../gems/htmltoword-0.1.8/lib/htmltoword/document.rb:46:in `block in save'

Thanks for any help )

Images support

Hi !

I'd like to know if images can be supported by your gem. I suppose not because it didn't worked for me and I can't see any reference to images or <img> tags in your source code (I only searched for it a bit).

If it is not supported yet, I'd be happy to try to implement it if you can guide me, since I don't know XSLT.

But I imagine some kind of HTML pre-processing to download images and reference them directly from inside the generated .docx "zip" file may do the trick - or maybe not ?

Thanks for your work though, it works nicely for the rest of the document, and a ruby solution is really appreciated here.

Link support

First - thanks for making this thing, it's a big challenge.

I'm wondering about link support.

<a href="#asdf">news</a>

That's just outputting line breaks for the opening and closing A tags.

uninitialized constant Mime::DOCX

With htmltoword version >= 0.2 there's no need to declare the mime-type, but I'm using htmltoword ver 0.5.1. I can't fix this "uninitialized constant Mime::DOCX" error. I need help.

White space being added before and after element

I created a word document using version 1.8 of htmltoword. It was interpreted fine in Openoffice, but after trying to open with a copy of microsoft word, the software reported the file was corrupted. I upgraded to version 2.0 and copied to the xslt to where I believe was the correct place, and regenerated the document. This document could be opened in word, but alot of preserve whitespace xml nodes were added, creating undesirable formatting. Is there something im missing here? I removed everything complex in the end, and tried to just render a paragraph, but still excessive white space appeared above and below the paragraph in the produced word document. I can attempt to write another xslt, but I'm wondering if there's anything obvious that ive missed....

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.