Giter Club home page Giter Club logo

docx's Introduction

What is DocX?

DocX is a .NET library that allows developers to manipulate Word 2007/2010/2013 files, in an easy and intuitive manner. DocX is fast, lightweight and best of all it does not require Microsoft Word or Office to be installed.

NOTE: There is a new Master branch as of Oct. 3, 2017. Please read about the Classic branch if you were using this project before the change.

DocX is the free, open source version of Xceed Words for .NET. Originally written by Cathal Coffey, and maintained by Przemyslaw Klys, it is now maintained by Xceed. Starting at v1.5.0, this free and open source product is provided under the Xceed Community License agreement(for non-commercial use).

Currently, the differences between DocX and Xceed Words for .NET, is that Xceed Words for .NET :

  • can convert a Word document to PDF
  • adds properties to wrap text around Pictures/Tables/Shapes
  • adds Picture cropping
  • adds Shapes (rectangles for now)
  • adds TextBoxes or Shapes containing Text
  • gets Shapes from Paragraphs
  • gets Charts from Paragraphs and can modify their categories/values
  • More properties in Charts configuration like axis Label position and series width
  • is at least two versions ahead of the DocX version
  • has professional technical support included in the subscription
  • can automatically update fields from a document
  • Insert html/rtf text (with tags), or html/rtf document, to a Word document
  • Clone lists or tables
  • Add or modify checkboxes
  • Set transparency in pictures
  • Create formatted hyperlinks based on a referenced hyperlinks
  • Joining 2 documents gives the opportunity to choose the headers/footers of doc1, doc2 or both of them in the resulting document.
  • Automatic hyphenations and configurable hyphenations
  • Digital signatures can be added to documents in the .NET Framework environment
  • Add footnotes and endnotes
  • ListOptions for List level configurations
  • Modify Chart's Series Marker and DataPoint styles
  • Insert a document at a specific point in another document
  • Wrap text around Charts
  • Format Charts Axis' title

What else do I need?

All that you need to install in order to use DocX is the .NET framework 4.0 or .NET5+ and Visual Studio 2010 or later, both of which are free.

What are the main features of DocX?

Edition DocX Xceed Words for .NET
Price Free $599.95
License Xceed Community License Proprietary
Email support YES
Create new Word documents YES YES
Modify Word documents YES YES
Create new PDF documents YES
Convert Word to PDF YES
Supports .DOCX from Word 2007 and up YES YES
Modify multiple documents in parallel for better performance YES YES
Apply a template to a Word document YES YES
Join documents, recreate portions from one to another YES YES
Supports document protection with or without password YES YES
Set document margins and page size YES YES
Set line spacing, indentation, text direction, text alignment YES YES
Wrap text around pictures YES
Pictures with cropping YES
Manage fonts and font sizes YES YES
Set text color, bold, underline, italic, strikethrough, highlighting YES YES
Set page numbering YES YES
Create sections YES YES
Available on .net 5 for .net 5/6 applications YES YES
Update document fields (ex: a table of contents) by calling only one method YES
Wrap text around tables YES
Wrap text around shapes YES
Create shapes (rectangles for now) YES
Create textboxes or shapes containing text YES
Get shapes from paragraphs YES
Get charts from paragraphs and modify their categories/values YES
Update document fields with 1 method call YES
Insert html/rtf text (with tags), or html/rtf document, to a Word document YES
Clone lists or tables YES
Add or modify checkboxes YES
Set transparency in pictures YES
Create formatted hyperlinks based on a referenced hyperlinks YES
Joining 2 documents gives the opportunity to choose which headers/footers to use YES
More properties to configure Charts YES
Automatic Hyphenations and configurable hyphenations YES
Digital signatures in .NET Framework YES
Add footnotes and endnotes YES
ListOptions for List level configurations YES
Modify Chart's Series Marker and DataPoint styles YES
Insert a document at a specific point in another document YES
Wrap text around Charts YES
Format Charts Axis' title YES
Get release ahead YES

Supported Word document elements

  • Add headers or footers which can be the same on all pages, or unique for the first page, or unique for odd/even pages. Can contain images, hyperlinks and more.
  • Insert/Modify paragraphs.
  • Insert/Modify numbered or bulleted lists.
  • Insert/Modify images. Flip, rotate, copy, modify, resize.
  • Insert/Modify tables. Insert/Remove rows, columns, change direction, column width, row height, borders, merge/delete cells.
  • Insert/Modify formatted equations or formulas.
  • Insert/Modify bookmarks.
  • Insert/Modify hyperlinks.
  • Insert/Modify horizontal lines.
  • Insert/Modify charts (bar, line, pie, 3D chart). Set colors, titles, legend, etc.
  • Find, remove or replace text. Supports case sensitivity and regular expressions.
  • Insert/Modify core or custom properties, such as author, address, subject, title, etc.
  • Insert a Table Of Contents. Set title, change formatting.

Why would I use DocX?

DocX makes creating and manipulating documents a simple task. It does not use COM libraries nor does it require Microsoft Office to be installed.

The following blog post from Cathal Coffey compares the code used to create a HelloWorld document using:

  1. Office Interop libraries,
  2. OOXML SDK,
  3. DocX

Advanced Examples

  1. Step by step guide to create an invoice for a company. http://cathalscorner.blogspot.com/2009/08/docx-v1007-released.html
  2. Replace text across many documents in Parallel. http://cathalscorner.blogspot.com/2010/12/replace-text-across-many-documents-in.html
  3. Programmatically manipulate an Image imbedded inside a document. http://cathalscorner.blogspot.com/2010/12/programmatically-manipulate-image.html
  4. Converting DocX into (.doc, .pdf, .html) http://cathalscorner.blogspot.com/2009/10/converting-docx-into-doc-pdf-html.html

Do you have an interesting or informative example that you would like to share? If you do, please email me.

License Information

DocX is provided under the Xceed Software, Inc. Community License.

More information can be found in the License page.

A commercial license can be purchased at Xceed.

Release history


NuGet Version

docx's People

Contributors

chrischip avatar cponty avatar cradaydg avatar delphin0850 avatar dianexceed avatar ermoll avatar fitdev avatar grabzit avatar henrys22 avatar jafin avatar janbernloehr avatar lahma avatar lucwuyts avatar martin-schiel avatar mdum avatar michalazza avatar oromand avatar przemyslawklys avatar twotsman avatar victorloktev avatar vzhikserg avatar xceedbouchers avatar xceedsoftware avatar xwgli avatar zambiorix avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

docx's Issues

Just need some help.

I am using DocX and it is a great library, very powerful. I generate dynamic docs with it and worked great.

Now I have a block of html and I want to add that in an existing doc, while keeping all the formatting, like b /br font size h1 etc, I want to know how can I do that with DocX or I need to use some other library.
Thanks

Find the Range of particular Text and Mark that text as hyperlink

Hi Team,

Below mentioned text will be present in our TEST.docx file.

sample example paragraph:
To make your document look professionally produced, Word provides header, footer, cover page, and text box designs that complement each other.
For example, you can add a matching cover page, header, and sidebar. Click Insert and then choose the elements you want from the different galleries.

We want Implement following requirements using DocX API

  1. Load "TEST.docx" File using DocX API.
  2. Identify the following text "Click Insert" in document and find its start and end index
  3. Mark "Click Insert" as hyperlink element and save the document.
    Please let us know is there any way to achieve this using DocX API.

Regards,
Arun

.Append() with formatting

is there a way to append to a paragraph with custom formating like with the formatting class,

example:
p.Append('string', formatting)

ApplyTemplate replaces all content in a document

Not sure if I am missing something, but when I call DocX.ApplyTemplate("template.dotx", true) everything in a document gets replaced with the content from the template, regardless of whether I set includeContent to true or false. Isn;t it supposed to apply stles only, and if content is to be included, it should be appended, not replace existing content?

Main document difference between Sections and GetSections

If have little knowledge of the ooxml specs, and don't understand why there is a difference between the propertie Sections of a document and the GetSections() method.

Sections returns 1 section less than GetSections()
I think it is the last section of the document missing

Is this supposed to work this way?
And if so: why?

Table.SetWidthsPercentage - wrong units?

This function calculates the column widths in points and assigns the results to Cell.Width. However Cell.Width uses pixels not points. Need to multiply by 96 and divide by 72 before calling Table.SetWidths inside this this function.

Note - I am using the 1.0.0.22 tag and manually added Table.SetWidthsPercentage, so maybe this is not an issue in master branch.

Alignment change on import of document

On import of .doc/.docx text alignment of entire document is changed to Justified.

  • The entire imported document is left aligned.
  • I tried setting each paragraph as left aligned, with no effect

Strings are reversed

The following code generates a document in which the strings are reversed...
using the latest NuGet - version DocX.1.0.0.19

  • reversing strings does not help entirely
  • applying styles such as bold/etc is harder...
public static void DocumentHeading()
        {
            Console.WriteLine("\tDocumentHeading()");
            using (DocX document = DocX.Create(@"C:\TEMP\DocumentHeading.docx"))
            {

                foreach (HeadingType heading in (HeadingType[])Enum.GetValues(typeof(HeadingType)))
                {
                    string text = string.Format("{0} - The quick brown fox jumps over the lazy dog", heading.EnumDescription());

                    Paragraph p = document.InsertParagraph();
                    p.AppendLine(text).Heading(heading);
                }


                document.Save();
                Console.WriteLine("\tCreated: docs\\DocumentHeading.docx\n");
            }
        }

Images not rendering

here is my code
Image img = doc.AddImage(@".\images\test.jpg");
Picture pic = img.CreatePicture();
pic.Height = 100;
pic.Width = 200;
Paragraph p = doc.InsertParagraph("", false);
pic.SetPictureShape(RectangleShapes.rect);
p.InsertPicture(pic);

        doc.SaveAs($"{path}\\src\\newDoc.docx");

when executed the document is saved and the image is placed on the bottom but it is a blank image.

New to github

I've added a collection for Content Controls and a couple of methods for replacing text in the, It's on my fork, but I'm very new to GitHub and I'm unsure what to do from here. Help, please?

Page breaks and layout size

I wish to add a page break in my document, and to achieve that I'm using the "InsertSectionPageBreak" method.
That actually breaks the flow to the next page, but the size of the page layout of the new page isn't the same as the first one. In other words, the very first page is an A4-size layout, and if I do NOT add page breaks, any further page will follow that size. If I add a page break on the first page, the 2nd and the 3rd are Letter-sized, whereas the 4th is an A4 back.
Tested with the NuGet release 1.0.0.19.
Any clue?
Very nice job, BTW!

Can't find bookmarks at header or footer

When there are bookmarks at header or footer,I use document.Bookmarks["bookmk"].SetText("ABC") to set text at bookmark "bookmk",then NullReferenceException occured.And the Bookmarks Collection doesn't contains bookmarks at header and footer,just contains bookmarks at main content.

DocX can't create docx under debian 7

I am trying to run DocX with Mono under debian 7. I'm using the git build, Building with xbuild is successful and the Hello world example succeed.
But the document can not open.
Run Examples.exe can create docx files, but can't open too.
some docx fails with:

Unhandled Exception:
System.NullReferenceException: Object reference not set to an instance of an object
at Novacode.DocX.PopulateDocument (Novacode.DocX document, System.IO.Packaging.Package package) <0xb44bd2e8 + 0x007c7> in :0
at Novacode.DocX.PostLoad (System.IO.Packaging.Package& package) <0xb44cf650 + 0x00267> in :0
at Novacode.DocX.Load (System.String filename) <0xb38f4558 + 0x0018f> in :0
at Examples.Program.CreateInvoice () <0xb36f3bd0 + 0x00033> in :0
at Examples.Program.CreateInvoice () <0xb36f3bd0 + 0x000d7> in :0
at Examples.Program.Main (System.String[] args) <0xb743ced0 + 0x000d3> in :0
[ERROR] FATAL UNHANDLED EXCEPTION: System.NullReferenceException: Object reference not set to an instance of an object
at Novacode.DocX.PopulateDocument (Novacode.DocX document, System.IO.Packaging.Package package) <0xb44bd2e8 + 0x007c7> in :0
at Novacode.DocX.PostLoad (System.IO.Packaging.Package& package) <0xb44cf650 + 0x00267> in :0
at Novacode.DocX.Load (System.String filename) <0xb38f4558 + 0x0018f> in :0
at Examples.Program.CreateInvoice () <0xb36f3bd0 + 0x00033> in :0
at Examples.Program.CreateInvoice () <0xb36f3bd0 + 0x000d7> in :0
at Examples.Program.Main (System.String[] args) <0xb743ced0 + 0x000d3> in :0
fuchsia@debian:~/docx_test/Examples/bin/Release$ '''

Leveled spacing to TOC.

How to add leveled spacing to TOC?
Headings with style Heading1 and Heading2 are being displayed at same level.

manual-table-of-contents

Text is reversed | mirrored

When creating a document without readOnly protection (with readOnly protection it works fine), all text in the document as mirrored.
For example:

Start Date: 
07/08/2016 16:31:50

Becomes:
image

By the way, adding read only protection, and then removing before the save, doesn't help.

DocX version: 1.0.0.23

.Net Core

Is there's a support for .net core?

My Solution runs on Unix

Potential bug in HelperFUnctions::IsSameFile()

On line 641 of internal static bool IsSameFile(Stream streamOne, Stream streamTwo)

            if (streamOne.Length != streamOne.Length)
            {
                // Return false to indicate files are different
                return false;
            }

Is there any reason to check if the length of the same stream does not match? Should not one of those be streamTwo?

Missing license info

Hello,

I could not find the license information for this project on GitHub. However, Codeplex states that it's Ms-PL (https://docx.codeplex.com/license). Is this information current and correct? Could you please include it on project's README.md? (I can create a pull request for that).

Thanks

Stacked bar chart.

On using BarChart() function from example and just changing cart groping to 'Stacked'
// Create chart. BarChart c = new BarChart(); c.BarDirection = BarDirection.Column; c.BarGrouping = BarGrouping.Stacked; c.GapWidth = 400; c.AddLegend(ChartLegendPosition.Bottom, false);

The multiple series are not set in same line.
Am I missing something?

Footnote editing

While it is possible to get a list of the footnotes, I am not able to add, remove or edit footnotes.

Any plans on adding that?

Provide a copy constructor for DocX

doc.Copy(): create a new DocX object from an existing one.

Can be done pretty easily by copying the underlying document content via memory stream.

Something like this:

DocX doc;
var stream= new MemoryStream();

/* some code to populate the document */

doc.SaveAs(stream);
stream.Seek(0, SeekOrigin.Begin);

DocX doc2= DocX.Load(stream);

https://docx.codeplex.com/workitem/13710

Paragraph.RemoveText calls CreateEdit when trackChanges == false

While using DocX in a plugin for CRM Online, I came across an error because I don't have privilege to get the current user. This shouldn't be an issue if track changes is off (and isn't an issue for Paragraph.InsertText as it checks trackChanges before calling Paragraph.CreateEdit). I'm thinking this may be a bug (not quite bug, but unnecessary), and that Paragraph.RemoveText should be similar to Paragraph.InsertText for calling CreateEdit. Avoiding this call should also avoid the call to get current user.

Paragraph.RemoveText lines of code: 3710 and 3748
Paragraph.InsertText lines of code: 2099 and 2119

DocX doesn't read out the "app.xml"

Unfortunately, DocX currently doesn't read out the "app.xml" file that does exist for a lot of *.docx files in the "/docProps" subfolder (side-by-side with "core.xml", which is being read out).

I would really love to have access to the information like count of words, count of characters etc. that is stored in the "app.xml" - it seems to me, it would be almost the same code as for the "CoreProperties" property on the "DocX" object. Can't be that hard to do! :-) Any hope that'll show up soon?

Iterate words in a document and edit if necessary

With DocX, is there an easy way to loop through all the words in a Document and alter them if necessary. Here is the interop equivalent code, but it requires Office and very slow:

var words = App.ActiveDocument.Words;
for (int i = 1; i <= words.Count; i++)
{
    var word = words[i];
    var text = word.Text;

    if (text.Equals("\r"))
    {
        continue;
    }

    //This method is some kind of spellchecker. Returnes the fixed version of a word.
    var deas = Deasciifier.Deasciify(text);

    if (deas.Equals(text))
    {
        continue;
    }

    word.Select();
    App.Selection.TypeText(deas);
}

HelperFunctions.FormatInput doubles new lines

If one adds a multi-line string using Paragraph.Append, AppendLine, InsertText, ReplaceText, ..., for each new line in actually two new lines are added to the document.

The reason is how strings a formatted in HelperFunctions.FormatInput. Enviroment.NewLine (on Windows) corresponds to '\r\n'. Before 1.0.0.20 only '\n' was considered as a special character adding a new line. However, in 24dd8ee also '\r' was given a special character treatment to issue a new line. Thus a standard Environment.NewLine = '\r\n' now causes two line breaks to be added to the document.

A workaround is to use Replace("\r\n", "\n") on a given string before passing it to Append, AppendLine, InsertText, ReplaceText, ... of a Paragraph, however this appears a little awkward.

Setting IndentationBefore and IndentationAfter makes an error when decimalsign is not a dot

I live in Belgium, and our decimal sign is a komma
The setter for IndentationBefore and IndentationAfter makes a invalid xml string (Word 2016 has a problem reading this, LibreOffice 5 does read it correct)

This is the xml generated:

<w:ind w:left="706,800005435944" xmlns:w="http://schemas.openxmlformats.org/wordprocessingml/2006/main" />

Solution:
Change line 785 and 728 from this:
string indentation = ((indentationBefore / 0.1) * 57).ToString();

to this:
string indentation = ((indentationBefore / 0.1) * 57).ToString(CultureInfo.GetCultureInfo("en-GB"));

FontSize in Paragraph.cs does not allow for half-sizes (ex. 7.5)

The FontSize() method in Paragraph.cs does not allow for half sizes, and an exception is thrown erroneously. I believe this is because of the if logic in line 2969 if (fontSize - (int)fontSize == 0).

This should be a quick fix and I'll submit a pull request shortly. Adding to issues list for documentation.

Table of Contents

How can i add a table of contents at the beggining of the document? Everytime i run it it gets added to the end


var p = this.wordDocument.InsertParagraph();
var ToC = this.wordDocument.InsertTableOfContents(this.Title, this.TableSwitch,);
var paragraphT = this.wordDocument.Paragraphs[0].InsertParagraphBeforeSelf(p);

Wiki?

Documentation for this project is spread through many old blog posts in Cathals Corner.
Wouldn't it be better to start centralizing documentation here in the wiki?

Automatically number headings like standard button does

I'd like to let the headings get numbered automatically. (Just like at Word 2010 when you press the button "List with many levels" and select the option where the headings get numbered just as I want them to (the standard word functionality)). Is there a way to accomplish this using DocX

DOCX in china

Looks like there is no support for the Chinese character set:

public Paragraph Font(FontFamily fontFamily)
        {
            ApplyTextFormattingProperty
            (
                XName.Get("rFonts", DocX.w.NamespaceName),
                string.Empty,
                new[] 
                {
                    new XAttribute(XName.Get("ascii", DocX.w.NamespaceName), fontFamily.Name),
                    new XAttribute(XName.Get("hAnsi", DocX.w.NamespaceName), fontFamily.Name), // Added by Maurits Elbers to support non-standard characters. See http://docx.codeplex.com/Thread/View.aspx?ThreadId=70097&ANCHOR#Post453865
                    new XAttribute(XName.Get("cs", DocX.w.NamespaceName), fontFamily.Name),    // Added by Maurits Elbers to support non-standard characters. See http://docx.codeplex.com/Thread/View.aspx?ThreadId=70097&ANCHOR#Post453865
                }
            );

            return this;
        }

Adding support should be as simple as:

new XAttribute(XName.Get("eastAsia", DocX.w.NamespaceName), fontFamily.Name)

Script property not parsed

This code should be added to the Parse function in Formatting.cs
case "vertAlign":
var script = option.GetAttribute(XName.Get("val", DocX.w.NamespaceName), null);
formatting.Script = (Script)Enum.Parse(typeof(Script), script);
break;

Memory Leaks

It looks like after creating a docx file, memory is not being released.

This is how we use DocX:

using (_doc = DocX.Create(_fileName))
{
    _doc.AddFooters();
    _doc.PageLayout.Orientation = Novacode.Orientation.Landscape;
    _doc.AddProtection(EditRestrictions.readOnly);

    ...
    ...

    _doc.Footers.even.PageNumbers = true;
    _doc.Footers.odd.PageNumbers = true;

    _doc.Save();
}

Set Table with w.r.t. Percentage.

I have tested method for same.

/// <summary> /// Set Table column width by prescribing percent /// </summary> /// <param name="widthsPercentage">column width % list</param> /// <param name="totalWidth">Total table width. Will be calculated if null sent.</param> public void SetWidthsPercentage(float[] widthsPercentage, float? totalWidth) { if (totalWidth == null) totalWidth = this.Document.PageWidth - this.Document.MarginLeft - this.Document.MarginRight; // calculate total table width List<float> widths = new List<float>(widthsPercentage.Length); // empty list, will hold actual width widthsPercentage.ToList().ForEach(pWidth => { widths.Add(pWidth * totalWidth.Value / 100); }); // convert percentage to actual width for all values in array SetWidths(widths.ToArray()); // set actual column width }

Can be added to Table class

No Images in Footer?

Hi

I need ability to replace images in whole document, but I found that you forgot to add Images property into footer. You have it in header and main body, but not footer. Footer only has Pictures, but in "Picture" class you hided actual "Image" object and users cannot replace content,
I think the best way will be to make Image in Picture class as a public property.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.