shinjukunian / docx Goto Github PK
View Code? Open in Web Editor NEWConvert NSAttributedString / AttributedString to .docx Word files on iOS and macOS
License: MIT License
Convert NSAttributedString / AttributedString to .docx Word files on iOS and macOS
License: MIT License
Hello @shinjukunian,
Thank you for this beautiful implementation. Actually I tried to convert a NSMutableAttributed string which has instead other 2 attributed string but I get the .docx document for the first AttributedString.
I did like:
let rootAttributedString = NSMutableAttributedString()
rootAttributedString.append(NSAttributedString(string: "blah blah blah 1 ... but more text")
rootAttributedString.append(NSAttributedString(string: "blah blah blah 2 ... more text here also")
and then
rootAttributedString.writeDocX(to: myURL)
but the output was the a .docx
file with "blah blah blah 1 ... but more text"
only.
Any ideas how to fix this?
B. Regards,
John
/Users/christopherriner/Downloads/DocX-master/DocX/NSAttributedString+DocX.swift:26:35: Type 'Bundle' has no member 'module'
When I create a docx, I’d like for any images to fit on a single page. I realize I can resize the image on my end, but that isn’t ideal as I’d like the original image to remain untouched.
I’ve attached two docx files that illustrate the issue:
Under-the-Wave-Created-DocX.docx
Under-the-Wave-Created-Word.docx
The first docx file was created using DocX, and you can see that my image, since it is large, is pushed to the next page and also clipped. When I add the original image into Word, you can see that it is resized to fit the page. (Word also – annoyingly – resizes the original image. However, when I unzip the docx, replace the resized image with my original image, then re-created the docx file, Word still displays the image correctly.)
I’ve only looked quickly, but it seems to me that this is controlled by the wp:extent element. In the DocX package, it looks like the extent is determined by the size of the image. This seems like a great fallback. It would be nice, though, if the image was scaled to fit when the image’s width/height is greater than the page’s usable width/height (i.e. not including margins).
Bonus suggestion #1: It would also be nice if there were a way to set the image’s display width more directly. Perhaps NSTextAttachment could be extended to have a “docxExportWidthFraction” attribute. If that were set to “0.5”, then the display width would be set to 50% of the page’s usable width.
Bonus suggestion #2: When the “docxExportWidthFraction” is set, it might be nice to center the image horizontally. (Though I wouldn’t complain if images were just always centered, rather than left-aligned.)
I think I probably broke this, but I can no longer build and run tests in the master
branch. (I usually test DocX in the context of my own application, so it’s rare that I build this way these days).
Anyhow, it looks like this is because DocXStyleConfiguration.swift and styles.xml haven’t been added to the project file.
I have a fix.
Thanks very much for the help closing my issue. I tried it and it works now. I was wondering if I could get your opinion on what I am trying to do. I was looking for a way to generate a word doc in code. I have to let a user enter lots and lots of data and then I store it in core data and then they want to tap a button and then my app should take all of that data plus some images and generate a word doc that they can then export from the app to one drive and edit it more on a laptop. I am trying to see if this library would help me in creating a word doc for them in code using the data they have entered. They do not need to edit anything directly in the app just enter the data and then the app generates word doc. The word doc again would have text and pictures and tables and will be very large. Do you think you library would be helpful in doing that?
I am working on an application that can import a docx file. On import, text that uses particular Word Styles will be imported in specific ways. For instance, text that uses the "Heading 1” style will be imported as a title.
I’d like to be able to export a docx file that looks exactly the same when it is re-imported into my application. To do this perfectly, I need the exported docx to have styles applied to some text.
I imagine that writeDocX(), could take an optional dictionary of Style Information, which would map an identifier (e.g. “Heading1Identifier”) to a dictionary of style information (e.g. [“name”: “Heading 1”, “kind”: “paragraphStyle”, “bold”: “true”]). Then, an attributed string could have an attribute (e.g. NSAttributedString.key.docXStyleIdentifier) that contained the style identifier value (e.g. “Heading1Identifier”). On export, text with that attribute would have the appropriate style applied.
This may be outside the scope of this project, but I figured it couldn’t hurt to submit it.
When you open a DocX-written file in Word, you'll see "Compatibility Mode" in the title bar. And, if you use File > Save As to create a new version of your file, Word will warn you that "you are about to update the file format, which might result in layout changes."
Compatibility Mode is a file format that allows these docx files to be opened in Word 2010 or earlier. However, it also means that some post-2010 Word features can't be used (unless, of course, you update the file).
Seems like maintaining compatibility with Word 2010 isn't necessary, and it would be preferable for docx files to appear like other "modern" docx files when opened in Word.
I have a fix.
let image1Attachment = NSTextAttachment()
image1Attachment.bounds = CGRect(x: 0, y: 0, width: 40 , height: 300)
image1Attachment.image = image
// wrap the attachment in its own attributed string so we can append it
let image1String = NSAttributedString(attachment: image1Attachment)
// add the NSTextAttachment wrapper to our full string, then add some more text.
fullString.append(image1String)
I have set the image display bounds , but generate docx display very big and beyond the words width then only dispaly
half of image
When DocX writes images to the media folder during docx creation, it always uses the “png” extension, regardless of the format of that image.
For instance, if I initialize an NSTextAttachment with a file wrapper that points to a JPEG image (e.g. “image.jpg”), then DocX will export it using a name like “rId3.png”. However, the format for that image is still a JPEG:
$ file -I rId3.png
rId3.png: image/jpeg; charset=binary
Instead, it should use an extension that matches the format.
Can we support Cocoapods? Thank you very much!
When executing main.swift
import DocX
import SwiftUI
let string = NSAttributedString(string: "This is a string", attributes: [:])
let url = URL(fileURLWithPath:"/Users/lilun/Downloads")
try? string.writeDocX(to: url)
Thread 1:"-[NSConcreteAttributedString writeDocXTo:error:] unrecognized selector sent to instance 0x600000204f60"
2023-05-24 08:12:28.904180+0800 AddDocx[7329:105841] -[NSConcreteAttributedString writeDocXTo:error:]: unrecognized selector sent to instance 0x600000204f60
2023-05-24 08:12:28.904517+0800 AddDocx[7329:105841] *** Terminating app due to uncaught exception 'NSInvalidArgumentException', reason: '-[NSConcreteAttributedString writeDocXTo:error:]: unrecognized selector sent to instance 0x600000204f60'
*** First throw call stack:
(
0 CoreFoundation 0x000000019599f154 __exceptionPreprocess + 176
1 libobjc.A.dylib 0x00000001954be4d4 objc_exception_throw + 60
2 CoreFoundation 0x0000000195a46110 -[NSObject(NSObject) __retain_OA] + 0
3 CoreFoundation 0x00000001959070a0 forwarding + 1600
4 CoreFoundation 0x00000001959069a0 _CF_forwarding_prep_0 + 96
5 AddDocx 0x0000000100005028 main + 604
6 dyld 0x00000001954eff28 start + 2236
)
libc++abi: terminating due to uncaught exception of type NSException
(lldb)
Where is the problem? Novice learning, the foundation is not yet solid, troublesome! Thank you.
If you have an attributed string and apply NSAttributedString.Key.underlineStyle
to it, the generated docx will not include that underline. This is because DocX requires that foregroundColor
must be set, otherwise the <w:u>
element won't be output.
Since the w:color
value is optional in the docx, I don't think this should be required. I have a fix ready that makes the color
argument Optional in the underlineElement
function.
That said, I'm not sure using the foregroundColor
is correct either, since NSAttributedString.Key.underlineColor
exists for the purpose. If existing code relies on this behavior, though, we could have DocX check for the underlineColor
and, if that isn't found, then use the foregroundColor
if it exists. Thoughts?
try DocXWriter.write(pages:[NSAttributedString], to url:URL)
I’m using DocXWriter.write(pages:to:options:) to write an array of attributed strings with page breaks in between, and noticed that a page break isn’t always inserted. In particular, when a string ends with multiple empty paragraphs in a row, the page break isn’t inserted.
If I trim the attributed strings so that they don’t end with any newline characters, everything behaves as expected. This is a fine workaround for me, but I still think this is a confusing bug.
You can repro by adding two empty lines to the end of the “string” in the testMultiPage() or testMultiPageWriter() tests. With that change, the generated docx file won’t contain any page breaks. I believe this is due to the “early out” in ParagraphElement’s buildRuns():
guard subString.length>0 else{return [AEXMLElement]()}
This is easy to repro using these simple tests:
func testAmpersand() {
let string="Key & Peele"
let attributedString=NSAttributedString(string: string)
testWriteDocX(attributedString: attributedString)
}
func testLessThan() {
let string="0 < 1"
let attributedString=NSAttributedString(string: string)
testWriteDocX(attributedString: attributedString)
}
If you then run these tests, they will fail with: [DocXTests.DocXTests testAmpersand] : failed - The file…couldn’t be opened because it isn’t in the correct format.
Is it possible to add author info ?
When trying to replace an existing docx file with one written by DocX, I get the following error:
“[filename].zip” couldn’t be copied to “Desktop” because an item with the same name already exists.
To save the file, either provide a different name, or move aside or delete the existing file, and try again.
The reason is because writeDocX_builtin()
uses FileManager.default.copyItem()
rather than replaceItemAt()
. The writing function is all set up to use replaceItemAt()
since it already creates the temp folder in a location that’s appropriate for the final URL.
(I will submit a pull request for this shortly)
If you have a link where the underlying URL includes an ampersand (e.g. "https://example.com/?kw=1&kw=2"), then Word won't be able to open the written docx file.
This is very similar to issue #18, and I should have noticed this when fixing that bug. I have a fix.
Currently, Only Package is the only way how to install this library. However, it's not always possible to use Packages in large projects due to some reasons. Can you add either manual installation or adding Cocoa Pods option?
I'm wondering if you can convert docx to NSAttributedString
?
I constructed a test case for a very long “book” that consists of 400 “chapters” of 5,000 words each. When I write it as a docx using DocXWriter.write(pages:)
, it takes a prohibitively long time: about 100 seconds on my computer (a 2021 MacBook Pro M1).
Here’s the test:
func testLongBookString() throws {
// Two paragraphs / 100 words
let twoParagraphs =
"""
This property contains the space (measured in points) added at the end of the paragraph to separate it from the following paragraph. This value is always nonnegative. The space between paragraphs is determined by adding the previous paragraph’s paragraphSpacing and the current paragraph’s paragraphSpacingBefore.
Specifies the border displayed above a set of paragraphs which have the same set of paragraph border settings. Note that if the adjoining paragraph has identical border settings and a between border is specified, a single between border will be used instead of the bottom border for the first and a top border for the second.
"""
// Create a "book" that consists of 400 chapters each with 5,000 words
// (2 million words total).
let chapterString = String(repeating: twoParagraphs, count: 50)
let chapterAttributedString = NSAttributedString(string: chapterString)
let chapters = Array(repeating: chapterAttributedString, count: 400)
let url=self.tempURL.appendingPathComponent(UUID().uuidString + "_myDocument_\(chapterString.prefix(10))").appendingPathExtension("docx")
try DocXWriter.write(pages: chapters, to: url)
try validateDocX(url: url)
}
Profiling revealed that significant time is spent bridging between NSString/String (and allocating Strings) in paragraphRanges
.
I have a fix that reduces the time for this test from ~100 seconds to ~1.4s.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.