lalalic / docx4js Goto Github PK
View Code? Open in Web Editor NEWa javascript docx parser
a javascript docx parser
Hi,
when i try to open a pptx with a file in attachment or a pic , i get this error :
Error : TypeError: Cannot read property 'prototype' of undefined
at _class._init (C:\Users\User\Documents\Workspace\pikeDocX\node_modules\docx4js\src\openxml\docx\officeDocument.js:10:15)
at _class.Part (C:\Users\User\Documents\Workspace\pikeDocX\node_modules\docx4js\src\openxml\part.js:26:8)
at _class (C:\Users\User\Documents\Workspace\pikeDocX\node_modules\docx4js\lib\openxml\officeDocument.js:39:101)
at new _class (C:\Users\User\Documents\Workspace\pikeDocX\node_modules\docx4js\lib\openxml\docx\officeDocument.js:39:95)
at _class (C:\Users\User\Documents\Workspace\pikeDocX\node_modules\docx4js\src\openxml\document.js:9:23)
at new _class (C:\Users\User\Documents\Workspace\pikeDocX\node_modules\docx4js\lib\openxml\docx\document.js:29:95)
at C:\Users\User\Documents\Workspace\pikeDocX\node_modules\docx4js\src\document.js:177:14
at parse (C:\Users\User\Documents\Workspace\pikeDocX\node_modules\docx4js\src\document.js:174:8)
at C:\Users\User\Documents\Workspace\pikeDocX\node_modules\docx4js\src\document.js:188:7
at FSReqCallback.readFileAfterClose [as oncomplete] (internal/fs/read_file_context.js:61:3)
My code :
import docx4js from 'docx4js'
docx4js.load("test.pptx").then(pptx=>{
console.log(pptx)
}).catch(e=> console.log( "Error : ", e))
Regards
Recently, I upgraded doxc4js from 3.1.1. But it showed 'ReferenceError: URL is not defined' when I ran my program.
The whole error message is below
error occur in addOPLToEs: ReferenceError: URL is not defined
at _class.getDataPartAsUrl (/root/parseOPL/node_modules/docx4js/lib/document.js:74:37)
at OfficeDocument.getRel (/root/parseOPL/node_modules/docx4js/lib/openxml/part.js:77:25)
at Object.pic (/root/parseOPL/node_modules/docx4js/lib/openxml/docx/officeDocument.js:206:55)
at identify (/root/parseOPL/node_modules/docx4js/lib/openxml/docx/officeDocument.js:94:48)
at OfficeDocument.renderNode (/root/parseOPL/node_modules/docx4js/lib/openxml/part.js:193:17)
at /root/parseOPL/node_modules/docx4js/lib/openxml/part.js:215:23
at Array.map (<anonymous>)
at OfficeDocument.renderNode (/root/parseOPL/node_modules/docx4js/lib/openxml/part.js:214:30)
at /root/parseOPL/node_modules/docx4js/lib/openxml/part.js:215:23
at Array.map (<anonymous>)
at OfficeDocument.renderNode (/root/parseOPL/node_modules/docx4js/lib/openxml/part.js:214:30)
at /root/parseOPL/node_modules/docx4js/lib/openxml/part.js:215:23
at Array.map (<anonymous>)
at OfficeDocument.renderNode (/root/parseOPL/node_modules/docx4js/lib/openxml/part.js:214:30)
at /root/parseOPL/node_modules/docx4js/lib/openxml/part.js:215:23
at Array.map (<anonymous>)
I see the code. But I cannot figure out why there are window and URL object in code
I just wanted to read a docx but cant even get the load function to work:
import docx4js from "docx4js"
docx4js.load(input).then(docx=>{
console.log('dox',docx)
}).catch(e=>{console.log('err',e)})
input:
ArrayBuffer { byteLength: 54248 }
What may be the problem?
parse this file catch a error: ReferenceError: Blob is not defined\n at _class.getDataPartAsUrl(docx4js/lib/document.js:99:61)
Good day,
Feature request / query:
Would love to have a find replace for template usage.
For example:
Doc contains the text "{WHO}"
And I want a nodejs function to call lib and replace it with Bob Martin for example.
ArrayBuffer as input can be useful for a standalone client app.
Hi! 👋
Firstly, thanks for your work on this project! 🙂
Part of pptx slide structure:
<p:blipFill>
<a:blip>
<a:extLst>
<a:ext uri="{79A0055B-C67C-407E-A111-50E730381C1C}">
<a14:useLocalDpi
xmlns:a14="http://schemas.microsoft.com/office/drawing/2010/main"
val="0"/>
</a:ext>
</a:extLst>
</a:blip>
<a:srcRect/>
<a:stretch>
<a:fillRect/>
</a:stretch>
</p:blipFill>
Today I used patch-package to patch [email protected] for the project I'm working on.
Here is the diff that solved my problem:
diff --git a/node_modules/docx4js/lib/openxml/drawml/index.js b/node_modules/docx4js/lib/openxml/drawml/index.js
index 5096ddd..53f7dc9 100644
--- a/node_modules/docx4js/lib/openxml/drawml/index.js
+++ b/node_modules/docx4js/lib/openxml/drawml/index.js
@@ -64,6 +64,7 @@ exports.default = function (od) {
url = _n$attribs["r:link"];
if (url) return { url: url };
+ if (!embed) return;
var part = od.$(n).part();
return new _part2.default(part, od.doc).getRel(embed);
},
diff --git a/node_modules/docx4js/src/openxml/drawml/index.js b/node_modules/docx4js/src/openxml/drawml/index.js
index 23f665e..5c15bd6 100644
--- a/node_modules/docx4js/src/openxml/drawml/index.js
+++ b/node_modules/docx4js/src/openxml/drawml/index.js
@@ -17,6 +17,8 @@ export default od=>({
const {attribs:{"r:embed":embed, "r:link":url}}=n
if(url)
return {url}
+ if(!embed)
+ return;
const part=od.$(n).part()
return new Part(part,od.doc).getRel(embed)
},
This issue body was partially generated by patch-package.
When I declare
const ModelHandler = require("docx4js/lib/openxml/docx/model-handler").default;
I have the following error with node 7.0.0
(node:7814) UnhandledPromiseRejectionWarning: Unhandled promise rejection (rejection id: 1): ReferenceError: officeDocument is not defined
In document.js line 63, there is a typo on officeDocument.identify
should be OfficeDocument.identify
That's fixing the issue and then after that I have another error:
TypeError: officeDocument.styles is not a function
See:
officeDocument
should be OfficeDocument
.
This mistake triggers an object not found error on invoking parse()
.
Hi! 👋
Firstly, thanks for your work on this project! 🙂
Today I used patch-package to patch [email protected]
for the project I'm working on.
Here is the diff that solved my problem:
diff --git a/node_modules/docx4js/lib/openxml/part.js b/node_modules/docx4js/lib/openxml/part.js
index c041ae5..a0d29d9 100644
--- a/node_modules/docx4js/lib/openxml/part.js
+++ b/node_modules/docx4js/lib/openxml/part.js
@@ -91,6 +91,7 @@ var Part = function () {
value: function getRel(id) {
var rel = this.rels("Relationship[Id=\"" + id + "\"]");
var target = rel.attr("Target");
+ if (target === 'NULL') return;
if (rel.attr("TargetMode") === 'External') return { url: target };
switch (rel.attr("Type").split("/").pop()) {
diff --git a/node_modules/docx4js/src/openxml/part.js b/node_modules/docx4js/src/openxml/part.js
index 1a1d690..a37a67d 100644
--- a/node_modules/docx4js/src/openxml/part.js
+++ b/node_modules/docx4js/src/openxml/part.js
@@ -63,6 +63,8 @@ export default class Part{
getRel(id){
var rel=this.rels(`Relationship[Id="${id}"]`)
var target=rel.attr("Target")
+ if(target==='NULL')
+ return;
if(rel.attr("TargetMode")==='External')
return {url:target}
This issue body was partially generated by patch-package.
I want to put with a control element of the word document, in accordance with the control name or type written into the database(Mysql), do not know what method to achieve, I am also concerned about the phpword, don't know if I can do it, hoping to provide methods, thank you. Ps.I offer a control element word document examples.
phpwordtomysql.docx
If the package is not maintained, please state it in README, it would save people time trying.
I am a newbie in React and i am not able to get this working in React.
I have use axios to get the response of the file as "arraybuffer" and successfully able to get the data from docx4js.load but i am not able to understand how to render that on the screen.
Here is the code if that might help:
import React, { useEffect } from "react";
import axios from "axios";
import docx4js from "docx4js";
const Docx = ({ previewedRecord, token }) => {
useEffect(() => {
axios
.get(previewedRecord.url, {
responseType: "arraybuffer",
headers: {
Authorization: "Bearer " + token
}
})
.then(res => {
docx4js.load(res.data).then(docx => {
console.log("[Docx.jsx] docx:", docx); // I am able to get the data here.
})
.catch(err => {
console.error("[Docx.jsx] err:", err);
});
})
.catch(err => {
console.error("[Docx.jsx] err:", err);
});
}, []);
return <div id="docx" />; // I want the Docx to be rendered here.
};
export default Docx;
Here is the screenshot of the docx returned:
Installed docx4js v. 3.2.9. First try following the documentation:
const docx4js = require("docx4js")
docx4js .load("a.docx")
Uncaught TypeError: docx4js.load is not a function
I had to insert a new level to use the package:
docx4js.docx.load("a.docx")
Promise { <pending> }
Could the documentation be updated?
Thanks!
mario
Line 135 in 5feb980
I have some questions:
It would be great if you can help me with those questions.
Edit: I wanted to get some information about the object but console.log(docx4js)
only prints
function _class() { // 26
_classCallCheck(this, _class); // 27
// 28
return _possibleConstructorReturn(this, (_class.__proto__ || Object.getPrototypeOf(_class)).apply(this, arguments));
}
When I call the load function then I get just a cryptic message in the promise reject:
TypeError: this.rels is not a function. (In 'this.rels("[Type$=\"" + type + "\"]")', 'this.rels' is undefined)
Trying to follow the README:
const ModelHandler = require('docx4js/openxml/docx/model-handler').default;
Docx zipped files may be presented as flattened files such as
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<?mso-application progid="Word.Document"?>
<pkg:package xmlns:pkg="http://schemas.microsoft.com/office/2006/xmlPackage">
<pkg:part pkg:name="/_rels/.rels" pkg:contentType="application/vnd.openxmlformats-package.relationships+xml" pkg:padding="512">
<pkg:xmlData>
[FILE CONTENT]
</pkg:xmlData>
</pkg:part>
<pkg:part pkg:name="/word/_rels/document.xml.rels" pkg:contentType="application/vnd.openxmlformats-package.relationships+xml" pkg:padding="256">
<pkg:xmlData>
[FILE CONTENT]
</pkg:xmlData>
</pkg:part>
<pkg:part pkg:name="/word/footnotes.xml" pkg:contentType="application/vnd.openxmlformats-officedocument.wordprocessingml.footnotes+xml">
<pkg:xmlData>
[FILE CONTENT]
</pkg:xmlData>
<pkg:part pkg:name="/word/media/image1.png" pkg:contentType="image/png" pkg:compression="store">
<pkg:binaryData>
[FILE CONTENT]
</pkg:binaryData>
</pkg:part>
var docx4js =require("docx4js")
docx4js.load("./abc.docx").then(docx=>{
//you can render docx to anything (react elements, tree, dom, and etc) by giving a function
docx.render(function createElement(type,props,children){
return {type,props,children}
})
})
if i type this code,it returns only props,children and it types. but i require total html content should be saved in any other variable or file .Is it possible to save entire parsed docx file in a variable which holds html content??
l'm working to parse the docx file to text, but i can not determine the numbering of list paragraph just by numId(abstractNumId) and ilvl value. Although it works, sometimes it is wrong. could you give me some advise?
Look forward to your reply!
Uncaught (in promise) ReferenceError: DOCX is not defined in nodejs
var docx4js = require("docx4js");
docx4js.load("my file path").then(function(doc){
console.log(doc);
var nothingFactory=DOCX.createVisitorFactory();
});
When I do npm i docx4js
and start the app I see the error not found xmldom module
. In package.json in git repository this module included. But in npm registry this module not found in package.json.
Updating module in npm can fix this problem
Launching the index.html in dist folder causes a "file not found error" for index.js , but if I copy index.html in any folder containing an index.js , it does not work anyway due to others errors.
It looks very interesting and could be used for many applications but the documentation is almost non-existent so it makes it very difficult to use.
Would it be possible to adapt it to in-browser use?
Self explanatory man, would really appreciate it
Good day
I know I can change the content of docx.
//you can change content on docx.officeDocument.content, and then save docx.officeDocument.content("w\\:t").text("hello") docx.save("~/changed.docx")
But this method doesn't change content of footnotes.
If there's some way to change it, that will be cool!
How can I convert a document containing Drawing Vectors to PNG?
Has anyone done that before? If so, please help me in this case.
Hello,
I'm trying to use this awesome library, but it seems to break on intensive docx files:
Uncaught (in promise) TypeError: ole.find(...) is null ___________________________________________________________ parse http://localhost:3000/core.min.js:9419 ______________________________________________________________________ getRelOleObject http://localhost:3000/core.min.js:9615 ____________________________________________________________ object http://localhost:3000/core.min.js:8955 _____________________________________________________________________ identify http://localhost:3000/core.min.js:9390 ___________________________________________________________________ renderNode http://localhost:3000/core.min.js:9658 _________________________________________________________________ childElements http://localhost:3000/core.min.js:9683 ______________________________________________________________ renderNode http://localhost:3000/core.min.js:9682 _________________________________________________________________ childElements http://localhost:3000/core.min.js:9683 ______________________________________________________________ renderNode http://localhost:3000/core.min.js:9682 _________________________________________________________________ childElements http://localhost:3000/core.min.js:9683 ______________________________________________________________ renderNode http://localhost:3000/core.min.js:9682 _________________________________________________________________ childElements http://localhost:3000/core.min.js:9683 ______________________________________________________________ renderNode http://localhost:3000/core.min.js:9682 _________________________________________________________________ render http://localhost:3000/core.min.js:8599 _____________________________________________________________________ render http://localhost:3000/core.min.js:8376 _____________________________________________________________________ fileFound http://localhost:3000/core.min.js:42189 _________________________________________________________________ promise callback*fileFound http://localhost:3000/core.min.js:42186 ________________________________________________ drag http://localhost:3000/core.min.js:42169 ______________________________________________________________________ ondrop http://localhost:3000/:1
I tried and was able to read a word template I used for work, but a complete file (100+ pages) gives me this error.
More specifically, this is what generates a null pointer:
ole.find("!ole10Native").content
--> TypeError: ole.find(...) is null
ole seems to be... the microsoft office docx structure ?
Is it a normal behavior ? Does anyone has an idea ?
Thanks a lot, have a good day.
Hi I am trying to use it in meteorjs and getting mentioned error,
Uncaught (in promise) TypeError: require(...).readFile is not a function
TIA.
I would love to see better examples in the README.
This seems like a powerful library, but if it's not documented then it will be used less.
Uncaught (in promise) TypeError: Cannot convert undefined or null to object
at Function.assign ()
at _class._init (officeDocument.js:16:16)
at _class._init (officeDocument.js:6:3)
at _class.Part (part.js:31:8)
at _class (officeDocument.js:2:1)
at new _class (officeDocument.js:2:1)
at _class (document.js:9:23)
at new _class (document.js:2:1)
at parse (document.js:177:14)
at reader.onload (document.js:194:6)
Hi
this repo is still live?
im having some issues to install this package from a MeteorJs package, using odd npm.require, about a missing xmldom module.
this could be fixed, but im worried about if this library is still maintened for future bugs.
Hi,
README.md is outdated or some files are missing from source. (Actually, there are several punctuation failures in example code, either)
1, There is no createVisitorFactory in docx4js. Actually, there is no such word in the whole project (only in README.md). I tried to find the way from source code, but I did not find the alternative.
2, Even function parse
does not have any arguments.
I did not find any ways to use this lib.
Uncaught ReferenceError: exports is not defined
at index.js:3
新建一个word文件,直接调用docx.load 会跳到openxml的16行就会报错了。
Hi,i used the example,
docx4j.load('./test.docx') // my file path in nodejs
error:
\docx4js.js:16
return DOCX.createVisitorFactory(function(wordModel){
^^^^^^
SyntaxError: Unexpected token return
at exports.runInThisContext (vm.js:53:16)
at Module._compile (module.js:373:25)
at Object.Module._extensions..js (module.js:416:10)
at Module.load (module.js:343:32)
at Function.Module._load (module.js:300:12)
at Function.Module.runMain (module.js:441:10)
at startup (node.js:139:18)
at node.js:968:3
Hi,
Very interesting project. Whats the license?
Best,
Prabhu
Error when running npm start
. Works good in React v17 and react-scripts v4. but fails to run in Reactv18 and react-scripts v5
Compiled with problems:
ERROR in ./node_modules/docx4js/lib/document.js 340:17-45
Module not found: Error: Can't resolve 'fs' in './node_modules/docx4js/lib'
package.json for reference
{
"name": "parser",
"version": "0.1.0",
"private": true,
"dependencies": {
"@testing-library/jest-dom": "^5.16.4",
"@testing-library/react": "^13.3.0",
"@testing-library/user-event": "^13.5.0",
"buffer": "^6.0.3",
"docx4js": "^3.2.20",
"react": "^18.2.0",
"react-dom": "^18.2.0",
"react-scripts": "5.0.1",
"web-vitals": "^2.1.4"
},
"scripts": {
"start": "react-scripts start",
"build": "react-scripts build",
"test": "react-scripts test",
"eject": "react-scripts eject"
},
"eslintConfig": {
"extends": [
"react-app",
"react-app/jest"
]
},
"browserslist": {
"production": [
">0.2%",
"not dead",
"not op_mini all"
],
"development": [
"last 1 chrome version",
"last 1 firefox version",
"last 1 safari version"
]
}
}
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.