Comments (16)
Hi,
There is something about the xlsx file that is causing the error.
I made a fresh one and the error is gone.
new xlsx:
Test2_edited.xlsx
output csv:
output.zip
from pyexcel.
For Test2.xlsx, it seems to work with latest tool setup:
>>> import pyexcel, pyexcel.ext.xlsx
>>> pyexcel.save_as(file_name="Test2.xlsx", dest_file_name="Test3.csv")
/../python2.7/site-packages/openpyxl/workbook/names/named_range.py:121: UserWarning: Discarded range with reserved name
warnings.warn("Discarded range with reserved name")
Here's what I got:
cat Test3.csv
Run Date,Network,,Day,,Length,,Time,Program,,,,,,Copy,,,,,,,Amount
9/1/2015,N/A,,Tue,,15 ,,19:13,N/A,,,,,,N/A,,,,,,,20.00
Here's my pip packages:
pyexcel==0.2.0
pyexcel-io==0.1.0
pyexcel-xlsx==0.1.0
For Test2_edited.xlsx, it was better:
>>> import pyexcel,pyexcel.ext.xlsx
>>> pyexcel.save_as(file_name="Test2_edited.xlsx", dest_file_name="test4.csv")
>>>
$ cat test4.csv
Run Date,Network,,Day,,Length,,Time,Program,,,,,,Copy,,,,,,,Amount
9/1/2015,N/A,,Tue,,15 ,,19:13,N/A,,,,,,N/A,,,,,,,20.00
By the way, for xlsx, you need not to import pyexcel.ext.xls.
from pyexcel.
Hi chfw,
Thanks again for pyexcel.
I updated all the pyexcel packages,
pyexcel==0.2.0
pyexcel-io==0.1.0
pyexcel-xls==0.1.0
pyexcel-xlsx==0.1.0
It should be noted that using the original code and the original xlsx file,
#!/usr/bin/env python
import pyexcel
import os
from pyexcel.ext import xlsx
from pyexcel.ext import xls
pyexcel.save_as(file_name="Test2.xlsx", dest_file_name="Test2.csv")
we still get
Traceback (most recent call last):
File "pyex_001.py", line 7, in <module>
pyexcel.save_as(file_name="Test2.xlsx", dest_file_name="Test2.csv")
File "D:\Tools\Python34\lib\site-packages\pyexcel\sources\__init__.py", line 318, in save_as
sheet.save_to(dest_source)
File "D:\Tools\Python34\lib\site-packages\pyexcel\sheets\sheet.py", line 20, in save_to
source.write_data(self)
File "D:\Tools\Python34\lib\site-packages\pyexcel\sources\file.py", line 80, in write_data
**self.keywords)
File "D:\Tools\Python34\lib\site-packages\pyexcel_io\__init__.py", line 296, in save_data
**keywords)
File "D:\Tools\Python34\lib\site-packages\pyexcel_io\__init__.py", line 260, in store_data
writer.write(data)
File "D:\Tools\Python34\lib\site-packages\pyexcel_io\base.py", line 290, in write
sheet.write_array(sheet_dicts[name])
File "D:\Tools\Python34\lib\site-packages\pyexcel_io\base.py", line 252, in write_array
for r in table:
File "D:\Tools\Python34\lib\site-packages\pyexcel_io\base.py", line 100, in to_array
cell_value = self.cell_value(r, c)
File "D:\Tools\Python34\lib\site-packages\pyexcel_xls\__init__.py", line 99, in cell_value
cell_type = self.native_sheet.cell_type(row, column)
File "D:\Tools\Python34\lib\site-packages\xlrd\sheet.py", line 413, in cell_type
return self._cell_types[rowx][colx]
IndexError: array index out of range
then, excluding the import, as you indicated,
#!/usr/bin/env python
import pyexcel
import os
from pyexcel.ext import xlsx
pyexcel.save_as(file_name="Test2.xlsx", dest_file_name="Test2.csv")
the output is,
D:\Tools\Python34\lib\site-packages\openpyxl-2.2.2-py3.4.egg\openpyxl\workbook\names\named_range.py:121: UserWarning: Discarded range with reserved name
warnings.warn("Discarded range with reserved name")
and, despite the warning, the output is OK!.
So it appears that including the import,
from pyexcel.ext import xls
pyexcel.xlx lib tries to process the xlsx file and ultimately calls xlrd and that is when the error occurs.
Without the import, the processing of the xlsx is handled by openpyxl and it successfully produces the output but with a warning.
from pyexcel.
Aha.. python 3. I haven't get one at hand. Will try it later. Looking at the tracback, it is related xlrd.
Can you try the following code instead?
#!/usr/bin/env python
import pyexcel
import os
from pyexcel.ext import xlsx
# from pyexcel.ext import xls <--
pyexcel.save_as(file_name="Test2.xlsx", dest_file_name="Test2.csv")
because pyexcel.ext.xls can read xlsx, as well as pyexcel.ext.xlsx, which uses openpyxl.
from pyexcel.
Yes, xlrd can't handle that file.
Python3 isn't the issue.
from pyexcel.
OK, I read the following part of your response. openpyxl is OK
from pyexcel.
If you want to keep xls support along with xlsx, you can do this:
import pyexcel
import os
from pyexcel.ext import xls # set reader for xls, xlsx, and writer for xls
from pyexcel.ext import xlsx # set again reader for xlsx, and writer for xslx,
...
So the second import will overwrite the first one. I will try to address this problem in future version of pyexcel-io v0.2.0.
from pyexcel.
As you have pointed out, no problem with openpyxl.
The problem is caused by including the import of pyexcel-xls after pyexcel-xlsx.
That caused pyexcel to use xlrd instead of openpyxl.
from pyexcel.
Here are the relevant codes for
https://github.com/pyexcel/pyexcel-xlsx/blob/master/pyexcel_xlsx/__init__.py#L145
https://github.com/pyexcel/pyexcel-xls/blob/master/pyexcel_xls/__init__.py#L227
from pyexcel.
The real problem is that original xlsx file.
xlrd handles the 2nd one without issue.
from pyexcel.
The point is that openpyxl handles it but with a warning and xlrd can't handle it at all.
The spreadsheet is at fault. I wouldn't change pyexcel for this one off occurance.
from pyexcel.
Can we conclude that the issue is with xlrd? Hence, probably we shall contact xlrd via this issue tacker
from pyexcel.
Agreed.
from pyexcel.
Thank you everyone for looking into it! My full script has to be able to handle both xls and xlsx files from clients which is why they were both in there. I appreciate the fix, switching the order of the imports was successful!
from pyexcel.
Issue was analyzed, root cause found and error was filed. Hence closing this issue. Please follow xlrd issue #167 for further info.
from pyexcel.
@Rehgan, Here is the sample code to use 'library' parameter
from pyexcel.
Related Issues (20)
- save_as() reports the wrong argument when an invalid keyword arg is supplied HOT 3
- How can I maintain column headers when adding all sheets to a book?
- Get cell indexes/coordinates
- deprecate old python versions HOT 1
- `file_name` argument supports `pathlib.Path`
- Add tqdm progressbar to BookStream and SheetStream [feature request]
- documentation: Output formats for p.save_as are not documented. HOT 1
- pyexcel assumes dest_file_name is a str, does not tolerate pathlib.Path
- CSV : decimal separator HOT 5
- append_doc method breaks with interpreter optimization on
- get_book() restults in TypeError: 'set' object is not subscriptable HOT 3
- Why is Chardet a dependency? HOT 3
- html getter is not defined HOT 1
- Replace deprecated imp module with importlib
- _csv.Error: iterator should return strings, not bytes (did you open the file in text mode?)
- save book after adding sheet not working
- Some information of a long text cell is not saved.
- Does not work with Python 3.12 (Windows 11, Office 365) HOT 1
- is slow load of .ods file expected/not planned to be fixed?
- Inconsistent result for formulas in string
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from pyexcel.