It will be very useful for having a function that will give me a list of all excel she

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Hi <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

getting a List of all sheets in the excel file about spark-excel HOT 14 CLOSED

crealytics commented on July 19, 2024

getting a List of all sheets in the excel file

from spark-excel.

Comments (14)

nightscape commented on July 19, 2024 1

@qfjp in principle, something like the following could work, but unfortunately there is a problem wrt. shading:

import com.crealytics.spark.excel._

val wbr = WorkbookReader(Map("path" -> "/path/to/file.xlsx"), spark.sparkContext.hadoopConfiguration)
// wbr: com.crealytics.spark.excel.WorkbookReader = com.crealytics.spark.excel.DefaultWorkbookReader@53ce41e7

wbr.withWorkbook(w => w.getSheetAt(0))
// java.lang.ClassCastException: class shadeio.poi.xssf.usermodel.XSSFWorkbook cannot be cast to class org.apache.poi.ss.usermodel.Workbook (shadeio.poi.xssf.usermodel.XSSFWorkbook and org.apache.poi.ss.usermodel.Workbook are in unnamed module of loader scala.reflect.internal.util.ScalaClassLoader$URLClassLoader @6d4c7248)
//  at $anonfun$1.apply(<console>:29)
//  at com.crealytics.spark.excel.WorkbookReader$class.withWorkbook(WorkbookReader.scala:15)
//  at com.crealytics.spark.excel.DefaultWorkbookReader.withWorkbook(WorkbookReader.scala:38)
//  ... 51 elided

If you can figure out how to solve the shading problem, that would be the way to go.
One possibility would be to create a def sheetNames: Seq[String] on WorkbookReader.
That should hopefully work as it doesn't expose any shaded classes.

We're open to PRs 😄

from spark-excel.

nightscape commented on July 19, 2024

Hi @antonkulaga,

would you want to retrieve this as a special DataFrame like this?

val df = sqlContext.read
    .format("com.crealytics.spark.excel")
    .option("sheetName", "SHEET_NAMES")
    .load("Worktime.xlsx")

Or just a simple helper function like

ExcelRelation.sheetNames("Worktime.xlsx")

Would you mind giving this a try yourself?

from spark-excel.

antonkulaga commented on July 19, 2024

@nightscape either way is ok for me. I think just RDD[String] or even Seq[String] will be enough

from spark-excel.

nightscape commented on July 19, 2024

I'm slightly leaning toward the Seq[String] version.
Do you wanna give it a try?

from spark-excel.

antonkulaga commented on July 19, 2024

Yes, of course

from spark-excel.

nightscape commented on July 19, 2024

@antonkulaga are you still interested in this?

from spark-excel.

antonkulaga commented on July 19, 2024

@nightscape it is nice to have but it is not of any urgency for me

from spark-excel.

nightscape commented on July 19, 2024

Ok, then I'll close this for the time being.

from spark-excel.

qfjp commented on July 19, 2024

Has there been any work in this direction in the intervening time? If not, I would still be interested in such a feature.

from spark-excel.

luzlab commented on July 19, 2024

You can always use a different library to get the sheet names. If you're using python, you can use xlrd. It looks fairly straightforward:

import xlrd
xls = xlrd.open_workbook(r'<path_to_your_excel_file>', on_demand=True)
print xls.sheet_names()

from spark-excel.

qfjp commented on July 19, 2024

Unfortunately python isn't really an option, and I would prefer not to mix libraries if I already have one for xlsx. Thanks for the advice though, it might come in handy in the future.

from spark-excel.

luzlab commented on July 19, 2024

OK. So would you prefer (1) to the sheetnames as a dataframe or (2) a helper function that returns a list of the sheetnames?

from spark-excel.

qfjp commented on July 19, 2024

@luzlab I don't think I'm ever going to be working with enough sheets to need it as a DataFrame, but it still might be worth it just for consistency's sake?

@nightscape Thanks for the advice, I may take a look at the code later and submit something.

from spark-excel.

nightscape commented on July 19, 2024

Fixed by #196 and #199

from spark-excel.

getting a List of all sheets in the excel file about spark-excel HOT 14 CLOSED

Comments (14)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent