My file is in s3 and I am trying to run this code from databricks. My s3 location is a

Does this help? <a href="http://mail-archives.apache.org/mod_mbox/spark-user/201507.mb

FileNotFoundException when trying to use com.crealytics.spark.excel about spark-excel HOT 15 CLOSED

crealytics commented on July 19, 2024

FileNotFoundException when trying to use com.crealytics.spark.excel

from spark-excel.

Comments (15)

nightscape commented on July 19, 2024

You're mixing up things.
Delete the .option("location",...) part and specify the full URL in load

from spark-excel.

kchatha commented on July 19, 2024

Then I get this

java.lang.IllegalArgumentException: Parameter "location" is missing in options.

from spark-excel.

nightscape commented on July 19, 2024

Seems you're using an outdated version.

from spark-excel.

kchatha commented on July 19, 2024

Thanks a lot for your quick response.
I am using spark-excel_2.11-0.9.17
My Spark version is spark 2.3.1
My Scala Version 2.11

from spark-excel.

nightscape commented on July 19, 2024

I think something is messed up with your classpath. This is definitely not version 0.9.17 because that option is now called path (see the checkParameter("path")):
https://github.com/crealytics/spark-excel/blob/master/src/main/scala/com/crealytics/spark/excel/DefaultSource.scala

from spark-excel.

kchatha commented on July 19, 2024

Thanks a lot. I cleaned up the classpath and now I am getting location exception but I am still getting other exception. This is the excel sheet I am getting from business so I can't even modify it manually.
org.apache.spark.sql.AnalysisException: Attribute name "Contract Identification Number" contains invalid character(s) among " ,;{}()\n\t=".

from spark-excel.

kchatha commented on July 19, 2024

Thanks a lot. I cleaned up the classpath and now I am not getting location exception but I am still getting other exception. This is the excel sheet I am getting from business so I can't even modify it manually.
org.apache.spark.sql.AnalysisException: Attribute name "Contract Identification Number" contains invalid character(s) among " ,;{}()\n\t=".

from spark-excel.

nightscape commented on July 19, 2024

Does this help? http://mail-archives.apache.org/mod_mbox/spark-user/201507.mbox/%3CCAAswR-6D7CGv1varyA0cYWnoH3nSr17gb+hypHORi3APGrhB7A@mail.gmail.com%3E

from spark-excel.

kchatha commented on July 19, 2024

Something is weird. I am creating a generic framework to extract any csv and excel. I am using spark-csv library , I extract the schema and save it and if schema is already there I will apply that schema in future.

However in case of Spark-excel library , when I extract schema I get all the column names, datatypes and nullable but I also get column names like this
"StructField(ReportingPeriodEndingDate,StringType,true)_color".

from spark-excel.

kchatha commented on July 19, 2024

One more question: What should be the format of custom schema?

from spark-excel.

kchatha commented on July 19, 2024

When I convert excel to CSV, it generates a lot of empty columns.

from spark-excel.

nightscape commented on July 19, 2024

That strange-looking column should actually be called "ReportingPeriodEndingDate_color" and is a generated column that contains the cell color of the "ReportingPeriodEndingDate" column.
You can turn that of by leaving out the .option("addColorColumns", "true").

from spark-excel.

kchatha commented on July 19, 2024

If I turn it off then it doesn't generate schema. I need to capture the schema.

from spark-excel.

nightscape commented on July 19, 2024

Wdym, "it doesn't generate schema"?
After reading from an Excel file to a df: DataFrame, you always have an associated schema with df.schema.

from spark-excel.

kchatha commented on July 19, 2024

Thanks a lot for your quick response and your help. I am able to solve the problem.

from spark-excel.

FileNotFoundException when trying to use com.crealytics.spark.excel about spark-excel HOT 15 CLOSED

Comments (15)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent