Giter Club home page Giter Club logo

Comments (23)

vkuznet avatar vkuznet commented on July 28, 2024

from cmsspark.

sciaba avatar sciaba commented on July 28, 2024

If so, how can I know if the fact of not having found the code on lxplus is a problem for my job? What would the symptoms be?

from cmsspark.

vkuznet avatar vkuznet commented on July 28, 2024

from cmsspark.

vkuznet avatar vkuznet commented on July 28, 2024

from cmsspark.

sciaba avatar sciaba commented on July 28, 2024

The second commit just removes the trailing comma if sparkexjar is empty, right? I guess it doesn't change anything from a practical point of view.
The only remaining issue is purely cosmetic: the output still produces

ls: cannot access /usr/lib/spark/examples/lib/spark-examples*: No such file or directory

so I'd add a 2> /dev/null...

from cmsspark.

vkuznet avatar vkuznet commented on July 28, 2024

from cmsspark.

sciaba avatar sciaba commented on July 28, 2024

There is a fundamental problem. If sparkexjar is empty, there is no way for

https://github.com/vkuznet/CMSSpark/blob/d42692cd75c20227b2988c033cec99788170da9c/src/python/CMSSpark/spark_utils.py#L419

to work because

aconv="org.apache.spark.examples.pythonconverters.AvroWrapperToJavaConverter"

won't exist.
I believe this is why I cannot get CMSSpark to work from ithdp-client01.cern.ch when I do

source hadoop-setconf.sh analytix

which is equivalent to

source /cvmfs/sft.cern.ch/lcg/etc/hadoop-confext/hadoop-setconf.sh analytix

In both cases you use the latest version of Spark, which doesn't distribute the examples. So, I cannot even suggest a fix.

from cmsspark.

sciaba avatar sciaba commented on July 28, 2024

I have submitted a ticket:

https://cern.service-now.com/service-portal/view-request.do?n=RQF1130097

from cmsspark.

vkuznet avatar vkuznet commented on July 28, 2024

from cmsspark.

sciaba avatar sciaba commented on July 28, 2024

Hi Valentin,
thank you! Could you give a look at the SNOW ticket I created? Zbigniew is willing to distribute the examples jar and you might want to agree with him how and where.
I don't know if I'll have time to try your fix very soon, I'll do my best.

from cmsspark.

sciaba avatar sciaba commented on July 28, 2024

Just by looking at the code I noticed that you are pointing to the old version of the jar; Zbigniew made available the latest version, compatible with 2.3.2, on it-hadoop-client under
/usr/hdp/spark-2/examples/jars/spark-examples_2.11-2.3.2.jar

from cmsspark.

vkuznet avatar vkuznet commented on July 28, 2024

from cmsspark.

sciaba avatar sciaba commented on July 28, 2024

The jar file disappeared. Now I'm 100% stuck. Do you manage to use CMSSpark at all?

from cmsspark.

vkuznet avatar vkuznet commented on July 28, 2024

from cmsspark.

sciaba avatar sciaba commented on July 28, 2024

from cmsspark.

vkuznet avatar vkuznet commented on July 28, 2024

from cmsspark.

vkuznet avatar vkuznet commented on July 28, 2024

from cmsspark.

vkuznet avatar vkuznet commented on July 28, 2024

from cmsspark.

sciaba avatar sciaba commented on July 28, 2024

It worked also for me from lxplus7, but only after replacing

.write.format("com.databricks.spark.csv")\

with

.write.format("csv")\

Otherwise I was getting

pyspark.sql.utils.AnalysisException: u'path hdfs://analytix/cms/users/asciaba/prova2/2018/10/15 already exists.;'

even if prova2 didn't exist before.

from cmsspark.

sciaba avatar sciaba commented on July 28, 2024

Could you try to do the same test but with code that reads Avro files (that is, that calls jm_tables or cmssw_tables)? This is to test if the Avro jar works with Spark 2.3.

from cmsspark.

vkuznet avatar vkuznet commented on July 28, 2024

from cmsspark.

vkuznet avatar vkuznet commented on July 28, 2024

from cmsspark.

sciaba avatar sciaba commented on July 28, 2024

I could also run my code (which uses Avro files) successfully with and without Yarn. I suppose we can close this issue.

from cmsspark.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.