Comments (23)
from cmsspark.
If so, how can I know if the fact of not having found the code on lxplus is a problem for my job? What would the symptoms be?
from cmsspark.
from cmsspark.
from cmsspark.
The second commit just removes the trailing comma if sparkexjar is empty, right? I guess it doesn't change anything from a practical point of view.
The only remaining issue is purely cosmetic: the output still produces
ls: cannot access /usr/lib/spark/examples/lib/spark-examples*: No such file or directory
so I'd add a 2> /dev/null...
from cmsspark.
from cmsspark.
There is a fundamental problem. If sparkexjar is empty, there is no way for
to work because
aconv="org.apache.spark.examples.pythonconverters.AvroWrapperToJavaConverter"
won't exist.
I believe this is why I cannot get CMSSpark to work from ithdp-client01.cern.ch when I do
source hadoop-setconf.sh analytix
which is equivalent to
source /cvmfs/sft.cern.ch/lcg/etc/hadoop-confext/hadoop-setconf.sh analytix
In both cases you use the latest version of Spark, which doesn't distribute the examples. So, I cannot even suggest a fix.
from cmsspark.
I have submitted a ticket:
https://cern.service-now.com/service-portal/view-request.do?n=RQF1130097
from cmsspark.
from cmsspark.
Hi Valentin,
thank you! Could you give a look at the SNOW ticket I created? Zbigniew is willing to distribute the examples jar and you might want to agree with him how and where.
I don't know if I'll have time to try your fix very soon, I'll do my best.
from cmsspark.
Just by looking at the code I noticed that you are pointing to the old version of the jar; Zbigniew made available the latest version, compatible with 2.3.2, on it-hadoop-client under
/usr/hdp/spark-2/examples/jars/spark-examples_2.11-2.3.2.jar
from cmsspark.
from cmsspark.
The jar file disappeared. Now I'm 100% stuck. Do you manage to use CMSSpark at all?
from cmsspark.
from cmsspark.
from cmsspark.
from cmsspark.
from cmsspark.
from cmsspark.
It worked also for me from lxplus7, but only after replacing
.write.format("com.databricks.spark.csv")\
with
.write.format("csv")\
Otherwise I was getting
pyspark.sql.utils.AnalysisException: u'path hdfs://analytix/cms/users/asciaba/prova2/2018/10/15 already exists.;'
even if prova2 didn't exist before.
from cmsspark.
Could you try to do the same test but with code that reads Avro files (that is, that calls jm_tables or cmssw_tables)? This is to test if the Avro jar works with Spark 2.3.
from cmsspark.
from cmsspark.
from cmsspark.
I could also run my code (which uses Avro files) successfully with and without Yarn. I suppose we can close this issue.
from cmsspark.
Related Issues (20)
- Last day of the month is not processed in bin/cron4dbs_condor HOT 4
- rucio_daily.py error due to wrong default date HOT 1
- Typo in date string HOT 1
- Evaluate to run yum update cern-hadoop-config in each Spark script
- Consider file pfn vs lfn in schemas
- --cvmfs option to run_spark de facto mandatory HOT 2
- Reading Avro files as Dataframes instead of RDD? HOT 5
- Make run_spark compatible with the ithdp-client cluster HOT 1
- Wrong type HOT 4
- Typo in variable name HOT 1
- dbs_events script missing header row in output HOT 2
- Corrupt timestamps in classad csvs HOT 1
- Authentication issues on it-hadoop-client HOT 8
- Simple setup for it-hadoop-client HOT 11
- Bug in run_spark? HOT 1
- Wrong field name in processing era dataframe? HOT 5
- Creating check functions for critic cron jobs HOT 1
- Main bash script for cron jobs and exposing cron metrics to push-gateway HOT 3
- Apply check util functions in cron jobs HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from cmsspark.