Comments (5)
Yes, you can use
SparkUtils.flattenSchema(df)
from cobrix.
Can we use this if our code base is in python? I believe we can use something like py4j, but is there another easier way?
from cobrix.
We haven't tried this, but you might be able to use the method from the JVM to Python gateway (sc._gateway.jvm...
) the same way PySpark interacts with Scala codebase.
Alternatively, the source code for the flattening is not too big, you can covert it in Python for your use. If you want, you can also contribute the Python version of the flattening code to Cobrix, we can include as one of the examples.
from cobrix.
Sorry for the late reply. Can you show a snippet of how the call from the JVM to Python gateway would work?
Regarding contribution, I do not have the bandwidth at the moment.
from cobrix.
I actually haven't tried this specifically with the flattening code. I might check it out and let you know if it worked
I expect something like:
sc = self.spark.sparkContext
flat_java_df = sc._gateway.jvm.za.co.absa.cobrix.spark.cobol.utils.SparkUtils.flattenSchema(df._jdf)
flat_df = spark.createDataFrame(flat_java_df)
ensuring Cobrix libraries are in the class path.
from cobrix.
Related Issues (20)
- Add the ability to reassemble a multi-segment file
- Add the ability to read file headers from custom record extractors
- Custom record extractors fail with indexes
- Fix backwards compatibility of the custom record extractor
- Does Cobrix handle the Easytrieve layout ? HOT 3
- Add support of binary fields
- Cobrix returning hexadecimal value in different format (qb) HOT 5
- Binary file with Endianness is expressed as big-endian and field PIC. HOT 4
- Packed-Decimal columns help required in outgoing EBCDIC file HOT 3
- Processing .gz files HOT 3
- Multiple codepages in the same file HOT 9
- documentation for different record formats HOT 1
- Generating the 5 dependency jars to run cobrix HOT 2
- copybook meta data for RDBMS HOT 5
- ADLS support HOT 1
- Mainframe Condensed data HOT 1
- COMP-3 field is being read with a value 3 less than expected value HOT 3
- Df to sas7bdat file writer HOT 3
- Installing Cobrix Libraries HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from cobrix.