Related Issues (20)
- [Bug] kfp worflow-test in ci/cd test.yml is not reproducible and may select an invalid dir. HOT 4
- [Feature] Develop ability to run ci/cd testing only on the portion of repo that has changed and its dependencies HOT 1
- Running out of disk space in ci/cd tests HOT 1
- [Bug] PII Transform - has portion of the CI/CD testing disabled HOT 2
- Enhance Code2Parquet module to handle non-code text as well HOT 5
- Look into the Quay security scanner checks that show a couple of critical severity issues with our images (e.g., as related to the existence of old pyarrow versions)
- [Bug] Remove Test from published library release to pypi HOT 2
- [Feature] Publish Single Wheel for Doc Quality Transform HOT 1
- [Feature] Allow a transform to define the file extensions it supports
- [Feature] Allow selected metadata fields to be ignored during tests. HOT 2
- [Bug] set-versions does not work for all files in the various git folders HOT 1
- [Feature] Provide an operator that loads files content to parquet HOT 2
- [Feature] In-Memory / no persistence storage runtime HOT 2
- [Bug] fasttext==0.9.2 doesn't build/install with GCC v13 compiler HOT 1
- [Bug] doc_id_transform not published to PYPI HOT 2
- [Bug] pdf2parquet is now failing ci/cd builds HOT 1
- [Feature] Publish data-prep-kit core and transforms NIGHTLY into pypi
- [Feature] Add a black list for KFP workflow tests
- [Bug] header_cleanser fails in running in openshift
- [Feature] HTML to Markdown (based on HTML2Parquet trafilatura code)
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from data-prep-kit.