Comments (5)
Hi @r3econ, thanks for reaching out.
How big are the rows you are trying to load? Given that you use --connector.csv.maxCharsPerColumn -1
I suppose they are quite large.
When dealing with large rows: here are some things that work:
- Throttle DSBulk; this is best done by setting
--engine.maxConcurrentQueries X
where X is a small number: start with 1, if it works, that's great but the throughput will be poor; try increasing the number little by little to improve DSBulk's throughput without breaking the operation. - Increase the heap size. This is done by setting the environment variable
DSBULK_JAVA_OPTS
. E.g.export DSBULK_JAVA_OPTS="-Xms1g -Xmx1g"
Hope that helps!
from dsbulk.
You can also add --dsbulk.log.sources false
to lower the heap pressure.
from dsbulk.
Any updates on this @r3econ ?
from dsbulk.
Yes, thanks for the info. I managed to get it working by tweaking the params
from dsbulk.
Glad to hear! Let's close this issue then.
from dsbulk.
Related Issues (20)
- Exception in thread "main" java.lang.NoClassDefFoundError: org/slf4j/LoggerFactory HOT 9
- dsbulk unload stuck when config -maxConcurrentFiles (write concurrency) greater than 1 HOT 1
- DSBulk Java API
- DSBulk dependency on `logback` implementation
- `ClassLoader` aware DSBulk
- `maxRecords` flag does not apply to write operations
- DSBulk count doesn't work on tables with just partition keys
- dsbulk compat with vector type HOT 4
- Loading from AWS S3 large file gives "Required array length is too large" error HOT 2
- Cannot import multiple values in a map<T,T> column using CSV files
- Add support for loading/unloading vector type data HOT 1
- dsbulk doesn't support toUnixTimestamp? HOT 4
- Parsing trouble when a column is called "vector" HOT 6
- Parsing vector data from JSON fails for "floats" with too many digits (aka doubles) HOT 1
- Split when unloading into smaller files
- Escape character when unloading
- DSBulk unload fails to parse map[value] as provided in query HOT 2
- Windows version only works when dsbulk in in short folders
- DSBulk DELETE can not accept any ranges on the clustering column when used within -query
- Allow file input for dsbulk unload
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from dsbulk.