nro / dataframe Goto Github PK
View Code? Open in Web Editor NEWDataFrame Library for Java
License: MIT License
DataFrame Library for Java
License: MIT License
Hey Alex,
if I am putting an index on my DataFrame I can search for a row like this:
dataFrame.addIndex("idx", "barcode")
DataRow row = dataFrame.findByIndex("idx", "barcode1");
However, in some cases multiple rows are returned. So ideally it could return sth like:
Iterable<DataRow> row = dataFrame.findByIndex("idx", "barcode1");
Could you maybe add this.
Best, Simon
Hi,
I changed Locale to FRENCH in NumberUtil Class but it is not taken in copy column (I tried for Double type but I think it is the same for other types).
The only place where it is used is in DataFrame.print function..
How to assign all the values together in the dataframe and not one by one using append
Something like this:
for (int i = 1; i <= 334; i++) {
df.append(parsed);
}
Hi. I am currently facing an issue with the Dataframe.
I have donwloaded a file from Amazon s3 private bucket and I am facing issues while filtering the rows that respect a certain condition.
Here is my code:
`
//This function allows me to connect to the private s3 bucket
connection();
S3Object s3object = s3client.getObject(bucketName, sourceFile);
DataFrame file = DataFrame.load(s3object.getObjectContent(), FileFormat.CSV);
//listColumns & size displaying
System.out.println(file.getColumnNames().toString());
System.out.println(file.size());
//getting the first line with the header column "AreaQ" being superior to 2
file.select("(AreaQ > 2)").print();`
I am having an error on this last line saying that there was a NULL exception that occured and the exception being "Exception in getValues() with cause = 'NULL' and exception = 'column header name not found 'AreaQ'' de.unknownreality.dataframe.DataFrameRuntimeException: column header name not found 'AreaQ'"
and yet I do have a column named AreaQ with numeric values that are > to 2.
Can you help me please?
Discovered this while adding columns to data frames in a multi threaded environment. Each thread has its own frame, so concurrency should not be an issue. However, ParserUtil#getParserMap()
and ParserUtil#init()
are not thread-safe. The lazy initialization of parserMap can lead to unexpected "Parser not found" errors when adding columns to multiple frames in multiple threads. This occurs because the if (parserMap == null)
check no longer triggers (since the map has been created by init()
), but the map is not done initializing.
From just looking at ParserUtil.java, it doesn't seem like there is a reason to lazily initialize this. Making the parser map static final
and initializing it in a static
block seems like a good solution that avoids multi-threading issues.
In my situation, I need get the dataframe size frequent
But consider the API, I will do this by
df.toList().size()
but the toList method defined in BaseDataFrame is expensive.
@Override
public List<List> toList() {
ArrayList<List> list = new ArrayList<>();
for (DataRow row : this) {
List data = new ArrayList();
for (int i = 0; i < columns.length; i++) {
data.add(row.get(i));
}
list.add(data);
}
return list;
}
any idea?
With the introduction of custom value types (#22), all column values can be written and read from DataStreams. This enables the implementation of a binary file format to improve performance and decrease file sizes
Apache Commons CSV can be used to improve CSV Format support.
This can be either done by including it in the main project or create an optional module (similar to GTF support ).
Hi Alexander,
I'm looking at "DataFrame" to use it in of my pet projects
Siegfried
introduce a value type abstraction to support custom column types (like temporal column types #21 )
Value types should provide the following methods:
These value types must then be used in the following library parts:
Value types are currently being worked on in branch
value-type-abstraction
Progress:
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.