linkedin / datahub-gma Goto Github PK
View Code? Open in Web Editor NEWGeneral Metadata Architecture
License: Apache License 2.0
General Metadata Architecture
License: Apache License 2.0
Medium to long term we'd like to move more code over from linkedin/datahub than was in this initial move. This includes the restli DAO and jobs (or at least very easy to use libraries for the jobs).
PR#132 changes the intended behavior of get()
when parameter aspectNames
is empty. Instead of throwing a 404, it will return VALUE with no aspects. To accomplish this behavior, getInternalNonEmpty
call was changed to getInternal
. We need to evaluate this behavior with getAll()
, batchGet()
, and search()
methods to make sure they are aligned with get()
's new behavior.
Double check and update calls to getInternalNonEmpty if needed to match with new get()
behavior.
Create a roadmap. Ideally we phrase this as the roadmap to a 1.0 (more stable) release.
We haven't just split our code up, but also our documentation. They need more cleanup than what I've given them initially here. For now, users should probably read the complete / untouched documentation on linkedin/datahub.
Few methods that are not overridden are missing java docs like getAllWithMetadata
. For other methods that are overridden, it will be good to have appropriate java docs as well.
Spotless can help here.
Javadoc formatting also includes ensuring that it is A) valid javadoc (html) and B) references are valid.
docs/developers.md
says ./gradlew build
should work, it no longer does out of the box. Before we were on ea281ea, had no problems then.
With current master (ff9a36b) I noticed I had to:
.github/workflows/build-and-test.yml
), maybe this can just be updated in the docs?TZ=UTC ./gradlew build
to make :dao-impl:ebean-dao:test
pass; otherwise I got stuff like12:01:29.310 [DEBUG] [TestEventLogger] Gradle suite > Gradle test > com.linkedin.metadata.dao.localrelationship.EbeanLocalRelationshipWriterDAOTest.testAddRelationshipWithRemoveAllEdgesToDestination FAILED
12:01:29.311 [DEBUG] [TestEventLogger] javax.persistence.PersistenceException: Data truncation: Incorrect datetime value: '1970-01-01 00:00:01' for column 'lastmodifiedon' at row 1
Think this is 1970/01/01 in my timezone (=GMT+1) hence before 1970 in UTC. Thought the latest commits could maybe have fixed it since they mentioned lastmodifiedon
, but apparently the issue is still there (tested with a8fb6c9 and ff9a36b).
Tests should work regardless of timezone
Ubuntu 22.04
From linkedin/datahub
Currently, the "CONTAIN" filter condition is not supported in Search, because the getQueryBuilderFromCriterion
in SearchUtils
class explicitly does not support it.
final Condition condition = criterion.getCondition();
if (condition == Condition.EQUAL) {
if (criterion.getValue().startsWith("urn:li:")) {
return QueryBuilders.termsQuery(criterion.getField(), criterion.getValue().trim());
}
return QueryBuilders.termsQuery(criterion.getField(), criterion.getValue().trim().split("\\s*,\\s*"));
} else if (condition == Condition.GREATER_THAN) {
return QueryBuilders.rangeQuery(criterion.getField()).gt(criterion.getValue().trim());
} else if (condition == Condition.GREATER_THAN_OR_EQUAL_TO) {
return QueryBuilders.rangeQuery(criterion.getField()).gte(criterion.getValue().trim());
} else if (condition == Condition.LESS_THAN) {
return QueryBuilders.rangeQuery(criterion.getField()).lt(criterion.getValue().trim());
} else if (condition == Condition.LESS_THAN_OR_EQUAL_TO) {
return QueryBuilders.rangeQuery(criterion.getField()).lte(criterion.getValue().trim());
}
throw new UnsupportedOperationException("Unsupported condition: " + condition);
}```
(even though it exists in the Filter Condition enum.
#### To Reproduce
Steps to reproduce the behavior:
1. Deploy DataHub
2. Issue a Search Query with a specific "filter" criteria that has the condition "CONTAIN". You'll see a server error returned.
![image](https://user-images.githubusercontent.com/17549204/125294210-efdd4080-e2d8-11eb-8a0d-143f87c93453.png)
#### Expected behavior
Contains operator should work for substring of string fields.
As reported by Lal Rishav at Saxo Bank!
#### Screenshots If applicable, add screenshots to help explain your problem.
#### Desktop (please complete the following information):
- OS: [e.g. iOS]
- Browser [e.g. chrome, safari]
- Version [e.g. 22]
#### Additional context
Add any other context about the problem here.
dyld: Library not loaded: /usr/local/opt/openssl/lib/libssl.1.0.0.dylib
Steps to reproduce the behavior:
Use a M1 chip Mac
build should work
I tried to install [email protected]
by myself per recommendation: #220.
However, I met a series of exception and here are related posts:
I was not able to install this version, can we choose another openssl version as I can successfully install 1.1, or, can someone help me install it, many thanks.
Now that this code lives in this git repo & we have published jars, we should be able to stop pushing code from internal and make this the source of truth.
We first need to catch up, then we can switch. We're a ways behind.
Some references to the code that splits the value provided in Criterion
model by commas include
We should add integration tests for Elasticsearch, or at least a framework for it in GMA so it is easy to write integration tests in DataHub.
We need to start publishing jars do DataHub can use them.
To be more inclusive: https://github.com/github/renaming
Bintray is being deprecated. Awaiting guidance from LinkedIn for a migration path forward.
https://jfrog.com/blog/into-the-sunset-bintray-jcenter-gocenter-and-chartcenter/
Deadline seems to be May 1st.
We should promote warnings to errors, and clean up said errors. There are a few places in our code that are a bit sloppy due to lack of these warnings being enforced (e.g. rawtype
and unchecked
warnings)
Post move we probably have a lot of unused dependencies we can remove.
The BaseQueryDAO
exposes four methods with a Statement
argument, which is referred to as "raw graph query statement", e.g.
What query language is expected here? Does it depend on the actual BaseQueryDAO
implementation?
If the that is the case, any code that uses such statement-methods would render implementation-specific, which contradicts the DAO approach.
We are considering using an upstream project datahub. Our team is an AWS shop, and would like to take advantage of AWS hosted solutions like Neptune whenever possible. It would be great to add support for Gremlin (one of the interfaces that Neptune implements) so that we can easily host the graph database for datahub.
Implement a BaseGraphWriterDAO and BaseQueryDAO for gremlin based graph data stores.
Alternatives would be finding 3rd party neo4j SaaS provider, or hosting our own database cluster within AWS. Both of these are something that we would prefer to avoid if possible, for both cost and business reasons.
Add any other context or screenshots about the feature request here.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.