odpi / egeria-connector-hadoop-ecosystem Goto Github PK
View Code? Open in Web Editor NEWHadoop ecosystem connectors for Egeria: repository proxy connector for Apache Atlas.
License: Apache License 2.0
Hadoop ecosystem connectors for Egeria: repository proxy connector for Apache Atlas.
License: Apache License 2.0
Hi, i'm trying to set up an egeria instance and connect to it using the hadoop connector.
I'm using the documentation in: https://odpi.github.io/egeria-connector-hadoop-ecosystem/getting-started/index.html#/7/1
Currently i have deployed an apache atlas instance using docker:
docker run -p 9026:9026 -p 9027:9027 -p 21000:21000 docker.io/planetf1/apache-atlas:latest
An apache kafka instance using:
bin/zookeeper-server-start.sh config/zookeeper.properties
bin/kafka-server-start.sh config/server.properties
And i followed the documentation except the part that is using the truststore.p12
The fact that i use http instead of https is the only difference.
But when i launch the server using:
curl -k -X POST "http://localhost:8080/open-metadata/admin-services/users/admin/servers/atlas/instance"
I always get this error:
Desktop โ curl -k -X POST "http://localhost:8080/open-metadata/admin-services/users/admin/servers/atlas/instance"
{"class":"SuccessMessageResponse","relatedHTTPCode":500,"exceptionClassName":"org.odpi.openmetadata.adminservices.ffdc.exception.OMAGConfigurationErrorException","exceptionCausedBy":"org.odpi.openmetadata.repositoryservices.ffdc.exception.OMRSConfigErrorException","actionDescription":"activateWithSuppliedConfig","exceptionErrorMessage":"OMAG-ADMIN-500-001 Method activateWithSuppliedConfig for OMAG server atlas returned an unexpected exception of org.odpi.openmetadata.repositoryservices.ffdc.exception.OMRSConfigErrorException with message OMRS-CONNECTOR-400-005 The connector to the local repository failed with a org.odpi.openmetadata.repositoryservices.ffdc.exception.OMRSLogicErrorException exception and the following error message: OMRS-REPOSITORY-400-025 Local metadata repository has not initialized correctly because it was unable to create its metadata collection","exceptionErrorMessageId":"OMAG-ADMIN-500-001","exceptionErrorMessageParameters":["atlas","activateWithSuppliedConfig","org.odpi.openmetadata.repositoryservices.ffdc.exception.OMRSConfigErrorException","OMRS-CONNECTOR-400-005 The connector to the local repository failed with a org.odpi.openmetadata.repositoryservices.ffdc.exception.OMRSLogicErrorException exception and the following error message: OMRS-REPOSITORY-400-025 Local metadata repository has not initialized correctly because it was unable to create its metadata collection"],"exceptionSystemAction":"The system is unable to work with the OMAG server. No change was made to the server's configuration document.","exceptionUserAction":"This is likely to be either a configuration, operational or logic error. Look for other errors. Validate the request. If you are stuck, raise an issue."}
I dont underestad what means Local metadata repository has not initialized correctly because it was unable to create its metadata collection
The other configuration posts calls does not throw any error, just a {"class":"VoidResponse","relatedHTTPCode":200}
as the documentation says.
Could someone give me some hint about what is happening? I'm really stuck at this point.
Thank you very much in advance!
eg. metadataCollectionId, createdBy, version, etc.
Atlas does not appear to support searching for empty values. Neither the = ""
or isNull
operators appear to work to achieve this, for example, against an empty description
property.
The implication of this is that the Egeria Conformance Test Suite (CTS) cannot be fully passed, as some search scenarios cannot be supported.
CTS currently has issues against the Anchors
and ProcessCall
types (only):
{
"profileId": 0,
"requirementId": 3,
"testCaseId": "repository-typedef-ProcessCall-null",
"testCaseName": "Repository type definition test case",
"testCaseDescriptionURL": "https://egeria.odpi.org/open-metadata-conformance-suite/docs/repository-workbench/test-cases/repository-typedef-test-case.md",
"testEvidenceType": "UNEXPECTED_EXCEPTION",
"assertionMessage": "Unexpected Exception TypeDefNotSupportedException : OMRS-ATLAS-REPOSITORY-404-001 The typedef \"ProcessCall\" is not supported by repository \"atlas\"",
"conformanceException": {
"exceptionClassName": "org.odpi.openmetadata.repositoryservices.ffdc.exception.TypeDefNotSupportedException",
"errorMessage": "OMRS-ATLAS-REPOSITORY-404-001 The typedef \"ProcessCall\" is not supported by repository \"atlas\""
}
},
{
"profileId": 0,
"requirementId": 3,
"testCaseId": "repository-typedef-Anchors-null",
"testCaseName": "Repository type definition test case",
"testCaseDescriptionURL": "https://egeria.odpi.org/open-metadata-conformance-suite/docs/repository-workbench/test-cases/repository-typedef-test-case.md",
"testEvidenceType": "UNEXPECTED_EXCEPTION",
"assertionMessage": "Unexpected Exception TypeDefNotSupportedException : OMRS-ATLAS-REPOSITORY-404-001 The typedef \"Anchors\" is not supported by repository \"atlas\"",
"conformanceException": {
"exceptionClassName": "org.odpi.openmetadata.repositoryservices.ffdc.exception.TypeDefNotSupportedException",
"errorMessage": "OMRS-ATLAS-REPOSITORY-404-001 The typedef \"Anchors\" is not supported by repository \"atlas\""
}
}
The getting started guide associated with this connector needs to either be removed or updated since it is highly misleading:
Here is the link https://odpi.github.io/egeria-connector-hadoop-ecosystem/getting-started/index.html
Currently this repository is oriented around the build of our original Atlas connector.
There are two further pieces of work on
I think it's highly likely there will be substantial differences for these projects as they are very distinct in terms of
I'd therefore propose we
( The following is premised on substantial differences. If in fact all the connectors will be very similar, then we could skip this - but i think the above is still desirable to create space for the new modules, and I would err on the flexible approach)
Part 2 could then include
- renaming build scripts to qualify them as for atlas only
- ideally we only want to rebuild if that particular area of code is modified vs any change in repo
Where we then add new areas we can create additional, selective actions for clarity
Does this make sense @davidradl @wbittles @cmgrote
When running the egeria-server-chassis-spring.jar file according to the README, the truststore file is required. If it isn't in the local path, the server won't start. Including truststore.p12 from the egeria repo fixes this problem, but it is not documented yet. May be related to odpi/egeria#3705
It seems that there are overlapping dependencies:
[WARNING] asm-5.0.4.jar, asm-3.1.jar define 21 overlapping classes:
[WARNING] - org.objectweb.asm.Type
[WARNING] - org.objectweb.asm.AnnotationVisitor
[WARNING] - org.objectweb.asm.MethodVisitor
[WARNING] - org.objectweb.asm.Attribute
[WARNING] - org.objectweb.asm.FieldWriter
[WARNING] - org.objectweb.asm.signature.SignatureWriter
[WARNING] - org.objectweb.asm.MethodWriter
[WARNING] - org.objectweb.asm.Edge
[WARNING] - org.objectweb.asm.Handler
[WARNING] - org.objectweb.asm.ByteVector
[WARNING] - 11 more...
[WARNING] hadoop-auth-2.9.2.jar, hadoop-core-1.2.1.jar define 20 overlapping classes:
[WARNING] - org.apache.hadoop.security.authentication.server.AuthenticationFilter
[WARNING] - org.apache.hadoop.security.authentication.server.KerberosAuthenticationHandler
[WARNING] - org.apache.hadoop.security.authentication.util.KerberosUtil
[WARNING] - org.apache.hadoop.security.authentication.server.PseudoAuthenticationHandler
[WARNING] - org.apache.hadoop.security.authentication.client.AuthenticationException
[WARNING] - org.apache.hadoop.security.authentication.client.AuthenticatedURL
[WARNING] - org.apache.hadoop.security.authentication.client.AuthenticatedURL$Token
[WARNING] - org.apache.hadoop.security.authentication.util.Signer
[WARNING] - org.apache.hadoop.security.authentication.client.Authenticator
[WARNING] - org.apache.hadoop.security.authentication.client.KerberosAuthenticator
[WARNING] - 10 more...
[WARNING] jsp-2.1-6.1.14.jar, jasper-runtime-5.5.12.jar define 43 overlapping classes:
[WARNING] - org.apache.jasper.runtime.PerThreadTagHandlerPool$1
[WARNING] - org.apache.jasper.runtime.JspFactoryImpl$PrivilegedGetPageContext
[WARNING] - org.apache.jasper.runtime.PageContextImpl$4
[WARNING] - org.apache.jasper.runtime.JspSourceDependent
[WARNING] - org.apache.jasper.runtime.JspRuntimeLibrary$PrivilegedIntrospectHelper
[WARNING] - org.apache.jasper.runtime.PageContextImpl$2
[WARNING] - org.apache.jasper.Constants
[WARNING] - org.apache.jasper.runtime.ProtectedFunctionMapper$2
[WARNING] - org.apache.jasper.runtime.PageContextImpl
[WARNING] - org.apache.jasper.runtime.PageContextImpl$11
[WARNING] - 33 more...
[WARNING] spring-jcl-5.1.3.RELEASE.jar, commons-logging-1.2.jar, jcl-over-slf4j-1.7.7.jar define 4 overlapping classes:
[WARNING] - org.apache.commons.logging.Log
[WARNING] - org.apache.commons.logging.impl.SimpleLog
[WARNING] - org.apache.commons.logging.impl.NoOpLog
[WARNING] - org.apache.commons.logging.LogFactory
[WARNING] jsr311-api-1.1.1.jar, jersey-bundle-1.19.4.jar, jersey-core-1.8.jar define 55 overlapping classes:
[WARNING] - javax.ws.rs.core.HttpHeaders
[WARNING] - javax.ws.rs.ext.RuntimeDelegate$HeaderDelegate
[WARNING] - javax.ws.rs.DefaultValue
[WARNING] - javax.ws.rs.core.StreamingOutput
[WARNING] - javax.ws.rs.HEAD
[WARNING] - javax.ws.rs.core.Request
[WARNING] - javax.ws.rs.ext.Providers
[WARNING] - javax.ws.rs.core.NewCookie
[WARNING] - javax.ws.rs.core.UriBuilderException
[WARNING] - javax.ws.rs.ext.ContextResolver
[WARNING] - 45 more...
[WARNING] servlet-api-2.5.jar, servlet-api-2.5-20081211.jar, servlet-api-2.5-6.1.14.jar define 42 overlapping classes:
[WARNING] - javax.servlet.http.HttpSessionBindingEvent
[WARNING] - javax.servlet.http.Cookie
[WARNING] - javax.servlet.http.NoBodyResponse
[WARNING] - javax.servlet.ServletContext
[WARNING] - javax.servlet.ServletOutputStream
[WARNING] - javax.servlet.http.HttpSessionListener
[WARNING] - javax.servlet.http.HttpSessionContext
[WARNING] - javax.servlet.FilterChain
[WARNING] - javax.servlet.GenericServlet
[WARNING] - javax.servlet.http.HttpServletRequestWrapper
[WARNING] - 32 more...
[WARNING] eclipselink-2.5.2.jar, javax.persistence-2.1.0.jar define 104 overlapping classes:
[WARNING] - javax.persistence.Convert
[WARNING] - javax.persistence.criteria.Join
[WARNING] - javax.persistence.LockTimeoutException
[WARNING] - javax.persistence.NamedEntityGraph
[WARNING] - javax.persistence.criteria.CriteriaUpdate
[WARNING] - javax.persistence.metamodel.SingularAttribute
[WARNING] - javax.persistence.StoredProcedureQuery
[WARNING] - javax.persistence.AccessType
[WARNING] - javax.persistence.metamodel.Bindable$BindableType
[WARNING] - javax.persistence.metamodel.IdentifiableType
[WARNING] - 94 more...
[WARNING] hadoop-annotations-2.9.2.jar, hadoop-core-1.2.1.jar define 8 overlapping classes:
[WARNING] - org.apache.hadoop.classification.InterfaceStability
[WARNING] - org.apache.hadoop.classification.InterfaceAudience$Private
[WARNING] - org.apache.hadoop.classification.InterfaceAudience$LimitedPrivate
[WARNING] - org.apache.hadoop.classification.InterfaceStability$Evolving
[WARNING] - org.apache.hadoop.classification.InterfaceStability$Stable
[WARNING] - org.apache.hadoop.classification.InterfaceAudience
[WARNING] - org.apache.hadoop.classification.InterfaceAudience$Public
[WARNING] - org.apache.hadoop.classification.InterfaceStability$Unstable
[WARNING] jersey-bundle-1.19.4.jar, jersey-core-1.8.jar define 283 overlapping classes:
[WARNING] - com.sun.jersey.core.spi.factory.MessageBodyFactory$1
[WARNING] - com.sun.jersey.core.impl.provider.entity.DataSourceProvider$ByteArrayDataSource$DSByteArrayOutputStream
[WARNING] - com.sun.jersey.core.header.reader.CookiesParser$MutableCookie
[WARNING] - com.sun.jersey.core.provider.jaxb.AbstractListElementProvider
[WARNING] - com.sun.jersey.core.spi.component.ioc.IoCProviderFactory
[WARNING] - com.sun.jersey.core.impl.provider.header.StringProvider
[WARNING] - com.sun.jersey.core.util.KeyComparatorLinkedHashMap$KeyIterator
[WARNING] - com.sun.jersey.core.impl.provider.entity.SourceProvider$SAXSourceReader
[WARNING] - com.sun.jersey.core.spi.factory.ResponseBuilderImpl
[WARNING] - com.sun.jersey.core.spi.component.ProviderFactory$SingletonComponentProvider
[WARNING] - 273 more...
[WARNING] commons-beanutils-core-1.8.0.jar, commons-beanutils-1.7.0.jar define 82 overlapping classes:
[WARNING] - org.apache.commons.beanutils.ConvertUtilsBean
[WARNING] - org.apache.commons.beanutils.converters.SqlTimeConverter
[WARNING] - org.apache.commons.beanutils.Converter
[WARNING] - org.apache.commons.beanutils.converters.FloatArrayConverter
[WARNING] - org.apache.commons.beanutils.NestedNullException
[WARNING] - org.apache.commons.beanutils.ConvertingWrapDynaBean
[WARNING] - org.apache.commons.beanutils.converters.LongArrayConverter
[WARNING] - org.apache.commons.beanutils.converters.SqlDateConverter
[WARNING] - org.apache.commons.beanutils.converters.BooleanArrayConverter
[WARNING] - org.apache.commons.beanutils.converters.StringConverter
[WARNING] - 72 more...
[WARNING] jersey-server-1.8.jar, jersey-bundle-1.19.4.jar define 514 overlapping classes:
[WARNING] - com.sun.jersey.api.core.ExtendedUriInfo
[WARNING] - com.sun.jersey.server.wadl.WadlGenerator
[WARNING] - com.sun.jersey.server.wadl.generators.resourcedoc.xhtml.XhtmlValueType
[WARNING] - com.sun.research.ws.wadl.Doc
[WARNING] - com.sun.jersey.server.impl.cdi.CDIExtension$ParameterBean
[WARNING] - com.sun.jersey.server.impl.cdi.CDIExtension
[WARNING] - com.sun.jersey.server.impl.container.httpserver.HttpHandlerContainer$Writer
[WARNING] - com.sun.jersey.server.impl.model.parameter.multivalued.StringReaderProviders$TypeFromStringEnum
[WARNING] - com.sun.jersey.server.impl.model.parameter.multivalued.StringReaderProviders$DateProvider$1
[WARNING] - com.sun.jersey.server.impl.cdi.DiscoveredParameter
[WARNING] - 504 more...
[WARNING] woodstox-core-asl-4.4.1.jar, woodstox-core-5.0.3.jar define 204 overlapping classes:
[WARNING] - com.ctc.wstx.dtd.DTDSchemaFactory
[WARNING] - com.ctc.wstx.exc.WstxException
[WARNING] - com.ctc.wstx.msv.GenericMsvValidator
[WARNING] - com.ctc.wstx.sr.ElemAttrs
[WARNING] - com.ctc.wstx.ent.ParsedExtEntity
[WARNING] - com.ctc.wstx.io.MergedStream
[WARNING] - com.ctc.wstx.dtd.DTDIdRefsAttr
[WARNING] - com.ctc.wstx.io.UTF8Writer
[WARNING] - com.ctc.wstx.cfg.OutputConfigFlags
[WARNING] - com.ctc.wstx.sw.XmlWriterWrapper$TextWrapper
[WARNING] - 194 more...
[WARNING] jsp-2.1-6.1.14.jar, jasper-compiler-5.5.12.jar, jasper-runtime-5.5.12.jar define 1 overlapping classes:
[WARNING] - org.apache.jasper.compiler.Localizer
[WARNING] ant-1.6.5.jar, ant-1.9.4.jar define 554 overlapping classes:
[WARNING] - org.apache.tools.ant.taskdefs.WhichResource
[WARNING] - org.apache.tools.ant.types.selectors.PresentSelector
[WARNING] - org.apache.tools.ant.taskdefs.Touch$DateFormatFactory
[WARNING] - org.apache.tools.ant.util.facade.FacadeTaskHelper
[WARNING] - org.apache.tools.ant.helper.DefaultExecutor
[WARNING] - org.apache.tools.ant.DefaultLogger
[WARNING] - org.apache.tools.ant.types.PatternSet$NameEntry
[WARNING] - org.apache.tools.ant.taskdefs.Javadoc$DocletParam
[WARNING] - org.apache.tools.ant.types.selectors.SelectorScanner
[WARNING] - org.apache.tools.ant.taskdefs.Nice
[WARNING] - 544 more...
[WARNING] spring-jcl-5.1.3.RELEASE.jar, commons-logging-1.2.jar define 1 overlapping classes:
[WARNING] - org.apache.commons.logging.LogFactory$1
[WARNING] jsp-api-2.1.jar, jsp-api-2.1-6.1.14.jar define 84 overlapping classes:
[WARNING] - javax.el.MethodInfo
[WARNING] - javax.servlet.jsp.JspFactory
[WARNING] - javax.servlet.jsp.el.ScopedAttributeELResolver
[WARNING] - javax.servlet.jsp.el.ImplicitObjectELResolver$ImplicitObjects$4
[WARNING] - javax.servlet.jsp.el.ExpressionEvaluator
[WARNING] - javax.servlet.jsp.tagext.Tag
[WARNING] - javax.el.ELContextListener
[WARNING] - javax.el.VariableMapper
[WARNING] - javax.servlet.jsp.tagext.TagExtraInfo
[WARNING] - javax.servlet.jsp.tagext.SimpleTag
[WARNING] - 74 more...
[WARNING] commons-logging-1.2.jar, jcl-over-slf4j-1.7.7.jar define 2 overlapping classes:
[WARNING] - org.apache.commons.logging.impl.SimpleLog$1
[WARNING] - org.apache.commons.logging.LogConfigurationException
[WARNING] commons-beanutils-core-1.8.0.jar, commons-collections-3.2.2.jar, commons-beanutils-1.7.0.jar define 10 overlapping classes:
[WARNING] - org.apache.commons.collections.FastHashMap$EntrySet
[WARNING] - org.apache.commons.collections.FastHashMap$KeySet
[WARNING] - org.apache.commons.collections.FastHashMap$CollectionView$CollectionViewIterator
[WARNING] - org.apache.commons.collections.ArrayStack
[WARNING] - org.apache.commons.collections.FastHashMap$Values
[WARNING] - org.apache.commons.collections.FastHashMap$CollectionView
[WARNING] - org.apache.commons.collections.FastHashMap$1
[WARNING] - org.apache.commons.collections.Buffer
[WARNING] - org.apache.commons.collections.FastHashMap
[WARNING] - org.apache.commons.collections.BufferUnderflowException
[WARNING] jersey-json-1.8.jar, jersey-bundle-1.19.4.jar define 79 overlapping classes:
[WARNING] - com.sun.jersey.json.impl.reader.StartElementEvent
[WARNING] - com.sun.jersey.json.impl.provider.entity.JSONWithPaddingProvider
[WARNING] - com.sun.jersey.api.json.JSONConfiguration$Builder
[WARNING] - com.sun.jersey.json.impl.provider.entity.JSONRootElementProvider
[WARNING] - com.sun.jersey.json.impl.writer.JsonXmlStreamWriter$DummyWriterAdapter
[WARNING] - com.sun.jersey.api.json.JSONJAXBContext
[WARNING] - com.sun.jersey.api.json.JSONUnmarshaller
[WARNING] - com.sun.jersey.json.impl.provider.entity.JSONListElementProvider$General
[WARNING] - com.sun.jersey.api.json.JSONConfiguration
[WARNING] - com.sun.jersey.api.json.JSONJAXBContext$2
[WARNING] - 69 more...
[WARNING] stax-api-1.0.1.jar, stax-api-1.0-2.jar define 37 overlapping classes:
[WARNING] - javax.xml.stream.XMLEventReader
[WARNING] - javax.xml.stream.StreamFilter
[WARNING] - javax.xml.stream.FactoryFinder$ClassLoaderFinderConcrete
[WARNING] - javax.xml.stream.util.StreamReaderDelegate
[WARNING] - javax.xml.stream.EventFilter
[WARNING] - javax.xml.stream.events.StartDocument
[WARNING] - javax.xml.stream.XMLEventWriter
[WARNING] - javax.xml.stream.XMLStreamConstants
[WARNING] - javax.xml.stream.events.EntityDeclaration
[WARNING] - javax.xml.stream.events.ProcessingInstruction
[WARNING] - 27 more...
[WARNING] hadoop-core-1.2.1.jar, hadoop-common-2.9.2.jar define 706 overlapping classes:
[WARNING] - org.apache.hadoop.io.retry.RetryPolicies$MultipleLinearRandomRetry
[WARNING] - org.apache.hadoop.io.ArrayFile
[WARNING] - org.apache.hadoop.metrics2.impl.MetricGaugeLong
[WARNING] - org.apache.hadoop.io.compress.zlib.ZlibCompressor$CompressionStrategy
[WARNING] - org.apache.hadoop.metrics.spi.AbstractMetricsContext
[WARNING] - org.apache.hadoop.util.GenericsUtil
[WARNING] - org.apache.hadoop.fs.FileSystem$Cache
[WARNING] - org.apache.hadoop.metrics.spi.NullContextWithUpdateThread
[WARNING] - org.apache.hadoop.io.compress.DecompressorStream
[WARNING] - org.apache.hadoop.record.XmlRecordInput$Value
[WARNING] - 696 more...
[WARNING] jsp-2.1-6.1.14.jar, jasper-compiler-5.5.12.jar define 143 overlapping classes:
[WARNING] - org.apache.jasper.compiler.Node$Nodes
[WARNING] - org.apache.jasper.compiler.tagplugin.TagPlugin
[WARNING] - org.apache.jasper.compiler.JspUtil
[WARNING] - org.apache.jasper.xmlparser.MyEntityResolver
[WARNING] - org.apache.jasper.compiler.Validator$TagExtraInfoVisitor
[WARNING] - org.apache.jasper.compiler.SmapGenerator
[WARNING] - org.apache.jasper.compiler.SmapStratum
[WARNING] - org.apache.jasper.compiler.tagplugin.TagPluginContext
[WARNING] - org.apache.jasper.compiler.JasperTagInfo
[WARNING] - org.apache.jasper.EmbeddedServletOptions
[WARNING] - 133 more...
[WARNING] maven-shade-plugin has detected that some class files are
[WARNING] present in two or more JARs. When this happens, only one
[WARNING] single version of the class is copied to the uber jar.
[WARNING] Usually this is not harmful and you can skip these warnings,
[WARNING] otherwise try to manually exclude artifacts based on
[WARNING] mvn dependency:tree -Ddetail=true and the above output.
[WARNING] See http://maven.apache.org/plugins/maven-shade-plugin/
initialize()
and into start()
Currently searching in Apache Atlas is entirely entity-based -- it is not possible to search for relationships. Therefore, it is not currently possible to implement the following Egeria methods without unacceptable performance overhead (pull all and scan in-memory):
findRelationshipsByProperty
findRelationshipsByPropertyValue
It seems that implementing the reference copy handling won't be possible per #4.
Options for proceeding:
addEntity
, etc -- just not the reference copy equivalents). We will also need:
AtlasGlossaryCategory
without an AtlasGlossary
and defining the anchor
on the AtlasGlossaryCategory
)Also need to decide scope of these:
And in each of those cases:
Implement automated CTS execution via k8s / Helm.
Exploring the Atlas API to create/update an entity, it seems that a self-defined GUID cannot be used (putting a GUID into the request body forces Atlas to attempt an update
, which then immediately fails as the GUID provided does not yet exist).
Furthermore, even when leaving the GUID off but including a homeId
, the homeId
is not stored and entities are still fully mutable (ie. via the UI).
These seem to be fundamental blockers to achieving anything other than a purely read-only connector for Apache Atlas...
Dependabot proposed
Bump derby from 10.8.3.1 to 10.15.1.3 odpi/egeria#1753
However we need substantive work for gaian to work with 10.15 so we should document our
usage & rationale for 10.8
Per consolidation of SchemaAttribute
and SchemaType
properties that will remove the need to manage eg. RelationalColumn
, RelationalColumnType
split (see: odpi/egeria#1317)
The atlas proxy supports only limited types (and is read only)
During startup many OMRS_AUDIT-0321 events are recorded stating the type was ignored. This is AS EXPECTED.
However for each we then get an Exception reported - probably due to the particular code path undertaken in response to the above.
This is probably a base egeria issue.
Fri Mar 03 12:39:51 GMT 2023 atlasserver Types OMRS-AUDIT-0321 A patch to the DigitalServiceOperator (79ac27f6-be9c-489f-a7c2-b9add0bf705c) type definition from Egeria (3.15) was ignored because the local repository does not support this type
Fri Mar 03 12:39:51 GMT 2023 atlasserver Exception OMRS-AUDIT-9019 The type definition event processor for the Egeria (3.15) service caught an unexpected exception org.odpi.openmetadata.repositoryservices.ffdc.exception.InvalidParameterException with message OMRS-REPOSITORY-400-019 A null TypeDef has been passed as the originalTypeDef parameter on a processUpdatedTypeDefEvent request to the open metadata repository Egeria (3.15)
Fri Mar 03 12:39:51 GMT 2023 atlasserver Exception OMRS-AUDIT-9019 Supplementary information: log record id d6dc1903-6689-4a1c-9330-ccfc2c6279e1 org.odpi.openmetadata.repositoryservices.ffdc.exception.InvalidParameterException returned message of OMRS-REPOSITORY-400-019 A null TypeDef has been passed as the originalTypeDef parameter on a processUpdatedTypeDefEvent request to the open metadata repository Egeria (3.15) and stacktrace of
InvalidParameterException{parameterName='originalTypeDef', reportedHTTPCode=400, reportingClassName='org.odpi.openmetadata.repositoryservices.connectors.stores.metadatacollectionstore.utilities.OMRSRepositoryPropertiesUtilities', reportingActionDescription='applyPatch', errorMessage='OMRS-REPOSITORY-400-019 A null TypeDef has been passed as the originalTypeDef parameter on a processUpdatedTypeDefEvent request to the open metadata repository Egeria (3.15)', reportedSystemAction='The system is unable to perform the request because the TypeDef is needed to perform the operation.', reportedUserAction='Fix the invoking code and retry the request.', reportedCaughtException=null, relatedProperties=null}
at org.odpi.openmetadata.repositoryservices.connectors.stores.metadatacollectionstore.utilities.OMRSRepositoryPropertiesUtilities.applyPatch(OMRSRepositoryPropertiesUtilities.java:2496)
at org.odpi.openmetadata.repositoryservices.localrepository.repositorycontentmanager.OMRSRepositoryContentManager.cacheUnsupportedTypeDef(OMRSRepositoryContentManager.java:191)
at org.odpi.openmetadata.repositoryservices.localrepository.repositorycontentmanager.OMRSRepositoryContentManager.processUpdatedTypeDefEvent(OMRSRepositoryContentManager.java:3120)
at org.odpi.openmetadata.repositoryservices.archivemanager.OMRSArchiveManager.processTypeDefStore(OMRSArchiveManager.java:323)
at org.odpi.openmetadata.repositoryservices.archivemanager.OMRSArchiveManager.processOpenMetadataArchive(OMRSArchiveManager.java:217)
at org.odpi.openmetadata.repositoryservices.archivemanager.OMRSArchiveManager.processOpenMetadataTypes(OMRSArchiveManager.java:145)
at org.odpi.openmetadata.repositoryservices.archivemanager.OMRSArchiveManager.setLocalRepository(OMRSArchiveManager.java:108)
at org.odpi.openmetadata.repositoryservices.admin.OMRSOperationalServices.initializeCohortMember(OMRSOperationalServices.java:450)
at org.odpi.openmetadata.adminservices.server.OMAGServerOperationalServices.activateWithSuppliedConfig(OMAGServerOperationalServices.java:323)
at org.odpi.openmetadata.adminservices.server.OMAGServerOperationalServices.activateWithStoredConfig(OMAGServerOperationalServices.java:153)
at org.odpi.openmetadata.adminservices.spring.OperationalServicesResource.activateWithStoredConfig(OperationalServicesResource.java:60)
The startup of the connector is taking a very long time when Atlas is connected to LDAP, and the connector user is not an LDAP user. During startup of the connector it seems to login for every call to Atlas, Atlas then first checks the user against LDAP, doesn't find it and then tries it's own user database.
This takes a lot of extra time. Is it possible to use a session with Atlas so login only happens once?
While trying to set up the connector Provider given in the Postman collection, an error was found:
POST:
{{baseURL}}/open-metadata/admin-services/users/{{user}}/servers/{{server}}/local-repository/event-mapper-details?connectorProvider=org.odpi.egeria.connectors.apache.atlas.eventmapper.ApacheAtlasOMRSRepositoryEventMapperProvider&eventSource={{atlas_kafka}}
"exceptionClassName": "org.odpi.openmetadata.adminservices.ffdc.exception.OMAGConfigurationErrorException",
"exceptionErrorMessage": "OMAG-ADMIN-500-001 Method setLocalRepositoryEventMapper for OMAG server ATLAS-B4All_MDS returned an unexpected exception of java.lang.ClassNotFoundException with message org.odpi.egeria.connectors.apache.atlas.eventmapper.ApacheAtlasOMRSRepositoryEventMapperProvider",
Searching for certain strings (best guess is those that contain regex-meaningful characters?) does not appear to be possible in Atlas: neither using the query
/ full-text search option, nor the property-specific basic search option.
For example, searching against qualifiedName
using a contains search with each of the following values all fail to return any results, despite such a result existing in Atlas:
est_hive_table1.nam
est_hive_table1\.nam
"est_hive_table1.nam"
"est_hive_table1\.nam"
The implication of this is that the Egeria Conformance Test Suite (CTS) cannot be passed as certain search scenarios cannot be supported.
There is some interest in restarting the work on the Atlas connector, as it allows for metadata to be retrieved from a hadoop environment
Initial tasks
Atlas Docker image
Connector
Can we setup a project for this repo under the ODPi org in Sonarcloud, to get quality checks against this repo? (Let me know if we should assign to someone else -- but I don't have admin privileges to do it myself ๐)
Currently the search methods of the metadata collection for the connector only work when a directly-mapped type is provided; need to be updated to support supertypes.
(For example, searching by ComplexSchemaType
should automatically include results for TabularSchemaType
, even though ComplexSchemaType
itself is not directly mapped / implemented.)
See odpi/egeria#458
Planning to move
Technically the gaian impersonation module has nothing to do with hadoop, but it's probably best also added here is it's used in conjunction with the above
Will elaborate when I look in more detail & will propose a PR.
I do not propose to move other samples, nor the vdc helm chart.. yet (sep. discussion)
Any removals from master will be done after this has been merged. Fixups for vdc chart may follow soon after
Will also need to refactor directory structure in accordance.
cc: @danielaotelea
This repository primarily contains the Apache Atlas repository connector for Egeria.
We previously used this as part of our 'VDC' (virtual data connector) MVP project, but in the last year or two it has not been a priority.
The connector was last updated to use egeria 2.11-SNAPSHOT - prior to updates to Java 11 & enabling of TLS security, and little code has been modified, nor tested significantly since the 1.8 timeframe.
Atlas itself last released a few years ago at 2.0 in May 2019 - https://atlas.apache.org/2.0.0/Downloads.html though some later artifacts do show via maven central at https://mvnrepository.com/artifact/org.apache.atlas/apache-atlas
Atlas still seems to require Java 1.8 and retains some quite old dependencies.
Source is at https://github.com/apache/atlas
The connector is currently read-only, and supports only some types.
If/when there is community interest in reviving this work, the current code should be an good start to work on, bringing it up to date with Egeria & other dependencies. Java 11 may cause some friction.
Until then I propose we
a) Update the top level README with an agreed position on how we view the state of this repository
b) Archive this repo in github (this makes it clearer.. it is reversible)
Create / update / delete operations against Apache Atlas-homed (non-reference-copy) instances.
Investigate why the only events that Atlas seems to be producing on the v2.0.0 environment are relationship events (nothing for entities, classifications, etc).
If the atlas server is not active when the egeria proxy server is started, the proxy server will fail to start up
Remove the supported status validation from the verifyTypeDef
methods and into the addEntity
, etc methods -- so long as ACTIVE
and DELETED
are supported for read-only scenario, the others we can detect at creation (write) time whether we support them or not.
in https://github.com/odpi/egeria-connector-hadoop-ecosystem/blob/master/docs/mappings/README.md the 2 links to json files are broken
Starting with a set of samples, like those from tutorial:
The default Apache Atlas inheritance structures differ from those defined in Egeria, causing problems when certain search combinations are used (ie. during the CTS execution).
For example: in Atlas, by default, the hive_table
type has as its direct super-type DataSet
, which in turn inherits from Asset
, etc. In Egeria, the open types for RelationalTable
(the type hive_table
maps to) instead inherits from SchemaAttribute
, which in turn traverses up via the schema area (eg. SchemaElement
) but not via the Asset
hierarchy. As a result, when searching at the level of eg. Asset
in Egeria, the search that is translated at this level to Atlas ends up returning Atlas types (Assets) that cannot be validly mapped to their equivalent Egeria types (SchemaElements).
Without forcing Atlas adopters to change their inheritance hierarchy, the options to address this would be:
as changed in the core repo as well, to confirm that we have no unused dependencies sitting around and no required dependencies that are not explicitly included where needed
Per discussion under odpi/egeria#1732 there is likely re-work that can be done to the Atlas proxy connector in order to better align with the OCF.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.