actionml / harness Goto Github PK
View Code? Open in Web Editor NEWHarness is a Machine Learning/AI Server with plugins for many algorithms including the Universal Recommender
License: Apache License 2.0
Harness is a Machine Learning/AI Server with plugins for many algorithms including the Universal Recommender
License: Apache License 2.0
I tried to use Azure Cosmos DB as alternative for MongoDB for harness. CosmosDB uses SSL and authentication by default which is not supported in harness yet.
Error Message:
No SSL support in java.nio.channels.AsynchronousSocketChannel. For SSL support use com.mongodb.connection.netty.NettyStreamFactoryFactory
See attached log file
harness-mongodb-ssl-error.txt
Please change the mongodb driver to support SSL connections.
Hello,
I have installed harness via docker-compose and went through https://actionml.com/docs/h_ur_quickstart tutorial step by step.
Right now I have UR engine up and running and I can receive recommendations for a user.
Now I'm trying to set item properties and receiving the error "ERROR URDataset - Can't parse UREvent from JObject".
Full error message:
14:11:19.419 INFO ActorSystemImpl - Harness Server: HttpRequest(HttpMethod(POST), http://0.0.0.0:9090/engines/movie_fit/events, List(Accept: application/json, User-Agent: rest-client/2.1.0 (darwin19.0.0 x86_64) ruby/2.6.5p114, Accept-Encoding: gzip, deflate;q=0.6, identity;q=0.3, Host: 0.0.0.0:9090, Timeout-Access: <function1>),HttpEntity.Strict(application/json,{"event":"$set","entityType":"item","entityId":"1","properties":{"title_type":["movie"],"genres":["4","5"],"persons":["37","1","2","3","4","5","6","7","8","9"]}}),HttpProtocol(HTTP/1.1)) 14:11:19.424 ERROR URDataset - Can't parse UREvent from JObject(List((event,JString($set)), (entityType,JString(item)), (entityId,JString(1)), (properties,JObject(List((title_type,JArray(List(JString(movie)))), (genres,JArray(List(JString(4), JString(5)))), (persons,JArray(List(JString(37), JString(1), JString(2), JString(3), JString(4), JString(5), JString(6), JString(7), JString(8), JString(9))))))))) org.json4s.package$MappingException: Can't convert JNothing to class java.lang.String at org.json4s.CustomSerializer$$anonfun$deserialize$1.applyOrElse(Formats.scala:403) at org.json4s.CustomSerializer$$anonfun$deserialize$1.applyOrElse(Formats.scala:400) at scala.PartialFunction$OrElse.apply(PartialFunction.scala:167) at scala.PartialFunction$class.applyOrElse(PartialFunction.scala:123) at scala.collection.AbstractMap.applyOrElse(Map.scala:59) at scala.PartialFunction$OrElse.apply(PartialFunction.scala:167) at org.json4s.Extraction$.customOrElse(Extraction.scala:600) at org.json4s.Extraction$.extract(Extraction.scala:387) at org.json4s.Extraction$.extract(Extraction.scala:39) at org.json4s.ExtractableJsonAstNode.extract(ExtractableJsonAstNode.scala:21) at com.actionml.core.validate.JsonSupport$DateReader$.read(JsonSupport.scala:44) at com.actionml.core.validate.JsonSupport$DateReader$.read(JsonSupport.scala:42) at org.json4s.ExtractableJsonAstNode.as(ExtractableJsonAstNode.scala:74) at com.actionml.engines.ur.URDataset.com$actionml$engines$ur$URDataset$$toUrEvent(URDataset.scala:238) at com.actionml.engines.ur.URDataset$$anonfun$input$1.apply(URDataset.scala:98) at com.actionml.engines.ur.URDataset$$anonfun$input$1.apply(URDataset.scala:98) at cats.data.Validated.andThen(Validated.scala:197) at com.actionml.engines.ur.URDataset.input(URDataset.scala:98) at com.actionml.engines.ur.UREngine$$anonfun$2.apply(UREngine.scala:113) at com.actionml.engines.ur.UREngine$$anonfun$2.apply(UREngine.scala:113) at cats.data.Validated.andThen(Validated.scala:197) at com.actionml.engines.ur.UREngine.input(UREngine.scala:113) at com.actionml.router.service.EventServiceImpl$$anonfun$receive$1.applyOrElse(EventService.scala:47) at akka.actor.Actor$class.aroundReceive(Actor.scala:502) at com.actionml.router.service.EventServiceImpl.aroundReceive(EventService.scala:35) at akka.actor.ActorCell.receiveMessage(ActorCell.scala:526) at akka.actor.ActorCell.invoke(ActorCell.scala:495) at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:257) at akka.dispatch.Mailbox.run(Mailbox.scala:224) at akka.dispatch.Mailbox.exec(Mailbox.scala:234) at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339) at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107) 14:11:19.426 INFO UREngine - Engine-id: movie_fit. Bad input Whoops, no response string 14:11:19.430 INFO ActorSystemImpl - Harness Server: Response for
Please assist.
harness stop
which uses cat ${PIDFILE} | xargs kill
does not always work. I have waited a long time and eventually must kill -9 ...
.
For now we can tell if the stop does not succeed by checking with jps
and printing a warning. But the real fix is unknown. This seems to have something to do with the Contextual Bandit since it is that integration test that has trouble. This in turn may mean it is a VW integration problem, not sure.
harness add <some-file>
actually sends the path to the server instead of the contents of the JSON. Sending the JSON makes the CLI (except start/stop) run completely remote.
harness status
called by harness start
should do a health check of the required and toolbox services and report this.
docker-compose exec harness-cli bash -c 'harness-cli add data/my-engine-config.json'
(see Config){
"engineId": "retailrocket",
"engineFactory": "com.actionml.engines.ur.UREngine",
"sparkConf": {
"master": "local",
"spark.serializer": "org.apache.spark.serializer.KryoSerializer",
"spark.kryo.registrator": "org.apache.mahout.sparkbindings.io.MahoutKryoRegistrator",
"spark.kryo.referenceTracking": "false",
"spark.kryoserializer.buffer": "300m",
"spark.executor.memory": "1g",
"spark.driver.memory": "1g",
"spark.es.index.auto.create": "true",
"spark.es.nodes": "<my_ip_adress>:9200",
"spark.es.nodes.wan.only": "true"
},
"algorithm":{
"indicators": [
{
"name": "transaction"
},{
"name": "view"
},{
"name": "addtocart"
}
]
}
}
10:03:40.561 INFO ContextCleaner - Cleaned accumulator 5556
10:03:40.561 INFO ContextCleaner - Cleaned accumulator 5452
10:03:40.561 INFO ContextCleaner - Cleaned accumulator 5386
10:03:40.561 INFO ContextCleaner - Cleaned accumulator 5500
10:03:40.561 INFO ContextCleaner - Cleaned accumulator 5476
10:03:40.617 INFO Executor - Finished task 0.0 in stage 143.0 (TID 59). 1181 bytes result sent to driver
10:03:40.619 INFO TaskSetManager - Finished task 0.0 in stage 143.0 (TID 59) in 1243 ms on localhost (executor driver) (1/1)
10:03:40.619 INFO TaskSchedulerImpl - Removed TaskSet 143.0, whose tasks have all completed, from pool
10:03:40.621 INFO DAGScheduler - ResultStage 143 (runJob at EsSpark.scala:108) finished in 1.267 s
10:03:40.625 INFO DAGScheduler - Job 31 finished: runJob at EsSpark.scala:108, took 1.282262 s
10:03:40.803 INFO SparkContextSupport$ - Job f3f1f145-e4f9-44dd-a713-28587fa076d5 completed in 1573121020802 ms [engine retailrocket]
10:03:40.806 INFO AbstractConnector - Stopped Spark@6454a6a2{HTTP/1.1,[http/1.1]}{0.0.0.0:4040}
10:03:40.813 INFO SparkUI - Stopped Spark web UI at http://f4f9bbcb5baa:4040
10:03:40.816 ERROR MongoAsyncDao - Can't removeOne from collection harness_meta_store.jobs with filter WrappedArray((jobId,Equals(f3f1f145-e4f9-44dd-a713-28587fa076d5)))
10:03:40.818 ERROR MongoAsyncDao - Sync DAO error
java.lang.RuntimeException: Can't removeOne from collection harness_meta_store.jobs with filter WrappedArray((jobId,Equals(f3f1f145-e4f9-44dd-a713-28587fa076d5)))
at com.actionml.core.store.backends.MongoAsyncDao$$anonfun$removeOneAsync$1.apply(MongoAsyncDao.scala:135)
at com.actionml.core.store.backends.MongoAsyncDao$$anonfun$removeOneAsync$1.apply(MongoAsyncDao.scala:129)
at scala.concurrent.Future$$anonfun$flatMap$1.apply(Future.scala:253)
at scala.concurrent.Future$$anonfun$flatMap$1.apply(Future.scala:251)
at scala.concurrent.impl.CallbackRunnable.run(Promise.scala:36)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
10:03:40.820 ERROR AsyncEventQueue - Listener JobManagerListener threw an exception
java.lang.RuntimeException: Can't removeOne from collection harness_meta_store.jobs with filter WrappedArray((jobId,Equals(f3f1f145-e4f9-44dd-a713-28587fa076d5)))
at com.actionml.core.store.backends.MongoAsyncDao$$anonfun$removeOneAsync$1.apply(MongoAsyncDao.scala:135)
at com.actionml.core.store.backends.MongoAsyncDao$$anonfun$removeOneAsync$1.apply(MongoAsyncDao.scala:129)
at scala.concurrent.Future$$anonfun$flatMap$1.apply(Future.scala:253)
at scala.concurrent.Future$$anonfun$flatMap$1.apply(Future.scala:251)
at scala.concurrent.impl.CallbackRunnable.run(Promise.scala:36)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
10:03:40.824 INFO MapOutputTrackerMasterEndpoint - MapOutputTrackerMasterEndpoint stopped!
10:03:40.851 INFO MemoryStore - MemoryStore cleared
10:03:40.851 INFO BlockManager - BlockManager stopped
10:03:40.853 INFO BlockManagerMaster - BlockManagerMaster stopped
10:03:40.854 INFO OutputCommitCoordinator$OutputCommitCoordinatorEndpoint - OutputCommitCoordinator stopped!
10:03:40.865 INFO SparkContext - Successfully stopped SparkContext
10:03:41.397 INFO MongoClientCache - Closing MongoClient: [mongo:27017]
10:03:41.400 INFO connection - Closed connection [connectionId{localValue:34, serverValue:34}] to mongo:27017 because the pool has been closed.```
if a nav-id is a URL there is a DB error "bad field name". This appears to be a Casbah bug that requires storing jouneys some new way
This appears to be a case where the query returns one of the eligible ids instead of a model id
When I want to create an Engine from my engine-template, Harness doesnt create an Index in Elasticsearch. Does anyone know why?
My sample engin.json looks like
{
"engineId": "ecommerce",
"engineFactory": "com.actionml.engines.ur.UREngine",
"sparkConf": {
"master": "local",
"spark.driver-memory": "8g",
"spark.executor-memory": "16g",
"spark.serializer": "org.apache.spark.serializer.KryoSerializer",
"spark.kryo.registrator": "org.apache.mahout.sparkbindings.io.MahoutKryoRegistrator",
"spark.kryo.referenceTracking": "false",
"spark.kryoserializer.buffer": "300m",
"spark.es.index.auto.create": "true",
"es.index.auto.create": "true"
},
"algorithm": {
"indicators": [
{
"name": "buy"
},
{
"name": "detail-view"
},
{
"name": "search-terms"
}
]
}
}
Further in the logs the output shows that ES index name:
is null
I'm trying to import exported data from PIO + UR, but I get this error when it meets an event with non-empty properties. Looks like, Harness wrongly tries to decode nested fields in properties?
ERROR URDataset - Can't parse UREvent from JObject(List((eventId,JString(AzNHWFcHuzpoYRf6IYtzXQAAAWtyVJTmqHaszTjE9tE)), (event,JString(FEED_LOADED)), (entityType,JString(account)), (entityId,JString(5b99b6fe520c627dac40cdae)), (properties,JObject(List((request,JObject(List((cursor,JNull), (limit,JInt(20)), (excludeIds,JArray(List())), (itemId,JString(5b60479c9b7e7425997fd083)), (orientation,JString(HORIZONTAL)), (clientInfo,JObject(List((application,JString(sandbox)), (appVersion,JObject(List((version,JString(0.3.2))))), (platform,JString(android)), (platformVersion,JObject(List((version,JString(8.1.0))))))))))), (feed,JObject(List((src,JObject(List((value,JString(ALSO_LIKE_HORIZONTAL)), (version,JString(v1))))), (ids,JArray(List(JString(5bbb229a24a24e05e22d757d), JString(5980640e981f94743ba6f2e8), JString(5c6164e8cfd54a435d7ee07e), JString(5a019543faf20e44181724f3), JString(5c9e10243b2b4409a29ee6e0), JString(5a4e3910520c624d8140809c), JString(5bd96e35dfa87c64648cadbf), JString(5b658546c4dc564e985ed908), JString(5a7d8b3024a24e6d074cc19f), JString(59848a6bf21c3a58a835a709), JString(59dde470f21c3a30ba5e49dd), JString(5b87acd4f1643f14005f15d7), JString(5b28ba703b2b442d2504d2cd), JString(5c936fcdc56a46246debfffd), JString(5c5452d5775ea36be2e1e9ef), JString(5a636ab6520c6256a319c420), JString(5c1a0ade581445366a2bed66), JString(5a6c3d5cc4dc565da74afed7), JString(5cb07221305a953825346406), JString(5b695c9bdfa87c43e438c457), JString(5bf7d47244106437f68f5af1), JString(5c98d4a6441064047e0e4beb), JString(5b923a4c24a24e1181add7c3), JString(594d11b7139771578a8949ed)))), (scores,JArray(List(JDouble(-1.0), JDouble(-1.0), JDouble(4.0), JDouble(-1.0), JDouble(4.0), JDouble(-1.0), JDouble(4.0), JDouble(4.0), JDouble(4.0), JDouble(4.0), JDouble(5.0), JDouble(4.0), JDouble(4.0), JDouble(-1.0), JDouble(4.0), JDouble(4.0), JDouble(4.0), JDouble(5.0), JDouble(-1.0), JDouble(4.0), JDouble(4.0), JDouble(-1.0), JDouble(4.0), JDouble(4.0)))))))))), (eventTime,JString(2019-06-20T00:41:14.214Z)), (creationTime,JString(2019-06-20T00:41:14.214Z))))
java.lang.RuntimeException: Can't parse date JString(5b60479c9b7e7425997fd083)
...
15:41:22.392 INFO UREngine - Bad input Whoops, no response string
15:41:22.392 ERROR URDataset - Can't parse date HORIZONTAL
java.time.format.DateTimeParseException: Text 'HORIZONTAL' could not be parsed at index 0
The events without properties are being imported with no problems.
harness add <engine-config-json>
should give an error if the engine-id is already in use and suggest using harness update <engine-config-json>
Installed using the doc link Install harness.
Steps:
harness-start
harness-cli add <path-to-engine-test_ur-json>
python3 import_mobile_device_ur_data_by_event_time.py
harness-cli train test_ur
After performing the 3rd step, I'm getting this error
:01.177 INFO NettyBlockTransferService - Server created on 192.168.1.2:35773
12:20:01.178 INFO BlockManager - Using org.apache.spark.storage.RandomBlockReplicationPolicy for block replication policy
12:20:01.195 INFO BlockManagerMaster - Registering BlockManager BlockManagerId(driver, 192.168.1.2, 35773, None)
12:20:01.197 INFO BlockManagerMasterEndpoint - Registering block manager 192.168.1.2:35773 with 2.1 GB RAM, BlockManagerId(driver, 192.168.1.2, 35773, None)
12:20:01.200 INFO BlockManagerMaster - Registered BlockManager BlockManagerId(driver, 192.168.1.2, 35773, None)
12:20:01.200 INFO BlockManager - Initialized BlockManager: BlockManagerId(driver, 192.168.1.2, 35773, None)
12:20:01.208 INFO ContextHandler - Started o.s.j.s.ServletContextHandler@706874a{/metrics/json,null,AVAILABLE,@Spark}
12:20:01.251 INFO URAlgorithm - Engine-id: test_ur. Spark context spark.submit.deployMode: client
12:20:01.262 WARN SparkContext - Using an existing SparkContext; some configuration may not take effect.
12:20:01.468 INFO MemoryStore - Block broadcast_0 stored as values in memory (estimated size 8.6 KB, free 2.1 GB)
12:20:01.630 INFO MemoryStore - Block broadcast_0_piece0 stored as bytes in memory (estimated size 2.0 KB, free 2.1 GB)
12:20:01.632 INFO BlockManagerInfo - Added broadcast_0_piece0 in memory on 192.168.1.2:35773 (size: 2.0 KB, free: 2.1 GB)
12:20:01.638 INFO SparkContext - Created broadcast 0 from broadcast at MongoSpark.scala:542
12:20:01.801 INFO SharedState - Setting hive.metastore.warehouse.dir ('null') to the value of spark.sql.warehouse.dir ('file:/home/dinesh/ml/harness/harnesscli/harness-cli/examples/spark-warehouse').
12:20:01.802 INFO SharedState - Warehouse path is 'file:/home/dinesh/ml/harness/harnesscli/harness-cli/examples/spark-warehouse'.
12:20:01.810 INFO ContextHandler - Started o.s.j.s.ServletContextHandler@1593bae9{/SQL,null,AVAILABLE,@Spark}
12:20:01.810 INFO ContextHandler - Started o.s.j.s.ServletContextHandler@4c840fc7{/SQL/json,null,AVAILABLE,@Spark}
12:20:01.811 INFO ContextHandler - Started o.s.j.s.ServletContextHandler@47ca1b15{/SQL/execution,null,AVAILABLE,@Spark}
12:20:01.812 INFO ContextHandler - Started o.s.j.s.ServletContextHandler@4a0406b{/SQL/execution/json,null,AVAILABLE,@Spark}
12:20:01.814 INFO ContextHandler - Started o.s.j.s.ServletContextHandler@5a20d5fd{/static/sql,null,AVAILABLE,@Spark}
12:20:02.091 INFO StateStoreCoordinatorRef - Registered StateStoreCoordinator endpoint
12:20:02.096 WARN SparkSession$Builder - Using an existing SparkSession; some configuration may not take effect.
12:20:02.100 INFO MemoryStore - Block broadcast_1 stored as values in memory (estimated size 8.7 KB, free 2.1 GB)
12:20:02.126 INFO MemoryStore - Block broadcast_1_piece0 stored as bytes in memory (estimated size 2.0 KB, free 2.1 GB)
12:20:02.126 INFO BlockManagerInfo - Added broadcast_1_piece0 in memory on 192.168.1.2:35773 (size: 2.0 KB, free: 2.1 GB)
12:20:02.127 INFO SparkContext - Created broadcast 1 from broadcast at MongoSpark.scala:542
12:20:02.129 INFO URPreparator$ - Indicators: List(purchase, view, category-pref, addtocart, transaction)
12:20:02.192 INFO cluster - Cluster created with settings {hosts=[localhost:27017], mode=SINGLE, requiredClusterType=UNKNOWN, serverSelectionTimeout='30000 ms', maxWaitQueueSize=500}
12:20:02.201 INFO cluster - Cluster description not yet available. Waiting for 30000 ms before timing out
12:20:02.204 INFO connection - Opened connection [connectionId{localValue:3, serverValue:7}] to localhost:27017
12:20:02.205 INFO cluster - Monitor thread successfully connected to server with description ServerDescription{address=localhost:27017, type=STANDALONE, state=CONNECTED, ok=true, version=ServerVersion{versionList=[4, 2, 1]}, minWireVersion=0, maxWireVersion=8, maxDocumentSize=16777216, logicalSessionTimeoutMinutes=30, roundTripTimeNanos=602758}
12:20:02.206 INFO MongoClientCache - Creating MongoClient: [localhost:27017]
12:20:02.215 INFO connection - Opened connection [connectionId{localValue:4, serverValue:8}] to localhost:27017
12:20:02.250 ERROR URAlgorithm - Spark computation failed for engine test_ur with params {{"engineId":"test_ur","engineFactory":"com.actionml.engines.ur.UREngine","sparkConf":{"master":"local","spark.driver-memory":"4g","spark.executor-memory":"4g"},"algorithm":{"indicators":[{"name":"purchase"},{"name":"view","maxCorrelatorsPerItem":50},{"name":"category-pref","maxCorrelatorsPerItem":50,"minLLR":5.0},{"name":"addtocart"},{"name":"transaction"}],"num":4}}}
java.lang.IllegalArgumentException: null
at org.apache.xbean.asm5.ClassReader.<init>(Unknown Source)
at org.apache.xbean.asm5.ClassReader.<init>(Unknown Source)
at org.apache.xbean.asm5.ClassReader.<init>(Unknown Source)
at org.apache.spark.util.ClosureCleaner$.getClassReader(ClosureCleaner.scala:46)
at org.apache.spark.util.FieldAccessFinder$$anon$3$$anonfun$visitMethodInsn$2.apply(ClosureCleaner.scala:449)
at org.apache.spark.util.FieldAccessFinder$$anon$3$$anonfun$visitMethodInsn$2.apply(ClosureCleaner.scala:432)
at scala.collection.TraversableLike$WithFilter$$anonfun$foreach$1.apply(TraversableLike.scala:733)
at scala.collection.mutable.HashMap$$anon$1$$anonfun$foreach$2.apply(HashMap.scala:134)
at scala.collection.mutable.HashMap$$anon$1$$anonfun$foreach$2.apply(HashMap.scala:134)
at scala.collection.mutable.HashTable$class.foreachEntry(HashTable.scala:236)
at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:40)
at scala.collection.mutable.HashMap$$anon$1.foreach(HashMap.scala:134)
at scala.collection.TraversableLike$WithFilter.foreach(TraversableLike.scala:732)
at org.apache.spark.util.FieldAccessFinder$$anon$3.visitMethodInsn(ClosureCleaner.scala:432)
at org.apache.xbean.asm5.ClassReader.a(Unknown Source)
at org.apache.xbean.asm5.ClassReader.b(Unknown Source)
at org.apache.xbean.asm5.ClassReader.accept(Unknown Source)
at org.apache.xbean.asm5.ClassReader.accept(Unknown Source)
at org.apache.spark.util.ClosureCleaner$$anonfun$org$apache$spark$util$ClosureCleaner$$clean$14.apply(ClosureCleaner.scala:262)
at org.apache.spark.util.ClosureCleaner$$anonfun$org$apache$spark$util$ClosureCleaner$$clean$14.apply(ClosureCleaner.scala:261)
at scala.collection.immutable.List.foreach(List.scala:392)
at org.apache.spark.util.ClosureCleaner$.org$apache$spark$util$ClosureCleaner$$clean(ClosureCleaner.scala:261)
at org.apache.spark.util.ClosureCleaner$.clean(ClosureCleaner.scala:159)
at org.apache.spark.SparkContext.clean(SparkContext.scala:2299)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:2073)
at org.apache.spark.rdd.RDD$$anonfun$take$1.apply(RDD.scala:1364)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112)
at org.apache.spark.rdd.RDD.withScope(RDD.scala:363)
at org.apache.spark.rdd.RDD.take(RDD.scala:1337)
at org.apache.spark.rdd.RDD$$anonfun$isEmpty$1.apply$mcZ$sp(RDD.scala:1472)
at org.apache.spark.rdd.RDD$$anonfun$isEmpty$1.apply(RDD.scala:1472)
at org.apache.spark.rdd.RDD$$anonfun$isEmpty$1.apply(RDD.scala:1472)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112)
at org.apache.spark.rdd.RDD.withScope(RDD.scala:363)
at org.apache.spark.rdd.RDD.isEmpty(RDD.scala:1471)
at com.actionml.engines.ur.URPreparator$$anonfun$8.apply(URPreparator.scala:58)
at com.actionml.engines.ur.URPreparator$$anonfun$8.apply(URPreparator.scala:58)
at scala.collection.TraversableLike$$anonfun$filterImpl$1.apply(TraversableLike.scala:248)
at scala.collection.immutable.List.foreach(List.scala:392)
at scala.collection.TraversableLike$class.filterImpl(TraversableLike.scala:247)
at scala.collection.TraversableLike$class.filterNot(TraversableLike.scala:267)
at scala.collection.AbstractTraversable.filterNot(Traversable.scala:104)
at com.actionml.engines.ur.URPreparator$.mkTraining(URPreparator.scala:58)
at com.actionml.engines.ur.URAlgorithm$$anonfun$train$1.apply(URAlgorithm.scala:262)
at com.actionml.engines.ur.URAlgorithm$$anonfun$train$1.apply(URAlgorithm.scala:256)
at scala.util.Success$$anonfun$map$1.apply(Try.scala:237)
at scala.util.Try$.apply(Try.scala:192)
at scala.util.Success.map(Try.scala:237)
at scala.concurrent.Future$$anonfun$map$1.apply(Future.scala:237)
at scala.concurrent.Future$$anonfun$map$1.apply(Future.scala:237)
at scala.concurrent.impl.CallbackRunnable.run(Promise.scala:36)
at scala.concurrent.impl.ExecutionContextImpl$AdaptedForkJoinTask.exec(ExecutionContextImpl.scala:121)
at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.pollAndExecAll(ForkJoinPool.java:1253)
at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1346)
at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
12:20:02.256 INFO SparkContextSupport$ - Job 63becf80-4c6c-4cd4-9181-e0ffa278f37c completed in 1572591002256 ms [engine test_ur]
12:20:02.258 ERROR MongoAsyncDao - Can't removeOne from collection harness_meta_store.jobs with filter WrappedArray((jobId,Equals(63becf80-4c6c-4cd4-9181-e0ffa278f37c)))
12:20:02.259 ERROR MongoAsyncDao - Sync DAO error
java.lang.RuntimeException: Can't removeOne from collection harness_meta_store.jobs with filter WrappedArray((jobId,Equals(63becf80-4c6c-4cd4-9181-e0ffa278f37c)))
at com.actionml.core.store.backends.MongoAsyncDao$$anonfun$removeOneAsync$1.apply(MongoAsyncDao.scala:135)
at com.actionml.core.store.backends.MongoAsyncDao$$anonfun$removeOneAsync$1.apply(MongoAsyncDao.scala:129)
at scala.concurrent.Future$$anonfun$flatMap$1.apply(Future.scala:253)
at scala.concurrent.Future$$anonfun$flatMap$1.apply(Future.scala:251)
at scala.concurrent.impl.CallbackRunnable.run(Promise.scala:36)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
at java.base/java.lang.Thread.run(Thread.java:834)
12:20:02.263 INFO AbstractConnector - Stopped Spark@7cd51c26{HTTP/1.1,[http/1.1]}{0.0.0.0:4040}
12:20:02.265 INFO SparkUI - Stopped Spark web UI at http://192.168.1.2:4040
12:20:02.266 ERROR AsyncEventQueue - Listener JobManagerListener threw an exception
java.lang.RuntimeException: Can't removeOne from collection harness_meta_store.jobs with filter WrappedArray((jobId,Equals(63becf80-4c6c-4cd4-9181-e0ffa278f37c)))
at com.actionml.core.store.backends.MongoAsyncDao$$anonfun$removeOneAsync$1.apply(MongoAsyncDao.scala:135)
at com.actionml.core.store.backends.MongoAsyncDao$$anonfun$removeOneAsync$1.apply(MongoAsyncDao.scala:129)
at scala.concurrent.Future$$anonfun$flatMap$1.apply(Future.scala:253)
at scala.concurrent.Future$$anonfun$flatMap$1.apply(Future.scala:251)
at scala.concurrent.impl.CallbackRunnable.run(Promise.scala:36)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
at java.base/java.lang.Thread.run(Thread.java:834)
Here is the data that i've of 'events' in mongodb
{ "_id" : ObjectId("5dbabcf41b62a3803294ace9"), "eventId" : null, "event" : "purchase", "entityType" : "user", "entityId" : "u1", "targetEntityId" : "iPhone XS", "dateProps" : { }, "categoricalProps" : { }, "floatProps" : { }, "booleanProps" : { }, "eventTime" : ISODate("2019-10-28T18:04:36.518Z") }
{ "_id" : ObjectId("5dbabcf41b62a3803294acec"), "eventId" : null, "event" : "purchase", "entityType" : "user", "entityId" : "u1", "targetEntityId" : "iPhone XR", "dateProps" : { }, "categoricalProps" : { }, "floatProps" : { }, "booleanProps" : { }, "eventTime" : ISODate("2019-10-27T22:52:36.518Z") }
{ "_id" : ObjectId("5dbabcf41b62a3803294acee"), "eventId" : null, "event" : "purchase", "entityType" : "user", "entityId" : "u1", "targetEntityId" : "iPhone 8", "dateProps" : { }, "categoricalProps" : { }, "floatProps" : { }, "booleanProps" : { }, "eventTime" : ISODate("2019-10-27T03:40:36.518Z") }
{ "_id" : ObjectId("5dbabcf41b62a3803294acf0"), "eventId" : null, "event" : "purchase", "entityType" : "user", "entityId" : "u1", "targetEntityId" : "iPad Pro", "dateProps" : { }, "categoricalProps" : { }, "floatProps" : { }, "booleanProps" : { }, "eventTime" : ISODate("2019-10-26T08:28:36.518Z") }
{ "_id" : ObjectId("5dbabcf41b62a3803294acf2"), "eventId" : null, "event" : "purchase", "entityType" : "user", "entityId" : "U 2", "targetEntityId" : "Pixel Slate", "dateProps" : { }, "categoricalProps" : { }, "floatProps" : { }, "booleanProps" : { }, "eventTime" : ISODate("2019-10-25T13:16:36.518Z") }
{ "_id" : ObjectId("5dbabcf41b62a3803294acf4"), "eventId" : null, "event" : "purchase", "entityType" : "user", "entityId" : "U 2", "targetEntityId" : "Galaxy 8", "dateProps" : { }, "categoricalProps" : { }, "floatProps" : { }, "booleanProps" : { }, "eventTime" : ISODate("2019-10-24T18:04:36.518Z") }
{ "_id" : ObjectId("5dbabcf41b62a3803294acf6"), "eventId" : null, "event" : "purchase", "entityType" : "user", "entityId" : "u-3", "targetEntityId" : "Surface Pro", "dateProps" : { }, "categoricalProps" : { }, "floatProps" : { }, "booleanProps" : { }, "eventTime" : ISODate("2019-10-23T22:52:36.518Z") }
{ "_id" : ObjectId("5dbabcf41b62a3803294acf8"), "eventId" : null, "event" : "purchase", "entityType" : "user", "entityId" : "u-4", "targetEntityId" : "iPhone XR", "dateProps" : { }, "categoricalProps" : { }, "floatProps" : { }, "booleanProps" : { }, "eventTime" : ISODate("2019-10-23T03:40:36.518Z") }
{ "_id" : ObjectId("5dbabcf41b62a3803294acfa"), "eventId" : null, "event" : "purchase", "entityType" : "user", "entityId" : "u-4", "targetEntityId" : "iPhone XR", "dateProps" : { }, "categoricalProps" : { }, "floatProps" : { }, "booleanProps" : { }, "eventTime" : ISODate("2019-10-22T08:28:36.518Z") }
{ "_id" : ObjectId("5dbabcf41b62a3803294acfc"), "eventId" : null, "event" : "purchase", "entityType" : "user", "entityId" : "u-4", "targetEntityId" : "iPhone XR", "dateProps" : { }, "categoricalProps" : { }, "floatProps" : { }, "booleanProps" : { }, "eventTime" : ISODate("2019-10-21T13:16:36.518Z") }
{ "_id" : ObjectId("5dbabcf41b62a3803294acfe"), "eventId" : null, "event" : "purchase", "entityType" : "user", "entityId" : "u-4", "targetEntityId" : "iPhone XR", "dateProps" : { }, "categoricalProps" : { }, "floatProps" : { }, "booleanProps" : { }, "eventTime" : ISODate("2019-10-20T18:04:36.518Z") }
{ "_id" : ObjectId("5dbabcf41b62a3803294ad00"), "eventId" : null, "event" : "purchase", "entityType" : "user", "entityId" : "u-4", "targetEntityId" : "iPhone 8", "dateProps" : { }, "categoricalProps" : { }, "floatProps" : { }, "booleanProps" : { }, "eventTime" : ISODate("2019-10-19T22:52:36.518Z") }
{ "_id" : ObjectId("5dbabcf41b62a3803294ad02"), "eventId" : null, "event" : "purchase", "entityType" : "user", "entityId" : "u-4", "targetEntityId" : "Galaxy 8", "dateProps" : { }, "categoricalProps" : { }, "floatProps" : { }, "booleanProps" : { }, "eventTime" : ISODate("2019-10-19T03:40:36.518Z") }
{ "_id" : ObjectId("5dbabcf41b62a3803294ad04"), "eventId" : null, "event" : "view", "entityType" : "user", "entityId" : "u1", "targetEntityId" : "Phones", "dateProps" : { }, "categoricalProps" : { }, "floatProps" : { }, "booleanProps" : { }, "eventTime" : ISODate("2019-10-18T08:28:36.518Z") }
{ "_id" : ObjectId("5dbabcf41b62a3803294ad06"), "eventId" : null, "event" : "view", "entityType" : "user", "entityId" : "u1", "targetEntityId" : "Phones", "dateProps" : { }, "categoricalProps" : { }, "floatProps" : { }, "booleanProps" : { }, "eventTime" : ISODate("2019-10-17T13:16:36.518Z") }
{ "_id" : ObjectId("5dbabcf41b62a3803294ad08"), "eventId" : null, "event" : "view", "entityType" : "user", "entityId" : "u1", "targetEntityId" : "Phones", "dateProps" : { }, "categoricalProps" : { }, "floatProps" : { }, "booleanProps" : { }, "eventTime" : ISODate("2019-10-16T18:04:36.518Z") }
{ "_id" : ObjectId("5dbabcf41b62a3803294ad0a"), "eventId" : null, "event" : "view", "entityType" : "user", "entityId" : "u1", "targetEntityId" : "Phones", "dateProps" : { }, "categoricalProps" : { }, "floatProps" : { }, "booleanProps" : { }, "eventTime" : ISODate("2019-10-15T22:52:36.518Z") }
{ "_id" : ObjectId("5dbabcf41b62a3803294ad0c"), "eventId" : null, "event" : "view", "entityType" : "user", "entityId" : "u1", "targetEntityId" : "Phones", "dateProps" : { }, "categoricalProps" : { }, "floatProps" : { }, "booleanProps" : { }, "eventTime" : ISODate("2019-10-15T03:40:36.518Z") }
{ "_id" : ObjectId("5dbabcf41b62a3803294ad0e"), "eventId" : null, "event" : "view", "entityType" : "user", "entityId" : "u1", "targetEntityId" : "Phones", "dateProps" : { }, "categoricalProps" : { }, "floatProps" : { }, "booleanProps" : { }, "eventTime" : ISODate("2019-10-14T08:28:36.518Z") }
{ "_id" : ObjectId("5dbabcf41b62a3803294ad10"), "eventId" : null, "event" : "view", "entityType" : "user", "entityId" : "u1", "targetEntityId" : "Mobile-acc", "dateProps" : { }, "categoricalProps" : { }, "floatProps" : { }, "booleanProps" : { }, "eventTime" : ISODate("2019-10-13T13:16:36.518Z") }
Is this error occuring because the 'eventId' is null ?
a user-id should be able to have admin for all resources or client for any number. Adding permission for several resources even it they are enumerated one at a time should be supported.
No required for first release.
Hi!
I cannot figure out where do I go wrong. I am trying to remove the default blacklist of the primary indicator. Here is my engine.json:
Fresh docker deployment from the newest dev.
{
"engineId": "ecom_ur",
"engineFactory": "com.actionml.engines.ur.UREngine",
"sparkConf": {
"master": "local",
"spark.serializer": "org.apache.spark.serializer.KryoSerializer",
"spark.kryo.registrator": "org.apache.mahout.sparkbindings.io.MahoutKryoRegistrator",
"spark.kryo.referenceTracking": "false",
"spark.kryoserializer.buffer": "300m",
"spark.executor.memory": "20g",
"spark.driver.memory": "10g",
"spark.es.index.auto.create": "true",
"spark.es.nodes": "192.168.40.225",
"spark.es.nodes.wan.only": "true"
},
"algorithm":{
"indicators": [
{
"name": "buy"
}
],
"maxQueryEvents": 19,
"blacklistEvents": [],
"maxCorrelatorsPerEventType": 49
}
}
But it does not affect the generated engine... There is an issue also with the "maxQueryEvents" btw. Below is the log for the generated engine..
Engine-id: ecom_ur. Initialized with JSON: {"engineId":"ecom_ur","engineFactory":"com.actionml.engines.ur.UREngine","sparkConf":{"master":"local","spark.serializer":"org.apache.spark.serializer.KryoSerializer","spark.kryo.registrator":"org.apache.mahout.sparkbindings.io.MahoutKryoRegistrator","spark.kryo.referenceTracking":"false","spark.kryoserializer.buffer":"300m","spark.executor.memory":"20g","spark.driver.memory":"10g","spark.es.index.auto.create":"true","spark.es.nodes":"192.168.40.225","spark.es.nodes.wan.only":"true"},"algorithm":{"indicators":[{"name":"buy"}],"maxQueryEvents":19,"blacklistEvents":[],"maxCorrelatorsPerEventType":49}}
10:39:51.697 INFO URAlgorithm - Engine-id: ecom_ur. Events to alias mapping: Map(buy -> buy)
10:39:51.697 INFO package$ -
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ URAlgorithm initialization parameters including "defaults" โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ ES index name: null โ
โ ES type name: items โ
โ RecsModel: all โ
โ Indicators: List(IndicatorParams(buy,None,None,None,None)) โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ Random seed: 326194065 โ
โ MaxCorrelatorsPerEventType: 49 โ
โ MaxEventsPerEventType: 500 โ
โ BlacklistEvents: List(buy) โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ User bias: 1.0 โ
โ Item bias: 1.0 โ
โ Max query events: 1000 โ
โ Limit: 20 โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ Rankings: โ
โ popular Some(popRank) โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
I would really appreciate some insight. I tried all possible combination for the JSON.. ( [[]], [{}], [""], etc...)
We are using harness server container to attach to authenticated Mongo DB.
We have tried multiple variations with sparkConf like:
#1
"spark.mongodb.input.uri": "mongodb://username:[email protected]/",
"spark.mongodb.input.database": "database-name",
"spark.mongodb.input.collection": "items",
"spark.mongodb.input.readPreference.name": "primaryPreferred"
#2 "spark.mongodb.input.uri":"mongodb://username:[email protected]:27017/database-name/?authSource=database-name",
"spark.mongodb.output.uri":"mongodb://username:[email protected]:27017/database-name/?authSource=database-name",
#3
"spark.mongodb.input.uri":"mongodb://username:[email protected]:27017/database-name",
"spark.mongodb.output.uri":"mongodb://username:[email protected]:27017/database-name"
But whenever we start training we are getting the same error :
WARNING: Partitioning failed.
Partitioning using the 'DefaultMongoPartitioner$' failed.
com.mongodb.MongoException: host and port should be specified in host:port format
at com.mongodb.ServerAddress.(ServerAddress.java:123)
at com.mongodb.internal.connection.ServerAddressHelper.createServerAddress(ServerAddressHelper.java:33)
at com.mongodb.internal.connection.ServerAddressHelper.createServerAddress(ServerAddressHelper.java:26)
We are not getting this error when we use database without authentication.
Google groups thread : https://groups.google.com/forum/#!topic/actionml-user/kqOF0tLO0o0
Thanks,
Nikola
All REST responses are escaped as if they were just strings being returned as JSON. They are actually JSON and should not be re-escaped
Make Mongo Administrator use case classes instead of Mongo Documents for engine persistence.
Remove all use of the Mongo specific Document type.
We should not assume an eventId is known of any event so any route that uses it is invalid.
See this PR: #17
harness add <some-json-file>
harness add <some-json-file>
With the same file and engineId should fail. It seems to update, should report that a harness update is requied.
need lots more integration tests
Use Denny's make setup to do more int tests for
may want to have other targets like no-auth-server build, etc.
You get an exception in MongoDao. This blocks being able to test updates that change config since it doesn't work even when config is not changed.
15:30:34.319 ERROR OneForOneStrategy - Can't update collection harness_meta_store.engines with filter WrappedArray((engineId,ur_nav_hinting)) and value Document((engineFactory,BsonString{value='com.actionml.engines.urnavhinting.URNavHintingEngine'}), (params,BsonString{value='{
"engineId" : "ur_nav_hinting",
"engineFactory" : "com.actionml.engines.urnavhinting.URNavHintingEngine",
"mirrorContainer" : "/tmp/mirrors",
"mirrorType" : "localfs",
"sparkConf" : {
"master" : "local",
"spark.driver-memory" : "4g",
"spark.executor-memory" : "4g",
"spark.serializer" : "org.apache.spark.serializer.KryoSerializer",
"spark.kryo.registrator" : "org.apache.mahout.sparkbindings.io.MahoutKryoRegistrator",
"spark.kryo.referenceTracking" : "false",
"spark.kryoserializer.buffer" : "300m",
"es.index.auto.create" : "true"
},
"algorithm" : {
"comment" : "may not need indexName and typeName, derive from engineId? but nowhere else to get the RESTClient address",
"esMaster" : "es-node-1",
"blacklist" : [
],
"indicators" : [
{
"name" : "nav-event"
},
{
"name" : "search-terms"
},
{
"name" : "content-pref"
}
],
"availableDateName" : "available",
"expireDateName" : "expires",
"dateName" : "date",
"num" : 6
}
}'}))
updating should have an engineId in the CLI so it can be checked against the one in the config otherwise you can easily update the wrong engine. Also the new config should be reported to the CLI and REST response.
no ranks should be returned if the score it null, which means they have not been calculated or configured. Need to decide what to do when they are configured but there is no value.
The link to harness JSON file in harness/docs/engine_configuration.md
is broken.
running harness train <engineId>
with a remote cluster make take a lot of time, many seconds, to copy jars to the context. This causes the harness command, which uses the REST API to timeout before returning the response.
So harness train times out the CLI and I believe times out the REST API too.
This should be in a Future as it was when it was first implemented.
The Future will eventually create and return an SC which itself execute code in the context all in the background.
The CLI and REST API to train should return immediately with the Harness JobID, then wait until the sc is ready, then execute the block of code with the sc.
item-based recs should return the item itself only when configured to do so. Probably a missed mustNot
value
add to integration test
first merge CI mods to the CLI Bash scripts
then migrate all CLI into Python so it can be packaged as a single script.
The cli creates a pid file in the harness directory -- bad because when using systemctl on startup the user is root
The harness-start cli also checks to see if a pid file exists and requires a flag to start -- bad, we should not use pid files at all but see if harness is running using ps
or just fail to start if something is bound to the port we need and not check pids.
The harness-start script calls harness-status, which doesn't work unless harness is installed in the root
user's path when using systemctl
I tried with Harness 0.4.0 & 0.5.0 and MongoDB indexes are not being created when a new UR engine is added.
I debugged the code and the method createIndexes
is never executed in URDataset.init
or MongoStore
because ttl
is always None
.
I had to set a value for ttl in the ur-engine.json to have the indexes created. But if someone doesn't want a ttl as me, indexes won't be created.
getting exceptions for findMany
with no filter on an empty collection. Shouldnโt it return an empty iterable?
Hi!
Using 0.5.1.
I'm trying to use the include filter rule, but I keep getting an empty result. I used a $set event to set a category on two items that exist in my data as such:
{
"event" : "$set",
"entityType" : "item",
"entityId" : "exampleItem",
"properties" : {
"category": ["electronics", "mobile"],
"expireDate": "2020-10-05T21:02:49.228Z"
},
"eventTime" : "2019-12-17T21:02:49.228Z"
}
Then I ran the train command and then this query:
{
"rules": [
{
"name": "category",
"values": ["electronics"],
"bias": -1
}]
}
but got an empty list as result.
The other rules (bias = 0 and > 0) seem to be working as specified.
Assuming a case class is used to store data in a collection, allow one field to be used as the unique id by annotating it with @id basically following what Morphia does with annotations, which in turn follows some standard applied to Java use of DB (can't remember which one)
This is very important because many case classes have been using _id
to define the key in a case class and not sure that is still followed not that we are not using Salat & Casbah.
The best way to handle JSON responses is to compress over the wire and prettify for display in the CLI or for server logs.
This is now prettified in the response
The following appears in logs when creating a user using auth:
15:01:05.050 WARN ActorSystemImpl - HTTP header 'Timeout-Access: <function1>' is not allowed in requests
15:01:05.077 INFO ActorSystemImpl - Complete: HttpMethod(POST):http://localhost:9090/auth/users -> 200 OK [65 ms.]
If Elasticsearch is not running harness delete <some-enigine-id>
will throw an exception because the delete of the model fails. This should test to see if ES is running and report that it is not as an error rather than an exception that gives no clue that ES is not running.
Should an init (Create or Update init) ping required services?
The same bool/should query return different results in ES5 vs ES6. This difference is quite bad for the UR.
Short-term solution: downgrade to use 5.x
Long-term solution: find the new way to do this query in 6.x+
See this SO: https://stackoverflow.com/questions/54335922/elasticsearch-6-5-query-result-scoring-seems-wrong
I've followed the quick start guide: https://actionml.com/docs/h_ur_quickstart
My config.json
is as follows:
{
"engineId": "2",
"engineFactory": "com.actionml.engines.ur.UREngine",
"sparkConf": {
"spark.serializer": "org.apache.spark.serializer.KryoSerializer",
"spark.kryo.registrator": "org.apache.mahout.sparkbindings.io.MahoutKryoRegistrator",
"spark.kryo.referenceTracking": "false",
"spark.kryoserializer.buffer": "300m",
"spark.executor.memory": "3g",
"spark.driver.memory": "3g",
"spark.es.index.auto.create": "true",
"spark.es.nodes": "harness-docker-compose_elasticsearch_1",
"spark.es.nodes.wan.only": "true"
},
"algorithm":{
"indicators": [
{
"name": "buy"
},{
"name": "view"
}
]
}
}
I run harness-cli train 2
and when I check harness-cli status engines 2
I see:
/harness-cli/harness-cli/harness-status: line 10: /harness-cli/harness-cli/RELEASE: No such file or directory
Harness CLI v settings
==================================================================
HARNESS_CLI_HOME ........................ /harness-cli/harness-cli
HARNESS_CLI_SSL_ENABLED .................................... false
HARNESS_CLI_AUTH_ENABLED ................................... false
HARNESS_SERVER_ADDRESS ................................... harness
HARNESS_SERVER_PORT ......................................... 9090
==================================================================
Harness Server status: OK
Status for engine-id: 2
{
"engineParams": {
"algorithm": {
"indicators": [
{
"name": "buy"
},
{
"name": "view"
}
]
},
"engineFactory": "com.actionml.engines.ur.UREngine",
"engineId": "2",
"sparkConf": {
"spark.driver.memory": "3g",
"spark.es.index.auto.create": "true",
"spark.es.nodes": "harness-docker-compose_elasticsearch_1",
"spark.es.nodes.wan.only": "true",
"spark.executor.memory": "3g",
"spark.kryo.referenceTracking": "false",
"spark.kryo.registrator": "org.apache.mahout.sparkbindings.io.MahoutKryoRegistrator",
"spark.kryoserializer.buffer": "300m",
"spark.serializer": "org.apache.spark.serializer.KryoSerializer"
}
},
"jobStatuses": {
"ed0becbf-ce7d-4e62-822d-2e3f2138e235": {
"comment": "Spark job",
"jobId": "ed0becbf-ce7d-4e62-822d-2e3f2138e235",
"status": {
"name": "executing"
}
}
}
}
The job never moves from executing, is there anyway I can debug why this is happening?
I've set harness up by following https://actionml.com/docs/harness_container_guide
the $set of properties in the UR should immediately modify the index after changing the property in the DB
harness start
and harness status
print various settings but do no real status checks for the services that are required.
Status for Mongo. Elasticsearch, and anything the engine's want to provide should be listed.
These will be several different types:
Hello,
I've used docker-compose to setup Harness + UR, went through workflow and get recommendations but all items have 0.0 scores in the result. What could go wrong?
The steps I did:
Please assist
In the engine's JSON config:
"sparkConf": {
"master": "local",
...
The training succeeds and the ur-integration-test.sh passes
Changing to a remote master:
"sparkConf": {
"master": "spark://Maclaurin.local:7077",
...
The test fails and the index is not written to ES. This seems to be why there are not results, the index is never created so no query could succeed.
The part of main.log that has to do with Spark training is:
9:20:30.976 INFO SparkContext - Added JAR /Users/pat/harness/rest-server/Harness-0.4.0-SNAPSHOT/lib/commons-codec.commons-codec-1.10.jar at spark://192.168.1.5:52290/jars/commons-codec.commons-codec-1.10.jar with timestamp 1546658430976
19:20:30.978 INFO StandaloneAppClient$ClientEndpoint - Connecting to master spark://Maclaurin.local:7077...
19:20:30.983 INFO TransportClientFactory - Successfully created connection to Maclaurin.local/127.0.0.1:7077 after 1 ms (0 ms spent in bootstraps)
19:20:31.001 WARN StandaloneAppClient$ClientEndpoint - Failed to connect to master Maclaurin.local:7077
org.apache.spark.SparkException: Exception thrown in awaitResult
at org.apache.spark.rpc.RpcTimeout$$anonfun$1.applyOrElse(RpcTimeout.scala:77)
at org.apache.spark.rpc.RpcTimeout$$anonfun$1.applyOrElse(RpcTimeout.scala:75)
at scala.runtime.AbstractPartialFunction.apply(AbstractPartialFunction.scala:36)
at org.apache.spark.rpc.RpcTimeout$$anonfun$addMessageIfTimeout$1.applyOrElse(RpcTimeout.scala:59)
at org.apache.spark.rpc.RpcTimeout$$anonfun$addMessageIfTimeout$1.applyOrElse(RpcTimeout.scala:59)
at scala.PartialFunction$OrElse.apply(PartialFunction.scala:167)
at org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:83)
at org.apache.spark.rpc.RpcEnv.setupEndpointRefByURI(RpcEnv.scala:100)
at org.apache.spark.rpc.RpcEnv.setupEndpointRef(RpcEnv.scala:108)
at org.apache.spark.deploy.client.StandaloneAppClient$ClientEndpoint$$anonfun$tryRegisterAllMasters$1$$anon$1.run(StandaloneAppClient.scala:106)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.RuntimeException: java.io.EOFException
at java.io.DataInputStream.readFully(DataInputStream.java:197)
at java.io.DataInputStream.readUTF(DataInputStream.java:609)
at java.io.DataInputStream.readUTF(DataInputStream.java:564)
at org.apache.spark.rpc.netty.RequestMessage$.readRpcAddress(NettyRpcEnv.scala:593)
at org.apache.spark.rpc.netty.RequestMessage$.apply(NettyRpcEnv.scala:603)
at org.apache.spark.rpc.netty.NettyRpcHandler.internalReceive(NettyRpcEnv.scala:662)
at org.apache.spark.rpc.netty.NettyRpcHandler.receive(NettyRpcEnv.scala:647)
at org.apache.spark.network.server.TransportRequestHandler.processRpcRequest(TransportRequestHandler.java:187)
at org.apache.spark.network.server.TransportRequestHandler.handle(TransportRequestHandler.java:111)
at org.apache.spark.network.server.TransportChannelHandler.channelRead(TransportChannelHandler.java:118)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340)
at io.netty.handler.timeout.IdleStateHandler.channelRead(IdleStateHandler.java:286)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340)
at io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:102)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340)
at org.apache.spark.network.util.TransportFrameDecoder.channelRead(TransportFrameDecoder.java:85)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340)
at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1359)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)
at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:935)
at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:138)
at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:645)
at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:580)
at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:497)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:459)
at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:858)
at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:138)
at java.lang.Thread.run(Thread.java:745)
at org.apache.spark.network.client.TransportResponseHandler.handle(TransportResponseHandler.java:189)
at org.apache.spark.network.server.TransportChannelHandler.channelRead(TransportChannelHandler.java:120)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:357)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:343)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:336)
at io.netty.handler.timeout.IdleStateHandler.channelRead(IdleStateHandler.java:287)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:357)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:343)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:336)
at io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:102)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:357)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:343)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:336)
at org.apache.spark.network.util.TransportFrameDecoder.channelRead(TransportFrameDecoder.java:85)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:357)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:343)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:336)
at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1294)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:357)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:343)
at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:911)
at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:131)
at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:643)
at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:566)
at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:480)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:442)
at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:131)
at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:144)
... 1 common frames omitted
19:20:39.302 INFO StandaloneAppClient$ClientEndpoint - Connecting to master spark://Maclaurin.local:7077...
19:20:39.304 WARN StandaloneAppClient$ClientEndpoint - Failed to connect to master Maclaurin.local:7077
org.apache.spark.SparkException: Exception thrown in awaitResult
at org.apache.spark.rpc.RpcTimeout$$anonfun$1.applyOrElse(RpcTimeout.scala:77)
at org.apache.spark.rpc.RpcTimeout$$anonfun$1.applyOrElse(RpcTimeout.scala:75)
at scala.runtime.AbstractPartialFunction.apply(AbstractPartialFunction.scala:36)
at org.apache.spark.rpc.RpcTimeout$$anonfun$addMessageIfTimeout$1.applyOrElse(RpcTimeout.scala:59)
at org.apache.spark.rpc.RpcTimeout$$anonfun$addMessageIfTimeout$1.applyOrElse(RpcTimeout.scala:59)
at scala.PartialFunction$OrElse.apply(PartialFunction.scala:167)
at org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:83)
at org.apache.spark.rpc.RpcEnv.setupEndpointRefByURI(RpcEnv.scala:100)
at org.apache.spark.rpc.RpcEnv.setupEndpointRef(RpcEnv.scala:108)
at org.apache.spark.deploy.client.StandaloneAppClient$ClientEndpoint$$anonfun$tryRegisterAllMasters$1$$anon$1.run(StandaloneAppClient.scala:106)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.RuntimeException: java.io.EOFException
at java.io.DataInputStream.readFully(DataInputStream.java:197)
at java.io.DataInputStream.readUTF(DataInputStream.java:609)
at java.io.DataInputStream.readUTF(DataInputStream.java:564)
at org.apache.spark.rpc.netty.RequestMessage$.readRpcAddress(NettyRpcEnv.scala:593)
at org.apache.spark.rpc.netty.RequestMessage$.apply(NettyRpcEnv.scala:603)
at org.apache.spark.rpc.netty.NettyRpcHandler.internalReceive(NettyRpcEnv.scala:662)
at org.apache.spark.rpc.netty.NettyRpcHandler.receive(NettyRpcEnv.scala:647)
at org.apache.spark.network.server.TransportRequestHandler.processRpcRequest(TransportRequestHandler.java:187)
at org.apache.spark.network.server.TransportRequestHandler.handle(TransportRequestHandler.java:111)
at org.apache.spark.network.server.TransportChannelHandler.channelRead(TransportChannelHandler.java:118)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340)
at io.netty.handler.timeout.IdleStateHandler.channelRead(IdleStateHandler.java:286)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340)
at io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:102)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340)
at org.apache.spark.network.util.TransportFrameDecoder.channelRead(TransportFrameDecoder.java:85)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340)
at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1359)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)
at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:935)
at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:138)
at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:645)
at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:580)
at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:497)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:459)
at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:858)
at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:138)
at java.lang.Thread.run(Thread.java:745)
at org.apache.spark.network.client.TransportResponseHandler.handle(TransportResponseHandler.java:189)
at org.apache.spark.network.server.TransportChannelHandler.channelRead(TransportChannelHandler.java:120)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:357)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:343)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:336)
at io.netty.handler.timeout.IdleStateHandler.channelRead(IdleStateHandler.java:287)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:357)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:343)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:336)
at io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:102)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:357)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:343)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:336)
at org.apache.spark.network.util.TransportFrameDecoder.channelRead(TransportFrameDecoder.java:85)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:357)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:343)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:336)
at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1294)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:357)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:343)
at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:911)
at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:131)
at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:643)
at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:566)
at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:480)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:442)
at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:131)
at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:144)
... 1 common frames omitted
19:20:50.982 INFO StandaloneAppClient$ClientEndpoint - Connecting to master spark://Maclaurin.local:7077...
19:20:50.985 WARN StandaloneAppClient$ClientEndpoint - Failed to connect to master Maclaurin.local:7077
org.apache.spark.SparkException: Exception thrown in awaitResult
at org.apache.spark.rpc.RpcTimeout$$anonfun$1.applyOrElse(RpcTimeout.scala:77)
at org.apache.spark.rpc.RpcTimeout$$anonfun$1.applyOrElse(RpcTimeout.scala:75)
at scala.runtime.AbstractPartialFunction.apply(AbstractPartialFunction.scala:36)
at org.apache.spark.rpc.RpcTimeout$$anonfun$addMessageIfTimeout$1.applyOrElse(RpcTimeout.scala:59)
at org.apache.spark.rpc.RpcTimeout$$anonfun$addMessageIfTimeout$1.applyOrElse(RpcTimeout.scala:59)
at scala.PartialFunction$OrElse.apply(PartialFunction.scala:167)
at org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:83)
at org.apache.spark.rpc.RpcEnv.setupEndpointRefByURI(RpcEnv.scala:100)
at org.apache.spark.rpc.RpcEnv.setupEndpointRef(RpcEnv.scala:108)
at org.apache.spark.deploy.client.StandaloneAppClient$ClientEndpoint$$anonfun$tryRegisterAllMasters$1$$anon$1.run(StandaloneAppClient.scala:106)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.RuntimeException: java.io.EOFException
at java.io.DataInputStream.readFully(DataInputStream.java:197)
at java.io.DataInputStream.readUTF(DataInputStream.java:609)
at java.io.DataInputStream.readUTF(DataInputStream.java:564)
at org.apache.spark.rpc.netty.RequestMessage$.readRpcAddress(NettyRpcEnv.scala:593)
at org.apache.spark.rpc.netty.RequestMessage$.apply(NettyRpcEnv.scala:603)
at org.apache.spark.rpc.netty.NettyRpcHandler.internalReceive(NettyRpcEnv.scala:662)
at org.apache.spark.rpc.netty.NettyRpcHandler.receive(NettyRpcEnv.scala:647)
at org.apache.spark.network.server.TransportRequestHandler.processRpcRequest(TransportRequestHandler.java:187)
at org.apache.spark.network.server.TransportRequestHandler.handle(TransportRequestHandler.java:111)
at org.apache.spark.network.server.TransportChannelHandler.channelRead(TransportChannelHandler.java:118)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340)
at io.netty.handler.timeout.IdleStateHandler.channelRead(IdleStateHandler.java:286)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340)
at io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:102)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340)
at org.apache.spark.network.util.TransportFrameDecoder.channelRead(TransportFrameDecoder.java:85)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340)
at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1359)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)
at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:935)
at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:138)
at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:645)
at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:580)
at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:497)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:459)
at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:858)
at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:138)
at java.lang.Thread.run(Thread.java:745)
at org.apache.spark.network.client.TransportResponseHandler.handle(TransportResponseHandler.java:189)
at org.apache.spark.network.server.TransportChannelHandler.channelRead(TransportChannelHandler.java:120)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:357)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:343)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:336)
at io.netty.handler.timeout.IdleStateHandler.channelRead(IdleStateHandler.java:287)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:357)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:343)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:336)
at io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:102)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:357)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:343)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:336)
at org.apache.spark.network.util.TransportFrameDecoder.channelRead(TransportFrameDecoder.java:85)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:357)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:343)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:336)
at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1294)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:357)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:343)
at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:911)
at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:131)
at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:643)
at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:566)
at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:480)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:442)
at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:131)
at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:144)
... 1 common frames omitted
19:20:59.304 ERROR StandaloneSchedulerBackend - Application has been killed. Reason: All masters are unresponsive! Giving up.
19:20:59.304 WARN StandaloneSchedulerBackend - Application ID is not initialized yet.
19:20:59.305 INFO ServerConnector - Stopped Spark@5acb809d{HTTP/1.1}{0.0.0.0:4040}
19:20:59.306 INFO ContextHandler - Stopped o.s.j.s.ServletContextHandler@7cb5c916{/stages/stage/kill,null,UNAVAILABLE,@Spark}
19:20:59.306 INFO ContextHandler - Stopped o.s.j.s.ServletContextHandler@1d5d8d08{/jobs/job/kill,null,UNAVAILABLE,@Spark}
19:20:59.307 INFO ContextHandler - Stopped o.s.j.s.ServletContextHandler@58d35a9c{/api,null,UNAVAILABLE,@Spark}
19:20:59.307 INFO ContextHandler - Stopped o.s.j.s.ServletContextHandler@6bd1e619{/,null,UNAVAILABLE,@Spark}
19:20:59.308 INFO ContextHandler - Stopped o.s.j.s.ServletContextHandler@15be96fb{/static,null,UNAVAILABLE,@Spark}
19:20:59.308 INFO ContextHandler - Stopped o.s.j.s.ServletContextHandler@5b33cf09{/executors/threadDump/json,null,UNAVAILABLE,@Spark}
19:20:59.308 INFO ContextHandler - Stopped o.s.j.s.ServletContextHandler@2472e86f{/executors/threadDump,null,UNAVAILABLE,@Spark}
19:20:59.309 INFO ContextHandler - Stopped o.s.j.s.ServletContextHandler@1c0c53b6{/executors/json,null,UNAVAILABLE,@Spark}
19:20:59.309 INFO ContextHandler - Stopped o.s.j.s.ServletContextHandler@130bb905{/executors,null,UNAVAILABLE,@Spark}
19:20:59.310 INFO Utils - Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 52297.
19:20:59.310 INFO ContextHandler - Stopped o.s.j.s.ServletContextHandler@5295a565{/environment/json,null,UNAVAILABLE,@Spark}
19:20:59.310 INFO NettyBlockTransferService - Server created on 192.168.1.5:52297
19:20:59.310 INFO ContextHandler - Stopped o.s.j.s.ServletContextHandler@22b35eb6{/environment,null,UNAVAILABLE,@Spark}
19:20:59.310 INFO BlockManager - Using org.apache.spark.storage.RandomBlockReplicationPolicy for block replication policy
19:20:59.310 INFO ContextHandler - Stopped o.s.j.s.ServletContextHandler@600df0b2{/storage/rdd/json,null,UNAVAILABLE,@Spark}
19:20:59.310 INFO BlockManagerMaster - Registering BlockManager BlockManagerId(driver, 192.168.1.5, 52297, None)
19:20:59.311 INFO ContextHandler - Stopped o.s.j.s.ServletContextHandler@415011fb{/storage/rdd,null,UNAVAILABLE,@Spark}
19:20:59.311 INFO ContextHandler - Stopped o.s.j.s.ServletContextHandler@2e1bd9bc{/storage/json,null,UNAVAILABLE,@Spark}
19:20:59.311 INFO ContextHandler - Stopped o.s.j.s.ServletContextHandler@324e164b{/storage,null,UNAVAILABLE,@Spark}
19:20:59.312 INFO ContextHandler - Stopped o.s.j.s.ServletContextHandler@3cd3b5e6{/stages/pool/json,null,UNAVAILABLE,@Spark}
19:20:59.312 INFO BlockManagerMasterEndpoint - Registering block manager 192.168.1.5:52297 with 1007.4 MB RAM, BlockManagerId(driver, 192.168.1.5, 52297, None)
19:20:59.312 INFO ContextHandler - Stopped o.s.j.s.ServletContextHandler@37ab01ed{/stages/pool,null,UNAVAILABLE,@Spark}
19:20:59.312 INFO BlockManagerMaster - Registered BlockManager BlockManagerId(driver, 192.168.1.5, 52297, None)
19:20:59.312 INFO ContextHandler - Stopped o.s.j.s.ServletContextHandler@5fc8703e{/stages/stage/json,null,UNAVAILABLE,@Spark}
19:20:59.312 INFO BlockManager - Initialized BlockManager: BlockManagerId(driver, 192.168.1.5, 52297, None)
19:20:59.312 INFO ContextHandler - Stopped o.s.j.s.ServletContextHandler@772f2790{/stages/stage,null,UNAVAILABLE,@Spark}
19:20:59.313 INFO ContextHandler - Stopped o.s.j.s.ServletContextHandler@5e3aa92f{/stages/json,null,UNAVAILABLE,@Spark}
19:20:59.313 INFO ContextHandler - Stopped o.s.j.s.ServletContextHandler@41c529db{/stages,null,UNAVAILABLE,@Spark}
19:20:59.313 INFO ContextHandler - Stopped o.s.j.s.ServletContextHandler@433d6e4d{/jobs/job/json,null,UNAVAILABLE,@Spark}
19:20:59.313 INFO ContextHandler - Stopped o.s.j.s.ServletContextHandler@383221d1{/jobs/job,null,UNAVAILABLE,@Spark}
19:20:59.313 INFO ContextHandler - Stopped o.s.j.s.ServletContextHandler@3125d8b4{/jobs/json,null,UNAVAILABLE,@Spark}
19:20:59.313 INFO ContextHandler - Stopped o.s.j.s.ServletContextHandler@22d415c4{/jobs,null,UNAVAILABLE,@Spark}
19:20:59.313 INFO ContextHandler - Started o.s.j.s.ServletContextHandler@27f75c54{/metrics/json,null,AVAILABLE,@Spark}
19:20:59.313 INFO SparkUI - Stopped Spark web UI at http://192.168.1.5:4040
19:20:59.314 ERROR SparkContext - Error initializing SparkContext.
java.lang.IllegalStateException: Cannot call methods on a stopped SparkContext.
This stopped SparkContext was created at:
org.apache.spark.SparkContext.<init>(SparkContext.scala:76)
com.actionml.core.spark.SparkContextSupport$$anonfun$createSparkContext$1.apply(SparkContextSupport.scala:125)
com.actionml.core.spark.SparkContextSupport$$anonfun$createSparkContext$1.apply(SparkContextSupport.scala:112)
scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24)
scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24)
scala.concurrent.impl.ExecutionContextImpl$AdaptedForkJoinTask.exec(ExecutionContextImpl.scala:121)
scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
The currently active SparkContext was created at:
(No active SparkContext.)
at org.apache.spark.SparkContext.assertNotStopped(SparkContext.scala:100)
at org.apache.spark.SparkContext.getSchedulingMode(SparkContext.scala:1679)
at org.apache.spark.SparkContext.postEnvironmentUpdate(SparkContext.scala:2227)
at org.apache.spark.SparkContext.<init>(SparkContext.scala:563)
at com.actionml.core.spark.SparkContextSupport$$anonfun$createSparkContext$1.apply(SparkContextSupport.scala:125)
at com.actionml.core.spark.SparkContextSupport$$anonfun$createSparkContext$1.apply(SparkContextSupport.scala:112)
at scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24)
at scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24)
at scala.concurrent.impl.ExecutionContextImpl$AdaptedForkJoinTask.exec(ExecutionContextImpl.scala:121)
at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
In this commit "polish docs, move to docs dir":
aa88533#diff-5778ab63eb5a74b5b25b27d166c80b32
Algorithm parameter blacklistEvents was renamed to blacklistIndicators without changing the documentation. I understand it was renamed to match the indicators config. Why was doc not updated? I discovered that blacklistEvents were not showing up in engine status and had to check the coding.
See https://actionml.com/docs/h_ur_config#complete-ur-engine-configuration-specification
"blacklistEvents": ["", "", "", ""],
Author @pferrel
Is there a way to get the complete item object as a response to the query and not just item ids?
There seem to be many cases where the creation of a SparkContext errors out so that the Spark Job is not created. Therefore the Spark Job does not create an error so the harness train
never errors.
This causes the queued job to stay queued forever so that fixing the error is futile since a new job will never be run.
ALL conditions that fail to start a Spark Job OR cause the Job to return an error should dequeue the Harness JobManager job. This will allow for fixing the problem and re-executing harness train ...
when running make-dist for auth server and harness there is a "dist" directory with all other root directories mirrored. this is confusing, bloats the dist tarball, and shoud be removed.
The following message is printed continuously in the logs when the auth server is running, is this needed?
15:01:13.622 DEBUG cluster - Checking status of localhost:27017
15:01:13.624 DEBUG cluster - Updating cluster description to {type=STANDALONE, servers=[{address=localhost:27017, type=STANDALONE, roundTripTime=0.8 ms, state=CONNECTED}]
15:01:23.628 DEBUG cluster - Checking status of localhost:27017
15:01:23.629 DEBUG cluster - Updating cluster description to {type=STANDALONE, servers=[{address=localhost:27017, type=STANDALONE, roundTripTime=0.8 ms, state=CONNECTED}]
15:01:33.633 DEBUG cluster - Checking status of localhost:27017
15:01:33.634 DEBUG cluster - Updating cluster description to {type=STANDALONE, servers=[{address=localhost:27017, type=STANDALONE, roundTripTime=0.8 ms, state=CONNECTED}]
15:01:43.637 DEBUG cluster - Checking status of localhost:27017
15:01:43.639 DEBUG cluster - Updating cluster description to {type=STANDALONE, servers=[{address=localhost:27017, type=STANDALONE, roundTripTime=0.9 ms, state=CONNECTED}]
as the title says
Synchronize Harness server means that all must know when one has updated the metastore. I think this means minimally:
sync
REST API, which tells Harness to re-sync with the metastoresync
sent to all harness servers when:
harness add
, harness update
, harness delete
, or ???Not sure of what else should be done. Any cluster of Harness needs to keep track if all others so probably a master voting structure like ES, where one keeps the DB of all other server, and if the master is not available another is voted to take over.
This seems pretty complicated so is there an easier solution?
Are there signaling DBs that will notify listeners when something changes?
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
19/06/25 03:18:59 INFO CoarseGrainedExecutorBackend: Started daemon with process name: 57@aae27ca361b0
19/06/25 03:18:59 INFO SignalUtils: Registered signal handler for TERM
19/06/25 03:18:59 INFO SignalUtils: Registered signal handler for HUP
19/06/25 03:18:59 INFO SignalUtils: Registered signal handler for INT
19/06/25 03:19:02 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
19/06/25 03:19:03 INFO SecurityManager: Changing view acls to: root
19/06/25 03:19:03 INFO SecurityManager: Changing modify acls to: root
19/06/25 03:19:03 INFO SecurityManager: Changing view acls groups to:
19/06/25 03:19:03 INFO SecurityManager: Changing modify acls groups to:
19/06/25 03:19:03 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(root); groups with view permissions: Set(); users with modify permissions: Set(root); groups with modify permissions: Set()
19/06/25 03:19:05 INFO TransportClientFactory: Successfully created connection to ff059f403060/172.29.0.2:44603 after 408 ms (0 ms spent in bootstraps)
19/06/25 03:19:06 INFO SecurityManager: Changing view acls to: root
19/06/25 03:19:06 INFO SecurityManager: Changing modify acls to: root
19/06/25 03:19:06 INFO SecurityManager: Changing view acls groups to:
19/06/25 03:19:06 INFO SecurityManager: Changing modify acls groups to:
19/06/25 03:19:06 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(root); groups with view permissions: Set(); users with modify permissions: Set(root); groups with modify permissions: Set()
19/06/25 03:19:06 INFO TransportClientFactory: Successfully created connection to ff059f403060/172.29.0.2:44603 after 5 ms (0 ms spent in bootstraps)
19/06/25 03:19:06 INFO DiskBlockManager: Created local directory at /tmp/spark-21895064-184b-4817-9988-c7e05776cbe8/executor-4423d906-96fc-4abc-a1ec-896b4b944506/blockmgr-58f7f70a-975f-465c-8dd5-a3473462f140
19/06/25 03:19:06 INFO MemoryStore: MemoryStore started with capacity 912.3 MB
19/06/25 03:19:07 INFO CoarseGrainedExecutorBackend: Connecting to driver: spark://CoarseGrainedScheduler@ff059f403060:44603
19/06/25 03:19:07 INFO WorkerWatcher: Connecting to worker spark://[email protected]:37435
19/06/25 03:19:07 INFO TransportClientFactory: Successfully created connection to /172.29.0.5:37435 after 11 ms (0 ms spent in bootstraps)
19/06/25 03:19:07 INFO WorkerWatcher: Successfully connected to spark://[email protected]:37435
19/06/25 03:19:07 INFO CoarseGrainedExecutorBackend: Successfully registered with driver
19/06/25 03:19:07 INFO Executor: Starting executor ID 1 on host 172.29.0.5
19/06/25 03:19:08 INFO Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 40379.
19/06/25 03:19:08 INFO NettyBlockTransferService: Server created on 172.29.0.5:40379
19/06/25 03:19:08 INFO BlockManager: Using org.apache.spark.storage.RandomBlockReplicationPolicy for block replication policy
19/06/25 03:19:08 INFO BlockManagerMaster: Registering BlockManager BlockManagerId(1, 172.29.0.5, 40379, None)
19/06/25 03:19:08 INFO BlockManagerMaster: Registered BlockManager BlockManagerId(1, 172.29.0.5, 40379, None)
19/06/25 03:19:08 INFO BlockManager: Initialized BlockManager: BlockManagerId(1, 172.29.0.5, 40379, None)
19/06/25 03:19:08 ERROR Inbox: Ignoring error
java.io.EOFException
at java.io.DataInputStream.readFully(DataInputStream.java:197)
at java.io.DataInputStream.readUTF(DataInputStream.java:609)
at java.io.DataInputStream.readUTF(DataInputStream.java:564)
at org.apache.spark.scheduler.TaskDescription$$anonfun$decode$1.apply(TaskDescription.scala:134)
at org.apache.spark.scheduler.TaskDescription$$anonfun$decode$1.apply(TaskDescription.scala:133)
at scala.collection.immutable.Range.foreach(Range.scala:160)
at org.apache.spark.scheduler.TaskDescription$.decode(TaskDescription.scala:133)
at org.apache.spark.executor.CoarseGrainedExecutorBackend$$anonfun$receive$1.applyOrElse(CoarseGrainedExecutorBackend.scala:96)
at org.apache.spark.rpc.netty.Inbox$$anonfun$process$1.apply$mcV$sp(Inbox.scala:117)
at org.apache.spark.rpc.netty.Inbox.safelyCall(Inbox.scala:205)
at org.apache.spark.rpc.netty.Inbox.process(Inbox.scala:101)
at org.apache.spark.rpc.netty.Dispatcher$MessageLoop.run(Dispatcher.scala:221)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
I think the original name of the project is called pio-kappa
?
Cus the Travis-CI link is pointing to that and is not working for harness right now.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.