orkes-io / orkes-conductor-community Goto Github PK
View Code? Open in Web Editor NEWOrkes Conductor is a microservices orchestration engine.
License: Other
Orkes Conductor is a microservices orchestration engine.
License: Other
Describe the bug
When running the Docker standalone, http://localhost:8080/swagger-ui/index.html does not display operations, instead an error "No operations defined in spec!"
Steps To Reproduce
Run the self-contained, standalone Docker image as per instructions. Navigate to http://localhost:8080/, then click on "Swagger Documentation".
Expected behavior
The usual Swagger page to be displayed with operations.
Device/browser
Additional context
The same issue occurs when downloading, building the server and running locally outside Docker against local REDIS and Postgres.
Describe the bug
Since version 1.0.7 only
Steps To Reproduce
Steps to reproduce the behavior:
docker volume create postgres
docker volume create redis
docker run --init -p 8080:8080 -p 1234:5000 --mount source=redis,target=/redis
--mount source=postgres,target=/pgdata orkesio/orkes-conductor-community-standalone:latest
Expected behavior
I expect that all workflows are shown.
Device/browser
Additional context
In version 1.0.6 it worked fine. Bug is here since 1.0.7.
Describe the bug
We run the docker image orkesio/orkes-conductor-community-standalone:latest and the container is always unhealthy because curl seem to be missing
"Health": {
"Status": "unhealthy",
"FailingStreak": 386,
"Log": [
{
"Start": "2024-07-08T16:51:14.249784013Z",
"End": "2024-07-08T16:51:14.344971847Z",
"ExitCode": 1,
"Output": "/bin/sh: curl: not found\n"
},
{
"Start": "2024-07-08T16:52:14.346606472Z",
"End": "2024-07-08T16:52:14.433134513Z",
"ExitCode": 1,
"Output": "/bin/sh: curl: not found\n"
},
{
"Start": "2024-07-08T16:53:14.440914138Z",
"End": "2024-07-08T16:53:14.549711097Z",
"ExitCode": 1,
"Output": "/bin/sh: curl: not found\n"
},
{
"Start": "2024-07-08T16:54:14.552011055Z",
"End": "2024-07-08T16:54:14.644104638Z",
"ExitCode": 1,
"Output": "/bin/sh: curl: not found\n"
},
{
"Start": "2024-07-08T16:55:14.646305222Z",
"End": "2024-07-08T16:55:14.75006993Z",
"ExitCode": 1,
"Output": "/bin/sh: curl: not found\n"
}
]
Describe the bug
Race condition found when indexTask with ES, the index requests' count sent by conductor server are not matched by received on ES side
Steps To Reproduce
Steps to reproduce the behavior:
Expected behavior
All the task should in terminated status, such as COMPLETED/FAILED in ES rather than IN_PROGRESS
Device/browser
Additional context
Time taken {} for indexing task:{} in workflow: {}
On ES side, the received records count is less than 3 randomly
Seem that, there is a race condition in function indexObject and indexBulkRequest,
`
private void indexObject(
final String index, final String docType, final String docId, final Object doc) {
byte[] docBytes;
try {
docBytes = objectMapper.writeValueAsBytes(doc);
} catch (JsonProcessingException e) {
logger.error("Failed to convert {} '{}' to byte string", docType, docId);
return;
}
IndexRequest request = new IndexRequest(index);
request.id(docId).source(docBytes, XContentType.JSON);
if (bulkRequests.get(docType) == null) {
bulkRequests.put(
docType, new BulkRequests(System.currentTimeMillis(), new BulkRequest()));
}
bulkRequests.get(docType).getBulkRequest().add(request);
if (bulkRequests.get(docType).getBulkRequest().numberOfActions() >= this.indexBatchSize) {
indexBulkRequest(docType);
}
}
private synchronized void indexBulkRequest(String docType) {
if (bulkRequests.get(docType).getBulkRequest() != null
&& bulkRequests.get(docType).getBulkRequest().numberOfActions() > 0) {
synchronized (bulkRequests.get(docType).getBulkRequest()) {
indexWithRetry(
bulkRequests.get(docType).getBulkRequest().get(),
"Bulk Indexing " + docType,
docType);
bulkRequests.put(
docType, new BulkRequests(System.currentTimeMillis(), new BulkRequest()));
}
}
}`
Thanks
I'm trying to run the orkes server, pointing it to a redis sentinel cluster. I am mounting the following in /app/config/config.properties
spring.datasource.url=jdbc:postgresql://postgres:5432/postgres
spring.datasource.username=postgres
spring.datasource.password=postgres
conductor.db.type=redis_sentinel
conductor.redis-lock.serverAddress=redis://redis:26379
conductor.redis.hosts=redis:26379:this-one
Below is the output from orkes, grepping for the wording redis:
10:32:07.812 [main] INFO io.orkes.conductor.OrkesConductorApplication - System Env Props - Key: REDIS_PORT_6379_TCP_PROTO, Value: tcp
10:32:07.812 [main] INFO io.orkes.conductor.OrkesConductorApplication - System Env Props - Key: REDIS_PORT_6379_TCP_ADDR, Value: 10.43.51.71
10:32:07.812 [main] INFO io.orkes.conductor.OrkesConductorApplication - System Env Props - Key: REDIS_PORT, Value: tcp://10.43.51.71:6379
10:32:07.813 [main] INFO io.orkes.conductor.OrkesConductorApplication - System Env Props - Key: REDIS_SERVICE_PORT_TCP_SENTINEL, Value: 26379
10:32:07.813 [main] INFO io.orkes.conductor.OrkesConductorApplication - System Env Props - Key: REDIS_PORT_26379_TCP, Value: tcp://10.43.51.71:26379
10:32:07.814 [main] INFO io.orkes.conductor.OrkesConductorApplication - System Env Props - Key: REDIS_PORT_26379_TCP_ADDR, Value: 10.43.51.71
10:32:07.814 [main] INFO io.orkes.conductor.OrkesConductorApplication - System Env Props - Key: REDIS_PORT_26379_TCP_PORT, Value: 26379
10:32:07.814 [main] INFO io.orkes.conductor.OrkesConductorApplication - System Env Props - Key: REDIS_SERVICE_HOST, Value: 10.43.51.71
10:32:07.815 [main] INFO io.orkes.conductor.OrkesConductorApplication - System Env Props - Key: REDIS_SERVICE_PORT_TCP_REDIS, Value: 6379
10:32:07.815 [main] INFO io.orkes.conductor.OrkesConductorApplication - System Env Props - Key: REDIS_PORT_26379_TCP_PROTO, Value: tcp
10:32:07.815 [main] INFO io.orkes.conductor.OrkesConductorApplication - System Env Props - Key: REDIS_PORT_6379_TCP, Value: tcp://10.43.51.71:6379
10:32:07.815 [main] INFO io.orkes.conductor.OrkesConductorApplication - System Env Props - Key: REDIS_PORT_6379_TCP_PORT, Value: 6379
10:32:07.816 [main] INFO io.orkes.conductor.OrkesConductorApplication - System Env Props - Key: REDIS_SERVICE_PORT, Value: 6379
10:32:07.832 [main] INFO io.orkes.conductor.OrkesConductorApplication - Setting conductor.redis-lock.serverAddress - redis://redis:26379
10:32:07.832 [main] INFO io.orkes.conductor.OrkesConductorApplication - Setting conductor.db.type - redis_sentinel
10:32:07.832 [main] INFO io.orkes.conductor.OrkesConductorApplication - Setting conductor.redis.hosts - redis:26379:this-one
ESC[30m2023-02-21 10:32:18,624ESC[0;39m ESC[34mINFO ESC[0;39m [ESC[34mmainESC[0;39m] ESC[33mio.orkes.conductor.queue.config.RedisQueueConfigurationESC[0;39m: Starting conductor server using redis_standalone - use SSL? false
ESC[30m2023-02-21 10:32:19,055ESC[0;39m ESC[1;31mERRORESC[0;39m [ESC[34mmainESC[0;39m] ESC[33mcom.netflix.conductor.redis.dao.RedisMetadataDAOESC[0;39m: refresh TaskDefs failed
redis.clients.jedis.exceptions.JedisDataException: ERR unknown command `HSCAN`, with args beginning with: `conductor.test.TASK_DEFS`, `0`,
at redis.clients.jedis.Protocol.processError(Protocol.java:135)
at redis.clients.jedis.Protocol.process(Protocol.java:169)
at redis.clients.jedis.Protocol.read(Protocol.java:223)
at redis.clients.jedis.Connection.readProtocolWithCheckingBroken(Connection.java:352)
at redis.clients.jedis.Connection.getUnflushedObjectMultiBulkReply(Connection.java:314)
at redis.clients.jedis.Connection.getObjectMultiBulkReply(Connection.java:319)
at redis.clients.jedis.Jedis.hscan(Jedis.java:3727)
at redis.clients.jedis.Jedis.hscan(Jedis.java:3719)
at com.netflix.conductor.redis.jedis.JedisStandalone.lambda$hscan$127(JedisStandalone.java:706)
at com.netflix.conductor.redis.jedis.JedisStandalone.executeInJedis(JedisStandalone.java:59)
at com.netflix.conductor.redis.jedis.JedisStandalone.hscan(JedisStandalone.java:706)
at com.netflix.conductor.redis.jedis.OrkesJedisProxy.hgetAll(OrkesJedisProxy.java:148)
at com.netflix.conductor.redis.dao.RedisMetadataDAO.getAllTaskDefs(RedisMetadataDAO.java:125)
at com.netflix.conductor.redis.dao.RedisMetadataDAO.refreshTaskDefs(RedisMetadataDAO.java:92)
at com.netflix.conductor.redis.dao.RedisMetadataDAO.<init>(RedisMetadataDAO.java:57)
at com.netflix.conductor.redis.dao.OrkesMetadataDAO.<init>(OrkesMetadataDAO.java:55)
ESC[30m2023-02-21 10:32:19,059ESC[0;39m ESC[34mINFO ESC[0;39m [ESC[34mmainESC[0;39m] ESC[33mcom.netflix.conductor.redis.dao.OrkesMetadataDAOESC[0;39m: taskDefCacheTTL set to 1000
So I believe I am loading the config fine, but it's still running the standalone Redis configuration. It looks like the default config, despite me attempting to override it and the output messages suggesting I've done that, is still take precedence?
Any ideas please?
Describe the bug
In Ui even though there are many execution requests it does not show correct row and pagination count
Steps To Reproduce
Steps to reproduce the behavior:
Make sure you have many executions in prior preferably more than 15
Go to Executions tab, keep default view count of executions as 15, the page nav icons are disabled and number of executions count is wrongly displayed
Click on drop down to increase row count and you will be able to more extra records
Expected behavior
Rows count should be correct and also the nav icons allow to navigate.
Additional context
These workflows were spawn from backend using locahost:1234 endpoints not via UI workbench. Thanks
Describe the bug
We are working on an adhoc task workflow. For that we have created a new spring boot service with conductor client. We are using conductor server in docker [orkesio/orkes-conductor-community-standalone:latest].
My service works fine without spring cloud config. It is able to poll for the tasks and execute it. However, when I add spring boot cloud config dependencies, it is able to poll for tasks and hence it does not execute it.
Steps To Reproduce
Steps to reproduce the behavior:
I have created a demo project in my github.
curl -H 'Content-Type: application/json' http://localhost:8080/submit/ -d '{"someId": 123}'
Notice that the service is NOT able to poll the task and execute it.
without-cloud-config
. This branch does not have spring cloud dependency. Run the main method in src/main/java/com/arpitrathore/test/Application.java again.curl -H 'Content-Type: application/json' http://localhost:8080/submit/ -d '{"someId": 123}'
Notice the service is able to poll the task and execute it.
Expected behavior
Service should poll and execute the task with or without spring cloud config dependency
Device/browser
In our production use-case, we often have long running workflows that wait on human tasks.
Because we want to be able to track human tasks in our own backoffice systems, we created a subworkflow that creates and tracks human tasks for us and ends with a HUMAN task in conductor.
We noticed an absurd load on REDIS, even when every single currently non-completed workflow is idling on a subworkflow that's idling on a HUMAN task. Looking into it more we noticed that our logs are getting spammed with
INFO [sweeper-thread-1] io.orkes.conductor.server.service.OrkesWorkflowSweeper: Running sweeper for workflow ***
. This constantly fetches the workflows and its tasks, and it seems like it is currently impossible to slow this process down.
Looking into the contradictory statements of this code and it's comment : https://github.com/orkes-io/orkes-conductor-community/blob/60325ef7b196a96d1062ddfecf924c4be7866309/server/src/main/java/io/orkes/conductor/server/service/OrkesWorkflowSweeper.java#L152C4-L152C4 ( Comment says 60 seconds, code is 60 milis ) , I'm worried a mistake might have been made in the implementation of the sweeper service, and workflows are being checked way more often than they should be.
I believe this to be a root cause of our production systems failing under relatively light load. Is there any way to slow down the sweeper without disabling it completely, or does a bug need to be fixed?
Describe the bug
parameter retryDelaySeconds in tasks definition do not add delay while retrying the task. The issue is observed with all tasks.
Can try to reproduce on sample workflow in this repo:
path: orkes-conductor-community-build/persistence/src/test/resources/wf2.json
Expected behavior
Failed Tasks should retry after delay of N seconds
Describe the bug
I created a http task and inline task with an intention of starting it with an delay of 5 seconds, there is a predined property suggested to use for this feature - startDelay( in seconds). After adding a delay of 5 (also tried 5000) but it didnot work, workflows are starting instantaneously.
Reproducible with all workflows, can try on sample workflow in this workflow by modifing the value of startDelay:
Path: orkes-conductor-community-build/server/src/main/resources/workflows.json
Expected behavior
Workflow should start after N seconds
Device/browser
Across all browsers
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.