Comments (22)
Can you start Hubot with HUBOT_LOG_LEVEL=debug to see what line of code the execution is getting to?
from hubot.
Sure. The batch happens here:
const chunkSize = 10
for (let i = 0; i < serviceNames.length; i += chunkSize) {
let chunk = serviceNames.slice(i, i + chunkSize)
const ignoredFromChunk = chunk.filter((service) => ignoredServices.includes(service))
ignored.push.apply(ignored, ignoredFromChunk)
chunk = chunk.filter((service) => !ignoredServices.includes(service))
if (chunk.length < 1) {
continue
}
let input = {
cluster,
services: chunk,
include: []
}
let command = new DescribeServicesCommand(input)
let serviceData
try {
serviceData = await ecsClient.send(command)
serviceData = serviceData.services
} catch (err) {
robot.logger.error(`Request to AWS failed: ${err}`)
}
Let's expand that a bit:
Instead of doing
for (let i = 0; i < serviceNames.length; i++) {
const service = serviceNames[i]
if (ignoredServices.includes(service)) {
ignored.push(service)
continue
}
let input = {
cluster,
services: [service],
include: []
}
let command = new DescribeServicesCommand(input)
let serviceData
try {
serviceData = await ecsClient.send(command)
serviceData = serviceData.services[0]
} catch (err) {
robot.logger.error(`Request to AWS failed: ${err}`)
}
I now loop through the list of serviceArns
in groups of 10 (and do some filtering). This means that I would send a request like ['service1', 'service2', ..., 'service10']
instead of [service1]
, [service2]
, etc. Reducing the time taken collecting data from AWS by a factor of 10.
I think perhaps surfacing the request timeout (somehow) would be amazing. Just so we know it's there.
from hubot.
I see. The new code changes
let input = {
cluster,
services: [service],
include: []
}
to
let input = {
cluster,
services: services,
include: []
}
where services
is an array of service names without the ignored ones.
from hubot.
Closing this as it seems to be more an issue with timeouts within adapters. Thanks for the help!
from hubot.
Nothing in the code immediately stands out to me as the culprit. I have a few probing questions:
- how is Hubot hosted? i.e. in kubernetes, on an EC2 instance, ????
- how many instances of Hubot are running?
- Does a single instance of Hubot have access to Prod, Dev and Stage?
- What version of Hubot is running?
from hubot.
Nothing in the code immediately stands out to me as the culprit. I have a few probing questions:
* how is Hubot hosted? i.e. in kubernetes, on an EC2 instance, ????
Docker container in ECS
* how many instances of Hubot are running?
1
* Does a single instance of Hubot have access to Prod, Dev and Stage?
Yes. Read-only access. And importantly, it's to the ECS Clusters. Not separate accounts/environments/etc.
* What version of Hubot is running?
11.1
from hubot.
Does it only respond 4 times when the value is "Production"?
from hubot.
Or when I leave it to "default". So when cluster
=== Production
.
from hubot.
what chat adapter are you using?
Does it respond 4 times with the same answer?
from hubot.
what chat adapter are you using? Does it respond 4 times with the same answer?
https://github.com/hubot-friends/hubot-slack
Yep. Exact same response, 4 times. Also takes about 20 minutes to get all four replies.
(Updated all that in the initial question, too)
from hubot.
Ok. I've seen this behavior before during development. The issue was that the code failed to acknowledge the message. In that situation. the Slack system will "retry sending the message". Here's where the code is supposed to acknowledge the message.
I also see an issue in the Slack Adapter. It's not awaiting
robot.receive
. I'm unsure what that will cause, but I'll have to push a fix for that.
from hubot.
I've also added the await
call in the Slack Adapter.
from hubot.
Seems to just...receive the message multiple times? To be clear...I definitely only typed it once, but this pattern (and I'm hesitant to give you full log messages...) looks like it's just...getting the message again.
from hubot.
Updated to the new adapter and I still get the duplicate messages. :(
from hubot.
Another thought is to await
res.send
because it's async
.
from hubot.
await res.send()
also doesn't help.
from hubot.
Is it odd that the envelope_id
is different for each of those messages?
from hubot.
Can you run a Hubot instance locally on your machine and replace the behavior?
from hubot.
It sounds like you might have a plausible cause, so add several grains of salt to anything in this comment :)
When I've observed Hubot get into a repeats-replies state, I had a suspicion it related to functionality such as remind-her or polling plugins (eg watch statuspage, report when status changes). It seemed like the use of setTimeout()
or setInterval()
could create concurrent threads. (The fact that you see it reply four times specifically suggests to me this doesn't quite fit ... but maybe there's a magic number in that system I don't know about.)
If the current best theory doesn't pan out, maybe consider which plugins could be disabled to isolate the behaviour?
from hubot.
There is a timeout in the slack response! Because this query to AWS is relatively slow, that doesn't entirely surprise me:
{"level":20,"time":1709307623089,"pid":11932,"hostname":"John-Seekins-MacBook-Pro-16-inch-2023-","name":"Hubot","msg":"Text = @hubot ecs list stale tasks"}
{"level":20,"time":1709307623089,"pid":11932,"hostname":"John-Seekins-MacBook-Pro-16-inch-2023-","name":"Hubot","msg":"Event subtype = undefined"}
{"level":20,"time":1709307623089,"pid":11932,"hostname":"John-Seekins-MacBook-Pro-16-inch-2023-","name":"Hubot","msg":"Received generic message: message"}
{"level":20,"time":1709307623090,"pid":11932,"hostname":"John-Seekins-MacBook-Pro-16-inch-2023-","name":"Hubot","msg":"Message '@hubot ecs list stale tasks' matched regex //^\\s*[@]?Hubot[:,]?\\s*(?:ecs list stale tasks( in )?([A-Za-z0-9-]+)?)/i/; listener.options = { id: null }"}
{"level":20,"time":1709307626395,"pid":11932,"hostname":"John-Seekins-MacBook-Pro-16-inch-2023-","name":"Hubot","msg":"eventHandler {
\"envelope_id\": \"bd22596e-ee19-4201-8250-792f91fc96d7\",
\"body\": {
\"token\": \"<>\",
\"team_id\": \"<>\",
\"context_team_id\": \"<>\",
\"context_enterprise_id\": null,
\"api_app_id\": \"<>\",
\"event\": {
\"client_msg_id\": \"<>\",
\"type\": \"message\",
\"text\": \"<@hubot> ecs list stale tasks\",
\"user\": \"<>\",
\"ts\": \"1709307622.850469\",
\"blocks\": [
{
\"type\": \"rich_text\",
\"block_id\": \"5X8EE\",
\"elements\": [
{
\"type\": \"rich_text_section\",
\"elements\": [
{
\"type\": \"user\",
\"user_id\": \"<>\"
},
{
\"type\": \"text\",
\"text\": \" ecs list stale tasks\"
}
]
}
]
}
],
\"team\": \"<>\",
\"channel\": \"<>\",
\"event_ts\": \"1709307622.850469\",
\"channel_type\": \"channel\"
},
\"type\": \"event_callback\",
\"event_id\": \"<>\",
\"event_time\": 1709307622,
\"authorizations\": [
{
\"enterprise_id\": null,
\"team_id\": \"<>\",
\"user_id\": \"<>\",
\"is_bot\": true,
\"is_enterprise_install\": false
}
],
\"is_ext_shared_channel\": false,
\"event_context\": \"<>\"
}
\"event\": {
\"client_msg_id\": \"<>\",
\"type\": \"message\",
\"text\": \"<@hubot> ecs list stale tasks\",
\"user\": \"<>\",
\"ts\": \"1709307622.850469\",
\"blocks\": [
{
\"type\": \"rich_text\",
\"block_id\": \"5X8EE\",
\"elements\": [
{
\"type\": \"rich_text_section\",
\"elements\": [
{
\"type\": \"user\",
\"user_id\": \"<>\"
},
{
\"type\": \"text\",
\"text\": \" ecs list stale tasks\"
}
]
}
]
}
],
\"team\": \"<>\",
\"channel\": \"<>\",
\"event_ts\": \"1709307622.850469\",
\"channel_type\": \"channel\"
},
\"retry_num\": 1,
\"retry_reason\": \"timeout\",
\"accepts_response_payload\": false
}"
}
{"level":20,"time":1709307626395,"pid":11932,"hostname":"John-Seekins-MacBook-Pro-16-inch-2023-","name":"Hubot","msg":"event {
\"envelope_id\": \"<>\",
\"body\": {
\"token\": \"<>\",
\"team_id\": \"<>\",
\"context_team_id\": \"<>\",
\"context_enterprise_id\": null,
\"api_app_id\": \"<>\",
\"event\": {
\"client_msg_id\": \"<>\",
\"type\": \"message\",
\"text\": \"<@hubot> ecs list stale asks\",
\"user\": \"<>\",
\"ts\": \"1709307622.850469\",
\"blocks\": [
{
\"type\": \"rich_text\",
\"block_id\": \"5X8EE\",
\"elements\": [
{
\"type\": \"rich_text_section\",
\"elements\": [
{
\"type\": \"user\",
\"user_id\": \"<>\"
},
{
\"type\": \"text\",
\"text\": \" ecs list stale tasks\"
}
]
}
]
}
],
\"team\": \"<>",
\"channel\": \"<>\",
\"event_ts\": \"1709307622.850469\",
\"channel_type\": \"channel\"
},
\"type\": \"event_callback\",
\"event_id\": \"<>\",
\"event_time\": 1709307622,
\"authorizations\": [
{
\"enterprise_id\": null,
\"team_id\": \"<>\",
\"user_id\": \"<>\",
\"is_bot\": true,
\"is_enterprise_install\": false
}
],
\"is_ext_shared_channel\": false,
\"event_context\": \"<>\"
},
\"event\": {
\"client_msg_id\": \"<>\",
\"type\": \"message\",
\"text\": \"<@hubot> ecs list stale tasks\",
\"user\": \"<>",
\"ts\": \"1709307622.850469\",
\"blocks\": [
{\n \"type\": \"rich_text\",
\"block_id\": \"5X8EE\",
\"elements\": [
{
\"type\": \"rich_text_section\",
\"elements\": [
{
\"type\": \"user\",
\"user_id\": \"<>\"
},
{
\"type\": \"text\",
\"text\": \" ecs list stale tasks\"
}
]
}
]
}
],
\"team\": \"<>\",
\"channel\": \"<>\",
\"event_ts\": \"1709307622.850469\",
\"channel_type\": \"channel\"
},
\"retry_num\": 1,
\"retry_reason\": \"timeout\",
\"accepts_response_payload\": false}
user = <>"
}
from hubot.
It's definitely me racing a timeout! I changed the code to batch AWS requests more efficiently and I'm no longer getting duplicate messages!
Relevant code:
/*
* Stale Deploys
*/
robot.respond(/ecs list stale tasks( in )?([A-Za-z0-9-]+)?/i, async res => {
const cluster = res.match[2] || defaultCluster
const services = await paginateServices(ecsClient, cluster)
// no need to sort these results
const serviceNames = services.map((x) => x.split('/')[x.split('/').length - 1])
const staleDateShort = new Date(Date.now() - shortExpireSecs)
const staleDateLong = new Date(Date.now() - longExpireSecs)
const expiredDate = new Date(Date.now() - expiredSecs)
let ignored = []
let shortExp = []
let longExp = []
let exp = []
/*
* Collect service data
*/
const chunkSize = 10
for (let i = 0; i < serviceNames.length; i += chunkSize) {
let chunk = serviceNames.slice(i, i + chunkSize)
const ignoredFromChunk = chunk.filter((service) => ignoredServices.includes(service))
ignored.push.apply(ignored, ignoredFromChunk)
chunk = chunk.filter((service) => !ignoredServices.includes(service))
if (chunk.length < 1) {
continue
}
let input = {
cluster,
services: chunk,
include: []
}
let command = new DescribeServicesCommand(input)
let serviceData
try {
serviceData = await ecsClient.send(command)
serviceData = serviceData.services
} catch (err) {
robot.logger.error(`Request to AWS failed: ${err}`)
}
for (let idx = 0; idx < serviceData.length; idx++) {
const deployDate = new Date(serviceData[idx].deployments[0].createdAt)
// skip any service newer than our longest expiration window
if (deployDate > staleDateLong) {
continue
}
const servString = `\`${serviceData[idx].serviceName}\` (deployed ${deployDate.toISOString()})`
if (deployDate < expiredDate) {
exp.push(servString)
} else if (deployDate < staleDateShort) {
shortExp.push(servString)
} else {
longExp.push(servString)
}
}
}
from hubot.
Well done tracking down this bug.
I don't see the code that "batches the AWS requests". Would you mind pointing it out for me? I'd love to see how you solved it.
I'm also curious if there's a move I can make to the Slack Adapter to either not let this situation happen or make it very visible that it's happening.
from hubot.
Related Issues (20)
- Hubot Health/Monitoring HOT 11
- Using async calls in hubot scripts HOT 2
- Switch site to GitHub pages HOT 3
- Hubot doesn't respond to `@Hubot help`
- Web service test fails in Node.js 21 HOT 2
- Update some of the modules in the Hubot ecosystem HOT 1
- The road to v11, let's get Hubot caught up to the latest changes in Javascript and Node.js
- `./bin/hubot --help` doesn't work HOT 1
- Test failing on Windows HOT 1
- CoffeeScript removal is in v10.0.4 HOT 1
- Windows: Unable to load script file HOT 1
- Running hubot -a @hubot-friends/hubot-ms-teams fails HOT 1
- Testing Hubot with Express 5
- Update docs about how to test with hubot
- Route handlers not working when running Xampltest pattern HOT 1
- response.reply is not a function in CatchAll listener HOT 4
- ENOENT: no such file or directory, open '.hubot_history'
- `Brain.users()` claims it returns an Array but returns an Object HOT 1
- The automated release is failing 🚨 HOT 3
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from hubot.