samliew / se-electionbot Goto Github PK
View Code? Open in Web Editor NEWA Stack Exchange/Stack Overflow election chatbot written in Node.js to handle FAQs in an election chat room
License: MIT License
A Stack Exchange/Stack Overflow election chatbot written in Node.js to handle FAQs in an election chat room
License: MIT License
getSiteUserIdFromChatStackExchangeId
function relies on scraping the profile page of the user to get the network user id in case it did not get one from the "linked site". Unfortunately, it is now broken (as indicated by the failing CI) - I highly suspect this is due to the recent change to profile pages that SE made (see MSE post).
I already know how to fix it and will make a PR a bit later, but opening the issue for documentation purposes ๐
(Non-debug/Production mode only)
As election chat rooms are manually created by a CM and then linked on the election page, the bot may have been in a default room and needs to switch to the official election chat room.
To do this we have these options:
Currently, guard unit tests are added and cross-referenced manually. We should auto-generate cross-referencing tests instead.
Solving this will also allow us to move forward with removing the rest of the if...else
statements in favor of a reducer.
Debug info:
https://tex.stackexchange.com/election/2
user withdrawn https://tex.stackexchange.com/election/2#post-620144
Withdrawn status was announced in election room only #148, but had wrong link.
There are some cases that the candidate score does not handle properly right now that we need to address:
503
response from the API (it actually seems like with the upgraded network id getter the risk of hitting API throttle got higher) which caused the bot to respond with this (lack of whitespace is on me, though :( - fixed by cf55705):RESPONSE Wow! You have a maximum candidate score of 40!Alas, the nomination period is over Hope to see your candidature next election!
The response has been building up correctly - the real score should've been 34/40 with 6 badges missing and rep maxed out:
Missing Badges: Electorate,Marshal,Reviewer,Steward,Investor,Copy Editor
Not sure of the root cause yet, will investigate
Lacking an account on the site causes the bot to respond with a calculation error. That's kind of expected, but we could look into returning a user-friendly message in this case (tested on me by accident - forgot that I do not have an account on Academia). Not sure if it is easy to distinguish between an API failure and a missing account, though.
We should also guard against null
from the utility fetching network accounts ( 312c1ff fixes this) - it does not help us much if an error happens at this point, but it reduces log cluttering and allows the utility to gracefully exit
Babel requires significant amount of setup for little benefit nowadays, especially given that we already harness a lot of TypeScript's features either way. I propose considering dropping Babel in favor of allowJS
set to true (no extra setup, it works like this already now) and using what TS emits directly.
This discussion is inspired by an upcoming PR I'll link soon ๐
Display free dyno hours on dashboard
Election dates could change halfway (i.e.: when not enough candidates and the nomination period is extended).
The bot will need to cancel existing cron jobs and reschedule them, otherwise we will need to manually restart the bot/instance.
This feature is necessary if the bot gains automation abilities in future #61 as it may be more difficult to restart the process/instance(?)
Currently, if the election data ends up invalid, depending on the occurrence of the check, the server is either not started up at all (if the failure happens early or crashes with a 500 on subsequent connects if the rescraper lead to an invalid election state. Latest to date downtime of the bot resulted from Stack Exchange breaking access to nomination pages - we want to avoid crashing if that happens as we can still serve some useful info in the meantime and periodically rescrape to ensure we are up and running as soon as the issue is fixed.
Already working on, documenting to keep track of the updates.
There are two sub-issues that need to be solved as part of this issue:
Election#validate
method is called, and the latter exits earlyScheduledAnnouncement#rescrape
method ends up breaking the bot if the election state is invalidA test is failing on a certain condition:
1) Election
getters
chatDomain
should default to stackexchange.com for network sites with non-SE domains:
AssertionError: expected 'stackoverflow.com' to equal 'stackexchange.com'
+ expected - actual
-stackoverflow.com
+stackexchange.com
Config variables when test fails:
CHAT_DOMAIN=stackoverflow.com
CHAT_ROOM_ID=190503
DEBUG=true
ELECTION_URL=https://stackoverflow.com/election/13
Removing the above values for CHAT_DOMAIN
and CHAT_ROOM_ID
passes the test.
Should this bot be able to run in multiple election rooms, will this cause an issue if it's all on the same Stack Exchange chat account?
Should we need to create mutiple accounts for the bot to utilise in chat?
If possible we should be able to run everything via a single account, which means keeping track of when the last message was sent and then back-off.
For some reason, the cancelled French Language election is showing up as ended, not cancelled. Also, there is a problem with the ballot link that surfaces because of this bug. On it.
Preliminary investigation: notice parsing issue
Given that https://stackexchange.com/filters/421979/all-elections is a relatively new addition, it only goes back as far as 2021, so the scraping we utilize can only go so far. It is possible, though, to fallback to the API and fetch all per-site Meta posts tagged [election]
before processing instead. Most likely will result in only 1-2 additional API calls on startup and during election hopping.
Right now, the bot only checks elections for changes every scrapeIntervalMins
minutes (default: 5).
This means when an election ends, the bot could take an additional 5 minutes to announce the winners.
This should be automated to rescrape 30 seconds (or a period allowing for tallying of votes) following the end of the election since SO has automated the elections.
This may complicate #60
Following an implementation of #60 (Create a separate program/process to scrape election status on all network sites),
We could have another "main" process that spins up more instances/processes for each election when an election is detected, and terminates them automatically N days after it ends or is cancelled.
This main process then could also "own" the development chatroom test instance if started by a dev.
We then may need to create a dev-only UI/API to manually start/stop instances, as well as override variables for each election instance.
Currently the election permalink/number is manually found by trial and error, ever since the past election history page is no longer accessible during an active election.
i.e. https://stackoverflow.com/election/12
Currently, this is then inserted into the bot's environment variable to direct the bot to scrape the right page.
We need a method to automatically find the current active election's permalink/number if the bot scrapes /election
for each site and finds an active election, for automation of the bot #60 and #61.
Currently, fetchChatTranscript
only fetches messages for the last day. It is a known limitation, however, we can (and should) expand on it to keep fetching transcripts until TRANSCRIPT_SIZE
threshold is met.
Scraping of election page & regex matching/testing of text may not work as intended for elections on non-English sites:
date formats will be different, some text may be RTL, unicode, etc.
Site examples:
If the election is to be extended for a week due to insufficient candidates, it should not say cancelled.
https://chat.stackexchange.com/transcript/message/61237556#61237556
Currently, when bot is in multiple rooms, a triggering message event in any of the rooms will cause the bot to respond in the room that it's listening to.
We must be careful about this because the bot can't leave rooms by itself, so if we are testing in another room using a local instance, the main instance will still get those events and causing it to also respond in the election chat room (or all instances).
We need to get the event's room Id, and ignore it if it's not from the same room that it's listening/connected to.
A temporary fix would be to kick bot from all except one room. However this will raise automatic flags for privileged users (mods) in other rooms. Another option would be to login as that account and then leave the rooms.
For some elections, autoscaling to Hobby dynos simply wastes resources due to low or non-existent activity in the election chat room.
Currently, the only way to prevent autoscaling is to launch the bot in debug mode, but it has a disadvantage of posting a debug message every time the bot restarts. It would not be an issue if Heroku wouldn't have dyno cycling that happens at least once 24 hours (expected behavior), causing the bot to continuously post the debug message.
With the addition of some developer-only testing commands to the privileged triggers, mods on the SE network may accidentally trigger an unwanted action, e.g.: time-travelling during an active election.
Commands that may require moving to a separate dev-only menu:
We could continue using the current env variable ADMIN_IDS
for dev IDs.
Currently, the setting up of this bot is done manually by changing environment vars in Heroku:
If we can create a program to scrape all network sites' election pages, and store the status in a persistent database, we could potentially use this to automatically spin up a new instance for that site if a scheduled election is detected (usually the announcement on per-site meta comes later, although they are automating the election process - so look out for changes).
Maybe we don't need to find/scrape default chatroom urls for each site, since that is unlikely to change. The list can be fetched from https://api.stackexchange.com/docs/sites#filter=default&run=true, and the room with the most users & messages should be the default (e.g.: https://chat.stackexchange.com/?tab=site&sort=people&host=serverfault.com).
But it would be nice for the bot to detect the election chat room if there is one, using the following methods (in order):
and display it on the bot dashboard
Heroku is slowly starting to nudge towards upgrading from 20 to 22. To avoid reaching EoL of stack 20, we need to consider upgrading. So far, the upgrade seems harmless, but I think we should defer until there are little to no elections running before experimenting with upgrading.
See https://devcenter.heroku.com/articles/heroku-22-stack for reference.
As per what transpired during the approval of the PR #147 , we need to tweak the GitHub workflow action to avoid failing the build with no options. The failure is related to secrets not being passed to normal pull_request
workflows as per the docs:
With the exception of GITHUB_TOKEN, secrets are not passed to the runner when a workflow is triggered from a forked repository. The permissions for the GITHUB_TOKEN in forked repositories is read-only.
The easiest way to solve this is to switch to pull_request_target
but comes with its own set of issues, see
Keeping your GitHub Actions and workflows secure Part 1: Preventing pwn requests.
The most promising workflow is to use types: [labeled]
that runs upon changing labels on the PR
and create a "safe" (or similar) label as external PRs have to be closely reviewed either way.
This is a tracking issue to keep tabs on the PR submitted to DefinitelyTyped to add definitions for the package:
On the 12th Stack Overflow election page, there is no link to the official chat room. This causes the scraper to try to find the room's URL in nomination posts leading to a semi-random incorrect room being picked up (Charcoal HQ, SOCVR, etc).
SERVER - failed to render home route: TypeError: Cannot read properties of undefined (reading 'stackexchange.com')
at BotConfig.get maintainerChatIds [as maintainerChatIds] (file:///app/dist/bot/config.js:115:22)
Also reduce dependency on this in case it is not set.
Following up on this comment, let's switch the command from relying on GitHub's contributor list to just using what's available in the contributors
array from the package.json file.
As per discussion, we are moving away from the pseudo-random Math.random()
to a dedicated package for getting random values. This is a tracking issue. Current candidates are:
Currently envars ADMIN_IDS
and DEV_IDS
are for our accounts on Chat.SO.
Our ids are different on Chat.SE and Chat.Meta.SE, so it makes sense to check against a different set when the bot is in these domains.
However unlikely, we do not know if a fourth chat server may be spun up in the future, so consideration may need to be made for that as well.
Also need to consider support should the bot spawn child processes for different chat servers/rooms (#61)
E.g.: my Chat.SO id is 584192, while on Chat.SE is 86504
I've made you an RO of the network test room: https://chat.stackexchange.com/rooms/info/92073/?tab=access
this.start();
doesn't queue another scrape.
The bot scrapes election once on startup, one more scrape gets triggered 5 minutes after, and then no further ones occur.
Dev-only commands should not be shown to mods when using @ElectionBot commands
, otherwise they will get confused when a command doesn't get a response.
I've already removed the keyword help
from displaying the mod-only menu, as I intend this to display the user help menu.
Note to self:
se-electionbot/src/commands/commands.js
Lines 61 to 62 in 1e592d0
On startup, bot scrapes transcript and detects that the winners have already been announced.
Then we timetravel back to the end of election and try to trigger announce winners
command.
This may be setting a local copy of BotConfig, so we are unable to trigger announce winners
again if the election has already ended.
If it's a graduation election, the appointed mods have to run if they wish to continue being a mod.
We could add a new command to ask which ones are running for, or have not submitted their nomination (yet - if nomination phase, or stepping down if past the nomination phase).
https://chat.stackexchange.com/transcript/message/61292331#61292331
This is because we may not be using the election chat room (site default chat room instead), and will cause noise.
To resolve #70 in a way that does not cause us to manually go through each and every instance, we need a way to bulk-update environment variables of all heroku instances. Ideas:
/config
server endpoint.Originally reported: https://chat.stackexchange.com/transcript/message/60660156#60660156
Currently, the removal of the @-mention of the bot only happens at the start of the string.
However, it is not guaranteed that users always follow the @-mention first convention.
This sometimes leads to confusing matches as some of the regular expressions used are more permissive than others.
E.g.: IDs of badges with the same name are different between the sites.
Stack Overflow named badges
https://api.stackexchange.com/docs/badges-by-name#pagesize=100&order=desc&sort=rank&filter=default&site=stackoverflow&run=true
Academia named badges
https://api.stackexchange.com/docs/badges-by-name#pagesize=100&order=desc&sort=rank&filter=default&site=academia&run=true
On bot start, we should call the API once to fetch named badges and update IDs of badges in electionBadges
.
Something weird happened 2 days after Tor has been cancelled - both dev and prod instances finally announced cancellations (which is strange by itself), but not only that, the text contained HTML for some reason: https://chat.stackexchange.com/transcript/message/61312901#61312901
Apparently, this is because SE uses a different notice template when the election is cancelled in the nomination phase and is not a pro tempore election. Compare the notice to Joomla's election, which is scraped correctly:
And this is what we got on French, for example:
When the election ends, a ballot file becomes available - we should add responses to what it is and how to find it (even if in a form of an annotated link to OpaVote's website). Mostly a note to not forget to implement it until the end of the election phase (to have some time in advance - till October 22nd).
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.