Giter Club home page Giter Club logo

dronefly's People

Contributors

dependabot[bot] avatar ethankward avatar eward-sunbelt avatar jwcook avatar michaelpirrello avatar synrg avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

dronefly's Issues

iNat profile autoadd subcommand

The [p]inat profile autoadd subcommand (moderator or "Manage profiles" role) will watch the channel from which it is invoked for messages containing a profile link and:

  • if the user posting the link is not yet known to the bot, automatically add the user and
    • show a reaction button to allow them to remove it
    • also show a message advising them how to remove or update it with subcommands for those operations
[p]inat profile autoadd start [report_to]
[p]inat profile autoadd stop
  • Where [report_to], if specified, optionally reports each add to the specified channel.
    • If this needs to be changed, stop and then start again with that parameter specified, or else see additional ideas below.

e.g. on #introductions channel, a moderator types:

[p]inat profile autoadd start #inat-profiles

Furthermore, if autoadd with a report is in effect, then any adds, updates or removals (whether or not they were done in-channel) will be reported to the named channel.

Additional ideas for reporting (perhaps deserves a separate issue):

As moderator (or "Manage profiles" role):

[p]inat profile autoreport start [report_to] [subcommands]
[p]inat profile autoreport stop [report_to] [subcommands]

The idea is to allow reporting to be enabled/disabled independently of autoadd reporting. If [subcommands] are specified, specific subcommands to start or stop reporting can be monitored (allowable values add, update, remove).

iNat taxon: Include taxonomy in summary

From @mws on the unofficial iNaturalist Discord:

Hey @SyntheticBee I have another bot feature idea ,classification or something like that, which will say which kingdom, phylum, class, order, etc a taxon is in, probably keeping it to the normal groups, not the supers and subs. Should be fairly doable, if you look above the species name on this page you can see where it says every taxon that your selected taxon falls within:
https://www.inaturalist.org/taxa/54805-Fraxinus-americana

I like the idea. Rather than make it into a new command, and without omitting any levels, I think it could be compactly represented as a single line added to the existing taxon command with certain key major levels of the tree highlit in bold to help orient the user, and the usual rule that genus and below are italicized, and the rank of the very last item in the tree named, e.g.

[p]inat taxon western conifer seed bug ->

Leptoglossus occidentalis (Western Conifer Seed Bug)
is a species with 6441 observations in:
Animalia > Arthropoda > Hexapoda > Insecta > Pterygota > Hemiptera > Heteroptera > Pentatomomorpha > Coreoidea > Coreidae > Coreinae > Anisoscelini > Genus Leptoglossus

And if the user needs a reminder as to what rank/common name are at any point in the tree, they can just do a new query with the named thing, e.g.

[p]inat taxon coreinae ->

Subfamily Coreinae
is a subfamily with 51885 observations in:
Animalia > Arthropoda > Hexapoda > Insecta > Pterygota > Hemiptera > Heteroptera > Pentatomomorpha > Coreoidea > Family Coreidae

Less relevant match chosen when a more frequently observed candidate exists

Riviera (iNat Discord) typed this search:

[p]inat taxon na wi s gr

It returned "Melanoplus serrulatus (Nantahala Short-wing Grasshopper)"

But what this returns first on the web (and what Riviera expected) is "Narrow-winged saltbush grasshopper", i.e.

https://www.inaturalist.org/search?q=na%20wi%20s%20gr

Our understanding is that both common names should match, but the more frequently observed Narrow-winged saltbush grasshopper (27 observations to date, vs. 0 for M. serrulatus) should be returned first, all other things considered equal. That is, all 4 terms are found in the common name, so their base score should be the same, and # of observations should be the tie-breaker.

taxon: support shorthand in query for multiple species in same genus

In the [p]map command, we now support a comma-delimited list of species. It would be good to formally support multiple species in the parser, and also a shorthand for multiple in the same genus, e.g. instead of:

[p]map Spilomyia sayi, Spilomyia fusca, Spilomyia longicornis`

either of these should be equivalent:

[p]map Spilomyia sayi, S. fusca, S. longicornis
[p]map Spilomyia sayi, S fusca, S longicornis

and this should be subtly different, in that the Genus is plotted on the map separately, and then the 3 species:

[p]map Spilomyia, S sayi, S fusca, S longicornis

Because of autocapitalization and other usability difficulties on mobile devices, case should not be significant, i.e. this should be equivalent:

[p]map spilomyia, s sayi, s fusca, s longicornis

To avoid confusion, limit the abbreviation expansion to:

  • an earlier genus query appeared in the list
  • that started with the same initial
  • that matches the first letters of the genus

So this would be valid following this rule:

[p]map spi in syrph, s sayi, s fusca, s longi

But this would fail on the 3rd point:

[p]map hawkweed, h venosum

The problem is that although "hawkweed" matches Hieracium, that they both start with "h" is accidental, and therefore not something that is reliable or dependable. Therefore, it's best not to make it seem like it is supported.

Currently there are only these two kinds of query:

  • SimpleQuery
    • is a query for a single taxon
  • CompoundQuery
    • is a query for two taxa: the main (child) query and ancestor query, which are performed ancestor first, and child 2nd

To this, we'd add a third kind:

  • ListQuery
    • one or more comma-delimited queries in a list
    • returning a list containing one taxon per successful query
    • if any are missing, normally a LookupError would be raised
      • though it would be up to the command using the ListQuery to decide how to implement this
      • i.e. it may in some contexts be helpful to just omit the non-matching query, though this is not the case with [p]inat map at this time; it fails the whole query

A ListQuery would be processed right-to-left with any entries that are genus (see Note below) or lower having their genus put temporarily into a dict by 1st letter of the genus as the list is processed (provided the query had one term and all its letters are the 1st letters of the matched genus). Then if any subsequent query terms starts with a single letter, the most recently matching genus would be substituted. if that fails, then the letter should simply be looked up in the usual way.

Note: Hybrids are a tricky corner case. I guess it's possible to have a query like "Agenus epithet1 ร— A epithet2" or even, once the genus is named: "Agenus, A epithet1 x epithet2", or even "Agenus1 epithet1 ร— Bgenus2 epithet2", "A epithet3", "B epithet4". Ideally, any convention that is used in the literature should be supported, but we could start with the simple cases first and leave these until later.

That reminds me that this isn't the only case that should be supported. Trinomial names are another to support in a list, e.g. "Agenus bepithet1, A b epithet2" should supported.

taxon: Time out & remove embed for any error response

To avoid cluttering the channel with failed lookups, allow a timeout to be set on failed lookup responses or "I don't understand" (i.e. invalid syntax of query). When set (should be the default), after the specified # of seconds, the embed for the failure will be removed.

iNat: Auto-preview observation links.

The current inat link command has serious usability issues. I propose to solve this by supporting auto-previewing of observation links so that users can do what is most natural: simply paste a link to the channel, and the bot will respond with supplemental info about the auto-previewed image.

  • First, on the unofficial iNat Discord server, I estimate the majority of users don't know how to use the bot, so they won't even think to use the command. It is most natural to simply paste a link in a channel and let Discord do its auto-preview "thing". Unfortunately, those previews are missing a lot of information from iNat that the bot could've provided via the inat link command.
  • Second, and here's where the usability issues start, even those who do know about the link alias for the inat link command are either prone to forget to use it, or else put off by (or forget) the unnatural syntax that is required currently by the command to suppress Discord's own auto-preview of the image: [p]link <https://...>. This goes against the intuitive & natural syntax goal for the project.

One solution to suppressing the auto-preview by Discord is to simply disable auto-previews either across the whole server or per channel. However, our server doesn't want to do that, as Discord's auto-previews are useful for non-observation links. It would take a lot of extra work & put a lot of extra burden on the bot to have to get in the middle and handle all auto-previews itself, which would be the only way to make this approach acceptable for our server.

Therefore, I propose to have an on_message hook that, when enabled, auto-previews any observation links in the current channel via link. This would need to be coupled with a modification to the default behaviour of link, which should normally omit the image, since Discord will always provide it. The link command itself can be kept for these cases:

  • The on_message hook is not enabled for the current channel:
    • [p]link https://... (i.e. allow Discord to preview the image, and the command provides the rest)
  • The user wishes to display the preview but have neither Discord, nor the bot preview the image itself, i.e.
    • [p]link <https://...> (i.e. the current angle-bracket syntax should intuitively do what angle-brackets are supposed to do, which is to completely suppress any image preview!)
    • this would simply output the summary of the observation, but without the image

One refinement to this would be to support a variant of the link command that, despite having included the link in <...> to suppress the image preview, would nevertheless include the image preview, i.e.

  • The current behaviour of [p]link <https://...> is changed by the proposal above to normally omit the image preview).
  • Thus, a new subcommand would need to be introduced for previewing with this syntax, e.g. [p]inat preview <https://...> which makes it clear in the command name that even though Discord auto-previewing is suppressed by angle-brackets, the bot will handle previewing the image itself.

ebird: Checklist summary

Similar to iNat observation but for eBird checklists

  • Show observer name, date, time, location, checklist ID
  • Show checklist comment, protocol, and duration
  • Show number of species, number of photos and audio

user: User display & management commands

Provide a command group with subcommands to link a server member's Discord login & iNaturalist login. Initially, provide:

[p]inat useradd [discord-user] [inat-user]
[p]inat userdel [discord-user]
[p]inat usershow [discord-user]

The discord-user parameter uses the discord.User converter so that the username can be pasted without @ so they are not mentioned when the command is executed. That will be useful to us when adding users off-channel. Otherwise, specifying @ when we add them publicly on #inat-profiles is fine as we want them to know they have been added, and Discord client will help make sure the name is correctly specified. When done without @ the argument needs to be enclosed in double-quotes if it contains blanks.

The inat-login parameter should accept either login, #, or link to their profile. It should not accept partial match on a name, as the chance for human error is too great.

ebird: Automatic rare bird alert

Automatically send a message when a species eBird considers rare is posted in a specified location

  • Include species comment from checklist
  • Link to checklist
  • Maybe if photos are included embed or link to them? Or at least indicate that they exist.
  • Restrict the time so it doesn't send a message is someone submits a historial checklist - maybe only do it if the checklist is from the past 2 weeks or something?

taxon: match taxon of last observation in channel

Extend the query language to include a last qualifier to match the taxon of any http{s}://{www.}inaturalist(.org|.ca)/observations/# URL recently mentioned in channel. It could support by [user] to filter on the last observation mentioned by the specified Discord user (without @ so they're not pinged) or my last (equivalent last by me) to reference the requesting user's last observation mentioned. Finally, some qualifiers to skip over most recent would be good, e.g. [ordinal] last to count backwards from current, e.g. 2nd last, my 2nd last, or their / by them to skip requesting user's own. Perhaps if an ordinal is given, the actual last keyword could be optional. Support browsing higher in the taxonomy tree to link to an ancestor (e.g. [p]taxon last genus would show the genus matching the last observation or, [p]taxon last parent would show the immediate ancestor of the last observation). Again, the keyword last might be omitted if there is nothing in the query but a rank that is an ancestor of the last observation (instead of the usual "I don't understand"), e.g. [p]taxon family where "of the last observation" is implied.

Consider if last should include other taxon queries, e.g. someone types [p]taxon zono to match Zonotrichia and then [p]taxon last parent, or again with optional last understood by context, [p]taxon parent to show its immediate ancestor to family Passerellidae (New World Sparrows).

Add iNat taxa lookup command

A taxa lookup would be a nice 1st feature for #12 , with initial support for a unique match (w. thumbnail). Later, add paged output of matching results when not uniquely matched.

See http://api.inaturalist.org/v1/docs/#!/Taxa/get_taxa and the by-id variant above it.

It was pointed out in #birds @ iNat Discord that the 4 letter shortcodes for birds are in iNat, so this is a twofer. Rather than coding something separately for that (e.g. see input_parsing in https://github.com/dfloer/discord-birdbot ), we could take advantage of that.

I envision:

[p]inat taxon bbpl

Since this uniquely matches https://inaturalist.ca/taxa/4892-Pluvialis-squatarola it would then respond with an embed that:

  • Includes the rank, name, & preferred common name.
  • Includes a link to the page on inaturalist.org.
  • Includes description, up to some reasonable maximum length (i.e. Source: Wikipedia, usually).
  • Includes a thumbnail of the cover image.
  • Includes some stats (e.g. total # of observations, etc.)

Similarly, when the binomial name or taxon_id is given, it should show just the one record, e.g.

[p]inat taxon pluvialis squatarola
[p]inat taxon 4892

Example API call & output:

$ curl -sl https://api.inaturalist.org/v1/taxa?q=bbpl | python -m json.tool
{
    "total_results": 1,
    "page": 1,
    "per_page": 30,
    "results": [
        {
            "observations_count": 5304,
            "taxon_schemes_count": 10,
            "ancestry": "48460/1/2/355675/3/67561/4783/4888",
            "is_active": true,
            "flag_counts": {
                "unresolved": 0,
                "resolved": 1
            },
            "wikipedia_url": "http://en.wikipedia.org/wiki/Grey_plover",
            "current_synonymous_taxon_ids": null,
            "iconic_taxon_id": 3,
            "rank_level": 10,
            "taxon_changes_count": 0,
            "atlas_id": null,
            "complete_species_count": null,
            "parent_id": 4888,
            "complete_rank": "subspecies",
            "name": "Pluvialis squatarola",
            "rank": "species",
            "extinct": false,
            "id": 4892,
            "default_photo": {
                "square_url": "https://static.inaturalist.org/photos/10603462/square.jpg?1505927973",
                "attribution": "(c) TroyEcol, all rights reserved, uploaded by Declan Troy",
                "flags": [],
                "medium_url": "https://static.inaturalist.org/photos/10603462/medium.jpg?1505927973",
                "id": 10603462,
                "license_code": null,
                "original_dimensions": {
                    "width": 1024,
                    "height": 683
                },
                "url": "https://static.inaturalist.org/photos/10603462/square.jpg?1505927973"
            },
            "ancestor_ids": [
                48460,
                1,
                2,
                355675,
                3,
                67561,
                4783,
                4888,
                4892
            ],
            "matched_term": "BBPL",
            "iconic_taxon_name": "Aves",
            "preferred_common_name": "Black-bellied Plover"
        }
    ]
}

Support for user-specific regional & personal stats would be a nice future enhancement, giving something that the web doesn't directly support (i.e. user needs to navigate to "Show yours" to see those). That would require that the user authorize the bot with an API token of their own.

taxon: include a list of child taxa sorted by # of observations

Related to #41 it would be nice to expand on that display and also include children of a taxon, up to some reasonable limit, with the same sort of "and # more" collapsing that that the map command has for the overflow.

So, something like:

[p]taxon leptoglossus ->

Genus Leptoglossus
is a genus with 16260 observations in:

Animalia > Arthropoda > Hexapoda > Insecta > Pterygota > Hemiptera > Heteroptera > Pentatomomorpha > Coreoidea > Coreidae > Coreinae > Tribe Anisoscelini

with these most commonly observed species:

L. occidentalis (6441), L. phyloppus (2974), L. oppositus (1272), L. zonatus (1099), L. clypealis (722), L. gonagra (226), L. fulvicornis (149), L. corculus (48), L. lineosus (35), L. concolor (23), L. fasciolatus (17), L. brevirostris (12), and 50 more species with 9 or fewer observations.

The constraints on how many child taxa can be listed are:

  • 2048 = maximum length of the description
  • 1024 = maximum length of a field (could take "overflow" if we run out of room in the description)
  • 6000 = maximum length of the whole Discord embed

Note: the mockup above is already approaching 1000 characters without links to all of the listed taxa or their maps included, so this may indicate that it's a bad idea to add so many links. Consider leaving off the more "expensive" links to maps and/or the links to the child taxa themselves so we can include more child taxa within the constraints shown above.

It would be helpful to also provide another command (perhaps [p]inat next, aliased as [p]next or [p]n) and/or reaction button menu to access the "inaccessible" children in the ellipsis "and 50 more". But that should be the topic of new issue.

iNat map: Fix off-center & too far zoomed-in map.

Compare these two images. The first is a map for https://inaturalist.ca/taxa/460811-Eupeodes-americanus as produced by the inat map command, and the 2nd is a map at the same window size for the same species as found on the Map tab on the web:

image

image

Clearly our center point and zoom factor leave something to be desired, as none of the actual range of the species are in view. So, generally speaking, our algorithm produces a zoom factor is too high, and picking the midpoint of the rectangle returned by get_observations() with return_bounds = true yields poor results when a species appears on multiple continents.

First, we should look at the code that generates the map for the web, and see if we can emulate what they do.

Second, if we're still not happy with the results a straight port of their code into ours gives us, we should discuss with the developers how to improve the algorithm, vs. coming up with something on our own.

iNat taxon parser: support unicode characters, e.g. times-sign for hybrids

In 1562ff6 I had to remove Unicode support from the parser because it bloated the bot's real memory footprint from 64M up to 610M!

See https://stackoverflow.com/questions/57517613/how-to-efficiently-parse-a-word-that-includes-the-majority-of-unicode-characters which doesn't focus on memory usage, but on poor execution time performance instead. Perhaps the method used there to avoid the inefficient construct can be used to re-add unicode support to the bot.

More limited support for Unicode might be added on a case by case basis for commonly used characters, but it would be a pain to add them one at a time.

inatcog/embeds.py: reuse common embed code in ebirdcog

Currently ebirdcog duplicates some common code in inatcog/embeds.py to make embeds. This should be shared code. Various options discussed on #coding on the Red - Discord Bot server:

  • Avoid sharedlibs as it really only works with downloader and is a pain to develop with, as hot-reload doesn't work, sharedlibs aren't used by many, and the future of sharedlibs is not at all certain as a result of these limitations.
  • Pip packages are an option, but again, are a bit of a pain to develop with, as it introduces additional releases to handle. Seems like a lot of extra hassle for something that likely won't get reuse outside this project.
    • These could be maintained outside of pypi. Jack> you can even install red without pypi: python -m pip install -U git+https://github.com/Cog-Creators/Red-DiscordBot@V3/develop#egg=Red-DiscordBot
  • Git submodules are another option.

The last option might be the least hassle and not too bad to work with. That's probably where to start.

Allow some cached use of API commands without special permissions

Queries that are unlikely to change much from one execution to the next & therefore can benefit from caching with a long retention period could be allowed for anyone, without prior authorization (i.e. accounted to the bot owner's API key for the service, or server/channel API key once that is implemented; see #10 for the latter).

For instance, looking up a any region code (see #7), and performing [p]ebird hybrids command for a given region (using default # of days) should be allowed at least once a day provided each query is cached.

eBird: rewrite hybrids command as general purpose obs (observation) command

Simplify the hybrids command and make it more generally useful by rewriting it as an observation search command with arguments that directly correspond to those supported in the eBird API, e.g. [p]obs region=US-MA days=7 cat=hybrids, but perhaps supporting a few non-keyword arguments for ease of use, following a similar design as the [p]inat taxon command, which understands keyword and other uniquely identifiable arguments by context.

See also #20 which could be the basis of a command parser for easy to remember, expressive query statements that map behind the scenes to various eBird data/obs API calls.

Help user determine region by lookup against the eBird API

Rather than have the user carry around eBird region codes in their head, it would be good to provide a way that they can lookup regions using the API.

  • Given that an exhaustive list of regions is not very useful, provide commands to browse down to a reasonable subset of regions to choose from.
  • Also, assuming the regions don't change very often, cache the result for a fairly long time (a day, perhaps) to avoid unnecessary API calls.

e.g.

[p]ebird regions <region>

Where the help would describe some common region values, and [p]ebird regions would list regions that are direct descendants of that region (but not recurse any further), paging output if necessary.

Unchecked access to guild config breaks in DM to the bot

If you DM the bot a request that causes an unchecked access to a guild config, as in the new feature added to the inat obs command that places emojis for certain project IDs on the observation title line, then an exception is raised:

[2019-11-03 19:54:12] [ERROR] red: Exception in command 'inat obs'
Traceback (most recent call last):
  File "C:\Python37\lib\site-packages\discord\ext\commands\core.py", line 79, in wrapped
    ret = await coro(*args, **kwargs)
  File "C:\Users\Ben\work\quaggagriff\inatcog\inatcog.py", line 145, in obs
    embed=await self.make_obs_embed(ctx, obs, url, preview=False)
  File "C:\Users\Ben\work\quaggagriff\inatcog\inat_embeds.py", line 77, in make_obs_embed
    project_emojis = await self.config.guild(ctx.guild).project_emojis()
  File "C:\Python37\lib\site-packages\redbot\core\config.py", line 881, in guild
    return self._get_base_group(self.GUILD, str(guild.id))
AttributeError: 'NoneType' object has no attribute 'id'
  1. There needs to be a global config if this feature is also supposed to work in DM to the bot.
  2. The guild config should not be checked in a direct message.

Ben

Add iNaturalist API cog

Patterned after ebirdcog, an iNaturalist cog would provide a useful selection of commands for iNat users. The initial focus should be on simple things that are limited in their use of iNat API resources, but are more cumbersome to do on the iNat platform. Feel free to leave suggestions in the comments on this issue.

iNat: Rewrite taxon command argument parsing as formal DSL

Rewrite the taxon command argument parsing that is currently done with custom code as a formal domain-specific language (DSL) implemented with pyparsing.

Since it will take some time to implement all my ideas, and it's moderately difficult just learning how pyparsing works, these will be worked on in the topic branch taxon-pyparsing.

Background reading:

Currently supported language elements & how we use them are:

  • rank keywords (species, genus, family, etc.)
  • whole query consists of digits indicates lookup by taxon id
  • the whole query can be double-query for an exact phrase match (i.e. ignore any results that don't literally contain all of the words in the query, in the order given)
  • if the query is 4 characters long and the results includes a 4-character long matched term that is uppercase, it is automatically considered the "best match" (i.e. matched a code, as in WTSP for White-Throated Sparrow)
  • implicit "and" between terms
    • not actually implemented by the taxon command; the API treats it this way by default

The effect these things have on a query depends on what is supported directly by the API. Some of these things merely map from the natural syntax we want to support for the end-user to the arguments required by one or more API commands to retrieve a result set. Others influence how the results are used after they are retrieved.

  • in the first category:
    • rank keywords, if specified, are appended together into a comma-delimited list to be passed to the /taxa endpoint as a rank argument, alongside the q argument which contains everything else
    • digits cause lookup by taxon id actually switches to a different API endpoint to do the lookup (and the response is different, too, but we select a subset of fields from either API call that includes everything we need)
  • in the second category is the phrase matching, where it is either:
    • implicit: i.e. we break from the usual "first hit is the best match" that the API provides & rank the first result that exactly matches the phrase as the "best match" instead
    • explicit: i.e. when they put the phrase in double-quotes, they have indicated they're not interested in any results that don't match, so we discard those results

Pyparsing redesign of the above:

  • tbd

iNat: Perform 'last' via reaction button on the message

Rather than trust to someone's memory (or eyeballs) that last would match the correct thing, provide reaction buttons on the observation to:

  • recall the observation in channel
  • alternatively, call another sub-command on the observation (like taxon to show the taxon embed for it)

Here are some notes from discussion with @Drapersniper on the Red DiscordBot server, channel #coding:

  • listen for on_reaction_add / on_reaction_remove
  • take care not to do anything expensive in any on_ hooks, i.e. config reads are fine, but config writes are expensive because "json writes the whole file content from memory to file on every write ... if the size is only like 1-2 mb you should be fine" - @Drapersniper
    • The way i usually get around that is to have an in memory cache.
    • See: PredaaA/predacogs@517dad5
    • This listens for most on_x and it is used on the largest Red bot without any performance inpact.
    • *_save_counters_to_config writes to file.
    • _clean_up cleans the cache and does a final write to file on cog reload/shutdown.
    • Note that this commit was the very first. There are 3 minor ones that are followups to this one. I don't think they affect you but just an fyi.

This idea effectively kills the last command. So far, nobody seems to be using it, so this is fine. It's far more reliable and natural on Discord to cause things to happen by directly interacting with messages rather than typing new commands to act on prior messages.

eBird: report recent intergrade observations for region

Add intergrade observations to the [p]hybrids command.

  • The API makes it seem as though you could specify cat=hybrid,intergrade, but that only returns hybrid results.
  • It would be necessary, based on this test result, to do two searches, one for cat=hybrid, and another for cat=intergrade and combine the results.
  • If the results are going to be mixed, it's probably not a good idea to drop (hybrid) from the name, which was thought to be redundant, but now would serve to distinguish those names from the intergrade results, which can have various parenthetical designations in their names, too.
  • As well, it would be good to check with the eBird devs to see if cat=hybrid,intergrade ought to work and is broken, or if the docs can be improved to make it clear that only one category can be specified per query, as the way it is currently written (in the plural) makes it seem like it supports more than one.

iNat taxon: Fix unexpected matches when filters provided

Some searches fail miserably with the /v1/taxa endpoint, and we have observed the /v1/taxa/autocomplete endpoint does much better. For example, we expect a search for snow to match the AOU code SNOW, but it fails. However, the /v1/taxa/autocomplete endpoint is not without its limitations:

  • no more than 30 results can be requested
  • no rank & taxon_id filters

Therefore, to make [p]inat taxon snow match the expected result "Snowy owl" (AOU code SNOW), the proposal is to switch to /v1/taxa/autocomplete, but only when neither taxon_id nor rank are provided as filters. The idea is if this fails to match in the first 30 results, they have simply failed to provide enough characters to make a unique match.

Unfortunately, this means that if they were overly specific, e.g. species SNOW, or SNOW in birds then even if we crank up the # of results on the /v1/taxa fallback API call, it will still match the wrong result because the AOU code doesn't turn up as the "best match" using that call. I'm not sure why this asymmetry, but I can just advise users for best results with AOU codes, it should be the only thing they specify. update: See in my comment below that the rank filter is now fixed by post-filtering the autocomplete results (though that leaves filtering on taxon_id unsolved).

iNat profile integration (profile subcommand)

Design:

To display (anyone):

[p]inat profile [user]
[p]inat profile
  • If [user] is a Discord user with an iNat profile known to the bot, an Embed with link to their profile is shown. Any roles assigned on Discord are also shown.
  • If [user] is not known to the bot, look up this as an iNat user id in the API & show it if found.
  • If no [user] is specified, shows your own profile if your profile is known to the bot. Otherwise, show help.

Modify (self):

[p]inat profile add [inat_user]
[p]inat profile remove
[p]inat profile update [inat_user]
  • Where [user] is either the iNat user id or a link to their iNat profile.
  • If an add is done and the user is already known, provide reaction button to accept overwriting the original value.
  • When user adds themself, give the user instructions to remove their profile with [p]inat profile remove.
  • The [p]inat profile update command is like add, except will not update a user that is not known, sending a message with tip to use [p]inat profile add instead, and will not prompt to overwrite the original value.

Modify (moderator or "Manage profiles" role):

[p]inat profile add [discord_user] [inat_user]
[p]inat profile remove [user]
[p]inat profile update [user] [inat_user]
  • As for self, except a [discord_user] or [user] argument is required.
    • If [user] matches a Discord user known to the bot, it will be selected for the operation.
    • Otherwise, if [user] matches an iNat user known to the bot, it will be selected for the operation.

ebird: Species search/summary

Search for a species that brings up a summary of the species from its species page (e.g. https://ebird.org/canada/species/amerob)

  • Display common and scientific name
  • Link to species page
  • Thumbnail of first image for the species
  • Description of species if available
  • Number of observations/with photos/with audio
  • Thumbnail of range map? Not sure if that's possible. Link to range map or prompt range map command.

Even missing half of these it could be useful.

iNat last: handle no results gracefully

Spotted in log from new server using the bot: if you are on a channel where [p]inat last has no matching messages in its history, it fail with an error to the user & traceback in the log like:

[2019-12-11 08:21:07] [ERROR] red: Exception in command 'inat last'
Traceback (most recent call last):
  File "/home/synrg/.local/share/Appledore/cogs/CogManager/cogs/inatcog/last.py", line 43, in get_last_obs_msg
    m for m in msgs if not m.author.bot and re.search(PAT_OBS_LINK, m.content)
StopIteration

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/synrg/.local/lib/python3.7/site-packages/discord/ext/commands/core.py", line 79, in wrapped
    ret = await coro(*args, **kwargs)
  File "/home/synrg/.local/share/Appledore/cogs/CogManager/cogs/inatcog/inatcog.py", line 156, in last
    last = await inat_link_msg.get_last_obs_msg(msgs)
RuntimeError: coroutine raised StopIteration

Clearly, [p]inat last just has to catch StopIteration & gracefully return Not found. Should be an easy fix.

taxon: support browse matching taxon descendants, siblings, & ancestors

This is a big one that likely needs to be split into much smaller chunks and done one feature at a time. Ideally:

SyntheticBee: So it would be cool if my taxon finder could become a taxonomy browser.
To accomplish this, I'd need a few menu buttons, and I think I'd also like to be able to have the user be able to type to change what the embed shows. Is there some cog that does this kind of interaction I could use as a model?
Take the sparrow above, by an up-arrow action button, you could navigate back up the taxonomy tree, but say you wanted to match the children of a node you're at, you'd need the user to be able to type something to accomplish that, and then the embed could be updated to (or replaced with?) a taxon that matches what was typed, within the taxon currently shown.
Left and right buttons could flip through taxa at the level you're at, and constraints could be placed so that you're not cycling through hundreds of those. For example [p]taxon zono browse nearby to browse species (descendants of 'Zonotrichia') starting at the zonotrichia genus, and nearby indicates a place filter based on the user's location previously made known to the bot.
Ferret: check out the menu utility https://red-discordbot.readthedocs.io/en/latest/framework_utils.html#module-redbot.core.utils.menus
there's also event predicates if you want a reply from the user

inat taxon: Show alternate names when matched

When an alternate name is matched, show the name and where it came from in the response, e.g.

[p]inat taxon subspecies common teal

Currently this returns the subspecies "Anas crecca crecca (Eurasian Green-winged Teal)" and the API result for that entry has matched_term: "Eurasian Green-winged Teal", i.e. it doesn't contain the word "common".

Why it matches is evident when you go to the web page for it: https://www.inaturalist.org/taxa/132873

In the taxon names section there is a line for "English" "Common Teal".

Since there is no API call for this, the name that was matched can only be obtained by either scraping the page or else using the same unpublished web endpoint that it does:

https://www.inaturalist.org/taxon_names.json?per_page=200&taxon_id=132873

Since by this point the command has already narrowed it down to a single record, and we'd only need this if the "best match" is against a name that isn't retrieved from the /v1/taxa API call (i.e. corner cases), it is justifiable to attempt this call, and if it succeeds, use the first matching name as the "Matched:" value in the response.

inat: User activity / project membership indicators

In order to reconcile the iNat Discord project membership lists with current guild membership, provide indicators for the [p]inat usershow and [p]inat userlist commands for:

  • user is a member of yearlist project(s)
  • user is/was active for the given year where "active" currently means "said something other than on #introductions"

Use a compact fixed-width layout for the listing so that it can be information-dense for rapid review. Don't worry about looking good anywhere but on the desktop.

map: Support shorthand for output of lists of names

Similar to #32 , when outputting a map for a long list of names of species, any repeatedly mentioned genera should be abbreviated in the output.

e.g. currently, a query like this results in abbreviated title to fit within the 256 character limit for Discord embed titles:

/map south leopard frog, north leopard frog, rio leopard frog, plains leopard frog,
lowland leopard frog, chiri leopard frog, atlantic leopard frog, relict leopard frog

=>

Range map for Lithobates sphenocephalus (Southern Leopard Frog), Lithobates
pipiens (Northern Leopard Frog), Lithobates berlandieri (Rio Grande Leopard Frog),
Lithobates blairi (Plains Leopard Frog), and 4 more

ideally that should be:

Range map for Lithobates sphenocephalus (Southern Leopard Frog), L. pipiens
(Northern Leopard Frog), L. berlandieri (Rio Grande Leopard Frog), L. blairi (Plains
Leopard Frog), and 4 more

And because this trims out 8 characters per "Lithobates" => "L." times 3, there are 24 more characters available in the line, so it would appear at least one more "L. epithet" could then fit in the title, if not more.

iNat: provide formatter to name one or more of something

Given an input like pluralize('genus', 2), output a string with plural ending applied, using the format string specified (or default as indicated), e.g. -> "2 genera" in this case.

  • 1st positional argument is the name of the thing to pluralize
  • the 2nd positional argument is a num or bool, defaulting to True
    • if num, it is used to determine which ending (1 is singular, anything else is plural)
    • if bool (or coerced to bool), True means pluralize and False means don't
  • the 3rd positional argument is a format string, defaulting to "%d %s" for num or "%s" for bool
  • any other positional or keyword args after the format string should be taken as arguments to the format string

Capitialization, if provided, should be kept intact, e.g. pluralize('Genus') -> "Genera".

See #42 which will need this formatter when describing the enumeration of child ranks, e.g.

...
rank = "species"
overflow = 50
return pluralize(
    rank,
    overflow,
    " and %d more %s",
)

-> "and 50 more species"

rank = "genus"
overflow = 1

-> "and 1 more genus"

rank = "genus"
overflow = 2

-> "and 2 more genera"

Allow configuration at multiple levels

The initial command, ebird hybrids, can only be configured with one region & days setting at a time. This makes it less useful for reporting on one or more channels+servers which together may have users interested in a number of different areas.

Defaults/overrides for such config items should be supported on the command itself, for a channel, for a guild, and finally, globally.

related: Determine relatedness of list of taxa

From #photosynthetics on Nov. 24, 2019, 14:44 UTC (10:44 AST):

[10:44 AM] michaelpirrello: @SyntheticBee does the bot have the ability, given a command and two inputs, to find the lowest common point of taxonomy? For instance, given the two taxa above, to ID Asteraceae as the last common point?
[10:44 AM] SyntheticBee: no, but that's a good idea.
[10:49 AM] SyntheticBee: e.g. ,related taxon1,taxon2,taxon3 => "Sci1 (common1), Sci2 (common2), and Sci3 (common3) belong to the same : Taxon"
[10:50 AM] michaelpirrello: yep, that'd be awesome, hadn't thought about the application to n>2, but sure, no reason to limit the number of inputs
[10:52 AM] SyntheticBee: or "... belong to the same : Taxon1, and Sci2 (common2) and Sci3 (common3) are more closely related, belonging to the same : Taxon2."
[10:52 AM] SyntheticBee: because I can anticipate queries where it is thought there is common ancestor in the set but one or more of them actually aren't related
[10:54 AM] SyntheticBee: also: ,related family taxon1,taxon2 could constrain the relatedness search to family level relatedness. this is true or false. if it's false, it should say the family of each one that does not share the same family.
[10:55 AM] SyntheticBee: or more generally speaking, it should group the taxa into families, stating first the related groups, and then the loners.

inatcog: provide map image in command response

For some commands, it may make sense to provide a map image in the command response as well as a link to the web for closer inspection. A map directly output in channel could convey the information more efficiently to a group of people than having each person click the link.

While Discord has no direct support for Google maps, here's a plan to support it:

  • Provide a separate inatmaps cog that enhances existing inat commands with map images.
  • This cog can use the python selenium package and chromedriver to get the map images.
  • I tried a proof of principle exercise in the python interpreter and it's not much code at all to get the screenshot. Performance, even on my tiny Atom-based home server, was acceptable.
  • Once the screenshot is taken, it just needs to be uploaded so the Discord embed can use it, also not difficult code.

See also #18 which could benefit from this.

Support command permissions & credentials at multiple levels

Because API calls consume resources accounted to the account owning the key, restricting access to API-using commands is a must. One way is to only allow trustworthy to use them. Another is to allow arbitrary users to use them, but only if they provide their own credentials. Decide on appropriate granularity of command permissions (i.e. global, server, channel, user) & credentials storage, then implement the two together.

eBird: support exclusion of specific results

Support filtering some results from reports (e.g. the [p]ebird hybrids report). Some users have noted that it is boring to see MALL x ABDU hybrid results every day, and they would rather see those dropped. To support this, it would need to be made general enough to be useful in other contexts. For example, when #21 is done, it could be useful to extend the grammar with except to exclude specific results, e.g. [p]ebird obs US-MA 7 hybrid intergrade except x00004.

ebird: Display most recent sighting for a species for a location

Requested by upupa-epops on eBird Discord #bot-commands:

Dec. 4, 2019

[12:49 AM AST] upupa-epops: Bring up the latest report of a specific species at a certain location
[12:49 AM AST] upupa-epops: that would be super useful

  1. needs to understand what "a species" is (e.g. like inatcog's "taxon" command)
  2. needs to understand what "a location" is (no support for this in inatcog, yet, but it needs something too; the ebirdcog 'hybrids' command understands only codes, and not text queries; ideally both cogs would handle text queries as well as codes
  3. needs to do the lookup on species & lookup on location, then do the lookup on most recent for species at the location
  4. needs to format an embed that contains the results

inatcog: Command to provide a map for 2 specified taxa

From mws on iNat discord:

mws: Hey SyntheticBee, it would be cool if Appledore could generate a range map of two
     or more species like this one:
     https://www.inaturalist.org/taxa/map?taxa=24255,24267#7/43.469/-82.442

last: Rewrite as interactions stored in config

Instead of plowing through channel histories to search for links & commands, record bot interactions in a config-backed history which can then be retrieved by the user in a less wordy fashion:

  • replacing grepping the history for bot commands:
    • see current ,last obs recently added corner case: PAT_OBS_LINK matches obs #
      • I'm unhappy with the sloppiness of the pattern match here
  • replacing grepping the history for links:
    • currently looks all over the place (user commands, the bot responses, digging into the embeds! ugh.)
  • introducing monitoring of messages for interesting things, e.g.
    • the long planned on_message hook to notice when links are pasted in channel including:
      • user profile links
      • observation links
      • taxon links
    • and then dispatching to the bot the job of previewing these as needed
  • and also introducing editing of auto-previewed messages to insert the iNat-specific summaries at the top
  • clearing up the mess that is ,link vs. ,obs (confusing to have two commands doing the same thing! only doing this because ,link nicely formats the message in a single embed)
  • and introducing reasonable default behaviours for certain command syntaxes that either are invalid or else just show the help, e.g.
    • [p]obs -> should show most recent observation
      • replacement for [p]last obs
    • [p]map -> should show the map for the most recent observation or taxon
      • replacement for [p]last obs taxon and [p]last taxon. we don't need both, as the most recent bot interaction of either kind should take precedence.
    • [p]taxon -> should show taxon for most recent observation
    • [p]family -> should show family for most recent observation or taxon

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.