Giter Club home page Giter Club logo

Comments (7)

mnalis avatar mnalis commented on June 2, 2024 1

Interestingly, when typing categories on https://commons.wikimedia.org/wiki/Special:UploadWizard

steps 2a. and 3a. also find only one match as the commons app, however trying step 1a actually works (i.e. produces the same result as 4a!)

Could the Commons app use the same API that Special:UploadWizard uses for category search?

from apps-android-commons.

mnalis avatar mnalis commented on June 2, 2024
  • Seems to be related to #3179 but not complete duplicate. Here is another example which shows that single-word with correct case not finding the multi-word match:
small_Screen_Recording_20240505_024119_Commons.mp4

and log:
CommonsAppLogs-5712-franjo.log

Note specifically:
1a. 00:00:08: No categories found for search term "franjo tuđman"
2a. 00:00:18: 1 category (Tuđman: surname) found for search term "tuđman"
3a. 00:00:23: 1 category (Tuđman: surname) found for search term "Tuđman" (uppercase), although the app also says No categories found (?!)
4a. 00:00:31: many categories found for "Franjo Tuđman" (correct result).

So even if search term "Tuđman" was in correct case in step 3a., it did not match as it contained only one word of the category name "Franjo Tuđman", and not all words.


Additional info (not in the screencast): searching for "franjo Tuđman" (case of only first word incorrect) works to find matches, but "Franjo tuđman" (case of only second word incorrect) doesn't find anything

from apps-android-commons.

nicolas-raoul avatar nicolas-raoul commented on June 2, 2024

I think we were using a different API in the past but switched for some reason, unfortunately I don't remember what API nor what the issue was.

Would you be able to:

  1. Identify the API URL of Special:UploadWizard, you can use Chrome DevTools' network tab to find out.
  2. List OK/fail for various search strings.

That would be super helpful. Thanks a lot!

from apps-android-commons.

mnalis avatar mnalis commented on June 2, 2024
  1. Identify the API URL of Special:UploadWizard, you can use Chrome DevTools' network tab to find out.

When typing into Categories fields of Special:UploadWizard, which seems to handle search case-insensitive manner (which we would like too) it calls (when searching for "franjo tuđman" all lowercase) this URL:

https://commons.wikimedia.org/w/api.php?action=opensearch&format=json&formatversion=2&namespace=14&limit=10&search=franjo%20tu%C4%91man

It successfully finds results in different case, e.g. :

0	"Category:Franjo Tuđman"
1	"Category:Franjo Tuđman bridge (Dubrovnik)"
2	"Category:Franjo Tuđman square in Blato (Korčula)"
[...]
  1. List OK/fail for various search strings.

Do you mean list of OK/fail currently in Commons app, or those in Special:UploadWizard ? Or both?

from apps-android-commons.

sivaraam avatar sivaraam commented on June 2, 2024

When typing into Categories fields of Special:UploadWizard, which seems to handle search case-insensitive manner (which we would like too) it calls (when searching for "franjo tuđman" all lowercase) this URL:

https://commons.wikimedia.org/w/api.php?action=opensearch&format=json&formatversion=2&namespace=14&limit=10&search=franjo%20tu%C4%91man

I think we specifically avoided the search API (which I think is now opensearch but good to confirm once) as it would not bring up categories for which a category page does not exist [ref]. Further it is possible to filter out hidden categories with the current API that we use but not the search API. Does the Special:UploadWizard bring up such pages when your search for it?

I'm kind of unsure. The category search bring up only prefix matches AFAIR. That's why "Franjo Tuđman" / "Franjo" / "franjo" would've brought the result you expected but not "Tuđman" / "tuđman". The API allows the first character of the title to be of any case only because MediaWiki itself treats the first character of the page title case insensitively. Otherwise, the API does a pure case-sensitive prefix match.

Overall, I think this is still is essentially a duplicate of #3179. Let me know in case I'm wrong.

from apps-android-commons.

nicolas-raoul avatar nicolas-raoul commented on June 2, 2024

Do you mean list of OK/fail currently in Commons app, or those in Special:UploadWizard ? Or both?

Both.
It would be even better if you can put a link to the search string as reference for each result.

More test searches can be found at #22 #237 #2664 #3641 #4192 #4901 (comment) #5588

@sivaraam

Tudman not finding Franjo Tudman (as seen in the screencast above) is a bug not related to case, I believe. Possibly that part is the same as #5588

Hopefully the table from Matija will give us a clear vision of all use cases, and what we should be using with what preprocessing.

from apps-android-commons.

sivaraam avatar sivaraam commented on June 2, 2024

Tudman not finding Franjo Tudman (as seen in the screencast above) is a bug not related to case, I believe. Possibly that part is the same as #5588

I don't think that's a bug in our end. It's basically a result of us using the allcategories API for search which does a prefix-match (i.e., results are shown only when the query matches the first part of the category title. see this comment for more details). This is also the root-cause of the following issues: #237 and #5588 (unless I didn't understand those issues correctly).

I'm closing this one in favour of continuing the discussion at #3179 which seems like a more appropriate place to discuss improvements to category search. It would help keep discussions / findings in one place. @mnalis could also feel free to post their findings there. 🙂

from apps-android-commons.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.