statistics's People
Forkers
rafaelccampos campos20 sauroux coder13 zfkdiyi louismeunier randomno timreyn jonesparaz mickeyjdoyle georgemunyoro spencerchubb daniel-anker-hermansen ligsnfstatistics's Issues
Create "evolution of records" graph
Preliminary notes
- Links to Issue 6742 on WCA Website repo
- Please refer to WCA Statistics 2.0 Transition Overview document for context around this change
Requirement
One of the things still needed in the transition to Statistics 2.0 is an updated version of the "Evolution of Records" graph.
Basically, we need to recreate that graph in more modern frameworks (Python/ReactJS) inside this repo.
Description of task
You will need to complete the following steps:
- A Python script to extract, generate and organise the source data for the graph
- Create a frontend page to host the graph in ReactJS
- Integrate (ie, link to) the page from the main statistic.worldcubeassociation.org website.
- Use research and judgement to decide how best to render the graph on the frontend - I don't know enough to advise on how to do this
General Comments
One person doesn't need to do all of this - if you only feel confident to do the Python or React sides of this, feel free to just do those.
Even if its one person doing everything, it is strongly recommended to submit a pull request for each piece of this, so that code review is easier. Please feel free to ask for help if you aren't sure if you're going in the right direction.
Average number of competitions per year by country since 2017
Hi,
just noticed that cancelled comps are apparently counted for this stat. For example Germany in 2020 had comps that took place and 5 that were cancelled. The stat lists 9 in total.
Is this intended?
Best,
Wilhelm
Migrate existing statistics
- Best "medal collection"
- Sum of 3x3/4x4/5x5 ranks
- Sum of all single ranks
- Sum of all average ranks
- Appearances in 3x3x3 Cube top 100 results
- Most Sub-X solves in 3x3x3 Cube
- 3x3x3 Blindfolded longest success streak
- 3x3x3 Blindfolded recent success rate
- World records in most events
- Best Podiums in 3x3x3 Cube event
- Oldest standing world records
- Most Persons
- Most Competitions
- Most Countries
- Most solves in one competition, year, or lifetime
Export database query as CSV
I would like to be able to export a database query as a csv file.
Success streaks should order by RoundType.rank
The query for "longest success streak" doesn't sort by RoundType.rank (https://github.com/thewca/statistics/blob/6f98049999b6edf1977ad6d9523053c662328cff/misc/python/statistics/longest_success_streak.py). This means that, for competitions with multiple rounds, it's nondeterministic what order they are returned in.
I noticed this for my clock success streak. Last week it was 116 (correct), early this week it dropped to 111, and when AmherstOpen2022 results were added it went back up to 116 (should be 121 now). My last DNF was in round 1 of a 2-round competition, so the finals of that competition appear to be switching back and forth between being included and not.
Thank you!
Update website copy/text
I risk sounding like a bit of an asshole on this one - sorry if I do!
Some of the text on the website appears to be placeholder text that could be expanded upon (eg, more info could be added to the "About" page), or has phrasing that sounds a bit weird to a native English speaker ("We try always to add new statistics." could be "We're always looking to add new statistics").
I expect the focus here has been on putting together the website and its features, not on making sure everything reads perfectly. But I would suggest that the text is reviewed before the site is made live.
This is something I would be happy to work on, but would only be available to do so next week.
Spelling Error, "Couting" Should be "Counting"
Better html configuration
Currently, the way to support html formatting is to send it directly to the statistics. We could add capabilities to support html configuration. Currently, query is including html snipets and it makes them quite hard to reuse/study.
Features that would be nice to support
- Links (with the href and text)
- Different colors (eg for sum of ranks)
Sum of Ranks not loading on Safari
I've had two reports of Sum of Ranks not loading on Safari - one on iOS, one on Mac.
Don't have much information besides that refreshing doesn't fix it. If you'd like me to ask more detailed questions, let me know what you'd like to find out.
Error message using custom function for displaying results
This came in via the WST email - though it would be best to put it here so you can work on it when you get a chance :)
Been loving the new WCA Statistics page, but it seems permissions have been denied for the custom function for displaying results. Here is the Error message:
< StatementCallback; bad SQL grammar [select count(*) from ( SELECT wca_statistics_time_format(347, '333', 'single') ) alias]; nested exception is java.sql.SQLSyntaxErrorException: execute command denied to user 'read_only'@'%' for routine 'wca_development.wca_statistics_time_format' >
It's a great function that I'd love to utilise, so this would be great to have access to! Amazing work on the rest of this site.
Better error handling when not logged in
When you go to a database query link when not being logged in it says "NOT FOUND" as the error message. It should either tell you "get access to this feature by logging in" or just redirect you to the login page instead.
Investigate statistics not updating (last updated on 30 May?)
Statistics oldest standing records may not be accurate
Submitted via website contact form:
Statistics oldest standing records may not be accurate. Noticed in the national records section the oldest standing records was a 3x3 average from 2007. That very same average he got a single time that is currently an NR. I suppose it was a continental record single at the time, but think that could still be worth noting if unintentional. Likewise, I assume there are current continental records that aren't listed because they were world records at first.
Add statistics from various sources
Here are some sources we can take statistics from (copy paste from the statistics group)
- https://jonatanklosko.github.io/wca_statistics
- https://campos20.github.io/wca-statistics
- https://sam596.github.io/WCA-Stats
- https://coder13.github.io/wca_statistics
- https://github.com/Linus.../python-WCA/blob/master/README.md
- http://ohrndorf.org/wca-stats
- https://cubingchina.com/results/statistics
- https://github.com/Jambrose777/JacobAmbroseWCAStatistics
- http://www.nemesizer.com
- https://handynotes.herokuapp.com/shares/N2ph3V
- https://www.kaggle.com/danieljamesdj/wca-stats-2019
"Best average": Exclude rounds where it was not possible to earn an average from the calculation
This references Sebastiano's comment on the forums, quoted below.
I think best first average 3 should exclude rounds where it was not possible to obtain an average (e.g. FMC rounds with 1 or 2 attempts). At the moment it doesn’t (I know it because I got a 28.33 average on my first round with 3 attempts).
getWcaIdPage() has an off by 1 error
In sum of ranks, if you search by id for someone who's SOR placement is a multiple of 20, the page number will be 1 higher than it should be.
to recreate, go to https://statistics.worldcubeassociation.org/sum-of-ranks, search for someone who's rank is a multiple of 20 (e.g 2017SWOR01 who is no. 20) and press enter.
Implement "PB Streaks" Statistic
From the "New statistic suggestion" email thread:
I’d like to propose the addition of PB streaks to the WCA Statistics page. There’s a wide interest in this statistic in the community and It’s easy to implement as the code for generating the rankings is already on Jonatan’s GitHub page. (Furthermore, but unnecessarily, Antonie is only two comps away from tying Evan's WR)
People might also find it interesting to filter out the ones that finished, leaving only the ongoing streaks as a ranking.
It might be too much to also separate them in continental and national rankings, but that’s up to you.
Add redirect parameters on login in statistics service
Redirect parameters should be present on every login. The absence of this makes the entire process a bit more cumbersome and potentially confusing for people who are not used to the platform.
For example, take this process:
Click on a WCA statistics database query URL
Login required
User logs in
Back to home screen. How do I get to the first screen again?
It is a bit counterintuitive. The same happens for any instance of login.
Originally posted by @GuidoDipietro in thewca/worldcubeassociation.org#6777 (comment)
Fix FMC Podium rankings
Bugs in statistics page
The following are the list of bugs that WST received through email a few days ago:
- Under Oldest standing national records, the Belizean NRs that Anthony Brooks got in 2009 are missing.
- The best counting result query has a typo: "Couting single" is the label for a column
- For recent success rate, rounding up the percentage gives people with N-1/N the top spot.
- For most competitions by country, and by year, cancelled competitions are included.
- I noticed under best medal collections for feet, Rafael de Andrade Cinoto appears with a former name (likely the one used for the feet podiums so idk if it needs to be changed)
Database export has not been replaced in >=19 days
The database export is supposed to be automatically replaced every 7 days:
statistics/scripts/get_db_export.sh
Lines 19 to 22 in 6e5c104
But it has not updated since 12/28/2022:
I don't see anything in the code that suggests an issue handling a change in year (I could be wrong). Is there a way to check if the cron is running?
Sum of Ranks presentation changes
Replace travis with GH Actions
Can we call this the new WCA standard? We are already migrating some projects.
Stat includes cancelled competitions
Description:
This statistic probably includes cancelled competitions, because the only competition in Czechia, that was in 2020, was Kostelec Open 2020, which was cancelled due to the pandemic.
Expected behaviour:
It shouldn't count cancelled competitions, since those didn't actually happened.
Fixing this should be very easy, since there are columns cancelled_at
and cancelled_by
in the Competitions
table, that should probably be null for competitions that weren't cancelled.
TODO list before releasing
I'm having less time to dedicate to this project so we are releasing it.
- Assigning a subdomain (impeditive)
- Restrict database query to logged users and point login from staging to prod (Impeditive)
- Hide backend endpoints to create statistics in prod so people won't mess with us (Impeditive)
- Assign a load balancer and use that address instead of server's address so we can be more consistent (we are using elastic IP, which can help, but it's not good enough IMO)
- Start using a dedicated database instead of one that lives in the server
- Use stats from various sources
- Stop using servers and start using ECS
- Change home page and about page
- Add a cron process to calculate stats
- HTTPs
Include export date
As of today, we are including the date when the statistics were generated, which is not very useful. We could include the export date, which is more meaningful.
- Migration with a new table that include export_date and execution_date. Statistics table should also include export_date
- Save export date in the control table
- Statistics should look at the last export date in the control table and the date should go as a parameter to frontend
- The front end needs to display this date in every page
Details
This file is the one that runs in the cron
https://github.com/thewca/statistics/blob/main/scripts/cron.sh
This line downloads the export dump
https://github.com/thewca/statistics/blob/main/scripts/generate_all_statistics.sh#L8
Around here, we should include a line to save the export date
https://github.com/thewca/statistics/blob/main/scripts/get_db_export.sh#L65
The export we use is the "not so well known developer database export" here
https://github.com/thewca/worldcubeassociation.org/wiki/Developer-database-export
As an export date, we can use the last date in the Results table
Best potential FMC mean is incorrect
From Elijah Brown on discord:
https://statistics.worldcubeassociation.org/statistics-list/best-potential-fmc-mean
this stat is not correct
2nd place combines first round and final
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.