Giter Club home page Giter Club logo

Comments (5)

joenano avatar joenano commented on June 22, 2024

Your 100% right, the times will be meaningless in some cases with horses eased down etc. Final times alone dont really mean much in European racing without the sectionals and even less so over jumps.

Im a flat racing man so the jumps probably didnt enter my mind when I added the time calculation! I think you just have to use your judgement as to when they are useful and trustworthy on an individual basis. Not sure that I can do anything other than just record them.

Appreciate the feedback and ideas.

from rpscrape.

puntermick avatar puntermick commented on June 22, 2024

ta for the other thread character encoding fix.
I will re download and have another whirl.

And yup re "just record them"
All you can do is make a good spade.
Up to the spade user as to whether he uses said spade for his career in ditch digging
or if he uses it thoughtlessly to dig his own grave :)

Sectionals..a growing area in the uk.
Silly scenario of some courses have them some do not.
Really needs racing authorities to say
"By Date X all courses must provide them in agreed standard format"
and ideally
"They must be fed into a central authority database from where they will be distributed freely to all punters with open source ethos."

We live in hope if not expectation :)

Other Ideas

Ponder some future means to download by date range.
A data scientist may be content with a one off trawl for data upon which to do some study.
A punter will be more prone to seek continual daily updates.

A download by date range feature could perhaps form the backbone
of their update routine.
Some form of scheduled task could perhaps call a download by date range call on a regular basis.
( if last update date was stored somehow that could be useful. That may facilitate and update script that needed no date parameters passed to it and that I suspect may make automated scheduling easier)
It would reduce the inefficiency of having to scrape a whole year when only 1 day or one week etc was required. Faster for the user and more respectful to the source.

Perhaps even useful if future scrape debugging was needed for a weird page.
Date range would let you more closely target the troublesome spot etc.

Just a few brain storm ideas.
Sure you have more ideas than you do have spare time :)

from rpscrape.

puntermick avatar puntermick commented on June 22, 2024

PS line 146

int(year) < 2019

To permit 2019 data download is it is as simple as changing that to 2020 or does anything else need done?

from rpscrape.

joenano avatar joenano commented on June 22, 2024

Yeah we are lagging behind the rest of the world when it comes to timing, probably by design, the bookies have more influence here than elsewhere.

Scraping by date range would require a different method, both in getting the individual race urls, and in storing the data. The current method gets every race url from a given year at a given track with one request. To scrape by data range, every date would need to be scraped individually, which is easy enough but I opted for the more efficient method when making the tool as it was more about historical data.

I will get the scrape by date working as a separate script and see where it goes as I think its a good idea. Im currently working on a different project and I havent thought about this in a while but you have given me some motivation for it.

And yes, just increasing that will increase the valid year range. I have updated that to 2020.

from rpscrape.

puntermick avatar puntermick commented on June 22, 2024

bookies with more influence..yup.
I thought it a sad day when racing authorities opted to go for a % of bookie profit share instead of a % of turnover as per Ireland.
It shifted uk Racing bodies to very much on the bookies side and against punters.
Turnover % would have positioned them as more neutral. That would be better from a longer term PR perspective. As it is they are now condoning accomplices to poor bookie behavior to racing punters.

They may face a degree of suffering as well from shifty bookie accountancy tricks which are easier to do on net profit than turnover. Racing as a loss leader etc

Water under the bridge that one though.

As for your motivation levels..

I suspect they will rise as your favored flat season approaches :)

What you have got so far here is brilliant as is.

from rpscrape.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.