jldbc / pybaseball Goto Github PK

View Code? Open in Web Editor NEW

1.2K 70.0 320.0 7.28 MB

Pull current and historical baseball statistics using Python (Statcast, Baseball Reference, FanGraphs)

License: MIT License

Python 99.60% Makefile 0.40%

baseball sabermetrics python data statcast

pybaseball's People

Contributors

Stargazers

Watchers

Forkers

elwarren sfigraf pjk302 tree-ind hellerpdx bittdy jskora biroc qiwsir tseale 903124 kimjunn jelloslinger jfreynolds trojanguard25 raivtash dhanson9 bullishbias tonycpsu ntanners shinichi-nakagawa kunalvaishnavi benikah patrickestark andersonfrailey arapoza redsnipertd jmartin1344 summeraz colanconnon whaiii farazirfan47 gwaybio mdrews93 pisomojado kirkwoodcis carterburns33 ryan-saklad gshau gjreda mstagner1983 wcountryman timandrews335 joshua-ramos carterlewis matthelwig fonnesbeck jaretk znob trevlawson mingsc donniekim411 alexchutkan bunkyfeats spilchen orah82 brendanahart jccarrigan kylejohnson363 hamptonmcneil cenuno tshankles30 smillerd jamiek5 rolfantlers newtype0222 tpoatsy3 drhoa3 jfeez webscrapist pythonthings twee22 jtrib joshuafortunatus lawewers kevinwang411 mdh3hc jlutz777 theslav1959 norraist vinodyk kylew7 salmonsd blue-shoes nvonutter zangerz831 colpanick bluebollo mwbroccolino mitchryan3 nkotak casey-white tylerliu42 jschne23 sarins artchaidez nickwan tallyw00d gzawodni elliota43

pybaseball's Issues

download_lahman() pull OLD Lahman dataset (baseballdatabank-2017.1)

When I execute pybaseball.download_lahman(), following is happened.

This problem was already fixed by this commit. But I guess this fix maybe not on released module to PyPI.
(I checked 1.0.5 and 1.0.7, but both set baseballdatabank-2017.1.zip to lahman.url).

Could you check how released module is right now, and fix this problem?

python setup.py egg_info" failed with error code 1

It seems like it's giving me an error on the requests dependency, but I have requests 2.18.4 and have tried updating to 2.19.1

Here's the full error message:

Collecting pybaseball
Using cached https://files.pythonhosted.org/packages/73/ed/032d64eddfbc0acad1cc509e5376ae63161f5ba2e079039ef04794fb51b7/pybaseball-1.0.7.tar.gz
Complete output from command python setup.py egg_info:
Traceback (most recent call last):
File "", line 1, in
File "/private/var/folders/s6/5hjjzxbs17g8t1ch00w6whvm0000gn/T/pip-install-pvC6TX/pybaseball/setup.py", line 90, in
'requests>=2.18.1'],
File "/Users/irarickman/anaconda2/lib/python2.7/distutils/core.py", line 111, in setup
_setup_distribution = dist = klass(attrs)
File "/Users/irarickman/anaconda2/lib/python2.7/site-packages/distribute-0.6.28-py2.7.egg/setuptools/dist.py", line 225, in init
_Distribution.init(self,attrs)
File "/Users/irarickman/anaconda2/lib/python2.7/distutils/dist.py", line 287, in init
self.finalize_options()
File "/Users/irarickman/anaconda2/lib/python2.7/site-packages/distribute-0.6.28-py2.7.egg/setuptools/dist.py", line 257, in finalize_options
ep.require(installer=self.fetch_build_egg)
File "/Users/irarickman/anaconda2/lib/python2.7/site-packages/distribute-0.6.28-py2.7.egg/pkg_resources.py", line 2029, in require
working_set.resolve(self.dist.requires(self.extras),env,installer))
File "/Users/irarickman/anaconda2/lib/python2.7/site-packages/distribute-0.6.28-py2.7.egg/pkg_resources.py", line 592, in resolve
raise VersionConflict(dist,req) # XXX put more info here
pkg_resources.VersionConflict: (certifi 2018.01.18 (/Users/irarickman/anaconda2/lib/python2.7/site-packages), Requirement.parse('certifi==2016.9.26'))

----------------------------------------

Command "python setup.py egg_info" failed with error code 1 in /private/var/folders/s6/5hjjzxbs17g8t1ch00w6whvm0000gn/T/pip-install-pvC6TX/pybaseball/

List of game_ids

Hey @jldbc thanks for creating this module! I'm currently using the statcast_single_game module for a project, and was wondering if you had a list of game_ids for previous seasons, or if you could point me in the right direction I would really appreciate it!

Making a fork that will be maintained

Hi guys,
I'm on GitHub Pro, I'm happy to work on this, I have a fork already -- what would you fine folks say to moving to there and working over there?

Feature: Local caching of data

This library relies on a set of data sources that are all online, which can be problematic for a few reasons:

You may wish to use it offline

and/or

Some of the data source it supports have rate limits (understandably so) that can result in data to not be returned when hit too frequently.

So after discussing with @schorrm in #85 we'd like to add a caching layer that would allow for data to be stored locally for repeat calls. I have some ideas that I will list in further comments, but other suggestions are welcome as well.

Drop index in statcast postprocessing

Hi @jldbc - thanks for putting this project together. I am using the statcast function to retrieve pitching data, and am proposing a simple change.

Adding drop=True in the reset_index() call on line 170 of statcast.py prevents an unnecessary column named "index". Happy to add change in a future PR.

Thanks again!

Pitch type is broken on statcast side

When generating a statcast query, there are now a number of pitches which are returned with 'pitch_type' set to what seems to be a date-string.

The easiest way to look at this is to use something like: data['pitch_type'].unique()

After a quick look, the vast majority of these pitches also return with blank fields in the following:

pfx_x, pfx_y, pfx_z
vx0, vy0, vz0, ax, ay, az
release_pos_x, release_pos_y, release_pos_z
pitch_name

However this doesn't cover all of the cases, the remainder seem normal except for a blank 'pitch_name' field - though there are plenty of pitches with blank pitch types that also return a blank 'pitch_name' field but are otherwise normal.

This is an issue on the statcast side: I've replicated the behaviour using a simple 2-day query from baseballsavant. With that in mind, this isn't really an issue with pybaseball as I don't think a cut & dry fix exists on the pybaseball end, but it's more something for people to be aware of.

For what it's worth, in case others want to remove these entries like I did, I used something like this:

mask = np.in1d(data['pitch_type'].astype(str).str[0], '1')
data = data[~mask]

Which covers you for these entries occurring across multiple years.

Thanks again for the hard work James.

Cheers,
Rens

Statcast spliting broken for number of days = 6*N+1

When the number of day in date range is multiple of 6 plus 1 (i.e. 7, 13, 19...) the last day cannot be import successfully

 FutureWarning: Sorting because non-concatenation axis is not aligned. A future version
of pandas will change to not sort by default.

To accept the future behavior, pass 'sort=False'.

To retain the current behavior and silence the warning, pass 'sort=True'.

  final_data = pd.concat(dataframe_list, axis=0)

add statcast cliplink

It would be nice to have a link or at least the hash value where I can find the videoclip for a specific pitch. Right now I have to redo the search on the site to obtain it.

Create dataset offline Vs Request to website

Why don't you create a dataset from this data?
Instead of run every time a request to a website hosted in internet..

Make sense this idea.....

pandas.errors.ParserError: Error tokenizing data.

Getting the following error but it doesn't reproduce reliably which is odd.

Error message is "pandas.errors.ParserError: Error tokenizing data. C error: EOF inside string starting at line 13331."

I received this message when running statcast('2008-03-20','2011-11-11'), and it appears to have happened in the sub-query from 2010-07-02 to 2010-07-07.

Re-running the same full query statcast('2008-03-20','2011-11-11') doesn't reproduce the issue and neither does statcast('2010-07-02','2010-07-07').

The error doesn't seem to impact many of the smaller queries, but probably needs a fix since it becomes increasingly likely to break a query as the date range gets larger and the function depends on a larger number of requests.

Any desire for fantasy specific components?

I have a handful of raggedy modules I use for fantasy purposes:

Most added players yesterday (to see the Anthony Santander, this year or Aquino, last year)
Two start pitchers (run this every Sunday to see what my options are to add value)

I also use the CBS / Yahoo APIs, which are deprecated, but can still shine through some insights.

Anyone see any value integrating?

Broken scraper

The scraper broke a couple days ago. I fixed it and submitted a PR (#60).

Suggested improvements

I've been using this library for a little bit (and just submitted my first PR!) and have really been enjoying it.

I have a few suggestions that I'd like to make. I'm more than willing to do a lot of the work on these suggestions, but wanted to make sure I would be on the same page with other devs before I just start throwing code over the wall.

Add some unit/integration tests. This library would be ripe for adding some unit and integration tests. Both to help make sure code changes are good, but also as a way to determine if an external data source format has changed.
Refactor/reuse code. It seems that a lot of the code that is used for the same site (e.g., FanGraphs, BRef) is very repetitive. I think it would be great to distill some of this down to some shared code to help minimize issues when changes are made. Would do this after the above suggestion to ensure that the library is 100% backwards compatible.
Adding a caching layer. Sometimes when making calls to FanGraphs, the site will start rejecting your requests for exceeding their rate limit. Locally I wrote a caching wrapper that will save my DataFrames to CSV and return those if available. Integrating this into the library would allow others to take advantage.
Add an extra field to the FanGraphs results that returns a column that is joinable to the other data. E.g., team data or player data. Right now the best we'd be able to get is the FanGraphs integer id. We may have to set up our own internal mapping for teams. There are some other sources we could tap into, such as one of these for players:

http://crunchtimebaseball.com/baseball_map.html
https://www.smartfantasybaseball.com/tools/

Let me know what you folks think. Not trying to rock the boat, just trying to take this thing to the next level.

Quality Starts Stat Missing?

Hey, when I do pitching_stats I get tons of data. The only stat I can see that is missing is Quality Starts. Do you know how I can get that stat? I see it on the baseball reference page for pitchers, but am not sure how to get it via the tool.

Thank you for your time.

DTypeWarning Error using playerid_lookup()

Apologies for another issue. I'm getting this following error when using playerid_lookup(last, first) function fairly consistently:

sys:1: DtypeWarning: Columns (8,9,10) have mixed types. Specify dtype option on import or set low_memory=False.

This issue seems to occur when reading the people.csv file into a dataframe. The dtypes are inconsistent for some column(s) within that file and I'm not sure where.

There are two fixes and I wanted to see which was preferred. The first option is to specify the dtypes for each column in the file so that there is no guessing involved and/or reading the entire file into memory before allocating and assigning a dataframe. The second option is to specify the low_memory argument to pandas.read_csv() to be False (default is True). I'm finding conflicted statements about the argument, some saying it is deprecated and others that say it isn't. If it is deprecated, it will just suppress the error; however, if it isn't deprecated, then it will use more memory in order to read all tuples of a column into memory to validate a type.

Is the larger memory footprint at runtime satisfactory or should the dtypes be specified for all the columns in the Chadwick Register?

Question - Statcast pitcher spin rate

When using statcast_pitcher, there is a column "spin_rate_deprecated". There is also "spin_dir", "break_angle_deprecated", and "break_length_deprecated", all of which appear to be "NaN" for 2019 stats at the very least. Is there any way to access spin rate/spin direction/etc. for 2019 since these columns are now deprecated?

Question- filter by minimum number of pitches

Is it possible to filter by a minimum number of pitches/events? For example I want to get every fastball for a certain time period thrown by pitchers who threw at least "x" amount of fastballs in that time period.

Baseball Savant Scraping Issue

Hi there,

I first want to say thank you for your awesome work in putting together pybaseball. I was playing around with the package yesterday and ran the fivethirtyeight New Science of Hitting example with no issues. Today I was running some statcast and statcast_batter queries in much the same way as in the example, however I run into errors every time.

I now receive the following error when attempting to print the shape of the initial dataset in the New Science of hitting example: ValueError: could not convert string to float: 'Sinker'.

When running statcast_batter with an end-date as in the example I receive an Error: Query Timeout message. However when statcast_batter is run without an end-date, it returns without an error, however the player_id is ignored in the returned data set.

I have attached an image to better illustrate what I'm experiencing (run here with David Ortiz's player_id as in the statcast_batter example).

I have run this using both Python 2.7.10 and Python 3.6.1, with the same behaviour for both.

EDIT: I have tried manually scraping the data using the technique set out by @alanrkessler and baseball savant is returning a 'Error: Query Timeout. Please try to limit your query to less data.' as the singular field of the csv when you attempt to scrape the data.

Error: Query Timeout. Please try to limit your query to less data.

Getting "Error: Query Timeout. Please try to limit your query to less data" on some of the statcast() queries, but having trouble reproducing it reliably.

Feature Request: Single game fangraphs stats

This is probably a bit of an arduous task but I think a quite valuable one. Fangraphs has a collection of the most valuable and in-depth stats and having this sort of granularity would be invaluable. Would even help out with some of the annoying stuff if some of the other contributors are on board.

pitching_stats is not working

the "pitching_stats" command returns a key error on "K-BB%"

Baseball Reference Pitcher WAR

Are there any plans to add WAR or any stats from the Player Value tables on Baseball Reference to pitching_stats_bref(season)?

I was looking to find the largest difference between bWAR and fWAR for pitchers, but I am unable to without a WAR column in the dataframe that returns from pitching_stats_bref(season). Were there issues in obtaining that data or just never implemented?

Batted ball trajectories (pybbda integration)

I'm interested in pulling the batted ball trajectory calculator from pybbda into this library. See #101

current season?

Hi James. First, thank you! I'm playing with your package to learn more python and matplotlib.

However, I'm not getting current season information with standings, schedule_and_record and batting_stats_range.

Everything works fine with previous seasons though.

Thanks!

Lahman functions need to re-download the DB every time

Really like this library, but one thing I don't get is why the Lahman DB needs to be re-downloaded every time you try and use a function interfacing with the Lahman DB.

I'm proposing something like this:

def get_lahman_zip():
    if os.path.exists(base_string):
        z = None
    else:
        s = requests.get(url,stream=True)
        z = zipfile.ZipFile(BytesIO(s.content))
    return z

And then all Lahman interfacing functions can be edited like so:

def parks():
    z = get_lahman_zip()
    f = os.path.join(base_string, "Parks.csv")
    data = pd.read_csv(f if z is None else z.open(f), header=0, sep=',', quotechar="'")
    return data

This way you only have to call download_lahman once and every subsequent time you call parks() it will just use the downloaded DB.

This probably isn't the most elegant way to do it, but I think something like this would be a good idea.

Happy to discuss, do the changes myself and file a pull request!

Set correct data types in Statcast data

Currently all numeric columns in the Statcast data are coerced to a float data type. This happens in the postprocessing function in statcast.py.

numeric_cols = ['release_speed','release_pos_x','release_pos_z','batter','pitcher','zone','hit_location','balls',
                'strikes','game_year','pfx_x','pfx_z','plate_x','plate_z','on_3b','on_2b','on_1b','outs_when_up','inning',
                'hc_x','hc_y','fielder_2','vx0','vy0','vz0','ax','ay','az','sz_top','sz_bot',
                'hit_distance_sc','launch_speed','launch_angle','effective_speed','release_spin_rate','release_extension',
                'game_pk','pitcher.1','fielder_2.1','fielder_3','fielder_4','fielder_5',
                'fielder_6','fielder_7','fielder_8','fielder_9','release_pos_y',
                'estimated_ba_using_speedangle','estimated_woba_using_speedangle','woba_value','woba_denom','babip_value',
                'iso_value','launch_speed_angle','at_bat_number','pitch_number','home_score','away_score','bat_score',
                'fld_score','post_away_score','post_home_score','post_bat_score','post_fld_score']

data[numeric_cols] = data[numeric_cols].astype(float)

Many of those numeric columns always contain integer values (balls, strikes, outs_when_up, etc.). These columns should be coerced to an int data type.

Perhaps more importantly, many of the other columns are ID values (batter, pitcher, game_pk, etc.). It would make more sense to coerce these columns to a string data type (or int would also be better than float).

I would recommend the following changes in the postprocessing function:

Remove 'batter', 'pitcher', and all other ID columns from the numeric_cols list.
Create a new int_cols list with 'balls', 'strikes', and all other columns that always contain integer values. Then coerce these columns to an int data type.
- Edit: The int data type doesn't allow for nulls. If you're willing to make Pandas >= 0.24.0 a requirement, then you could use the Pandas nullable integer data type. (This data type could also be useful in some other places)

batting_stats_range IndexError: list index out of range

Receiving the ERROR below CODE below when trying to get batting stats based on date range. Can anyone provide any help with this?

CODE
from pybaseball import batting_stats_range
from pybaseball import pitching_stats_range

data = batting_stats_range('2017-05-01', '2017-05-08')
data.head()

ERROR

IndexError Traceback (most recent call last)
in
----> 1 data = batting_stats_range('2017-05-01', '2017-05-08')
2 data.head()

~/opt/anaconda3/lib/python3.7/site-packages/pybaseball/league_batting_stats.py in batting_stats_range(start_dt, end_dt)
79 # retrieve html from baseball reference
80 soup = get_soup(start_dt, end_dt)
---> 81 table = get_table(soup)
82 table = table.dropna(how='all') # drop if all columns are NA
83 # scraped data is initially in string format.

~/opt/anaconda3/lib/python3.7/site-packages/pybaseball/league_batting_stats.py in get_table(soup)
49
50 def get_table(soup):
---> 51 table = soup.find_all('table')[0]
52 data = []
53 headings = [th.get_text() for th in table.find("tr").find_all("th")][1:]

IndexError: list index out of range

Statcast Queries Currently Broken

Baseball Savant changed some column names, which is currently breaking the statcast function. Bill Petti gives some details here.

I'll look into the mapping later and try to get a quick deploy out.

Stat types for statcast are named differently from the baseball savant CSV documentation.

https://baseballsavant.mlb.com/csv-docs

Some items don't match like hit distance is hit_distance_sc which is just hit_distance in the docs.

Launch angle is launch_speed_angle, I don't see the point of this, and if there is one, a full documentation needs to be made.

Also, quick help, can someone show me where I can find if the strike was called or swung on.

team_pitching_bref not working

`from pybaseball import team_pitching
import pandas as pd
pd.set_option('display.max_columns', None)

data = team_pitching_bref('NYY', 2019)
print(data)`

NameError: name 'team_pitching_bref' is not defined

Developing a style guide

So in a recent PR I tried to bring in some formatting changes (not necessarily on purpose, mostly because it was my first PR, and I always have some sort of auto pep 8 formatter on, 😆).

This led to quite a few unrelated code changes and some understandable concern on @schorrm 's part (expecially for some of the choices that were made by the formatter.

However, I think (and I believe @schorrm agrees to some extent) that adding some code style standards could be fruitful, and if we can all coalesce around a shared tool and config to keep it painless, the better! The goal of the style guide would to make the code more readable and internally consistent.

So I'd like to use this issue to discuss what some participants like in a style guide, don't like in a style guide, or are apathetic to.

I'll begin with a few of mine.

I prefer capping lines at a length of 120, but am not opposed (and frequently will) break apart lines smaller than that if I feel like it will improve readability. For example:

# Technically legal
cols = [col.replace('*', '').replace('#', '') for col in cols]

# More readable in my opinion
cols = [
    col.replace('*', '').replace('#', '') for col in cols
]

# For extra long lines I'd even break it this way as well
cols = [
    col.replace('*', '').replace('#', '').extraLongFunctionGoesHereToTakeUpRoom()
    for col in cols
]

Along those same lines I prefer to break apart long strings like this:

my_string = "Pretend this string goes on for something like 120 characters... " +
    "The rest of the string goes here."

Along those same lines I think for long lines that are function calls, treating the function parens like curly brackets in other languages makes for a clean look:

ata = fangraphs.get_fangraphs_tabular_data_from_url(
    _FG_TEAM_PITCHING_URL.format(
        start_season=start_season,
        end_season=end_season,
        league=league,
        ind=ind,
    )
)

I prefer spacing around my type declarations, variable assignments, and operators:

def team_pitching(start_season: int = None):
    for season in range(start_season, start_season + 1):
        pass

For longer function definitions I prefer splitting them apart per parameter:

def team_pitching(
    start_season: int,
    end_season: int = None,
    league: str = 'all',
    ind: int = 1,
):

I also really prefer when the code gets a pylint score of 10.0, but there are some linting failures I don't flip out about (like docstrings on modules).
When possible, I prefer type hinting in function params and returns so MyPy can help catch misuse before runtime.
I would like to eliminate all print statements if possible. Print statements are an uncontrolled side effect for anyone using the library downstream. Instead we should use the logging library and give the user some control over where the logs go:

https://docs.python.org/3/howto/logging.html#logging-basic-tutorial

Add Lahman DB

I'm thinking this package's next addition should be the Lahman database from http://seanlahman.com/. I'm going to add some starter code for this and use this for tracking any issues that arise.

scrape today's game pks?

i saw the post of the previous season's game id's csv file.

i was wondering if there was a function to scrape today's game pks? sort of hard to use the statcast function without knowing the game pks.

Batter handedness splits

Tossing this out as a question before I attempt to implement it...

Are one of the numerous columns returned with betting_stats() the platoon split? Thanks!

Add Seamheads Data

Seamheads has the best Negro League data. By far.

batting_stats_range and pitching_stats_range - IndexError: list index out of range

I'm running batting_stats_range and pitching_stats_range once a day for the last 3 months with range parameters of the day I'm currently running.
Been getting data for both every day until July 27th, 2020, after that date I've been getting that error every day. Nothing changed from my side so I assume it's an issue with pybaseball API or one of its providers.

Is anyone else experiencing the same issue?

Beautiful Soup Error

I've noticed this a few times throughout this repo but is anyone else having this issue?

from pybaseball import team_pitching
team_pitching(1999)

AttributeError: 'NoneType' object has no attribute 'find_all'

Issue is coming from the get_table(soup, ind) function

Add descriptions to function returns / table columns?

I've been developing some projects, and one of the main pet peeves (let me be clear, it isn't major) is not having a good description of the tables/of the columns that they have.

I wonder if other people would appreciate having descriptions of the tables centralized in the documentation of this repo?

Pros:
- Easier use of the functions, knowing exactly what they return (What shape and what columns, maybe even what columns mean)
Cons: It's possible that the various databases chage their shape (low risk, they are pretty stable)

Basically more documentation for function outputs.

What do you guys think?

Maintaining this package

It's been a really long time since any pull requests have gotten dealt with. Would it be possible to get some more maintainers here? I'm extremely thankful for what you've done, and of course, I understand you have a real job and stuff, but I don't. I'm a 4th year CompSci student with some time to kill and I'd love to help out as a maintainer, and I'm sure there are other contributors here who would also be happy to help maintain this package.
Thank you

New Maintainer for this Package + Repo

I'm adding @schorrm as a collaborator for both this repo and its associated PyPI project. Moshe has been active in using and improving the package both here and in his fork. Having a more active maintainer and keeping most users under a single PyPI installation will be good for the quality and stability of the project. I'm excited to have him on board!

Splitting Statcast Requests into Multiple Queries

Continuing a discussion that was started in issue #20, I think one of the ways we can solve the issues users are having when they try to run statcast_pitcher and statcast_batter queries over periods longer than about two months would be to break those queries up.

My general approach to this would be to first determine if a query is longer than some arbitrary maximum (probably around 60 days), then use some of the features in Python's datetime package to iterate over the user specified period in chunks calling the necessary function each time. This will result in a list of DataFrames returned by each function call and which can be bound together.

A secondary goal would be to run the queries in parallel, but I think that can wait until after this initial work is done.

I'm happy to work on adding this feature and would love to hear any feedback/ideas from others.

Marcel projections (pybbda integration)

after talking with @schorrm we think it makes sense to pull some of the features from pybbda into this library. The first one I want to start with is marcel projections. this issue is for any discussion we may want to have around this feature. See #99

Need reverse playerid_lookup() function

The current playerid_lookup() function finds player ids, taking name as input. Something doing the opposite would be useful, taking an id from a statcast query as input and returning the player's name. The option to do this in bulk would be even better.

Statcast Game Date Format Problem

Did the statcast csv change its date format? Someone tagged me in this on Twitter https://twitter.com/ckurcon/status/1301913190465507328

I get: ValueError: time data "2020-09-03T00:00:00.000Z" doesn't match format specified

team_batting issue

from pybaseball import team_batting
team_batting(2016)

My error returns this:

team_batting(2016)
Traceback (most recent call last):

File "", line 1, in
team_batting(2016)

File "/opt/anaconda3/lib/python3.7/site-packages/pybaseball/team_batting.py", line 76, in team_batting
table = get_table(soup, ind)

File "/opt/anaconda3/lib/python3.7/site-packages/pybaseball/team_batting.py", line 26, in get_table
rows = table_body.find_all('tr')

AttributeError: 'NoneType' object has no attribute 'find_all'

List Index out of Range when Query Batting Stats

Got this error when attempting to download the dataframe.

Question - Statcast batter

quick question - is there an easy way to distinguish every at-bat (like an at bat id) when using the statcast batter package?

Feature Request: Position Appearances

It may be a difficult feature to add, but I would like to see the addition of player position appearances added. I realize this data is recorded in the Lahman Database, but that does not include the current season appearances. This information could be valuable in determining differences between positions, position scarcity, and other important areas of analysis.