Giter Club home page Giter Club logo

google_classroom's People

Contributors

agonz09 avatar dchess avatar denglender avatar susanxiong avatar zkagin avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

google_classroom's Issues

Add individualStudentOptions for courseWork

There is a nested dataset within the courseWork endpoint that is used when an assignment is assigned to a subset of students. This is necessary to pull in so that it can be determined which assignments were not assigned to all students and so missing assignments in analysis can be more exact. However, adding this directly in the CourseWork table would blow up the data and it would no longer be at the same level of granularity.

I propose we add a new table that maps the courseWorkId to the studentId from this individualStudentOptions list.

Fix OrgUnit ValueError bug

https://github.com/kipp-bayarea/google_classroom/blob/e8e40c5ae5891adf1eb94ef1b717f9a6a0d91b9c/api.py#L169-L171

@zkagin I am encountering an issue running the latest branch (traceback below)

Traceback (most recent call last):
  File "/root/.local/share/virtualenvs/code-_Py8Si6I/lib/python3.7/site-packages/tenacity/__init__.py", line 394, in call
    result = fn(*args, **kwargs)
  File "/code/timer.py", line 22, in wrapper
    results = func(*args, **kwargs)
  File "/code/api.py", line 104, in get_and_write_to_db
    results = self.request_data().execute()
  File "/code/api.py", line 169, in request_data
    if self.org_unit_id:
  File "/root/.local/share/virtualenvs/code-_Py8Si6I/lib/python3.7/site-packages/pandas/core/generic.py", line 1479, in __nonzero__
    f"The truth value of a {type(self).__name__} is ambiguous. "
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "main.py", line 159, in <module>
    main(Config)
  File "main.py", line 97, in main
    sql, overwrite=False, debug=config.DEBUG
  File "/root/.local/share/virtualenvs/code-_Py8Si6I/lib/python3.7/site-packages/tenacity/__init__.py", line 311, in wrapped_f
    return self.call(f, *args, **kw)
  File "/root/.local/share/virtualenvs/code-_Py8Si6I/lib/python3.7/site-packages/tenacity/__init__.py", line 391, in call
    do = self.iter(retry_state=retry_state)
  File "/root/.local/share/virtualenvs/code-_Py8Si6I/lib/python3.7/site-packages/tenacity/__init__.py", line 351, in iter
    six.raise_from(retry_exc, fut.exception())
  File "<string>", line 3, in raise_from
tenacity.RetryError: RetryError[<Future at 0x7f1a4e5b4910 state=finished raised ValueError>]

Converting Params To Environment Variables

Right now to configure which data gets pulled requires modifying the dockerfile to exec different commands. This should be moved to booleans in an env file so that code does not need to be modified to configure the call, and so configurations can be saved locally.

Add error handling for dates

If a date in Google Classroom is outside of the bounds panda can parse, add error handling that logs the record id (and course id if relevant) and inserts it with a null date. Example if a coursework timestamp looks like "9999-10-10 04:59:00" this would be handled as a NULL date instead but log the coursework id and the corresponding course id.

This issue was identified by KIPP Texas.

Support for Meet Data

Classroom has an integration w/Meet, which provides video chat. Would be great to pull in Meet usage data as well via reports API for the call_ended event.

From my perspective, the relevant columns are:

  • conference_id
  • device_type
  • display_name
  • duration_seconds
  • endpoint_id
  • identifier
  • identifier_type
  • ip_address
  • is_external
  • meeting_code
  • organizer_email

Add dueDate parsing to CourseWork

Currently the dueDate timestamp is parsed into the following fields:

  • dueDate.year
  • dueDate.month
  • dueDate.day
  • duetTime.hours
  • dueTime.minutes

It would be preferable for this to be a single datetime field.

Possible to get only new or updated data?

Anyone know if it's possible to get new/updated data from the API? I have a similar script running to pull data, but have a bottleneck on StudentSubmissions since it has a huge amount of data now after about ~2 months of remote learning. Curious to hear if it might be possible to get new and or recently updated data in incremental fashion.

Adding Notifications

The current job would benefit from a way to notify success/errors when running as a job. We do this standard by adding a mailer that emails a Slack channel, but am wondering if we need a broader approach for this project.

Invalid column name 'ImportDate'.

Our morning sync errored today with the following:

2020/09/04 11:00:36 - Google Classroom - Running on platform : Windows Server 2012 R2
2020/09/04 11:00:36 - Google Classroom - Executing command : cmd.exe /C "C:\Users\SZ\AppData\Local\Temp\kettle_637ecd0b-eebf-11ea-bf57-dfabbe81a29eshell.bat"
2020/09/04 11:00:36 - Google Classroom - (stdout) 
2020/09/04 11:00:36 - Google Classroom - (stdout) c:\users\sz\google_classroom\google_classroom>pipenv run python c:\users\sz\google_classroom\google_classroom\main.py 
2020/09/04 11:00:37 - Google Classroom - ERROR (version 6.1.0.1-196, build 1 from 2016-04-07 12.08.49 by buildguy) : (stderr) Loading .env environment variables…
2020/09/04 11:00:46 - Google Classroom - (stdout) 2020-09-04 11:00:46AM Eastern Daylight Time | INFO: OrgUnits: Generating requests...
2020/09/04 11:00:46 - Google Classroom - (stdout) 2020-09-04 11:00:46AM Eastern Daylight Time | INFO: OrgUnits: 1 requests remaining. On page 0.
2020/09/04 11:00:48 - Google Classroom - (stdout) 2020-09-04 11:00:48AM Eastern Daylight Time | INFO: EndPoint.batch_pull_data completed in 5.26 seconds.
2020/09/04 11:01:05 - Google Classroom - (stdout) 2020-09-04 11:01:05AM Eastern Daylight Time | INFO: StudentUsage: Generating requests...
2020/09/04 11:01:05 - Google Classroom - (stdout) 2020-09-04 11:01:05AM Eastern Daylight Time | INFO: StudentUsage: 6 requests remaining.
2020/09/04 11:01:11 - Google Classroom - (stdout) 2020-09-04 11:01:11AM Eastern Daylight Time | ERROR: (pyodbc.ProgrammingError) ('42S22', "[42S22] [Microsoft][ODBC Driver 13 for SQL Server][SQL Server]Invalid column name 'ImportDate'. (207) (SQLExecute); [42S22] [Microsoft][ODBC Driver 13 for SQL Server][SQL Server]Statement(s) could not be prepared. (8180)")
2020/09/04 11:01:11 - Google Classroom - (stdout) [SQL: INSERT INTO custom.[GoogleClassroom_StudentUsage] ([Email], [AsOfDate], [LastUsedTime], [ImportDate]) VALUES (?, ?, ?, ?)]
[...]
2020/09/04 11:01:11 - Google Classroom - (stdout) (Background on this error at: http://sqlalche.me/e/f405)
2020/09/04 11:01:11 - Google Classroom - (stdout) Traceback (most recent call last):
2020/09/04 11:01:11 - Google Classroom - (stdout)   File "C:\Users\SZ\.virtualenvs\google_classroom-Q34DGSlj\lib\site-packages\sqlalchemy\engine\base.py", line 1228, in _execute_context
2020/09/04 11:01:11 - Google Classroom - (stdout)     cursor, statement, parameters, context
2020/09/04 11:01:11 - Google Classroom - (stdout)   File "C:\Users\SZ\.virtualenvs\google_classroom-Q34DGSlj\lib\site-packages\sqlalchemy\dialects\mssql\pyodbc.py", line 413, in do_executemany
2020/09/04 11:01:11 - Google Classroom - (stdout)     cursor, statement, parameters, context=context
2020/09/04 11:01:11 - Google Classroom - (stdout)   File "C:\Users\SZ\.virtualenvs\google_classroom-Q34DGSlj\lib\site-packages\sqlalchemy\engine\default.py", line 587, in do_executemany
2020/09/04 11:01:11 - Google Classroom - (stdout)     cursor.executemany(statement, parameters)
2020/09/04 11:01:11 - Google Classroom - (stdout) pyodbc.ProgrammingError: ('42S22', "[42S22] [Microsoft][ODBC Driver 13 for SQL Server][SQL Server]Invalid column name 'ImportDate'. (207) (SQLExecute); [42S22] [Microsoft][ODBC Driver 13 for SQL Server][SQL Server]Statement(s) could not be prepared. (8180)")
2020/09/04 11:01:11 - Google Classroom - (stdout) 
2020/09/04 11:01:11 - Google Classroom - (stdout) The above exception was the direct cause of the following exception:
2020/09/04 11:01:11 - Google Classroom - (stdout) 
2020/09/04 11:01:11 - Google Classroom - (stdout) Traceback (most recent call last):
2020/09/04 11:01:11 - Google Classroom - (stdout)   File "c:\users\sz\google_classroom\google_classroom\main.py", line 174, in <module>
2020/09/04 11:01:11 - Google Classroom - (stdout)     main(Config)
2020/09/04 11:01:11 - Google Classroom - (stdout)   File "c:\users\sz\google_classroom\google_classroom\main.py", line 69, in main
2020/09/04 11:01:11 - Google Classroom - (stdout)     pull_data(config, creds, sql)
2020/09/04 11:01:11 - Google Classroom - (stdout)   File "c:\users\sz\google_classroom\google_classroom\main.py", line 96, in pull_data
2020/09/04 11:01:11 - Google Classroom - (stdout)     usage.batch_pull_data(dates=date_range_string, overwrite=False)
2020/09/04 11:01:11 - Google Classroom - (stdout)   File "c:\users\sz\google_classroom\google_classroom\timer.py", line 22, in wrapper
2020/09/04 11:01:11 - Google Classroom - (stdout)     results = func(*args, **kwargs)
2020/09/04 11:01:11 - Google Classroom - (stdout)   File "c:\users\sz\google_classroom\google_classroom\endpoints\base.py", line 265, in batch_pull_data
2020/09/04 11:01:11 - Google Classroom - (stdout)     self._write_to_db(df)
2020/09/04 11:01:11 - Google Classroom - (stdout)   File "c:\users\sz\google_classroom\google_classroom\endpoints\base.py", line 97, in _write_to_db
2020/09/04 11:01:11 - Google Classroom - (stdout)     self.sql.insert_into(self.table_name, df, chunksize=10000)
2020/09/04 11:01:11 - Google Classroom - (stdout)   File "C:\Users\SZ\.virtualenvs\google_classroom-Q34DGSlj\lib\site-packages\sqlsorcery\__init__.py", line 210, in insert_into
2020/09/04 11:01:11 - Google Classroom - (stdout)     dtype=dtype,
2020/09/04 11:01:11 - Google Classroom - (stdout)   File "C:\Users\SZ\.virtualenvs\google_classroom-Q34DGSlj\lib\site-packages\pandas\core\generic.py", line 2663, in to_sql
2020/09/04 11:01:11 - Google Classroom - (stdout)     method=method,
2020/09/04 11:01:11 - Google Classroom - (stdout)   File "C:\Users\SZ\.virtualenvs\google_classroom-Q34DGSlj\lib\site-packages\pandas\io\sql.py", line 521, in to_sql
2020/09/04 11:01:11 - Google Classroom - (stdout)     method=method,
2020/09/04 11:01:11 - Google Classroom - (stdout)   File "C:\Users\SZ\.virtualenvs\google_classroom-Q34DGSlj\lib\site-packages\pandas\io\sql.py", line 1317, in to_sql
2020/09/04 11:01:11 - Google Classroom - (stdout)     table.insert(chunksize, method=method)
2020/09/04 11:01:11 - Google Classroom - (stdout)   File "C:\Users\SZ\.virtualenvs\google_classroom-Q34DGSlj\lib\site-packages\pandas\io\sql.py", line 755, in insert
2020/09/04 11:01:11 - Google Classroom - (stdout)     exec_insert(conn, keys, chunk_iter)
2020/09/04 11:01:11 - Google Classroom - (stdout)   File "C:\Users\SZ\.virtualenvs\google_classroom-Q34DGSlj\lib\site-packages\pandas\io\sql.py", line 669, in _execute_insert
2020/09/04 11:01:11 - Google Classroom - (stdout)     conn.execute(self.table.insert(), data)
2020/09/04 11:01:11 - Google Classroom - (stdout)   File "C:\Users\SZ\.virtualenvs\google_classroom-Q34DGSlj\lib\site-packages\sqlalchemy\engine\base.py", line 984, in execute
2020/09/04 11:01:11 - Google Classroom - (stdout)     return meth(self, multiparams, params)
2020/09/04 11:01:11 - Google Classroom - (stdout)   File "C:\Users\SZ\.virtualenvs\google_classroom-Q34DGSlj\lib\site-packages\sqlalchemy\sql\elements.py", line 293, in _execute_on_connection
2020/09/04 11:01:11 - Google Classroom - (stdout)     return connection._execute_clauseelement(self, multiparams, params)
2020/09/04 11:01:11 - Google Classroom - (stdout)   File "C:\Users\SZ\.virtualenvs\google_classroom-Q34DGSlj\lib\site-packages\sqlalchemy\engine\base.py", line 1103, in _execute_clauseelement
2020/09/04 11:01:11 - Google Classroom - (stdout)     distilled_params,
2020/09/04 11:01:11 - Google Classroom - (stdout)   File "C:\Users\SZ\.virtualenvs\google_classroom-Q34DGSlj\lib\site-packages\sqlalchemy\engine\base.py", line 1288, in _execute_context
2020/09/04 11:01:11 - Google Classroom - (stdout)     e, statement, parameters, cursor, context
2020/09/04 11:01:11 - Google Classroom - (stdout)   File "C:\Users\SZ\.virtualenvs\google_classroom-Q34DGSlj\lib\site-packages\sqlalchemy\engine\base.py", line 1482, in _handle_dbapi_exception
2020/09/04 11:01:11 - Google Classroom - (stdout)     sqlalchemy_exception, with_traceback=exc_info[2], from_=e
2020/09/04 11:01:11 - Google Classroom - (stdout)   File "C:\Users\SZ\.virtualenvs\google_classroom-Q34DGSlj\lib\site-packages\sqlalchemy\util\compat.py", line 178, in raise_
2020/09/04 11:01:11 - Google Classroom - (stdout)     raise exception
2020/09/04 11:01:11 - Google Classroom - (stdout)   File "C:\Users\SZ\.virtualenvs\google_classroom-Q34DGSlj\lib\site-packages\sqlalchemy\engine\base.py", line 1228, in _execute_context
2020/09/04 11:01:11 - Google Classroom - (stdout)     cursor, statement, parameters, context
2020/09/04 11:01:11 - Google Classroom - (stdout)   File "C:\Users\SZ\.virtualenvs\google_classroom-Q34DGSlj\lib\site-packages\sqlalchemy\dialects\mssql\pyodbc.py", line 413, in do_executemany
2020/09/04 11:01:11 - Google Classroom - (stdout)     cursor, statement, parameters, context=context
2020/09/04 11:01:11 - Google Classroom - (stdout)   File "C:\Users\SZ\.virtualenvs\google_classroom-Q34DGSlj\lib\site-packages\sqlalchemy\engine\default.py", line 587, in do_executemany
2020/09/04 11:01:11 - Google Classroom - (stdout)     cursor.executemany(statement, parameters)
2020/09/04 11:01:11 - Google Classroom - (stdout) sqlalchemy.exc.ProgrammingError: (pyodbc.ProgrammingError) ('42S22', "[42S22] [Microsoft][ODBC Driver 13 for SQL Server][SQL Server]Invalid column name 'ImportDate'. (207) (SQLExecute); [42S22] [Microsoft][ODBC Driver 13 for SQL Server][SQL Server]Statement(s) could not be prepared. (8180)")
2020/09/04 11:01:11 - Google Classroom - (stdout) [SQL: INSERT INTO custom.[GoogleClassroom_StudentUsage] ([Email], [AsOfDate], [LastUsedTime], [ImportDate]) VALUES (?, ?, ?, ?)]
[...]
2020/09/04 11:01:11 - Google Classroom - (stdout) (Background on this error at: http://sqlalche.me/e/f405)

A couple days ago I pulled down the last few months of commits (in an attempt to see if that would address the deadlocking crash that we started to see again), so I'm guessing that's the proximal cause. I can easily create what appears to be the missing column in our DB, just passing along as an FYI that there may have been a breaking change in the DB structure at some point.

Adding Safety Checks

This library currently assumes a lot about the structure of data coming from the Google Classroom API and what files/tables exist in the integrated database. The goal here would be to make sure to check and no-op rather than crash in these cases, and potentially to warn the user with a clear error message if something is wrong and needs to be fixed to continue.

Adding Idempotency

Right now, when the library is called multiple times, it can duplicate some data. The goal is eventually to allow the user to run the script as many times as needed without worry that they may accidentally delete old data or double-count new data.

One option to solve this may involve transitioning away from the files caches in /data which currently are causing some of these issues and instead using the database as the source of truth, potentially instead using file logs only for auditing purposes.

Batch request ID already exists

Traceback (most recent call last):
File "main.py", line 131, in
main(Config)
File "main.py", line 126, in main
StudentSubmissions(classroom_service, sql, config).batch_pull_data(course_ids)
File "/code/timer.py", line 22, in wrapper
results = func(*args, **kwargs)
File "/code/api.py", line 182, in batch_pull_data
batch.add(request, request_id=request_id)
File "/root/.local/share/virtualenvs/code-_Py8Si6I/lib/python3.7/site-packages/googleapiclient/_helpers.py", line 134, in positional_wrapper
return wrapped(*args, **kwargs)
File "/root/.local/share/virtualenvs/code-_Py8Si6I/lib/python3.7/site-packages/googleapiclient/http.py", line 1398, in add
raise KeyError("A request with this ID already exists: %s" % request_id)
KeyError: 'A request with this ID already exists: 41010390897;None;2'

Coursework endpoint failing

We're getting the following error now when trying to download coursework:

2020-12-08 01:16:47PM Central Standard Time | ERROR: (pyodbc.DataError) ('22018', '[22018] [Microsoft][ODBC Driver 13 for SQL Server]Invalid character value for cast specification (0) (SQLParamData)')
[SQL: INSERT INTO dbo.[GoogleClassroom_CourseWork] ([courseId], id, title, description, state, [alternateLink], [creationTime], [updateTime], [dueDate], [maxPoints], [workType], [assigneeMode], [submissionModificationMode], [creatorUserId], [topicId]) VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)]
[parameters: (('222693886482', '223016268500', 'to do', None, 'PUBLISHED', 'https://classroom.google.com/c/MjIyNjkzODg2NDgy/a/MjIzMDE2MjY4NTAw/details', datetime.datetime(2020, 11, 11, 13, 29, 5, 499000), datetime.datetime(2020, 11, 11, 13, 31, 44, 627000), None, 100, 'ASSIGNMENT', 'ALL_STUDENTS', 'MODIFIABLE_UNTIL_TURNED_IN', '115624344439250234383', None), ('222693886482', '222718588596', 'class class', None, 'PUBLISHED', 'https://classroom.google.com/c/MjIyNjkzODg2NDgy/a/MjIyNzE4NTg4NTk2/details', datetime.datetime(2020, 11, 10, 20, 16, 31, 164000), datetime.datetime(2020, 11, 10, 20, 16, 44, 358000), None, 100, 'ASSIGNMENT', 'ALL_STUDENTS', 'MODIFIABLE_UNTIL_TURNED_IN', '115624344439250234383', None), ('222691921702', '232238999150', 'How much wood would a woodchuck chuck if a woodchuck could chuck wood?', 'ANSWER THE QUESTION!', 'PUBLISHED', 'https://classroom.google.com/c/MjIyNjkxOTIxNzAy/sa/MjMyMjM4OTk5MTUw/details', datetime.datetime(2020, 11, 19, 13, 44, 26, 403000), datetime.datetime(2020, 11, 19, 13, 48, 37, 97000), None, 70, 'SHORT_ANSWER_QUESTION', 'ALL_STUDENTS', 'MODIFIABLE_UNTIL_TURNED_IN', '104144867849088240281', None), ('222691921702', '228735845599', 'SPELL IT RIGHT!', 'You have to spell all the questions right!', 'PUBLISHED', 'https://classroom.google.com/c/MjIyNjkxOTIxNzAy/a/MjI4NzM1ODQ1NTk5/details', datetime.datetime(2020, 11, 11, 15, 24, 38, 974000), datetime.datetime(2020, 11, 11, 15, 39, 12, 928000), None, 100, 'ASSIGNMENT', 'ALL_STUDENTS', 'MODIFIABLE_UNTIL_TURNED_IN', '104144867849088240281', None), ('222689364757', '229324082553', 'Commercial Script ', 'Directions inside', 'PUBLISHED', 'https://classroom.google.com/c/MjIyNjg5MzY0NzU3/a/MjI5MzI0MDgyNTUz/details', datetime.datetime(2020, 11, 12, 18, 53, 12, 849000), datetime.datetime(2020, 11, 12, 18, 55, 45, 685000), datetime.datetime(2020, 11, 14, 0, 0), 100, 'ASSIGNMENT', 'ALL_STUDENTS', 'MODIFIABLE_UNTIL_TURNED_IN', '101375802740655167828', None), ('205884161781', '228719101204', 'nez perce', "\nHey it's Thursday already! Bam! You are an overcomer, an overachiever, and blessed with all that you need! Just like you put your clothes on put l ... (121 characters truncated) ... put your mind and heart into. Con ganas todo se puede! We believe in you! It is better to give than to receive. Being thankful makes your day better!", 'PUBLISHED', 'https://classroom.google.com/c/MjA1ODg0MTYxNzgx/a/MjI4NzE5MTAxMjA0/details', datetime.datetime(2020, 11, 11, 14, 38, 51, 239000), datetime.datetime(2020, 11, 11, 14, 39, 51, 698000), None, 100, 'ASSIGNMENT', 'ALL_STUDENTS', 'MODIFIABLE_UNTIL_TURNED_IN', '105301322954643330860', None), ('205884161781', '228719101157', 'interviw your sibling', "\nHey it's Thursday already! Bam! You are an overcomer, an overachiever, and blessed with all that you need! Just like you put your clothes on put l ... (121 characters truncated) ... put your mind and heart into. Con ganas todo se puede! We believe in you! It is better to give than to receive. Being thankful makes your day better!", 'PUBLISHED', 'https://classroom.google.com/c/MjA1ODg0MTYxNzgx/a/MjI4NzE5MTAxMTU3/details', datetime.datetime(2020, 11, 11, 14, 37, 32, 584000), datetime.datetime(2020, 11, 11, 14, 37, 49, 542000), None, 100, 'ASSIGNMENT', 'ALL_STUDENTS', 'MODIFIABLE_UNTIL_TURNED_IN', '105301322954643330860', None), ('205884161781', '223029072321', "what did you learn about the labraant family today are they still famouse i don't know you need to answer that question", 'today if you watch the video about the labrant family well then you would have no problem with this question you needed two wtch both of the video if you took the test if you have not go take the test then aswer this question', 'PUBLISHED', 'https://classroom.google.com/c/MjA1ODg0MTYxNzgx/sa/MjIzMDI5MDcyMzIx/details', datetime.datetime(2020, 11, 11, 13, 36, 30, 870000), datetime.datetime(2020, 11, 11, 13, 39, 35, 641000), None, 100, 'SHORT_ANSWER_QUESTION', 'ALL_STUDENTS', 'MODIFIABLE_UNTIL_TURNED_IN', '105301322954643330860', None) ... displaying 10 of 2227 total bound parameter sets ... ('202185157887', '230526788854', 'basketball', 'look at it', 'PUBLISHED', 'https://classroom.google.com/c/MjAyMTg1MTU3ODg3/a/MjMwNTI2Nzg4ODU0/details', datetime.datetime(2020, 11, 16, 13, 31, 25, 455000), datetime.datetime(2020, 11, 16, 13, 31, 22, 341000), datetime.datetime(2020, 11, 19, 5, 59), 100, 'ASSIGNMENT', 'ALL_STUDENTS', 'MODIFIABLE_UNTIL_TURNED_IN', '110665100727005067242', '230526788855'), ('202185157887', '229922436189', 'idk', None, 'PUBLISHED', 'https://classroom.google.com/c/MjAyMTg1MTU3ODg3/a/MjI5OTIyNDM2MTg5/details', datetime.datetime(2020, 11, 13, 20, 35, 49, 875000), datetime.datetime(2020, 11, 13, 20, 35, 51, 242000), None, 100, 'ASSIGNMENT', 'ALL_STUDENTS', 'MODIFIABLE_UNTIL_TURNED_IN', '110665100727005067242', None))]
(Background on this error at: http://sqlalche.me/e/13/9h9h)
Traceback (most recent call last):
File "C:\Users\WBartley.virtualenvs\google_classroom-J9HVh9GF\lib\site-packages\sqlalchemy\engine\base.py", line 1258, in _execute_context
cursor, statement, parameters, context
File "C:\Users\WBartley.virtualenvs\google_classroom-J9HVh9GF\lib\site-packages\sqlalchemy\dialects\mssql\pyodbc.py", line 417, in do_executemany
cursor, statement, parameters, context=context
File "C:\Users\WBartley.virtualenvs\google_classroom-J9HVh9GF\lib\site-packages\sqlalchemy\engine\default.py", line 590, in do_executemany
cursor.executemany(statement, parameters)
pyodbc.DataError: ('22018', '[22018] [Microsoft][ODBC Driver 13 for SQL Server]Invalid character value for cast specification (0) (SQLParamData)')

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File "main.py", line 171, in
main(Config)
File "main.py", line 150, in main
CourseWork(classroom_service, sql, config).batch_pull_data(course_ids)
File "S:\Projects\google_classroom\timer.py", line 22, in wrapper
results = func(*args, **kwargs)
File "S:\Projects\google_classroom\api.py", line 259, in batch_pull_data
self._write_to_db(df)
File "S:\Projects\google_classroom\api.py", line 96, in write_to_db
self.sql.insert_into(self.table_name, df, chunksize=10000)
File "C:\Users\WBartley.virtualenvs\google_classroom-J9HVh9GF\lib\site-packages\sqlsorcery_init
.py", line 210, in insert_into
dtype=dtype,
File "C:\Users\WBartley.virtualenvs\google_classroom-J9HVh9GF\lib\site-packages\pandas\core\generic.py", line 2663, in to_sql
method=method,
File "C:\Users\WBartley.virtualenvs\google_classroom-J9HVh9GF\lib\site-packages\pandas\io\sql.py", line 521, in to_sql
method=method,
File "C:\Users\WBartley.virtualenvs\google_classroom-J9HVh9GF\lib\site-packages\pandas\io\sql.py", line 1317, in to_sql
table.insert(chunksize, method=method)
File "C:\Users\WBartley.virtualenvs\google_classroom-J9HVh9GF\lib\site-packages\pandas\io\sql.py", line 755, in insert
exec_insert(conn, keys, chunk_iter)
File "C:\Users\WBartley.virtualenvs\google_classroom-J9HVh9GF\lib\site-packages\pandas\io\sql.py", line 669, in _execute_insert
conn.execute(self.table.insert(), data)
File "C:\Users\WBartley.virtualenvs\google_classroom-J9HVh9GF\lib\site-packages\sqlalchemy\engine\base.py", line 1014, in execute
return meth(self, multiparams, params)
File "C:\Users\WBartley.virtualenvs\google_classroom-J9HVh9GF\lib\site-packages\sqlalchemy\sql\elements.py", line 298, in _execute_on_connection
return connection._execute_clauseelement(self, multiparams, params)
File "C:\Users\WBartley.virtualenvs\google_classroom-J9HVh9GF\lib\site-packages\sqlalchemy\engine\base.py", line 1133, in _execute_clauseelement
distilled_params,
File "C:\Users\WBartley.virtualenvs\google_classroom-J9HVh9GF\lib\site-packages\sqlalchemy\engine\base.py", line 1318, in execute_context
e, statement, parameters, cursor, context
File "C:\Users\WBartley.virtualenvs\google_classroom-J9HVh9GF\lib\site-packages\sqlalchemy\engine\base.py", line 1512, in handle_dbapi_exception
sqlalchemy_exception, with_traceback=exc_info[2], from
=e
File "C:\Users\WBartley.virtualenvs\google_classroom-J9HVh9GF\lib\site-packages\sqlalchemy\util\compat.py", line 178, in raise

raise exception
File "C:\Users\WBartley.virtualenvs\google_classroom-J9HVh9GF\lib\site-packages\sqlalchemy\engine\base.py", line 1258, in _execute_context
cursor, statement, parameters, context
File "C:\Users\WBartley.virtualenvs\google_classroom-J9HVh9GF\lib\site-packages\sqlalchemy\dialects\mssql\pyodbc.py", line 417, in do_executemany
cursor, statement, parameters, context=context
File "C:\Users\WBartley.virtualenvs\google_classroom-J9HVh9GF\lib\site-packages\sqlalchemy\engine\default.py", line 590, in do_executemany
cursor.executemany(statement, parameters)
sqlalchemy.exc.DataError: (pyodbc.DataError) ('22018', '[22018] [Microsoft][ODBC Driver 13 for SQL Server]Invalid character value for cast specification (0) (SQLParamData)')
[SQL: INSERT INTO dbo.[GoogleClassroom_CourseWork] ([courseId], id, title, description, state, [alternateLink], [creationTime], [updateTime], [dueDate], [maxPoints], [workType], [assigneeMode], [submissionModificationMode], [creatorUserId], [topicId]) VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)]
[parameters: (('222693886482', '223016268500', 'to do', None, 'PUBLISHED', 'https://classroom.google.com/c/MjIyNjkzODg2NDgy/a/MjIzMDE2MjY4NTAw/details', datetime.datetime(2020, 11, 11, 13, 29, 5, 499000), datetime.datetime(2020, 11, 11, 13, 31, 44, 627000), None, 100, 'ASSIGNMENT', 'ALL_STUDENTS', 'MODIFIABLE_UNTIL_TURNED_IN', '115624344439250234383', None), ('222693886482', '222718588596', 'class class', None, 'PUBLISHED', 'https://classroom.google.com/c/MjIyNjkzODg2NDgy/a/MjIyNzE4NTg4NTk2/details', datetime.datetime(2020, 11, 10, 20, 16, 31, 164000), datetime.datetime(2020, 11, 10, 20, 16, 44, 358000), None, 100, 'ASSIGNMENT', 'ALL_STUDENTS', 'MODIFIABLE_UNTIL_TURNED_IN', '115624344439250234383', None), ('222691921702', '232238999150', 'How much wood would a woodchuck chuck if a woodchuck could chuck wood?', 'ANSWER THE QUESTION!', 'PUBLISHED', 'https://classroom.google.com/c/MjIyNjkxOTIxNzAy/sa/MjMyMjM4OTk5MTUw/details', datetime.datetime(2020, 11, 19, 13, 44, 26, 403000), datetime.datetime(2020, 11, 19, 13, 48, 37, 97000), None, 70, 'SHORT_ANSWER_QUESTION', 'ALL_STUDENTS', 'MODIFIABLE_UNTIL_TURNED_IN', '104144867849088240281', None), ('222691921702', '228735845599', 'SPELL IT RIGHT!', 'You have to spell all the questions right!', 'PUBLISHED', 'https://classroom.google.com/c/MjIyNjkxOTIxNzAy/a/MjI4NzM1ODQ1NTk5/details', datetime.datetime(2020, 11, 11, 15, 24, 38, 974000), datetime.datetime(2020, 11, 11, 15, 39, 12, 928000), None, 100, 'ASSIGNMENT', 'ALL_STUDENTS', 'MODIFIABLE_UNTIL_TURNED_IN', '104144867849088240281', None), ('222689364757', '229324082553', 'Commercial Script ', 'Directions inside', 'PUBLISHED', 'https://classroom.google.com/c/MjIyNjg5MzY0NzU3/a/MjI5MzI0MDgyNTUz/details', datetime.datetime(2020, 11, 12, 18, 53, 12, 849000), datetime.datetime(2020, 11, 12, 18, 55, 45, 685000), datetime.datetime(2020, 11, 14, 0, 0), 100, 'ASSIGNMENT', 'ALL_STUDENTS', 'MODIFIABLE_UNTIL_TURNED_IN', '101375802740655167828', None), ('205884161781', '228719101204', 'nez perce', "\nHey it's Thursday already! Bam! You are an overcomer, an overachiever, and blessed with all that you need! Just like you put your clothes on put l ... (121 characters truncated) ... put your mind and heart into. Con ganas todo se puede! We believe in you! It is better to give than to receive. Being thankful makes your day better!", 'PUBLISHED', 'https://classroom.google.com/c/MjA1ODg0MTYxNzgx/a/MjI4NzE5MTAxMjA0/details', datetime.datetime(2020, 11, 11, 14, 38, 51, 239000), datetime.datetime(2020, 11, 11, 14, 39, 51, 698000), None, 100, 'ASSIGNMENT', 'ALL_STUDENTS', 'MODIFIABLE_UNTIL_TURNED_IN', '105301322954643330860', None), ('205884161781', '228719101157', 'interviw your sibling', "\nHey it's Thursday already! Bam! You are an overcomer, an overachiever, and blessed with all that you need! Just like you put your clothes on put l ... (121 characters truncated) ... put your mind and heart into. Con ganas todo se puede! We believe in you! It is better to give than to receive. Being thankful makes your day better!", 'PUBLISHED', 'https://classroom.google.com/c/MjA1ODg0MTYxNzgx/a/MjI4NzE5MTAxMTU3/details', datetime.datetime(2020, 11, 11, 14, 37, 32, 584000), datetime.datetime(2020, 11, 11, 14, 37, 49, 542000), None, 100, 'ASSIGNMENT', 'ALL_STUDENTS', 'MODIFIABLE_UNTIL_TURNED_IN', '105301322954643330860', None), ('205884161781', '223029072321', "what did you learn about the labraant family today are they still famouse i don't know you need to answer that question", 'today if you watch the video about the labrant family well then you would have no problem with this question you needed two wtch both of the video if you took the test if you have not go take the test then aswer this question', 'PUBLISHED', 'https://classroom.google.com/c/MjA1ODg0MTYxNzgx/sa/MjIzMDI5MDcyMzIx/details', datetime.datetime(2020, 11, 11, 13, 36, 30, 870000), datetime.datetime(2020, 11, 11, 13, 39, 35, 641000), None, 100, 'SHORT_ANSWER_QUESTION', 'ALL_STUDENTS', 'MODIFIABLE_UNTIL_TURNED_IN', '105301322954643330860', None) ... displaying 10 of 2227 total bound parameter sets ... ('202185157887', '230526788854', 'basketball', 'look at it', 'PUBLISHED', 'https://classroom.google.com/c/MjAyMTg1MTU3ODg3/a/MjMwNTI2Nzg4ODU0/details', datetime.datetime(2020, 11, 16, 13, 31, 25, 455000), datetime.datetime(2020, 11, 16, 13, 31, 22, 341000), datetime.datetime(2020, 11, 19, 5, 59), 100, 'ASSIGNMENT', 'ALL_STUDENTS', 'MODIFIABLE_UNTIL_TURNED_IN', '110665100727005067242', '230526788855'), ('202185157887', '229922436189', 'idk', None, 'PUBLISHED', 'https://classroom.google.com/c/MjAyMTg1MTU3ODg3/a/MjI5OTIyNDM2MTg5/details', datetime.datetime(2020, 11, 13, 20, 35, 49, 875000), datetime.datetime(2020, 11, 13, 20, 35, 51, 242000), None, 100, 'ASSIGNMENT', 'ALL_STUDENTS', 'MODIFIABLE_UNTIL_TURNED_IN', '110665100727005067242', None))]
(Background on this error at: http://sqlalche.me/e/13/9h9h)

Merging Strategy

@zkagin What strategy would you recommend for approving multiple PRs? Do you generally prefer to approve the dependent branches and merge them into the parent branch before merging all of them into master?

Otherwise, it seems its necessary to keep rebasing after each merge.

SQL Deadlock

When running the StudentSubmissions endpoint query, the following error occurs (and is only visible after retries are exhausted or disabled):

sqlalchemy.exc.DBAPIError: (pyodbc.Error) ('40001', '[40001] [Microsoft][ODBC Driver 17 for SQL Server][SQL Server]Transaction (Process ID 85) was deadlocked on lock resources with another process and has been chosen as the deadlock victim. Rerun the transaction. (1205) (SQLExecDirectW)')
[SQL: SELECT [INFORMATION_SCHEMA].[TABLES].[TABLE_NAME]
FROM [INFORMATION_SCHEMA].[TABLES]
WHERE [INFORMATION_SCHEMA].[TABLES].[TABLE_SCHEMA] = CAST(? AS NVARCHAR(max)) AND [INFORMATION_SCHEMA].[TABLES].[TABLE_TYPE] = CAST(? AS NVARCHAR(max)) ORDER BY [INFORMATION_SCHEMA].[TABLES].[TABLE_NAME]]
[parameters: ('custom', 'BASE TABLE')]
(Background on this error at: http://sqlalche.me/e/dbapi)

StudentSubmissions request writing dupes to db

StudentSubmissions(classroom_service).get_and_write_to_db(
            sql, course_ids, debug=config.DEBUG
        )

When this call encounters certain courses, it is loading way more records than exist in the course. For instance there is a course with 22 assignments and 29 students, that returns 100Ks of records. It should only be pulling back ~600+ records.

Archive StudentSubmissions before refresh

If a student is removed from a course all of the submission data for that course is lost. We should create a snapshot of prior day data before truncating the table.

Possible logic:

  • Before truncating StudentSubmission table, append data to archive table
  • Refresh StudentSubmission table
  • Remove any matching assignment (by submission id) from archive table that exists in current table

This logic would also help maintain year over year data as well. This is potentially something we should consider for other data sources: teachers, students, courses, coursework, etc.

Fix data conversion error on CourseWork

When importing the CourseWork endpoint, the following error is occurring:
sqlalchemy.exc.ProgrammingError: (pyodbc.ProgrammingError) ('42000', '[42000] [Microsoft][ODBC Driver 17 for SQL Server][SQL Server]Error converting data type nvarchar to float. (8114) (SQLExecDirectW)')

This is either coming from the topicId or one of the dueDate/dueTime fields.

Add calendarId to course object/table

Courses in Google Classroom now have Google Calendars associated with them. It would be nice to have that ID available so that we can construct links to the calendar from the database. The Course resource in the Classroom API has a calendarId field, so it seems like a simple matter of adding that to the models in this code.

Query StudentSubmissions by coursework

Rather than querying the StudentSubmission endpoint by looping over Courses, it could be preferable to loop over individual assignments (CourseWork). That way the query could be date-bound by pre-filtering the assignments by creationTime. While this might result in more calls, it may reduce pagination because it would be less data volume overall. Using Courses to query pulls in every assignment ever in that class and many may no longer be relevant to monitor after a certain point. If history is necessary to maintain, one approach would be to handle similar to the StudentUsage data and truncate a subset and append.

Incremental Meet Updates

Pulling Google Meet data currently takes about 15 minutes for us, a relatively small network of schools. For a mid- to large-sized network, this could take a extremely long time.

It's probably a good idea eventually to change the Meet logic to query only for new data since the last update, and append that to the table, rather than dropping the table and re-adding everything.

Add runtime option for running the job for all endpoints

Now that batching is in place, it would be good to add a way to call the end-to-end job for all endpoints.

Currently this would need to run as:
docker run --rm -t google_classroom --usage --courses --topics --coursework --students --teachers --guardians --invites --submissions

Or by setting these all in the .env file.

It would be preferable to either add an optional --all flag to run them or to configure it such that if no flag is passed (or no env vars) then it defaults to running all.

Add partial data tracking on usage reports

When running the usage report for multiple dates if more than the most recent date has partial data, reruns only refresh the latest date and do not account for partial data on past dates.

One option would be to flag any dates that have partial data warnings and when rerunning the job truncate those dates.

Aliases endpoint does not record courseId

The course.aliases.list API responds with an array of CourseAlias objects. CourseAlias does not contain a courseId property though. Unfortunately, to be useful from a data reporting perspective, we need the courseId in the CourseAliases table.

I've been able to work around this issue for now by specifically handling aliases in the internal callback method of api.py -> EndPoint.batch_pull_data. If you insert the following lines starting at line 220 of api.py (right after the nextPageToken conditional block) it will add the courseId property to each CourseAlias object in the response["aliases"] array.

            """Fix up alias records to include course Id"""
            if "aliases" in response:
                for alias in response["aliases"]:
                    if "courseId" not in alias:
                        alias["courseId"] = course_id

I don't think this should be the permanent fix, as it breaks inheritance (the parent shouldn't know about things that don't apply to all children), but the only other way I could see for now was to reimplement the batch_pull_data callback as a class method on EndPoint that could be overridden in CourseAliases with the updated callback code. That looked like a lot more work though and would require refactoring external references like batch_data. I'm sure there's a better way to do this as well, given I'm not much of a python coder yet.

StudentSubmissions Memory Error

@zkagin Here's a traceback for that memory error I mentioned:

2020-09-26 12:09:40AM UTC | INFO: StudentSubmissions: Generating requests...
2020-09-26 12:09:46AM UTC | INFO: StudentSubmissions: 1923 requests remaining.
2020-09-26 12:09:57AM UTC | INFO: StudentSubmissions: 1630 requests remaining.
2020-09-26 12:10:25AM UTC | INFO: StudentSubmissions: 1382 requests remaining.
2020-09-26 12:10:36AM UTC | INFO: StudentSubmissions: Quota exceeded. Pausing for 20 seconds...
2020-09-26 12:10:56AM UTC | INFO: StudentSubmissions: 1287 requests remaining.
2020-09-26 12:11:30AM UTC | INFO: StudentSubmissions: 1101 requests remaining.
2020-09-26 12:12:01AM UTC | INFO: StudentSubmissions: 874 requests remaining.
2020-09-26 12:12:40AM UTC | INFO: StudentSubmissions: 671 requests remaining.
2020-09-26 12:14:42AM UTC | ERROR: RetryError[<Future at 0x7ff71ff4d250 state=finished raised MemoryError>]
Traceback (most recent call last):
  File "/root/.local/share/virtualenvs/-x-v5uFv0/lib/python3.7/site-packages/tenacity/__init__.py", line 412, in call
    result = fn(*args, **kwargs)
  File "/root/.local/share/virtualenvs/-x-v5uFv0/lib/python3.7/site-packages/googleapiclient/_helpers.py", line 134, in positional_wrapper
    return wrapped(*args, **kwargs)
  File "/root/.local/share/virtualenvs/-x-v5uFv0/lib/python3.7/site-packages/googleapiclient/http.py", line 1528, in execute
    self._execute(http, self._order, self._requests)
  File "/root/.local/share/virtualenvs/-x-v5uFv0/lib/python3.7/site-packages/googleapiclient/http.py", line 1473, in _execute
    parser.feed(for_parser)
  File "/usr/local/lib/python3.7/email/feedparser.py", line 175, in feed
    self._input.push(data)
  File "/usr/local/lib/python3.7/email/feedparser.py", line 110, in push
    parts = self._partial.readlines()
MemoryError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "main.py", line 167, in <module>
    main(Config)
  File "main.py", line 158, in main
    StudentSubmissions(classroom_service, sql, config).batch_pull_data(course_ids)
  File "/google_classroom/timer.py", line 22, in wrapper
    results = func(*args, **kwargs)
  File "/google_classroom/api.py", line 259, in batch_pull_data
    self._execute_batch_with_retry(batch)
  File "/google_classroom/api.py", line 161, in _execute_batch_with_retry
    retryer(batch.execute)
  File "/root/.local/share/virtualenvs/-x-v5uFv0/lib/python3.7/site-packages/tenacity/__init__.py", line 409, in call
    do = self.iter(retry_state=retry_state)
  File "/root/.local/share/virtualenvs/-x-v5uFv0/lib/python3.7/site-packages/tenacity/__init__.py", line 369, in iter
    six.raise_from(retry_exc, fut.exception())
  File "<string>", line 3, in raise_from
tenacity.RetryError: RetryError[<Future at 0x7ff71ff4d250 state=finished raised MemoryError>]

Fix request retries to be per request

Currently with the new method get_and_write_to_db when a request fails, the entire loop starts again. This should happen per request so that if a single request fails only that request retries.

I.E. if the students query loops over 1200 courses and a single course request fails midway it starts back over at course 1. It should just retry that single course request and continue on.

Usage Data fail on first run

@dchess - Was thinking more about the error we looked at today, and (admittedly not having fully immersed myself in the code), I don't think it was due to not having quotes around the date in the .env file.

I think it will always fail if there is no data in the student usage table and that it only worked today because my workaround fed data into the student usage data prior uncommenting the original code.

        usage = StudentUsage(admin_reports_service, sql, config, org_unit_id)
        last_date = usage.get_last_date()
        start_date = last_date + timedelta(days=1) or datetime.strptime(config.SCHOOL_YEAR_START, "%Y-%m-%d")

My quick read of StudentUsage.get_last_date() is that it's going to return None if there's no data previously loaded in the table, at which point the next line will throw an error because you can't add to None.

I had enough energy to type up this issue, but not enough to clear my database and try from scratch again to double check, so hopefully this is useful, and apologies if it's a false alarm.

DB Agnosticism

The current library assumes in a number of places that MSSQL is used as the library. One of the broader goals is to make this library database-agnostic, so that it can be used with everything from MSSQL to SQLite to cloud-based DBs like BigQuery or Redshift.

Add late column to studentSubmissions

There is a late boolean field in the API that would be good to add for simpler flagging of late assignments in analysis. I'm not sure why we didn't add this initially, but it seems like a simple addition.

Optimizing Performance

Right now all calls to the Google Classroom API are done sequentially and individually. As we identify bottlenecks, the goal is to optimize these calls and minimize the time to run this from hours to minutes. This can potentially be done through both batch calling the endpoints and through parallelization of calls.

Further details are pending a performance test.

Add Invitations endpoint

One previously ignored endpoint that could be valuable is the Invitations resource. This could be useful for identifying students who have been sent an invite, but haven't shown up in the Students by course dataset.

Add autodoc documentation

As we begin to near a place of stability and other users want to leverage this connector, it would be useful to include a basic level of documentation. We should consider adding Sphinx autodoc to auto-generate documentation from docstrings.

Adding Tests

The google_classroom library doesn't currently have any tests. The goal of this issue is to add testing coverage for the API processing and saving into a DB. For each call to the API, this would include:

  • Patching the API return and confirming that data is received and processed correctly.
  • Spinning up a fake database and confirming that data is written correctly to the right tables.

Move database inserts to batch level

Right now the database writes are happening as part of the batch request callback. This may be creating performance bottlenecks. One option would be to move this to the batch level and cache the data for writing until the entire batch request is finished.

Student Submissions error

We're also now getting the following error downloading student submissions:

[SQL: INSERT INTO dbo.[GoogleClassroom_StudentSubmissions] ([courseId], [courseWorkId], id, [userId], [creationTime], [updateTime], state, [draftGrade], [assignedGrade], [courseWorkType], [createdTime], [turnedInTimestamp], [returnedTimestamp], [draftMaxPoints], [draftGradeTimestamp], [draftGraderId], [assignedMaxPoints], [assignedGradeTimestamp], [assignedGraderId], late) VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)]
[parameters: (('231301848111', '238611370303', 'Cg4I3LPjrqgEEL-i4vL4Bg', '115842805580895735978', datetime.datetime(2020, 12, 1, 16, 41, 25, 957000), datetime.datetime(2020, 12, 1, 16, 57, 22, 45000), 'TURNED_IN', None, None, 'ASSIGNMENT', datetime.datetime(2020, 12, 1, 16, 41, 25, 924000), datetime.datetime(2020, 12, 1, 16, 57, 22, 45000), None, None, None, None, None, None, None, 0), ('231301848111', '238611370303', 'Cg4I4rfjrqgEEL-i4vL4Bg', '107177606335613004747', datetime.datetime(2020, 12, 1, 16, 41, 18, 113000), datetime.datetime(2020, 12, 1, 16, 55, 44, 996000), 'TURNED_IN', None, None, 'ASSIGNMENT', datetime.datetime(2020, 12, 1, 16, 41, 18, 98000), datetime.datetime(2020, 12, 1, 16, 55, 44, 996000), None, None, None, None, None, None, None, 0), ('231301848111', '238611370303', 'Cg4IneGKr6gEEL-i4vL4Bg', '102514046507660962400', datetime.datetime(2020, 12, 1, 16, 40, 47, 472000), datetime.datetime(2020, 12, 1, 16, 52, 35, 223000), 'TURNED_IN', None, None, 'ASSIGNMENT', datetime.datetime(2020, 12, 1, 16, 40, 47, 458000), datetime.datetime(2020, 12, 1, 16, 52, 35, 223000), None, None, None, None, None, None, None, 0), ('231301848111', '238611370303', 'Cg4I7umKr6gEEL-i4vL4Bg', '105987267483095051848', datetime.datetime(2020, 12, 1, 16, 40, 47, 544000), datetime.datetime(2020, 12, 1, 17, 21, 46, 949000), 'TURNED_IN', None, None, 'ASSIGNMENT', datetime.datetime(2020, 12, 1, 16, 40, 47, 432000), datetime.datetime(2020, 12, 1, 17, 21, 46, 948000), None, None, None, None, None, None, None, 0), ('231301848111', '238611370303', 'Cg4IzIvEr6gEEL-i4vL4Bg', '108912011662447611108', datetime.datetime(2020, 12, 1, 16, 40, 22, 776000), datetime.datetime(2020, 12, 1, 21, 24, 5, 623000), 'TURNED_IN', None, None, 'ASSIGNMENT', datetime.datetime(2020, 12, 1, 16, 40, 22, 713000), datetime.datetime(2020, 12, 1, 21, 24, 5, 623000), None, None, None, None, None, None, None, 0), ('231301848111', '238611370303', 'Cg4IwYj2r6gEEL-i4vL4Bg', '110557822443959198991', datetime.datetime(2020, 12, 1, 16, 40, 30, 982000), datetime.datetime(2020, 12, 1, 16, 58, 13, 592000), 'TURNED_IN', None, None, 'ASSIGNMENT', datetime.datetime(2020, 12, 1, 16, 40, 30, 977000), datetime.datetime(2020, 12, 1, 16, 58, 13, 592000), None, None, None, None, None, None, None, 0), ('231301848111', '238611370303', 'Cg4IzNyas6gEEL-i4vL4Bg', '104140230101238714287', datetime.datetime(2020, 12, 4, 20, 44, 52, 870000), datetime.datetime(2020, 12, 4, 20, 44, 56, 67000), 'TURNED_IN', None, None, 'ASSIGNMENT', datetime.datetime(2020, 12, 4, 20, 44, 52, 856000), datetime.datetime(2020, 12, 4, 20, 44, 56, 66000), None, None, None, None, None, None, None, 1), ('231301848111', '238611370303', 'Cg4Imdavs6gEEL-i4vL4Bg', '107722742003146079646', datetime.datetime(2020, 12, 1, 16, 42, 56, 994000), datetime.datetime(2020, 12, 1, 17, 1, 19, 356000), 'TURNED_IN', None, None, 'ASSIGNMENT', datetime.datetime(2020, 12, 1, 16, 42, 56, 939000), datetime.datetime(2020, 12, 1, 17, 1, 19, 356000), None, None, None, None, None, None, None, 0) ... displaying 10 of 10000 total bound parameter sets ... ('203613708078', '231661121210', 'Cg4I26-ThtgDELrd0IDfBg', '108281326632597912647', datetime.datetime(2020, 11, 18, 13, 3, 37, 457000), datetime.datetime(2020, 11, 18, 20, 36, 35, 543000), 'TURNED_IN', None, None, 'ASSIGNMENT', datetime.datetime(2020, 11, 18, 13, 3, 37, 394000), datetime.datetime(2020, 11, 18, 20, 36, 35, 543000), None, None, None, None, None, None, None, 0), ('203613708078', '231661121210', 'Cg4I_siYh9gDELrd0IDfBg', '104154892059107388602', datetime.datetime(2020, 11, 18, 13, 3, 39, 694000), datetime.datetime(2020, 11, 18, 14, 53, 46, 714000), 'TURNED_IN', None, None, 'ASSIGNMENT', datetime.datetime(2020, 11, 18, 13, 3, 39, 635000), datetime.datetime(2020, 11, 18, 14, 53, 46, 713000), None, None, None, None, None, None, None, 0))]
(Background on this error at: http://sqlalche.me/e/13/9h9h)
Traceback (most recent call last):
File "C:\Users\WBartley.virtualenvs\google_classroom-J9HVh9GF\lib\site-packages\sqlalchemy\engine\base.py", line 1258, in _execute_context
cursor, statement, parameters, context
File "C:\Users\WBartley.virtualenvs\google_classroom-J9HVh9GF\lib\site-packages\sqlalchemy\dialects\mssql\pyodbc.py", line 417, in do_executemany
cursor, statement, parameters, context=context
File "C:\Users\WBartley.virtualenvs\google_classroom-J9HVh9GF\lib\site-packages\sqlalchemy\engine\default.py", line 590, in do_executemany
cursor.executemany(statement, parameters)
pyodbc.DataError: ('22018', '[22018] [Microsoft][ODBC Driver 13 for SQL Server]Invalid character value for cast specification (0) (SQLParamData)')

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File "main.py", line 171, in
main(Config)
File "main.py", line 162, in main
StudentSubmissions(classroom_service, sql, config).batch_pull_data(course_ids)
File "S:\Projects\google_classroom\timer.py", line 22, in wrapper
results = func(*args, **kwargs)
File "S:\Projects\google_classroom\api.py", line 259, in batch_pull_data
self._write_to_db(df)
File "S:\Projects\google_classroom\api.py", line 96, in write_to_db
self.sql.insert_into(self.table_name, df, chunksize=10000)
File "C:\Users\WBartley.virtualenvs\google_classroom-J9HVh9GF\lib\site-packages\sqlsorcery_init
.py", line 210, in insert_into
dtype=dtype,
File "C:\Users\WBartley.virtualenvs\google_classroom-J9HVh9GF\lib\site-packages\pandas\core\generic.py", line 2663, in to_sql
method=method,
File "C:\Users\WBartley.virtualenvs\google_classroom-J9HVh9GF\lib\site-packages\pandas\io\sql.py", line 521, in to_sql
method=method,
File "C:\Users\WBartley.virtualenvs\google_classroom-J9HVh9GF\lib\site-packages\pandas\io\sql.py", line 1317, in to_sql
table.insert(chunksize, method=method)
File "C:\Users\WBartley.virtualenvs\google_classroom-J9HVh9GF\lib\site-packages\pandas\io\sql.py", line 755, in insert
exec_insert(conn, keys, chunk_iter)
File "C:\Users\WBartley.virtualenvs\google_classroom-J9HVh9GF\lib\site-packages\pandas\io\sql.py", line 669, in _execute_insert
conn.execute(self.table.insert(), data)
File "C:\Users\WBartley.virtualenvs\google_classroom-J9HVh9GF\lib\site-packages\sqlalchemy\engine\base.py", line 1014, in execute
return meth(self, multiparams, params)
File "C:\Users\WBartley.virtualenvs\google_classroom-J9HVh9GF\lib\site-packages\sqlalchemy\sql\elements.py", line 298, in _execute_on_connection
return connection._execute_clauseelement(self, multiparams, params)
File "C:\Users\WBartley.virtualenvs\google_classroom-J9HVh9GF\lib\site-packages\sqlalchemy\engine\base.py", line 1133, in _execute_clauseelement
distilled_params,
File "C:\Users\WBartley.virtualenvs\google_classroom-J9HVh9GF\lib\site-packages\sqlalchemy\engine\base.py", line 1318, in execute_context
e, statement, parameters, cursor, context
File "C:\Users\WBartley.virtualenvs\google_classroom-J9HVh9GF\lib\site-packages\sqlalchemy\engine\base.py", line 1512, in handle_dbapi_exception
sqlalchemy_exception, with_traceback=exc_info[2], from
=e
File "C:\Users\WBartley.virtualenvs\google_classroom-J9HVh9GF\lib\site-packages\sqlalchemy\util\compat.py", line 178, in raise

raise exception
File "C:\Users\WBartley.virtualenvs\google_classroom-J9HVh9GF\lib\site-packages\sqlalchemy\engine\base.py", line 1258, in _execute_context
cursor, statement, parameters, context
File "C:\Users\WBartley.virtualenvs\google_classroom-J9HVh9GF\lib\site-packages\sqlalchemy\dialects\mssql\pyodbc.py", line 417, in do_executemany
cursor, statement, parameters, context=context
File "C:\Users\WBartley.virtualenvs\google_classroom-J9HVh9GF\lib\site-packages\sqlalchemy\engine\default.py", line 590, in do_executemany
cursor.executemany(statement, parameters)
sqlalchemy.exc.DataError: (pyodbc.DataError) ('22018', '[22018] [Microsoft][ODBC Driver 13 for SQL Server]Invalid character value for cast specification (0) (SQLParamData)')
[SQL: INSERT INTO dbo.[GoogleClassroom_StudentSubmissions] ([courseId], [courseWorkId], id, [userId], [creationTime], [updateTime], state, [draftGrade], [assignedGrade], [courseWorkType], [createdTime], [turnedInTimestamp], [returnedTimestamp], [draftMaxPoints], [draftGradeTimestamp], [draftGraderId], [assignedMaxPoints], [assignedGradeTimestamp], [assignedGraderId], late) VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)]
[parameters: (('231301848111', '238611370303', 'Cg4I3LPjrqgEEL-i4vL4Bg', '115842805580895735978', datetime.datetime(2020, 12, 1, 16, 41, 25, 957000), datetime.datetime(2020, 12, 1, 16, 57, 22, 45000), 'TURNED_IN', None, None, 'ASSIGNMENT', datetime.datetime(2020, 12, 1, 16, 41, 25, 924000), datetime.datetime(2020, 12, 1, 16, 57, 22, 45000), None, None, None, None, None, None, None, 0), ('231301848111', '238611370303', 'Cg4I4rfjrqgEEL-i4vL4Bg', '107177606335613004747', datetime.datetime(2020, 12, 1, 16, 41, 18, 113000), datetime.datetime(2020, 12, 1, 16, 55, 44, 996000), 'TURNED_IN', None, None, 'ASSIGNMENT', datetime.datetime(2020, 12, 1, 16, 41, 18, 98000), datetime.datetime(2020, 12, 1, 16, 55, 44, 996000), None, None, None, None, None, None, None, 0), ('231301848111', '238611370303', 'Cg4IneGKr6gEEL-i4vL4Bg', '102514046507660962400', datetime.datetime(2020, 12, 1, 16, 40, 47, 472000), datetime.datetime(2020, 12, 1, 16, 52, 35, 223000), 'TURNED_IN', None, None, 'ASSIGNMENT', datetime.datetime(2020, 12, 1, 16, 40, 47, 458000), datetime.datetime(2020, 12, 1, 16, 52, 35, 223000), None, None, None, None, None, None, None, 0), ('231301848111', '238611370303', 'Cg4I7umKr6gEEL-i4vL4Bg', '105987267483095051848', datetime.datetime(2020, 12, 1, 16, 40, 47, 544000), datetime.datetime(2020, 12, 1, 17, 21, 46, 949000), 'TURNED_IN', None, None, 'ASSIGNMENT', datetime.datetime(2020, 12, 1, 16, 40, 47, 432000), datetime.datetime(2020, 12, 1, 17, 21, 46, 948000), None, None, None, None, None, None, None, 0), ('231301848111', '238611370303', 'Cg4IzIvEr6gEEL-i4vL4Bg', '108912011662447611108', datetime.datetime(2020, 12, 1, 16, 40, 22, 776000), datetime.datetime(2020, 12, 1, 21, 24, 5, 623000), 'TURNED_IN', None, None, 'ASSIGNMENT', datetime.datetime(2020, 12, 1, 16, 40, 22, 713000), datetime.datetime(2020, 12, 1, 21, 24, 5, 623000), None, None, None, None, None, None, None, 0), ('231301848111', '238611370303', 'Cg4IwYj2r6gEEL-i4vL4Bg', '110557822443959198991', datetime.datetime(2020, 12, 1, 16, 40, 30, 982000), datetime.datetime(2020, 12, 1, 16, 58, 13, 592000), 'TURNED_IN', None, None, 'ASSIGNMENT', datetime.datetime(2020, 12, 1, 16, 40, 30, 977000), datetime.datetime(2020, 12, 1, 16, 58, 13, 592000), None, None, None, None, None, None, None, 0), ('231301848111', '238611370303', 'Cg4IzNyas6gEEL-i4vL4Bg', '104140230101238714287', datetime.datetime(2020, 12, 4, 20, 44, 52, 870000), datetime.datetime(2020, 12, 4, 20, 44, 56, 67000), 'TURNED_IN', None, None, 'ASSIGNMENT', datetime.datetime(2020, 12, 4, 20, 44, 52, 856000), datetime.datetime(2020, 12, 4, 20, 44, 56, 66000), None, None, None, None, None, None, None, 1), ('231301848111', '238611370303', 'Cg4Imdavs6gEEL-i4vL4Bg', '107722742003146079646', datetime.datetime(2020, 12, 1, 16, 42, 56, 994000), datetime.datetime(2020, 12, 1, 17, 1, 19, 356000), 'TURNED_IN', None, None, 'ASSIGNMENT', datetime.datetime(2020, 12, 1, 16, 42, 56, 939000), datetime.datetime(2020, 12, 1, 17, 1, 19, 356000), None, None, None, None, None, None, None, 0) ... displaying 10 of 10000 total bound parameter sets ... ('203613708078', '231661121210', 'Cg4I26-ThtgDELrd0IDfBg', '108281326632597912647', datetime.datetime(2020, 11, 18, 13, 3, 37, 457000), datetime.datetime(2020, 11, 18, 20, 36, 35, 543000), 'TURNED_IN', None, None, 'ASSIGNMENT', datetime.datetime(2020, 11, 18, 13, 3, 37, 394000), datetime.datetime(2020, 11, 18, 20, 36, 35, 543000), None, None, None, None, None, None, None, 0), ('203613708078', '231661121210', 'Cg4I_siYh9gDELrd0IDfBg', '104154892059107388602', datetime.datetime(2020, 11, 18, 13, 3, 39, 694000), datetime.datetime(2020, 11, 18, 14, 53, 46, 714000), 'TURNED_IN', None, None, 'ASSIGNMENT', datetime.datetime(2020, 11, 18, 13, 3, 39, 635000), datetime.datetime(2020, 11, 18, 14, 53, 46, 713000), None, None, None, None, None, None, None, 0))]
(Background on this error at: http://sqlalche.me/e/13/9h9h)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.