ucbds-infra / otter-grader Goto Github PK

View Code? Open in Web Editor NEW

118.0 5.0 59.0 49.58 MB

A Python and R autograding solution

Home Page: https://otter-grader.readthedocs.io

License: BSD 3-Clause "New" or "Revised" License

Python 94.42% Dockerfile 0.14% Shell 2.91% Makefile 0.13% Smarty 0.44% R 1.51% Jinja 0.44%

python jupyter-notebooks jupyter jupyter-notebook autograder autograding r rmd

otter-grader's People

Contributors

Stargazers

Watchers

otter-grader's Issues

Make gs generator cleanup on failure

Make sure that GS generator removes tmp directory on failure

add unit tests

add unit tests

Fix relative imports on gradescope

Currently getting a ModuleNotFoundError or other error when trying to import from utils on Gradescope

Customizing grading with multiple hidden tests per question

Is your feature request related to a problem? Please describe.
With multiple hidden tests per question, if a single hidden test failed, the entire hidden score of this question becomes 0.

Describe the solution you'd like
Can you provide a feature to allow for assigning scores proportional to the number of hidden tests passed?

Describe alternatives you've considered
I considered breaking up the question with multiple hidden tests into multiple questions each with a single hidden test, but that's not natural semantically.

Additional context
We're using this for a class at Harvard, which is going on right now...

Compatibility with Gradescope Notebook Preview

Currently otter-grader output is html formatted, which has problem being displayed in code preview window for the jupyter-notebook. Are there any plans in future to include a fix to main branch? What should be best way to make otter output compatible? Please share your thoughts.

What it looks like right now:

How it should look like, which would to helpful for TAs to grade submissions (or figure out problems with their autograder setup):

Also in Autograder Output tab under results, the HTML object cannot be read with only final scores being displayed properly (same goes for debugging via ssh, autograder outputs show as "<IPython.core.display.HTML object>" )

Thank you!

manually graded questions with custom prompts have superfluous prompts

I ran into issues where manually graded questions with custom prompts had superfluous prompts like this:

Type your answer here, replacing this text.

These prompts appeared even in places where I specify custom prompts.

I found it hard to understand the relevant part of the documentation at https://otter-grader.readthedocs.io/en/stable/otter_assign/python_notebook_format.html. After the text "An example of a manually-graded written question with a custom prompt", there is no code but only a screenshot:

../_images/assign_sample_written_manual_with_prompt.png

I didn't know which code I needed to write in order to achieve the effect illustrated in the screenshot.

I was finally able to figure it out after I carefully reread the part of the documentation that says: "If there is a prompt cell for manually-graded questions (i.e. a cell between the question cell and solution cell), then this prompt is included in the output. If none is present, Otter Assign automatically adds a Markdown cell with the contents Type your answer here, replacing this text.."

My mistake was that I had included the prompt in the cell that also contained the BEGIN QUESTION code. After I split this cell in two, the issue disappeared.

Perhaps it would be helpful if the documentation included a worked-out code example and not just the screenshot, or alternatively, if otter assign was modified so that it doesn't matter whether you have the prompt in a different cell as BEGIN QUESTION or the same cell.

Python version: 3.8.5
Otter-Grader version: 1.1.3

verbose flag

Finish implementation of -v flag

Public tests appears as hidden in Gradescope

Describe the bug
When using the otter assign functionality a code test written using # TEST will appear as a hidden test when grading in Gradescope

To Reproduce
Steps to reproduce the behavior:

Create a test using a # TEST cell in a dev notebook
Create a student version using otter assign
Upload autograder.zip to Gradescope
Grade an assignment
Result of # TEST cell is classified as hidden in Gradescope

Expected behavior
I would expect such a test to be considered public, not hidden.

Versions
Python 3.8.5
Otter 1.0.0b10

Additional context
There were no hidden tests in the notebook.

fix gradescope parser for group submissions

just need to do the TODO on line 51 in metadata.py

Add error throwing for unrecognized configs in otter assign

as titled

list the contents of requirements.txt directly in the metadata of the master notebook

It would be very helpful if I could list the requirements (the contents of requirements.txt) directly in the metadata of the master notebook, so that the master notebook can be a self-contained file and can be moved around without the attached requirements.txt file. This would be useful because Jupyter notebooks don't have great support for uploading/downloading multiple files, and the specific platform on which we're deploying doesn't have the zip command installed.

Conda envs on Gradescope

Allow environments to be specified using conda in Gradescope containers.

grade_notebook() not recognizing already executed tests

grade_notebook() contains logic for only running tests in tests_glob if they haven't already been run.

At the moment this logic is broken on Gradescope, resulting in all tests being rerun even if they originally passed. This means that the following can now fail

x = 5
grader.check("q1a") # test x == 5
...
x += 2
grader.check("q2a") # test x == 7

The autograder will run all tests with the final global environment, resulting in the value of X it sees always being seven.

I tossed some print statements in, (inside the selection logic) and it seems that the issue relates to logic that assumes executed tests will have short names, but we instead see both already tested tests and ones in the glob have full paths.

we ran these: ['/autograder/submission/tests/q1b.py', '/autograder/submission/tests/q1c.py', '/autograder/submission/tests/q2a.py', '/autograder/submission/tests/q3d.py', '/autograder/submission/tests/q4a.py', '/autograder/submission/tests/q4b.py', '/autograder/submission/tests/q4c.py', '/autograder/submission/tests/q4d.py', '/autograder/submission/tests/q4f.py', '/autograder/submission/tests/q4g.py']
We are going to include: /autograder/source/tests/__init__.py
We are going to include: /autograder/source/tests/q1b.py
We are going to include: /autograder/source/tests/q1c.py
We are going to include: /autograder/source/tests/q2a.py
We are going to include: /autograder/source/tests/q3d.py
We are going to include: /autograder/source/tests/q4a.py
We are going to include: /autograder/source/tests/q4b.py
We are going to include: /autograder/source/tests/q4c.py
We are going to include: /autograder/source/tests/q4d.py
We are going to include: /autograder/source/tests/q4f.py
We are going to include: /autograder/source/tests/q4g.py

This seems related to the copying of files between /autograder/source/tests/ and /autograder/submission/tests/. The executed tests are in /autograder/submission/tests/, but the glob remains pointed to /autograder/source/tests/.

Potential solutions include using basename to compare only the test names or updating the glob passed into grade_notebook to point to `/autograder/submission/tests/

Cannot open via jupyter notebook on windows 10

Steps to replicate on windows 10 after pip installation and docker pull for image:

run jupyter notebook in cmd
open demo notebook

stack trace:
[I 00:19:26.496 NotebookApp] The Jupyter Notebook is running at: http://localhost:8888/?token=53992a9a47de2e7a1d1ee95fb62c2c82c8a43e8bc4710ed6
[I 00:19:26.498 NotebookApp] Use Control-C to stop this server and shut down all kernels (twice to skip confirmation).
[C 00:19:26.501 NotebookApp]

Copy/paste this URL into your browser when you connect for the first time,
to login with a token:
    http://localhost:8888/?token=53992a9a47de2e7a1d1ee95fb62c2c82c8a43e8bc4710ed6

[I 00:19:26.951 NotebookApp] Accepting one-time-token-authenticated connection from ::1
[W 00:19:32.496 NotebookApp] Notebook demo.ipynb is not trusted
[E 00:19:34.709 NotebookApp] Unhandled error in API request
Traceback (most recent call last):
File "C:\Program Files\Anaconda3\lib\site-packages\traitlets\traitlets.py", line 526, in get
value = obj._trait_values[self.name]
KeyError: 'loop'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "C:\Program Files\Anaconda3\lib\site-packages\notebook\base\handlers.py", line 503, in wrapper
    result = yield gen.maybe_future(method(self, *args, **kwargs))
  File "C:\Program Files\Anaconda3\lib\site-packages\tornado\gen.py", line 1133, in run
    value = future.result()
  File "C:\Program Files\Anaconda3\lib\site-packages\tornado\gen.py", line 1141, in run
    yielded = self.gen.throw(*exc_info)
  File "C:\Program Files\Anaconda3\lib\site-packages\notebook\services\sessions\handlers.py", line 75, in post
    type=mtype))
  File "C:\Program Files\Anaconda3\lib\site-packages\tornado\gen.py", line 1133, in run
    value = future.result()
  File "C:\Program Files\Anaconda3\lib\site-packages\tornado\gen.py", line 1141, in run
    yielded = self.gen.throw(*exc_info)
  File "C:\Program Files\Anaconda3\lib\site-packages\notebook\services\sessions\sessionmanager.py", line 79, in create_session
    kernel_id = yield self.start_kernel_for_session(session_id, path, name, type, kernel_name)
  File "C:\Program Files\Anaconda3\lib\site-packages\tornado\gen.py", line 1133, in run
    value = future.result()
  File "C:\Program Files\Anaconda3\lib\site-packages\tornado\gen.py", line 1141, in run
    yielded = self.gen.throw(*exc_info)
  File "C:\Program Files\Anaconda3\lib\site-packages\notebook\services\sessions\sessionmanager.py", line 92, in start_kernel_for_session
    self.kernel_manager.start_kernel(path=kernel_path, kernel_name=kernel_name)
  File "C:\Program Files\Anaconda3\lib\site-packages\tornado\gen.py", line 1133, in run
    value = future.result()
  File "C:\Program Files\Anaconda3\lib\site-packages\tornado\gen.py", line 326, in wrapper
    yielded = next(result)
  File "C:\Program Files\Anaconda3\lib\site-packages\notebook\services\kernels\kernelmanager.py", line 87, in start_kernel
    super(MappingKernelManager, self).start_kernel(**kwargs)
  File "C:\Program Files\Anaconda3\lib\site-packages\jupyter_client\multikernelmanager.py", line 110, in start_kernel
    km.start_kernel(**kwargs)
  File "C:\Program Files\Anaconda3\lib\site-packages\jupyter_client\manager.py", line 244, in start_kernel
    self.start_restarter()
  File "C:\Program Files\Anaconda3\lib\site-packages\jupyter_client\ioloop\manager.py", line 49, in start_restarter
    kernel_manager=self, loop=self.loop,
  File "C:\Program Files\Anaconda3\lib\site-packages\traitlets\traitlets.py", line 554, in __get__
    return self.get(obj, cls)
  File "C:\Program Files\Anaconda3\lib\site-packages\traitlets\traitlets.py", line 533, in get
    value = self._validate(obj, dynamic_default())
  File "C:\Program Files\Anaconda3\lib\site-packages\traitlets\traitlets.py", line 589, in _validate
    value = self.validate(obj, value)
  File "C:\Program Files\Anaconda3\lib\site-packages\traitlets\traitlets.py", line 1675, in validate
    self.error(obj, value)
  File "C:\Program Files\Anaconda3\lib\site-packages\traitlets\traitlets.py", line 1522, in error
    raise TraitError(e)
traitlets.traitlets.TraitError: The 'loop' trait of an IOLoopKernelManager instance must be a ZMQIOLoop, but a value of class 'tornado.platform.asyncio.AsyncIOMainLoop' (i.e. <tornado.platform.asyncio.AsyncIOMainLoop object at 0x000001B7AD3CFA20>) was specified.

[E 00:19:34.750 NotebookApp] {
"Host": "localhost:8888",
"Connection": "keep-alive",
"Content-Length": "87",
"Accept": "application/json, text/javascript, /; q=0.01",
"Sec-Fetch-Dest": "empty",
"X-Requested-With": "XMLHttpRequest",
"X-Xsrftoken": "2|0523390e|95df01d3ff9669c08e63fc9e8fa03f74|1581665725",
"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/80.0.3987.106 Safari/537.36",
"Content-Type": "application/json",
"Origin": "http://localhost:8888",
"Sec-Fetch-Site": "same-origin",
"Sec-Fetch-Mode": "cors",
"Referer": "http://localhost:8888/notebooks/demo.ipynb",
"Accept-Encoding": "gzip, deflate, br",
"Accept-Language": "en-US,en;q=0.9,zh-CN;q=0.8,zh;q=0.7",
"Cookie": "_xsrf=2|0523390e|95df01d3ff9669c08e63fc9e8fa03f74|1581665725; username-localhost-8888="2|1:0|10:1581754766|23:username-localhost-8888|44:MjM5YTk3ODRmMDdhNGNhNzljNTc2Y2FkNTNhMDNjYTE=|39ea2353eb332be352794c09b805cb370361575c3b7f2d3c3f7f6e02877da4e9""
}
[E 00:19:34.763 NotebookApp] 500 POST /api/sessions (::1) 176.94ms referer=http://localhost:8888/notebooks/demo.ipynb
Traceback (most recent call last):
File "C:\Program Files\Anaconda3\lib\runpy.py", line 193, in _run_module_as_main
"main", mod_spec)
File "C:\Program Files\Anaconda3\lib\runpy.py", line 85, in run_code
exec(code, run_globals)
File "C:\Program Files\Anaconda3\lib\site-packages\ipykernel_main.py", line 3, in
app.launch_new_instance()
File "C:\Program Files\Anaconda3\lib\site-packages\traitlets\config\application.py", line 658, in launch_instance
app.start()
File "C:\Program Files\Anaconda3\lib\site-packages\ipykernel\kernelapp.py", line 474, in start
ioloop.IOLoop.instance().start()
File "C:\Program Files\Anaconda3\lib\site-packages\tornado\platform\asyncio.py", line 132, in start
self.asyncio_loop.run_forever()
File "C:\Program Files\Anaconda3\lib\asyncio\base_events.py", line 408, in run_forever
raise RuntimeError('This event loop is already running')
RuntimeError: This event loop is already running

parallelize docker containers

Flag for alternative to ellipsis in solution replacement

Is your feature request related to a problem? Please describe.
I'm trying out otter with Rmarkdown. By default solutions are replaced with ..., which in R has a very different meaning than it does in Python. As a consequence, you can't knit Rmarkdown until ellipses are replaced. I want students to be able to knittable immediately so that they can continuously knit as the work through the solutions.

Describe the solution you'd like
A flag in otter assign for specifying the replacement text, e.g. in R this could be NULL instead of ....

Describe alternatives you've considered
I can manually search and replace ... with NULL after running otter assign. Alternatively I can set eval=FALSE globally so that no code chunks are evaluated. That should enable you to knit the file without errors.

Additional context
Add any other context or screenshots about the feature request here.

otter-assign should clear the output of the student-facing notebook after generating it

I wonder if it would be technically possible to modify otter-assign so it clears the output of the student-facing notebook after generating it. This would remove the numbers in brackets (e.g. [135] next to the cells where students are expected to add input.

Is your feature request related to a problem? Please describe.
The student-facing notebook that otter-assign generates contains numbers that indicate that some cells have been run (e.g. [135] next to these cells). It turns out that these numbers are confusing some students who are new to Jupyter.

Describe the solution you'd like
otter-assign should clear the output of the student-facing notebook after generating it.

Describe alternatives you've considered
The alternative consists in manually clearing the output of each student-facing notebook every time after otter-assign is run.

Additional context
Manually clearing the output is time intensive because it can't be done from the command-line and can't be automated as far as I know.

Execute_notebook assumes test location, incompatible with Gradescope

execute_notebook contains a line that assumes all tests are in /home/tests/, regardless of how the notebook was originally configured. This is incompatible with the Gradescope autograder as Gradescope places all tests in /autograder/source/tests/ and they are then copied to /autograder/submission/tests by the generated run_autograder script.

While the autograder is still able to run as a result of passing the tests_glob and ignore_errors=True (which results in the missing tests not printing errors) arguments, this results in all tests being run at the end of the notebook's execution, in the global namespace.

This would make the following sequence fail as all tests executed will see the value of x as 7.

x = 5
grader.check("q1a") # check x = 5
x = x + 2
grader.check("q1b") # check x = 7

Trying to work on windows

I have TA who is trying to get this work on Windows.

When running from the command line the otter command doesn't work.

(1) Any advice on getting this to work on windows?

(2) Alternately would it be easy to import the module and run the command directly as I'm already executing from Python?

Thanks in advance
Jason

Gradescope output error

Hi! Gradescope is giving the following error when submitting a notebook:

http://gradescope-autograders.readthedocs.io/en/latest/specs/#output-format
for more details.

Use the "Debug via SSH" button below to debug this issue.


We did not receive any results for this submission.

The autograder produced the following output:

Traceback (most recent call last):
  File "./run_autograder", line 3, in <module>
    from otter.utils import remove_html_in_hint
ImportError: cannot import name 'remove_html_in_hint'

This happened after I upgraded otter-grader to latest

Cell Specification Feature

It would be awesome if the beta release of Otter had an option to specify which code cells to include/exclude when executing a student's notebook submission. For example, maybe the first few cells of the submission are code examples that are irrelevant to the testing of the student's submission.

public tests check results in output "All tests passed" even though hidden tests may have failed

Describe the bug
When students run a check within their notebook, it outputs "All tests passed" whenever all public tests have been passed, even though hidden tests may not have been passed. This confuses students.

To Reproduce
Steps to reproduce the behavior:
As a student, work on any problem in the notebook in a way that satisfies the relevant public tests, and run "grader.check()".

Expected behavior
The output should be configurable by the instructor, and the default output should be "All public tests passed" rather than "All tests passed".

Versions
Python version: 3.8.5
Otter-Grader version: 1.1.3

Additional context

In general, I prefer not to tell students to ignore the output of otter-grader because I want them to be able to rely on it and to pay attention to it.

Sample email by a student:

"I was just submitting my assignment on my gradebook and I noticed I got a 51/70 on the first jupyter assignment. I'm just a little confused because when I submitted the assignment I had like the "all tests passed" thing but it says I was missing stuff in my assignment. Is there something I'm missing in the assignment? What should I do next time to avoid this? "

Docs redirect

A lot of people have this link, because it used to hold the docs:

[1] https://github.com/ucbds-infra/otter-grader/blob/master/docs/index.md

But now it seems the docs are mostly here:

[2] https://nbviewer.jupyter.org/github/ucbds-infra/otter-grader/blob/master/demo/demo.ipynb

Could you add some links to [1] so that people who start there can find [2] and whatever else they need?

Otter Grader downloads chromium on gradescope

Though execute_notebook contains logic to mock out PDF generation, I'm still seeing logs indicating that Chromium is being downloaded. While this doesn't break anything, it clutters the logs and probably adds some time to autograder execution.

Perhaps the import chain of otter -> nb2pdf -> nbpdfexport -> ? somehow automatically starts the download

My notebooks look like the following

import otter
grader = otter.Notebook()
grader.export("proj3.ipynb", filtering=False)

Gradescope logs, note that the output relating to chromium comes before the student output

[W:pyppeteer.chromium_downloader] start chromium download.
Download may take a few minutes.
[W:pyppeteer.chromium_downloader] 
chromium download done.
[W:pyppeteer.chromium_downloader] chromium extracted to: /root/.local/share/pyppeteer/local-chromium/575458
We filtered out 92% of the entries

Duration: 240.3395219270353 
Speed: 226.90793945018308



  name  score  possible visibility
0  q1b    2.0         2    visible
1  q1c    3.0         3    visible
2  q2a    1.0         1    visible

Otter Assign should run a notebook prior to processing it

Is your feature request related to a problem? Please describe.
I frequently find myself forgetting to run a master notebook prior to running Otter Assign on it. Otter Assign does not emit a warning but produces notebooks that lead to strange behavior when autograding, e.g. tests that are written to expect a boolean are seen as expecting nothing.

Describe the solution you'd like
Otter Assign should run the notebook as a preprocessing step by default. If I don't want Otter Assign to run the notebook (e.g. because I prefer to to it myself), I'd like to be able to specify this in the metadata of the notebook.

Describe alternatives you've considered
An alternative would be to pass Otter Assign an option on the command line.
Another alternative would be for Otter Assign to emit a warning when it encounters a notebook that has not yet been run, both at the beginning and at the end running Otter Assign.

Additional context
I notice that Otter Assign runs the notebook for me multiple times anyway, but only after generating the autograder (perhaps as part of generating the PDF template). Running the notebook one more time should not make a big difference.

Streamline CI tests

Streamline tests, reduce # files necessary, etc. so that CI doesn't take so long to run and to reduce the size of the repo.

fix gradescope export

add identifiers in CSV output

add identifier column to CSV output file

Remove tornado==5.1.1 requirement

As titled

Typo on documentation for hiding questions on Gradescope

The documentation says:

On submission, students will only be able to see the results of those tests for which test["hidden"] evaluates to True (see Test Files for more info). If test["hidden"] is False or not specified, then test is hidden.

This wording suggests that when a

test['hidden'] is True, the question is shown on Gradescope
test['hidden'] is False, the question is hidden on Gradescope

However, this is the opposite of Gradescope's behavior (and inconsistent with the wording 'hidden').

Gradescope behavior is that when a

test['hidden'] is True, the question is hidden on Gradescope
test['hidden'] is False, the question is shown on Gradescope

Test files page should also be updated

Dead link in docs

The gradescope otter docs link to the below link, but it 404s

Page linking: https://otter-grader.readthedocs.io/en/latest/gradescope.html
Link: https://otter-grader.readthedocs.io/en/latest/test_files.md#ok-format-caveats
Possible correct link: https://otter-grader.readthedocs.io/en/latest/test_files.html#ok-format-caveats

(.md -> .html)

Otter grade don't recognize declared variable.

Otter grade doesn't pickup predefined variables when run *.ipynb files
Generic errors: NameError: name 'unemployment' is not defined, NameError: name 'boston_under_10' is not defined.
There are many more but I just gonna give 2 examples here.

First example:
When I run otter grade, I get this following error. I already have unemployment variable before unemployment.select('Date', 'NEI', 'NEI-PTER').take(0) cell

q1_1 > Suite 1 > Case 1

unemployment.select('Date', 'NEI', 'NEI-PTER').take(0)
NameError: name 'unemployment' is not defined

Error: expected

Date | NEI | NEI-PTER

1994-01-01 | 10.0974 | 11.172

but got

Traceback (most recent call last):

...

NameError: name 'unemployment' is not defined

Run only this test case with "python3 ok -q q1_1 --suite 1 --case 1"

Second example:

Running tests

q4_3 > Suite 1 > Case 1

boston_under_10 >= 0 and boston_under_10 <= 100
NameError: name 'boston_under_10' is not defined

Error: expected

True

but got

Traceback (most recent call last):

...

NameError: name 'boston_under_10' is not defined

Run only this test case with "python3 ok -q q4_3 --suite 1 --case 1"

Expected behavior
It should not be any error since I already declare the variable before used it
Versions
Otter grader version 1.1.3
Additional context

Grade locally in entirely separate containers, a la Gradescope

Look into runtime of creating an image and spawning one container per submission. Might be able to mirror Gradescope's autograding container structure?

need explicit instructions for running otter

The README in the documentation describes how to pull the docker image, but then doesn't say anything about how to run otter!

It might help to have an example set of commands to go with the demo notebook that you have available. I managed to run it using something like this on the test notebook in the repo:

docker rm -f otter ; docker run -it -v $(pwd):/data --name otter ucbdsinfra/otter-grader otter check /data/otter-grader/demo/demo.ipynb

However that just outputs:

otter
0 of 0 tests passed

Thanks! -Aaron

Comparison with related software, especially nbgrader

It's interesting to see an active project with momentum in this space, but it would be really useful to have a comparison with existing tools. In comparison to nbgrader, it seems like question prep takes a bit more work with Otter, but perhaps can better support cases in which students can approach the problem in different ways. And it looks like Gradescope is a recommended way to write feedback, versus the built-in grading/feedback support in nbgrader. Perhaps you could write a brief comparison on the docs site to make it clear what we'd be getting/giving up in switching to Otter to help prospective users make more informed decisions.

Make otter automatically find requirements.txt

Otter should automatically find a requirements.txt file in the CWD even if -r flag is not specified. If none is present, assume no extra requirements needed.

calls to `display()` in notebook fail

I'm executing otter grader on gradescope, and I've had it fail on a notebook where the student made a call to display(). I explicitly set ignore_errors=False as I suspected that swallowing the error was causing the student's cell to fail silently, breaking all of the tests dependent on it. I'm not sure if this behavior is expected or not.

root@place:/autograder# ./run_autograder 
Traceback (most recent call last):
  File "./run_autograder", line 65, in <module>
    scores = grade_notebook(nb_path, tests_glob, name="submission", ignore_errors=False, gradescope=True)
  File "/usr/local/lib/python3.6/dist-packages/otter/grade.py", line 57, in grade_notebook
    global_env = execute_notebook(nb, secret, initial_env, ignore_errors=ignore_errors, gradescope=gradescope)
  File "/usr/local/lib/python3.6/dist-packages/otter/grade.py", line 202, in execute_notebook
    exec(cell_source, global_env)
  File "<string>", line 12, in <module>
NameError: name 'display' is not defined

Student code looks the below. Note, there is no explicit import of display anywhere in the code.

display(df.head()['date'])

Requirements.txt that otter generates for gradescope:

datascience
jupyter_client
ipykernel
matplotlib
pandas
ipywidgets
scipy
seaborn
sklearn
nb2pdf
tornado==5.1.1
otter-grader==0.4.2
numpy
pandas
seaborn
sklearn
tensorflow

PDF equality test for Otter Export

Current tests for otter-export test for existence of pdf file, but pdf correctness is still to be determined.

multiple points per question

autograder.zip file hangs on build in Gradescope

Describe the bug
When uploading the autograder.zip file created from calling otter assign will occasionally hang while building in Gradescope.

To Reproduce
Steps to reproduce the behavior:

Create autograder files using otter assign <filename>.ipynb
Upload autograder.zip to a Gradescope assignment.

Expected behavior
The build should complete.

Versions
Python 3.8.5 and Otter Grader 1.1.2

Additional context
The final Build Output from the Configure Autograder section is as follows:

The build should result in a
Setting up xfonts-75dpi (1:1.0.4+nmu1) ...
Setting up xfonts-base (1:1.0.4+nmu1) ...
Setting up wkhtmltox (1:0.12.6-1.bionic) ...
Processing triggers for libc-bin (2.27-3ubuntu1.2) ...
Hit:1 http://archive.ubuntu.com/ubuntu bionic InRelease
Hit:2 http://security.ubuntu.com/ubuntu bionic-security InRelease
Hit:3 http://archive.ubuntu.com/ubuntu bionic-updates InRelease
Hit:4 http://archive.ubuntu.com/ubuntu bionic-backports InRelease
Reading package lists...
Reading package lists...
Building dependency tree...
Reading state information...
build-essential is already the newest version (12.4ubuntu1).
build-essential set to manually installed.
Some packages could not be installed. This may mean that you have
requested an impossible situation or if you are using the unstable
distribution that some required packages have not yet been created
or been moved out of Incoming.
The following information may help to resolve the situation:

The following packages have unmet dependencies:
 libcurl4-gnutls-dev : Conflicts: libcurl4-openssl-dev but 7.58.0-2ubuntu3.10 is to be installed
 libcurl4-openssl-dev : Conflicts: libcurl4-gnutls-dev but 7.58.0-2ubuntu3.10 is to be installed
�[91mE: Unable to correct problems, you have held broken packages.
�[0m�[91m2020-10-21 13:36:54 URL:https://repo.anaconda.com/miniconda/Miniconda3-py37_4.8.3-Linux-x86_64.sh [88867207/88867207] -> "/autograder/source/miniconda_install.sh" [1]
�[0mPREFIX=/root/miniconda3
Unpacking payload ...
�[91m
  0%|          | 0/35 [00:00<?, ?it/s]�[0m�[91m
Extracting : ncurses-6.2-he6710b0_1.conda:   0%|          | 0/35 [00:00<?, ?it/s]�[0m�[91m
Extracting : ncurses-6.2-he6710b0_1.conda:   3%|▎         | 1/35 [00:00<00:10,  3.25it/s]�[0m�[91m
Extracting : tk-8.6.8-hbc83047_0.conda:   3%|▎         | 1/35 [00:00<00:10,  3.25it/s]   �[0m�[91m
Extracting : pyopenssl-19.1.0-py37_0.conda:   6%|▌         | 2/35 [00:00<00:10,  3.25it/s]�[0m�[91m
Extracting : ca-certificates-2020.1.1-0.conda:   9%|▊         | 3/35 [00:00<00:09,  3.25it/s]�[0m�[91m
Extracting : certifi-2020.4.5.1-py37_0.conda:  11%|█▏        | 4/35 [00:00<00:09,  3.25it/s] �[0m�[91m
Extracting : six-1.14.0-py37_0.conda:  14%|█▍        | 5/35 [00:00<00:09,  3.25it/s]        �[0m�[91m
Extracting : sqlite-3.31.1-h62c20be_1.conda:  17%|█▋        | 6/35 [00:00<00:08,  3.25it/s]�[0m�[91m
Extracting : ruamel_yaml-0.15.87-py37h7b6447c_0.conda:  20%|██        | 7/35 [00:00<00:08,  3.25it/s]�[0m�[91m
Extracting : pycosat-0.6.3-py37h7b6447c_0.conda:  23%|██▎       | 8/35 [00:00<00:08,  3.25it/s]      �[0m�[91m
Extracting : setuptools-46.4.0-py37_0.conda:  26%|██▌       | 9/35 [00:00<00:07,  3.25it/s]    �[0m�[91m
Extracting : pip-20.0.2-py37_3.conda:  29%|██▊       | 10/35 [00:00<00:07,  3.25it/s]      �[0m�[91m
Extracting : tqdm-4.46.0-py_0.conda:  31%|███▏      | 11/35 [00:00<00:07,  3.25it/s] �[0m�[91m
Extracting : pycparser-2.20-py_0.conda:  34%|███▍      | 12/35 [00:00<00:07,  3.25it/s]�[0m�[91m
Extracting : chardet-3.0.4-py37_1003.conda:  37%|███▋      | 13/35 [00:00<00:06,  3.25it/s]�[0m�[91m
Extracting : idna-2.9-py_1.conda:  40%|████      | 14/35 [00:00<00:06,  3.25it/s]          �[0m�[91m
Extracting : yaml-0.1.7-had09818_2.conda:  43%|████▎     | 15/35 [00:00<00:06,  3.25it/s]�[0m�[91m
Extracting : libedit-3.1.20181209-hc058e9b_0.conda:  46%|████▌     | 16/35 [00:00<00:05,  3.25it/s]�[0m�[91m
Extracting : cryptography-2.9.2-py37h1ba5d50_0.conda:  49%|████▊     | 17/35 [00:00<00:05,  3.25it/s]�[0m�[91m
Extracting : openssl-1.1.1g-h7b6447c_0.conda:  51%|█████▏    | 18/35 [00:00<00:05,  3.25it/s]        �[0m�[91m
Extracting : requests-2.23.0-py37_0.conda:  54%|█████▍    | 19/35 [00:00<00:04,  3.25it/s]   �[0m�[91m
Extracting : wheel-0.34.2-py37_0.conda:  57%|█████▋    | 20/35 [00:00<00:04,  3.25it/s]   �[0m�[91m
Extracting : python-3.7.7-hcff3b4d_5.conda:  60%|██████    | 21/35 [00:00<00:04,  3.25it/s]
Extracting : python-3.7.7-hcff3b4d_5.conda:  63%|██████▎   | 22/35 [00:00<00:02,  4.48it/s]
Extracting : _libgcc_mutex-0.1-main.conda:  63%|██████▎   | 22/35 [00:00<00:02,  4.48it/s] 
Extracting : zlib-1.2.11-h7b6447c_3.conda:  66%|██████▌   | 23/35 [00:00<00:02,  4.48it/s]
Extracting : pysocks-1.7.1-py37_0.conda:  69%|██████▊   | 24/35 [00:00<00:02,  4.48it/s]  
Extracting : conda-package-handling-1.6.1-py37h7b6447c_0.conda:  71%|███████▏  | 25/35 [00:00<00:02,  4.48it/s]
Extracting : readline-8.0-h7b6447c_0.conda:  74%|███████▍  | 26/35 [00:00<00:02,  4.48it/s]                    
Extracting : cffi-1.14.0-py37he30daa8_1.conda:  77%|███████▋  | 27/35 [00:00<00:01,  4.48it/s]
Extracting : xz-5.2.5-h7b6447c_0.conda:  80%|████████  | 28/35 [00:00<00:01,  4.48it/s]       
Extracting : ld_impl_linux-64-2.33.1-h53a641e_7.conda:  83%|████████▎ | 29/35 [00:00<00:01,  4.48it/s]
Extracting : urllib3-1.25.8-py37_0.conda:  86%|████████▌ | 30/35 [00:00<00:01,  4.48it/s]             
Extracting : libstdcxx-ng-9.1.0-hdf63c60_0.conda:  89%|████████▊ | 31/35 [00:00<00:00,  4.48it/s]
Extracting : libgcc-ng-9.1.0-hdf63c60_0.conda:  91%|█████████▏| 32/35 [00:00<00:00,  4.48it/s]   
Extracting : libffi-3.3-he6710b0_1.conda:  94%|█████████▍| 33/35 [00:00<00:00,  4.48it/s]     
Extracting : conda-4.8.3-py37_0.tar.bz2:  97%|█████████▋| 34/35 [00:00<00:00,  4.48it/s] �[0m�[91m
                                                                                        �[0m�[91m
�[0m

CI for ottr

As titled.

Docker build isssue

There is a current issue with one of the dependencies nb2pdf that causes the build to error out.

There is temporary solution here.

If you change the build of the docker file to the line below it builds ok.

RUN pip3 install git+https://github.com/eldridgejm/nb2pdf

Just putting the solution here in case anyone else runs into it.

Find a way to route `otter gen` and `otter check` without deleting from sys.argv

Currently, otter gen and otter check are routed in bin/otter by checking sys.argv[1] and then deleting it so argparse works.

Should find a way to route them without editing sys.argv.

Update Python Version for Gradescope

Currently otter's generated setup.sh only uses python 3.6 for grading submissions. Python 3.7 has now been out for 1.5 years, and I've run into some issues from students using functions that are supported in 3.7 but not 3.6, such as fromisoformat().

It would be nice if either setup.sh could be updated to use a newer python on Gradescope (maybe 3.7 since 3.8 support is lacking in some packages still) or so that otter gives an option when generating the autograder.zip of which python to use.

Original setup.sh:

#!/usr/bin/env bash

apt-get install -y python3 python3-pip

pip3 install -r /autograder/source/requirements.txt

An updated one I used for python 3.7:

#!/usr/bin/env bash

apt-get install -y python3.7 python3-pip

update-alternatives --install /usr/bin/python3 python3 /usr/bin/python3.7 1

pip3 install -r /autograder/source/requirements.txt

Update Otter Assign docs to describe no-solution question removal

As titled. Reported by #88

Change Otter Grade output format to match Gradescope output

Essentially the idea is to scrap the Otter Grade output format and just copy the same results format used for Gradescope. Their format is a tidy JSON file and these can be easily stitched together into a larger one containing all submissions. This would also allow the grading internals for both to be the same, cutting down on the complexity of the code.

change otter.execute.grade_notebook to output in this format? (possibly moving the formatting of the JSON object to another function)
edit the run_autograder script to call otter grade directly
edit otter.grade.main to convert args to attrdict.AttrDict if it's a dict rather than argparse.Namespace

Indexing error for hidden test when running autograder file in student environment

Describe the bug
In Rmarkdown, I'd like to test the solution file (from the autograder folder) in the student environment. When there is only a hidden test for a question, in the student tests cases is an empty vector, e.g. cases: []. However the autograder solution file still includes an ottr::check cell for that question leading to the following error:

Error in suite_results[[1]] : subscript out of bounds

This can be easily filxed by removing ottr::check from the autograder file but if there are many instances of this issue, it might be a pain. It's cleaner if a test with no cases automatically passes without error.

To Reproduce
Steps to reproduce the behavior:

Create a question with one hidden test and no visible tests.
Run otter assign
Move the Rmd from the autograder folder into the student folder and run.
See error

Expected behavior
autograder Rmd file runs without error when evaluated in the student folder

Versions
Please provide your Python and Otter versions. The Otter can be obtained using from otter import __version__

Additional context
Add any other context about the problem here.

New test file format - using testbook?

Just wanted to put this on your radar, it's a new tool out of the nteract project called Testbook:

https://github.com/nteract/testbook

I think the basic idea is to have a first-class experience for testing notebooks. I know that Otter has its own testing spec / process right now, but wanted to let you know about this in case it could be useful or worth following.

otter-assign should warn when either too many or too few modules are specified in requirements

Is your feature request related to a problem? Please describe.
After using otter-assign for about six weeks, the need to explicitly list the names of certain python modules in a requirements.txt file has emerged as one of the most frustrating sources of hard-to-track bugs, and as a time-intensive step in the workflow.

To avoid bugs, every time one adds an "import" statement to a notebook, one must remember to check the relevant library against three lists:

the list of standard python libraries at https://docs.python.org/3/py-modindex.html; this list may change from python version to python version
the list of modules in the default requirements at https://otter-grader.readthedocs.io/en/latest/otter_generate/container_image.html#requirements-txt; this list may change from otter version to otter version;
a file requirements.txt that must be passed to otter-assign as an argument; this list typically changes from notebook to notebook.

For a given combination of python, otter, and notebook, if the library occurs neither in 1) nor in 2) it must occur in 3). If the library occurs either in 1) or in 2) or both, it must not occur in 3). Thus list 3) must neither contain too many or too few modules.

Otter-assign will not complain about deviations from this rule, but will produce an autograder that silently fails while grading, with no straightforward indication of the problem.

Describe the solution you'd like
Otter-assign should prominently warn about deviations from this rule in a preprocessing step, should repeat the warning at the end of its run, and should not produce an autograder.

Describe alternatives you've considered
An alternative would be for otter-assign to figure out the contents of list 3) automatically.

Additional context
The TAs and I have lost hours to this bug and have had to bother Chris several times to help us track it down.

add question-by-question score breakdown

Change final score to mapping of questions to individual scores

ucbds-infra / otter-grader Goto Github PK

otter-grader's People

Contributors

Stargazers

Watchers

Forkers

otter-grader's Issues

First example: When I run otter grade, I get this following error. I already have unemployment variable before unemployment.select('Date', 'NEI', 'NEI-PTER').take(0) cell

Error: expected

Date | NEI | NEI-PTER

1994-01-01 | 10.0974 | 11.172

but got

Traceback (most recent call last):

...

NameError: name 'unemployment' is not defined

Error: expected

True

but got

Traceback (most recent call last):

...

NameError: name 'boston_under_10' is not defined

Recommend Projects

Recommend Topics

Recommend Org

First example:
When I run otter grade, I get this following error. I already have unemployment variable before unemployment.select('Date', 'NEI', 'NEI-PTER').take(0) cell