dmoj / judge-server Goto Github PK

Judging backend server for the DMOJ online judge.

License: GNU Affero General Public License v3.0

Python 51.65% C 1.47% C++ 42.63% Dockerfile 0.21% Makefile 0.06% Shell 0.13% Java 0.99% Pascal 0.03% Go 0.04% C# 0.08% Assembly 0.04% Roff 0.01% Cython 2.12% Rust 0.05% Brainfuck 0.51%

online-judge sandbox competitive-programming computer-science linux freebsd contest-platform python

judge-server's Introduction

DMOJ Judge

Contest judge backend for the DMOJ site interface, supporting IO-based, interactive, and signature-graded tasks, with runtime data generators and custom output validators.

See it in action at dmoj.ca!

Supported platforms and runtimes

The judge implements secure grading on Linux and FreeBSD machines.

	Linux	FreeBSD
x64	✔	✔
x86	✔	¯\_(ツ)_/¯
x32	✔	—
ARM	✔	❌

Versions up to and including v1.4.0 also supported grading on Windows machines.

Versions up to and including v3.0.2 also supported grading with pure ptrace without seccomp, which is useful on Linux kernel versions before 4.8.

The DMOJ judge does not need a root user to run on Linux machines: it will run just fine under a normal user.

Supported languages include:

C++ 11/14/17/20 (GCC and Clang)
C 99/11
Java 8-19
Python 2/3
PyPy 2/3
Pascal
Mono C#/F#/VB

The judge can also grade in the languages listed below:

Ada
AWK
Brain****
COBOL
D
Dart
Forth
Fortran
Go
Groovy
Haskell
INTERCAL
JavaScript (Node.js and V8)
Kotlin
Lean 4
LLVM IR
Lua
NASM
Objective-C
OCaml
Perl
PHP
Pike
Prolog
Racket
Ruby
Rust
Scala
Chicken Scheme
sed
Steel Bank Common Lisp
Swift
Tcl
Turing
V8 JavaScript
Zig

Installation

Installing the DMOJ judge creates two executables in your Python's script directory: dmoj and dmoj-cli. dmoj is used to connect a judge to a DMOJ site instance, while dmoj-cli provides a command-line interface to a local judge, useful for testing problems.

For more detailed steps, read the installation instructions.

Note that the only Linux distribution with first-class support is the latest Debian, with the default apt versions of all runtimes. This is what we run on dmoj.ca, and it should "just work". While the judge will likely still work with other distributions and runtime versions, some runtimes might fail to initialize. In these cases, please file an issue.

Stable build

We periodically publish builds on PyPI. This is the easiest way to get started, but may not contain all the latest features and improvements.

$ pip install dmoj

Bleeding-edge build

This is the version of the codebase we run live on dmoj.ca.

$ git clone --recursive https://github.com/DMOJ/judge-server.git
$ cd judge-server
$ pip install -e .

Several environment variables can be specified to control the compilation of the sandbox:

DMOJ_TARGET_ARCH; use it to override the default architecture specified for compiling the sandbox (via -march). Usually this is native, but will not be specified on ARM unless DMOJ_TARGET_ARCH is set (a generic, slow build will be compiled instead).

With Docker

We maintain Docker images with all runtimes we support in the runtimes-docker project.

Runtimes are split into three tiers of decreasing support. Tier 1 includes Python 2/3, C/C++ (GCC only), Java 8, and Pascal. Tier 3 contains all the runtimes we run on dmoj.ca. Tier 2 contains some in-between mix; read the Dockerfile for each tier for details. These images are rebuilt and tested every week to contain the latest runtime versions.

The script below spawns a tier 1 judge image. It expects the relevant environment variables to be set, the network device to be enp1s0, problems to be placed under /mnt/problems, and judge-specific configuration to be in /mnt/problems/judge.yml. Note that runtime configuration is already done for you, and will be merged automatically into the judge.yml provided.

$ git clone --recursive https://github.com/DMOJ/judge-server.git
$ cd judge-server/.docker
$ make judge-tier1
$ exec docker run \
    --name judge \
    -p "$(ip addr show dev enp1s0 | perl -ne 'm@inet (.*)/.*@ and print$1 and exit')":9998:9998 \
    -v /mnt/problems:/problems \
    --cap-add=SYS_PTRACE \
    -d \
    --restart=always \
    dmoj/judge-tier1:latest \
    run -p15001 -s -c /problems/judge.yml \
    "$BRIDGE_ADDRESS" "$JUDGE_NAME" "$JUDGE_KEY"

Usage

Running a judge server

$ dmoj --help
usage: dmoj [-h] [-p SERVER_PORT] -c CONFIG [-l LOG_FILE] [--no-watchdog]
            [-a API_PORT] [-A API_HOST] [-s] [-k] [-T TRUSTED_CERTIFICATES]
            [-e ONLY_EXECUTORS | -x EXCLUDE_EXECUTORS] [--no-ansi]
            server_host [judge_name] [judge_key]

Spawns a judge for a submission server.

positional arguments:
  server_host           host to connect for the server
  judge_name            judge name (overrides configuration)
  judge_key             judge key (overrides configuration)

optional arguments:
  -h, --help            show this help message and exit
  -p SERVER_PORT, --server-port SERVER_PORT
                        port to connect for the server
  -c CONFIG, --config CONFIG
                        file to load judge configurations from
  -l LOG_FILE, --log-file LOG_FILE
                        log file to use
  --no-watchdog         disable use of watchdog on problem directories
  -a API_PORT, --api-port API_PORT
                        port to listen for the judge API (do not expose to
                        public, security is left as an exercise for the
                        reverse proxy)
  -A API_HOST, --api-host API_HOST
                        IPv4 address to listen for judge API
  -s, --secure          connect to server via TLS
  -k, --no-certificate-check
                        do not check TLS certificate
  -T TRUSTED_CERTIFICATES, --trusted-certificates TRUSTED_CERTIFICATES
                        use trusted certificate file instead of system
  -e ONLY_EXECUTORS, --only-executors ONLY_EXECUTORS
                        only listed executors will be loaded (comma-separated)
  -x EXCLUDE_EXECUTORS, --exclude-executors EXCLUDE_EXECUTORS
                        prevent listed executors from loading (comma-
                        separated)
  --no-ansi             disable ANSI output
  --skip-self-test      skip executor self-tests

Running a CLI judge

$ dmoj-cli --help
usage: dmoj-cli [-h] -c CONFIG
                [-e ONLY_EXECUTORS | -x EXCLUDE_EXECUTORS]
                [--no-ansi]

Spawns a judge for a submission server.

optional arguments:
  -h, --help            show this help message and exit
  -c CONFIG, --config CONFIG
                        file to load judge configurations from
  -e ONLY_EXECUTORS, --only-executors ONLY_EXECUTORS
                        only listed executors will be loaded (comma-separated)
  -x EXCLUDE_EXECUTORS, --exclude-executors EXCLUDE_EXECUTORS
                        prevent listed executors from loading (comma-
                        separated)
  --no-ansi             disable ANSI output
  --skip-self-test      skip executor self-tests

Documentation

For info on the problem file format and more, read the documentation.

judge-server's People

Contributors

Stargazers

Watchers

Forkers

myapple projectstowork1 dong138 lalithr95 crazydreamer rayeya aurpine thesharpowl caidongyun frarteaga aman8050 hhemanth mmun pengzhang123 englishisnotgood onlinecodesubmission paradox512 ducminhquan little-wallace codejxer xiaochencui fyshi2016 phoenix1369 minkov ethanhuang0526 bingyupiaoyao gajanlee vvidal alphabetzinc kinkir lantoniotrento bupt-oj-v4 jack3737 runme pockman viewv willypillow metaflw harrisonding hackmas lzb991435344 robin-pong aakanshadhawan luobuccc hujiaogen johirbuet jwvg0425 hysci simonvpe lugt ychuan1115 saraithernandez chenxuefei123 yanshi1 ibmandura brainplusplus uwapcs dayamoy puffyshoggoth olee12 murez brightodd gekkasaori xiaowuc1 zmvictor deanamic gt2345 echokilooscar schoj yuan776 devopsotrator mcpt manoj-makkuboy rzhang05 mizan-cs jloongking ktshen008 xuejinqi qmainuddin strategist922 jjxxmiin ntwuxc ninjaclasher byoungjoonim xyene pkulics bigstone09 naveenchopra99 adelattia haslambda dyna-dot zoid-anurag brigaccess dangerggg blueskyxxz amansharma07 tsinghoo wickdom lueve mysticalx99

judge-server's Issues

Connection refused by host on Windows

error: [Errno 10054] An existing connection was forcibly closed by the remote host

`cptbox` improvements.

Add option to only track process creation and exiting PTRACE_O_TRACE*.
A SecurePopen-compatible class to log only only these events. Useful for example to track languages with builtin sandbox without performance penalty.
Port cptbox to more platforms, most notably linux x32 to use 64-bit powers with only 32-bit pointers.

Signature grading for Pascal

Should be possible to export a C grader to Pascal.

System.exit for Java

Since the user submission runs in the same process as the judge executor calls to System.exit are blocked such that the user does not exit the grading JVM.

Should be trivial to handle the exit permission request and kill the submission thread instead of throwing an AccessControlException.

LANG=C for compilers

Would prevent awesome havoc when the system language is set to something like zh_CN. GCC would be more than happy to give translated messages, puzzling everyone.

Note: Herculaneum uses Unicode smart quotes in error messages exactly for this reason.

Checkers take longer than submission

For linked list, the checker takes longer than the submission itself to check the data.

Some profile data:

name                                  ncall  tsub      ttot      tavg
..judge/checkers/standard.py:1 check  25     20.60029  39.94269  1.597708

The submission itself takes 7.69s. This is clearly not acceptable. I suggest we start porting checkers to C.

Apparently dependency on C++11 being available

Traceback (most recent call last):
  File "/var/lib/openshift/532715544382ec40ff00017a/app-root/data/judge/judge.py", line 269, in _begin_grading
    short_circuit=short_circuit, source_code=original_source, interactive=('grader' in init_data)):
  File "/var/lib/openshift/532715544382ec40ff00017a/app-root/data/judge/judge.py", line 566, in run
    topen = self._resolve_open_call(init_data, problem_id, forward_test_cases)
  File "/var/lib/openshift/532715544382ec40ff00017a/app-root/data/judge/judge.py", line 356, in _resolve_open_call
    generator_launcher = clazz.Executor('%s-generator' % problem_id, generator_source).launch_unsafe
  File "/var/lib/openshift/532715544382ec40ff00017a/app-root/data/judge/executors/GCCExecutor.py", line 57, in __init__
    raise CompileError(compile_error)
CompileError: cc1plus: error: unrecognized command line option "-std=c++11"

Unable to use from future imports with python executor

Currently, we add some new lines on top of the submission, for optimization, with disastrous results since from __future__ import <something> doesn't work.

We should use an alternative mechanism to make that work.

Haskell Optimization

I don't think Haskell "A Plus B" program should take 2.5 seconds...

Also, according to this documentation, -O produced reasonable code at a reasonable speed, while "At the moment, -O2 is unlikely to produce better code than -O."

How to run this judger on Windows ?

Hello, I want to try out this repo of judge on Windows ? How to use it ? Any documentation will be helpful. 👍

Interactive grader should take strings

Well you see, normal grader takes string to safe_communicate, checkers also take strings, but these interactive graders, take umm, cStringIO.StringIO???

Clojure support

Add all the JVM languages!

Objective-C support

Self-test should test output echoing

Errno 10013 after abnormal termination on Windows

socket.error: [Errno 10013] An attempt was made to access a socket in a way for idden by its access permissions

After exiting from the issue documented in #5 , Windows judges are not able to connect again.

Option to run system checks without sandbox

For DMOJ/hephaestus to work, we have to have the option of running all the executors without the sandbox. Ideally this would mean migrating most duplicate executor features into ResourceProxy and renamed BaseExecutor. This BaseExecutor would then be able to expose the basic features as seen fit.

Unicode toggling for Java executor

To speed up IO in Java, the executor uses an ASCII-encoded stream. This has the effect of making Unicode-dependent problems such as http://www.dmoj.ca/problem/denoun2 painful to solve in Java.

The stream encoding should be configurable per-problem.

Uncleaned wbox accounts

When wbox crashes, its generated accounts are never cleaned. They should be cleaned the next time the judge boots.

C:\Users\Tudor>net user

User accounts for \\BERLIN

-------------------------------------------------------------------------------
Administrator            Guest                    Tudor
-----------              wbox0000000000c6afaf     wbox0000000001871be0
wbox00000000018a93b8     wbox000000000191d021     wbox0000000001aa8ccc
wbox0000000001b196a0     wbox000000000ace72e2     wbox000000000b23e734
wbox000000000b296475     wbox000000000b324dc7     wbox000000000ba9c9f5
wbox000000000bb804e8     wbox000000000bbb3434
The command completed successfully.

Scala support

We already have a Java executor.

More detailed "No public class" message

Is most confusing for newcoming Java users as it is.

Fully class-based executor system

The current executor system does not take advantage of inheritance and modularization. This should be changed so that small changes doesn't have to cause all executors to be changed, e.g. #47, #34.

JVM sandbox refactor

Following the .NET sandbox example, the JVM sandbox should have a similar interface to ease the addition of #37, #38, #39, #41.

Include racket as supported language

waterloo first year cs teaches racket exclusively...

Obtuse docs lead to magical status codes

The Java submission:

public class Clazz {
    public static void main(String[] argv) {
    }

    static {
        String x = "x";
        for(int i = 0; i < 1000000; i++) x += x;
    }
}

Will not produce an MLE status code as expected, but rather an IR. The offending line? https://github.com/DMOJ/judge/blob/master/java_executor/src/ca/dmoj/java/SubmissionThread.java#L27

But why? Clearly the OutOfMemoryError is handled. The reason is quite illegit. Let's take a look at the docs for java.lang.reflect.Method#invoke (emphasis mine, cruft removed).

If the underlying method is static, the class that declared the method is initialized if it has not already been initialized.

Throws:

InvocationTargetException - if the underlying method throws an exception.

ExceptionInInitializerError - if the initialization provoked by this method fails.

Now, at this time, the class that declared the method is not initialized. So we're not given an InvocationTargetException when the static block runs out of heap space. Easy though, right? Just wrap the invocation to catch ExceptionInInitializerError, and check if the cause is an OutOfMemoryError. No.

In my experience with the Java language and Java documentation, the terms "Exception" and "Error" are used quite interchangeably. InvocationTargetException states that it will be thrown if "if the underlying method throws an exception", but this is not true. It will be thrown if the underlying method throws any Throwable - indeed, the entire Java executor relies on this. It's therefore only logical to assume that ExceptionInInitializerError would act the same. It doesn't.

Unlike what you might expect, it only gets thrown if the initialization throws an Exception. Not a Throwable, as you might believe - an Exception. Hence, "correct" code would wrap the invocation call with a try/catch block for OutOfMemoryError, which, by virtue of being an Error, will not be wrapped in ExceptionInInitializerError.

Thanks, Java.

This rant issue took less time to write than this bug took to identify.

Configurable csbox.exe and java_executor.jar location

No reason why it should be fixed into the file, as it makes DMOJ/hephaestus rather painful.

"\w+\.exe has stopped working" on OLE on Windows

Using checker standard
judge-connect.dmoj.ca:15000 => {
    "name": "grading-begin",
    "submission-id": 52896
}
OLE: stdout

This application has requested the Runtime to terminate it in an unusual way.
Please contact the application's support team for more information.

Judging statistics

Would be really nice to be able to measure submission count and average submission judging time for showing on http://status.dmoj.ca/

YAML judge configuration

judge.json is painful because it's JSON according to @Xyene and I don't disagree.

Protocol configurability

Or something among those lines. TCP packets in async node.js are just too painful.

Multiple judge support

On powerful computers it may be beneficial to support the running more than one judge without going through the pain of having to copy the judge directory. Simply pass a path to a judge json model as an optional commandline parameter, and default to /data/judge/judge.json for compatibility with judges that are already set up.

IE and sending emails on missing test case file

Don't just silently terminate grading and pretend like things will be fine.

Abysmal error handling

The judge servers were written to "chug along", but this has a terrible effect on error messages.

The following traceback is generated when a user misconfigures a problem:

Traceback (most recent call last):

  File "/code/site/judge/bridge/judgehandler.py", line 199, in on_bad_problem
    raise ValueError('\n\n' + packet['message'])

ValueError:

Traceback (most recent call last):
  File "/judge/judge/judge.py", line 185, in _begin_grading
    with open(os.path.join(get_problem_root(problem_id), 'init.json'), 'r') as init_file:
  File "/usr/lib/python2.7/posixpath.py", line 77, in join
    elif path == '' or path.endswith('/'):
AttributeError: 'NoneType' object has no attribute 'endswith'

Intuitive, right? Very helpful to those who are not familiar with the judge's source code.

Executable size limit

We can set RLIMIT_FSIZE to limit the maximum file size a compiler can possibly create. However, setting this limit in the judge process is not a trivial matter. A clean solution requires a script which sets the limit and execves the compiler, which would require a utility function to spawn compilers as this only works on *nix. The helper might also need to catch SIGXFSZ exit and raise compile error as appropriate.

Needless to say, this has no priority.

Class-based GCC executor

A class-based approach is fully modifiable by subclasses instead of making the make_executor function take more and more arguments.

IE when submission ignores stdin data.

When operating on the child's stdin on Windows, and the child ignores all the input, IOError: [Errno 22] Invalid argument is caused when the buffer is flushed. This happens on write or close, and causes exceptions.

Sample tracebacks:

Traceback (most recent call last):
  File "C:\Users\Quantum\Desktop\judge\judge.py", line 281, in _begin_grading
    interactive=('grader' in init_data)):
  File "C:\Users\Quantum\Desktop\judge\judge.py", line 664, in run
    result.proc_output, error = communicate(input_data, outlimit=25165824, errlimit=1048576)
  File "C:\Users\Quantum\Desktop\judge\communicate.py", line 53, in safe_communicate
    proc.stdin.write(input)
IOError: [Errno 22] Invalid argument

Traceback (most recent call last):
  File "C:\Users\Quantum\Desktop\judge\judge.py", line 281, in _begin_grading
    interactive=('grader' in init_data)):
  File "C:\Users\Quantum\Desktop\judge\judge.py", line 664, in run
    result.proc_output, error = communicate(input_data, outlimit=25165824, errlimit=1048576)
  File "C:\Users\Quantum\Desktop\judge\communicate.py", line 63, in safe_communicate
    proc.stdin.close()
IOError: [Errno 22] Invalid argument

Erlang support

Groovy support

Abortion request not honoured when running generator

Generators should automatically stop executing when being aborted, at least stop before spawning the next process to generate the next file. Similar action for compilers would be nice to have.

Dart support

Because Dart is just so cool.

Unbuffered I/O for interactive problems

A certain online judge doesn't require flushing for interactive problems. A ~~wild guess~~ scientific analysis suggests that they present pty devices for stdin and stdout. This can be arranged on *nix through the use of the pty module's openpty() in place of os.pipe().

Judge API

Importing classes from judge.py for use in interactive graders/custom checkers is nasty and ultimately depends highly on the current judge version, forcing any future changes to the internals to be signature-compatible.

An API could easily resolve this.

Local grading server UI

Would by highly useful for problemsetters without installing the DMOJ Django site. Could be written as a lightweight Java applet that implements a minimal grading UI.

It would reduce the amount of effort that needs to go into organizing contests, since data would not have to be uploaded for every try. Would also be useful for allowing this judge to be used as a standalone program for private training (there are more stars on this repo than the site repo, which might be indicative that some potential users are interested in private graders).

Uncleaned wbox firewall rules

Same as #19, but for firewall rules.

Clang

According to some StackOverflow post, Clang is recommended for students due to "extremely clear diagnostics are definitely easier for beginners to interpret". The idea is rather popular to the point that GCC tries to debunk it as a myth.

It might be worth to add clang as an alternative C/C++ frontend.

JavaScript Nashorn support

Should this be added in place of Rhino? It ships with Java 8.

@quantum5

Generator cache

Some generators take prohibitively long times to generate testdata, and in a contest scenario it would be beneficial to have a cache of generated data so that subsequent submissions run faster. The cache would likely expire after a few minutes, to free resources.