Comments (9)
I guess you are right that uploading the exact same object should be impossible. Didn't think about that one. However, it should still be possible to upload reruns as the machine may have changed between runs.
from openml.
From the server POV, identical runs can be blocked easily.
In the current specification, there is no way of telling the server from which PC it was send.
from openml.
We should indeed block those, thanks for bringing it up.
I would like to have some idea of the machine/environment in which the
experiment was run, even if not an actual machine id. One use case is to
upload runtimes as an indication of how fast the algorithm is (even if not
entirely comparable), in which case we would need some CPU benchmark of the
machine. Another use case is users who just want to know which
OS/environment was used. If we could extend the run description with
optional runtime, benchmark and environment info that would be great.
Cheers,
Joaquin
On 23 October 2013 10:43, janvanrijn [email protected] wrote:
From the server POV, identical runs can be blocked easily.
In the current specification, there is no way of telling the server from
which PC it was send.—
Reply to this email directly or view it on GitHubhttps://github.com//issues/28#issuecomment-26888889
.
Dr. Ir. Joaquin Vanschoren
Leiden Institute of Advanced Computer Science (LIACS)
Universiteit Leiden
Niels Bohrweg 1, 2333 CA Leiden, The Netherlands
mobile: (+32) (0)497 90 30 69
from openml.
W.r.t your last comment:
I would like to have some idea of the machine/environment in which the
experiment was run
I can see your point. But this is not easy. Put it into our features request, then push it back for a while?
Otherwise:
Throw an an error for now if the user uploads the same run and let him delete a selection of his runs on the server if he really needs to repeat something?
from openml.
Weka's plugin is created in such a way that it now sends some basic OS + JVM Benchmark information along with each run.
If I understand this issue correctly, we want to limit the number of runs that each user can do on each task / setup (implementation + param setting) combination.
Implementationwise, this can easily be arranged. Joaquin, does this comply to your wishes? Or do you still want to have a run uploaded from various machines?
from openml.
Oh, this is an old thread :). Things have changed in the mean time...
I currently believe it should be possible for me to run the exact same flow
and setup several times, for instance on different hardware. It would be
quite frustrating if I run a series of experiments on one machine, but
cannot compare the runtimes afterwards because one of the runs happened to
be run before.
It is indeed possible that the user makes a mistake and sends the same run
multiple times. Perhaps this can be catched with a checksum and the time
between submits? This doesn't sound very urgent, though. More important is
that a user can remove runs that were uploaded by accident.
Cheers,
Joaquin
On 11 August 2014 12:20, janvanrijn [email protected] wrote:
Weka's plugin is created in such a way that it now sends some basic OS +
JVM Benchmark information along with each run.If I understand this issue correctly, we want to limit the number of runs
that each user can do on each task / setup (implementation + param setting)
combination.Implementationwise, this can easily be arranged. Joaquin, does this comply
to your wishes? Or do you still want to have a run uploaded from various
machines?—
Reply to this email directly or view it on GitHub
#28 (comment).
Joaquin Vanschoren
about me http://www.win.tue.nl/~jvanscho/
from openml.
How about this: I will write an interface that shows users the runs that they have executed multiple times, together with delete functionality. Later on we can change the policy regarding multiple uploads.
from openml.
Sounds good. This will be part of 'My runs'?
On 11 August 2014 16:21, janvanrijn [email protected] wrote:
How about this: I will write an interface that shows users the runs that
they have executed multiple times, together with delete functionality.
Later on we can change the policy regarding multiple uploads.—
Reply to this email directly or view it on GitHub
#28 (comment).
Joaquin Vanschoren
about me http://www.win.tue.nl/~jvanscho/
from openml.
This is already implemented and running.
from openml.
Related Issues (20)
- Give back error messages when something with a dataset upload went wrong HOT 1
- Return error code 404 instead of 412 when dataset is not found
- [Server Error] HOT 1
- Issues uploading to test server (production works fine)
- Feature Request: Active Classification Task
- `Code 215: Database error. Setup search query failed - None`
- list_tasks() return task less than shown on the website HOT 1
- Dead link in README: "Citation and Honor Code" HOT 1
- Provide test datasets where labels are hidden HOT 2
- `parquet_url` incorrectly provided for non-arff formats
- Server error when creating a new task HOT 1
- Missing error message on blank fields
- Same dataset can be uploaded multiple times
- ServerError: server failed with HTTP status code 503 HOT 9
- Bucket does not exist or is private. Failed to download parquet, fallback on ARFF. HOT 2
- Cannot edit dataset HOT 1
- ElasticSearch user mapping: first name and last name should use a normalizer to allow case-insensitive sort
- JSON endpoint can return XML
- Weird task state for 361162?
- Delete HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from openml.