Giter Club home page Giter Club logo

Comments (20)

rossant avatar rossant commented on September 27, 2024

what would be the output of one.list(eids=['someeid'], search_term='dataset_type')?

from ibllib.

kdharris101 avatar kdharris101 commented on September 27, 2024

It would be all the dataset types available for the eid 'someeid' - provided that was a valid eid. If it wasn't, you would get None, or an error, or something.

Again, we would have the issue of passing a string vs a list of strings, and I would suggest we interpret a single string the same way as a list of one string.

from ibllib.

rossant avatar rossant commented on September 27, 2024

so one.list(eids=['someeid'], search_term='dataset_type') and one.list(eids=['someeid']) would be equivalent?

I'm just wondering whether it would be confusing for users to have the same function list() return different kinds of items depending on the arguments. Would one.list_dataset_types(...), one.list_users(...), one.list_eids(...), one.list_datasets(...) be sensible? Doing one.<TAB> in IPython would immediately give you the list of relevant functions.

from ibllib.

kdharris101 avatar kdharris101 commented on September 27, 2024

Yes, isn't that just how it works with python default arguments?

I see your point about having multiple functions - but the problem then is that introducing a new search term requires adding a new function.

How about one.list('search_terms') to return a list of search terms used by this implementation?

from ibllib.

nbonacchi avatar nbonacchi commented on September 27, 2024

What about both?
I see good arguments for both ways of doing things.
We could implement one.list by having a bunch of user interface wrapper functions called as Cyrille was suggesting that in reality could just call one.list() with the appropriate arguments....

To Kennet's point, yes that would be nice, my initial post in the openneurodata repo here was exactly that, called metasearch...

So what if it looked like this:
one.list() returns all the listable dimensions - all args can be used as now
one.list_eids() returns eIDs - all args except eID can be used
one.list_users() returns a list of users - all args except users
one.list_dataset_types returns dataset_types - ...
etc...
of course one.list would always be there and available for users to use
so the output of one.list would be the intersection of all the args that were inserted
and all other one.list_* would basically specify what the user wants to return.

from ibllib.

rossant avatar rossant commented on September 27, 2024

Depending on how we implement it, we could also have list() that transparently calls the various list_stuff() functions, depending on the arguments. I can already see a big nested list of if sequences in the main list() implementation, which can be error-prone.

from ibllib.

kdharris101 avatar kdharris101 commented on September 27, 2024

The most common thing people are going to want to do is find what files are associated with an eid. The reasoning behind this suggestion was to have a function that does that with the default parameters - but also does other things (required less often) with other parameters. The less functions we have the better!

One other thing we should have is a way to get a docstring on what a particular dataset type means. (We have this in the spreadsheet now)

from ibllib.

nbonacchi avatar nbonacchi commented on September 27, 2024

:) I see what you mean... I was thinking of using sets and intersections of sets after cache querying the whole database, but that can be problematic also...

from ibllib.

oliche avatar oliche commented on September 27, 2024

My initial thinking for the list function was to provide the user with the possible values for session fields that can be filtered on: dataset_types, users and subjects. This is functionality A.

myone.list('dataset-types') returns the API query on the dataset-types Django table.

Kenneth has a very valid point: as of yesterday there was no way to export the session info (functionality B) without also downloading the full datasets . To put it mildly, that was somewhat stupid.
d = myone.load(eid, dataset_types=dataset_types, dclass_output=True)
I fixed it yesterday with the following proposed implementation:
d = myone.info(eid)

D is a data structure (auto-completes in editor and matches Matlab structure syntax) with the following fields available: d.dataset_id, d.dataset_type, among others.

I called the functionality A method list, and functionality B method info.
I suggest to call them list and session_info.

from ibllib.

kdharris101 avatar kdharris101 commented on September 27, 2024

This has the functionality we need, but maybe not in the most user-friendly way.

It would be great to just type one.list(eid) into my jupyter notebook, and get a simple list of dataset types available for this experiment - because that is what I will want to know 90% of the time. The other stuff I will want rarely, and could type a more complex command to get it.

from ibllib.

oliche avatar oliche commented on September 27, 2024

Ok it makes sense. Short easy to remember command for widely used functionality.
Will do:

  • myone.list(eid) returns a list of dataset-types
  • myone.session_info(eid) returns a data structure with more fields about the datasets
  • I'll refactor myone.list('dataset_types') etc... as
    • myone.ls_dataset_types
    • myone.ls_users
    • myone.ls_subjects

Matlab implementation will use same namings.
Everybody on board ?

from ibllib.

oliche avatar oliche commented on September 27, 2024

Pushed on dev and master:

  • syntax refactoring
  • updated tests
  • updated tutorial

from ibllib.

kdharris101 avatar kdharris101 commented on September 27, 2024

Re-opening this since I can still see a problem with having the function names encode the search terms.

We don't know exactly how the search functionality is going to work. But it will probably be something like the Django syntax, i.e. one.search(field, value) or one.search(relation, value) for example one.search('user', 'kenneth') or one.search('date_>', '1/1/2017').

The search fields could be anything in the database - not just the things we are coding here. If we need a separate function to list every one of these, we will need to code a new ls_ function for every field in the database. If we just pass the search term as an argument to one.list, this won't happen. So for example if the experiments table had a field percent_correct then

one.list(None, 'percent_correct')

Would give a value of all possible percent correct scores in all experiments.

Django also allows for relational queries spanning multiple tables. All this comes for free. Why would we want to reimplement it with lots of new functions?

from ibllib.

oliche avatar oliche commented on September 27, 2024

Ok, I've re-opened the thread.

For the ls_* those functions are wrappers to the generic ls function. It was just a matter of convenience to get auto-completion since I would expect those to be used often.
Note that those functions do not perform an aggregation, it is a simple REST query to the endpoint (a simple table dump).

Aggregating unique values for fields of existing experiments:

  • If the field is a foreign key, this can be done through REST going through the table endpoint. List users for existing experiments will require a custom filter in users/views.py and the endpoint queried would be users?custom_query.
  • If the field is not a foreign key, like percent_correct, this requires to create a view and map the view to an URL that returns a JsonResponse.

Cyrille, I may be wrong, but it seems to me that the REST API is not meant to forward directly aggregation or complex queries, unless they correspond to existing endpoints (tables). Plus neither case allows for dynamic definitions.

So yes it is easy and efficient to perform queries on the Django side (AWS instance), but I'm not sure it is so easy through REST. I'm looking into it now, ideas welcomed !

from ibllib.

kdharris101 avatar kdharris101 commented on September 27, 2024

from ibllib.

oliche avatar oliche commented on September 27, 2024

Ok so I've opened the postgres ports, and can connect from a client computer via the Django command line. This implies adding Alyx as a dependency on the client, but it doesn't require the full installation nor it requires to run an Alyx instance on the client. Just using the models to connect to the database and perform queries.

However it seems impossible to run queries with a read-only postgres user.

from ibllib.

rossant avatar rossant commented on September 27, 2024

The whole point of alyx was to provide two interfaces, a complex, low-level one with SQL or django, and a simpler, high-level one with REST for the most common operations.

The REST API should provide endpoints/filters for 90% of the use-cases.

Are you suggesting that these 90% are not sufficient for ONE?

If there is a highly complex query that would be very common, we should just implement a new endpoint or a new filter. We shouldn't try to reimplement django or SQL on top of REST.

from ibllib.

kdharris101 avatar kdharris101 commented on September 27, 2024

from ibllib.

rossant avatar rossant commented on September 27, 2024

It would be better not to rely on the alyx codebase client-side. If all you ever want to do is call filter(...) then it would be nearly trivial to implement something where all the arguments to filter() are passed as a serialized string to a REST API, and passed again to django's filter() server-side. This is a bit of a hack, but it would work...

@oliche you could implement this as follows:

one.filter(**kwargs) => kwargs dictionary as JSON => base64 encoding => REST /custom/ endpoint or something => base64 decoding => JSON decoding => django.filter(**kwargs)

from ibllib.

oliche avatar oliche commented on September 27, 2024

Yes.

For the filter, I would implement this with Q objects as this is equivalent and more flexible (we can't make OR with the straight filter).

To communicate from client to server, for existing filters or simple ones we can implement with the current endpoints. However if we need to return aggregations that do not correspond to an endpoint I'll create a new app one with a view that returns a Json object.

from ibllib.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.