Giter Club home page Giter Club logo

citrine-python's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

citrine-python's Issues

Accept raw Taurus data objects and templates for `register`

Currently, each data object and template ("data concept") in taurus is extended as a Resource in citrine-python. That's no biggie for reads, since they expose the same read interface, but it can cause confusion when creating data. The citrine-python version of data concepts must be used, but the class definitions for the rest of the data model still live in taurus. This creates code blocks like:

from citrine.resources.process_spec import ProcessSpec
from citrine.resources.project import Project
from taurus.entity.bounds.integer_bounds import IntegerBounds
from taurus.entity.value.nominal_integer import NominalInteger

It would be nice if the register method would accept bare taurus data concepts in addition to their resource versions. In that case, the register method would also serve the purpose of converting from taurus to citrine-python: it would return the Resource sub-class. This way, users would only ever have to create data in the taurus model but could still interact with an independent REST API.

Cannot import citrine in python 3.5

In a fresh python 3.5 conda environment, I get:

In [1]: import citrine                                                                                  
Traceback (most recent call last):

  File "/home/maxhutch/anaconda3/envs/py3.5/lib/python3.5/site-packages/IPython/core/interactiveshell.py", line 3326, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)

  File "<ipython-input-1-2b383dd42cdf>", line 1, in <module>
    import citrine

  File "/home/maxhutch/anaconda3/envs/py3.5/lib/python3.5/site-packages/citrine/__init__.py", line 2, in <module>
    from citrine.citrine import Citrine  # noqa: F401

  File "/home/maxhutch/anaconda3/envs/py3.5/lib/python3.5/site-packages/citrine/citrine.py", line 7
    DEFAULT_HOST: str = 'citrine.io'
                ^
SyntaxError: invalid syntax

Reproduction steps:

$ conda create -n py3.5 python=3.5
$ conda activate py3.5
$ pip install citrine
$ pip install ipython
$ ipython
In [1]: import citrine     

Guarantee unused ingredients appear in gemtable

If an ingredient column is empty, it doesnt show up in the gemtable. There should be an option to have unused ingredients show up in the gemtable.

example data source:

ID ing 1 ing 2 ing 3 (this ingredient doesnt show up in gemtable)
type Amount Amount Amount
formulation 1 70 30
formulation 2 50 50
formulation 3 25 75

`Update` does not work for data concepts collections

update is defined for all collections (https://github.com/CitrineInformatics/citrine-python/blob/master/src/citrine/_rest/collection.py#L105), but it does not work for data concepts collections (https://github.com/CitrineInformatics/citrine-python/blob/master/src/citrine/resources/data_concepts.py#L215) because data objects do not have a .uid field. Instead, to update a data object, one can re-call register or use async_update.

update should be overridden for data concepts collections. The following are possible behaviors:

  1. Call register
  2. Call async_update, wait for it to finish, then get and return the updated data object
  3. Throw an exception and direct users to call either register or async_update, depending on their intent

Include field and object name in field type validation error

Currently, if you try to assign str to an int field or a None to a non-optional field, you get an error message like:

ValueError: None is not one of valid types: <class 'str'>!

which forces you to navigate the stacktrace in order to figure out where its coming from. It would be really helpful to include the field name and object type in that value error, e.g.:

ValueError: None is not one of valid types: <class 'str'> for MaterialRun.name!

Predictor Report documentation is out of date

Current predictor report documentation: https://github.com/CitrineInformatics/citrine-python/blob/master/docs/source/workflows/predictor_reports.rst

The example report JSON only contains feature importances, which is out of date. It should contain a set of descriptors and a sequence of models. Here's an example pulled from development:

{
  "models": [
    {
      "name": "GeneralLosslessModel_1559617749",
      "type": "GeneralLosslessModel",
      "inputs": [
        "x",
        "y",
        "z"
      ],
      "outputs": [
        "x"
      ],
      "display_name": "ExpressionRelation_-9162134",
      "model_settings": [
        {
          "name": "Expression",
          "value": "(x) <- (x * y * z)",
          "children": []
        }
      ],
      "feature_importances": []
    }
  ],
  "descriptors": [
    {
      "units": "",
      "category": "Real",
      "lower_bound": -1.7976931348623157e+308,
      "upper_bound": 1.7976931348623157e+308,
      "descriptor_key": "x"
    },
    {
      "units": "",
      "category": "Real",
      "lower_bound": -1.7976931348623157e+308,
      "upper_bound": 1.7976931348623157e+308,
      "descriptor_key": "y"
    },
    {
      "units": "",
      "category": "Real",
      "lower_bound": -1.7976931348623157e+308,
      "upper_bound": 1.7976931348623157e+308,
      "descriptor_key": "z"
    },
    {
      "units": "",
      "category": "Real",
      "lower_bound": 0,
      "upper_bound": 100,
      "descriptor_key": "x"
    }
  ]
}

Outdated documentation which references `citrine.attributes`

In the documentation which shows example code here:
https://github.com/CitrineInformatics/citrine-python/blob/master/docs/source/getting_started/code_examples.rst#create-a-linked-process-material-and-measurement

Condition, Parameter, and Property are imported this way:

from citrine.attributes.condition import Condition
from citrine.attributes.parameter import Parameter
from citrine.attributes.property import Property

but citrine.attributes has been depreciated and no longer exists. So running the example code results in

---------------------------------------------------------------------------
ModuleNotFoundError                       Traceback (most recent call last)
<ipython-input-68-a74b6227c8ac> in <module>
----> 1 from citrine.attributes.condition import Condition
      2 from citrine.attributes.parameter import Parameter
      3 from citrine.attributes.property import Property

ModuleNotFoundError: No module named 'citrine.attributes'

projects.list() per_page limit does not return empty list

if there is a project with say 12 items:
projects.list(page=2, per_page=20) should be an empty list or none; Instead it is returning the last page that has items in it (in this case, page 1)

This issue does not affect datasets or data objects

Flesh out Contributing.md

File currently has minimal content. At a minimum it should include the required format for writing docstrings, a discussion of the linter requirements, and a note about the test coverage requirement.

Include api_error in BadRequest exception message

Right now, bad API requests only include the route in the stacktrace. If an error is encounted, a user has to go back and do somethign like:

try:
    dataset.process_runs.register(process_run)
except BadRequest as e:
    print(e.api_error)

to see the error. We should include the api_error in the exception message so that it shows up in a stacktrace.

cc: @lkubie @asantas93

Code duplication and lack of error checking in get_type()

All PolymorphicSerializable implement a get_type(data) class method that returns the underlying type given serialized data. In general, data has some type field with the name of the class, which is matched against a known list of types. Here are two different example implementations:

@classmethod
def get_type(cls, data) -> Type['Processor']:
    return {
        'Grid': GridProcessor,
        'Enumerated': EnumeratedProcessor
    }[data['config']['type']]
@classmethod
def get_type(cls, data) -> Type['Predictor']:
    type_dict = {
        "Simple": SimpleMLPredictor
    }
    typ = type_dict.get(data['config']['type'])

    if typ is not None:
        return typ
    else:
        raise ValueError(
            '{} is not a valid predictor type. '
            'Must be in {}.'.format(data['config']['type'], type_dict.keys())
        )

Both implementations have a "type dictionary" and return the value in the type dictionary corresponding to a key that is pulled from data. But the first implementation does not gracefully throw an exception if the type is not in the type dictionary, and neither catches the possibility that data could be malformed (e.g., if data['config']['type'] does not exist).

Moving some of this logic and error checking to the abstract PolymorhpicSerializable class would lead to code deduplication and standardize exceptions across all implementations of PolymorphicSerializable.

Include soft-links in data object `__repr__`

When using jupyter, the representation of the object shown in the output cell is its __repr__. By default, that only shows object members, which excludes soft links. This leads to the non-obvious behavior that repr(process_run) doesn't include ingredients and repr(material_run) doesn't include measurements.

We should override __repr__ in objects that include soft-links to show linked objects.

h/t @sesevgen

Break predictors.py into several files

The file src/citrine/informatics/predictors.py is a monolith that contains every predictor. Create a "predictors/" directory and put individual files inside of it. Import the classes into "predictors/init.py" so that code that imports predictors does not break.

Add tests_require to setup.py

Right now, the requirements for testing are included in test_requirements.txt but not in setup.py. This isn't a bug, per se, but it could make it difficult for a developer that is building on top of citrine-python to include test-scoped dependencies (because pypi won't walk requirements files).

These test dependencies should be added with reasonably permissive version bounds, again so that a developer who is using citrine-python can flexibly test their own code.

Serialization properties cannot be defined in parent classes

Many of our PolymorphicSerializable class hierarchies have shared members. It would be nice to bring these members (e.g. key descriptors) into the parent class to communicate that they are part of the interface and to deduplicate their definition. However, the magik in those properties causes an error to be thrown when this is attempted, e.g.

E       assert <citrine.informatics.descriptors.InorganicDescriptor object at 0x7fe6b29af630> == <citrine.informatics.descriptors.InorganicDescriptor object at 0x7fe6945453c8>

Interesting, I get a completely different error when I try to do this in #163 :

tests/ara/test_variables.py:9: in <module>
    RootInfo("root name", ["Root", "Name"], "name"),
src/citrine/ara/variables.py:71: in __init__
    self.short_name = short_name
src/citrine/_serialization/properties.py:114: in __set__
    getattr(base_class, self.serialization_path).fset(obj, value_to_set)
E   AttributeError: 'NoneType' object has no attribute 'fset'

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.