citrineinformatics / citrine-python Goto Github PK

View Code? Open in Web Editor NEW

21.0 21.0 7.0 25.47 MB

License: Apache License 2.0

Python 99.88% Shell 0.12%

citrine-python's People

Contributors

Stargazers

Watchers

Forkers

lbianchini84 crleblanc donnotron666 jspeerless sailfish009 maxhutch shenganzhang

citrine-python's Issues

Code duplication and lack of error checking in get_type()

All PolymorphicSerializable implement a get_type(data) class method that returns the underlying type given serialized data. In general, data has some type field with the name of the class, which is matched against a known list of types. Here are two different example implementations:

@classmethod
def get_type(cls, data) -> Type['Processor']:
    return {
        'Grid': GridProcessor,
        'Enumerated': EnumeratedProcessor
    }[data['config']['type']]

@classmethod
def get_type(cls, data) -> Type['Predictor']:
    type_dict = {
        "Simple": SimpleMLPredictor
    }
    typ = type_dict.get(data['config']['type'])

    if typ is not None:
        return typ
    else:
        raise ValueError(
            '{} is not a valid predictor type. '
            'Must be in {}.'.format(data['config']['type'], type_dict.keys())
        )

Both implementations have a "type dictionary" and return the value in the type dictionary corresponding to a key that is pulled from data. But the first implementation does not gracefully throw an exception if the type is not in the type dictionary, and neither catches the possibility that data could be malformed (e.g., if data['config']['type'] does not exist).

Moving some of this logic and error checking to the abstract PolymorhpicSerializable class would lead to code deduplication and standardize exceptions across all implementations of PolymorphicSerializable.

Break predictors.py into several files

The file src/citrine/informatics/predictors.py is a monolith that contains every predictor. Create a "predictors/" directory and put individual files inside of it. Import the classes into "predictors/init.py" so that code that imports predictors does not break.

Include field and object name in field type validation error

Currently, if you try to assign str to an int field or a None to a non-optional field, you get an error message like:

ValueError: None is not one of valid types: <class 'str'>!

which forces you to navigate the stacktrace in order to figure out where its coming from. It would be really helpful to include the field name and object type in that value error, e.g.:

ValueError: None is not one of valid types: <class 'str'> for MaterialRun.name!

What publication(s) should I cite when referring to the Open Citrine Platform?

Specifically the adaptive design capabilities

https://citrine.io/category/publications/

test_measurement_material_connection_rehydration broken

See https://travis-ci.org/CitrineInformatics/citrine-python/jobs/588811967

Probably after CitrineInformatics/gemd-python#27

cc: @bfolie

Include soft-links in data object `repr`

When using jupyter, the representation of the object shown in the output cell is its __repr__. By default, that only shows object members, which excludes soft links. This leads to the non-obvious behavior that repr(process_run) doesn't include ingredients and repr(material_run) doesn't include measurements.

We should override __repr__ in objects that include soft-links to show linked objects.

h/t @sesevgen

Cannot import citrine in python 3.5

In a fresh python 3.5 conda environment, I get:

In [1]: import citrine                                                                                  
Traceback (most recent call last):

  File "/home/maxhutch/anaconda3/envs/py3.5/lib/python3.5/site-packages/IPython/core/interactiveshell.py", line 3326, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)

  File "<ipython-input-1-2b383dd42cdf>", line 1, in <module>
    import citrine

  File "/home/maxhutch/anaconda3/envs/py3.5/lib/python3.5/site-packages/citrine/__init__.py", line 2, in <module>
    from citrine.citrine import Citrine  # noqa: F401

  File "/home/maxhutch/anaconda3/envs/py3.5/lib/python3.5/site-packages/citrine/citrine.py", line 7
    DEFAULT_HOST: str = 'citrine.io'
                ^
SyntaxError: invalid syntax

Reproduction steps:

$ conda create -n py3.5 python=3.5
$ conda activate py3.5
$ pip install citrine
$ pip install ipython
$ ipython
In [1]: import citrine

Guarantee unused ingredients appear in gemtable

If an ingredient column is empty, it doesnt show up in the gemtable. There should be an option to have unused ingredients show up in the gemtable.

example data source:

ID	ing 1	ing 2	ing 3 (this ingredient doesnt show up in gemtable)
type	Amount	Amount	Amount
formulation 1	70	30
formulation 2	50	50
formulation 3	25	75

Accept raw Taurus data objects and templates for `register`

Currently, each data object and template ("data concept") in taurus is extended as a Resource in citrine-python. That's no biggie for reads, since they expose the same read interface, but it can cause confusion when creating data. The citrine-python version of data concepts must be used, but the class definitions for the rest of the data model still live in taurus. This creates code blocks like:

from citrine.resources.process_spec import ProcessSpec
from citrine.resources.project import Project
from taurus.entity.bounds.integer_bounds import IntegerBounds
from taurus.entity.value.nominal_integer import NominalInteger

It would be nice if the register method would accept bare taurus data concepts in addition to their resource versions. In that case, the register method would also serve the purpose of converting from taurus to citrine-python: it would return the Resource sub-class. This way, users would only ever have to create data in the taurus model but could still interact with an independent REST API.

projects.list() per_page limit does not return empty list

if there is a project with say 12 items:
projects.list(page=2, per_page=20) should be an empty list or none; Instead it is returning the last page that has items in it (in this case, page 1)

This issue does not affect datasets or data objects

Predictor Report documentation is out of date

Current predictor report documentation: https://github.com/CitrineInformatics/citrine-python/blob/master/docs/source/workflows/predictor_reports.rst

The example report JSON only contains feature importances, which is out of date. It should contain a set of descriptors and a sequence of models. Here's an example pulled from development:

{
  "models": [
    {
      "name": "GeneralLosslessModel_1559617749",
      "type": "GeneralLosslessModel",
      "inputs": [
        "x",
        "y",
        "z"
      ],
      "outputs": [
        "x"
      ],
      "display_name": "ExpressionRelation_-9162134",
      "model_settings": [
        {
          "name": "Expression",
          "value": "(x) <- (x * y * z)",
          "children": []
        }
      ],
      "feature_importances": []
    }
  ],
  "descriptors": [
    {
      "units": "",
      "category": "Real",
      "lower_bound": -1.7976931348623157e+308,
      "upper_bound": 1.7976931348623157e+308,
      "descriptor_key": "x"
    },
    {
      "units": "",
      "category": "Real",
      "lower_bound": -1.7976931348623157e+308,
      "upper_bound": 1.7976931348623157e+308,
      "descriptor_key": "y"
    },
    {
      "units": "",
      "category": "Real",
      "lower_bound": -1.7976931348623157e+308,
      "upper_bound": 1.7976931348623157e+308,
      "descriptor_key": "z"
    },
    {
      "units": "",
      "category": "Real",
      "lower_bound": 0,
      "upper_bound": 100,
      "descriptor_key": "x"
    }
  ]
}

Include api_error in BadRequest exception message

Right now, bad API requests only include the route in the stacktrace. If an error is encounted, a user has to go back and do somethign like:

try:
    dataset.process_runs.register(process_run)
except BadRequest as e:
    print(e.api_error)

to see the error. We should include the api_error in the exception message so that it shows up in a stacktrace.

cc: @lkubie @asantas93

`Update` does not work for data concepts collections

update is defined for all collections (https://github.com/CitrineInformatics/citrine-python/blob/master/src/citrine/_rest/collection.py#L105), but it does not work for data concepts collections (https://github.com/CitrineInformatics/citrine-python/blob/master/src/citrine/resources/data_concepts.py#L215) because data objects do not have a .uid field. Instead, to update a data object, one can re-call register or use async_update.

update should be overridden for data concepts collections. The following are possible behaviors:

Call register
Call async_update, wait for it to finish, then get and return the updated data object
Throw an exception and direct users to call either register or async_update, depending on their intent

Change "unique_label" to "name"

IngredientSpec and IngredientRun have a "unique_label" field, and ProcessTemplate has a "allowed_unique_label" field. Per the most recent docs, those should be changed to "name" and "allowed_names".

This change must be done after CitrineInformatics/gemd-python#3

Add `@alpha` decorator to provide runtime warnings of alpha usages

is there any way to add some kind of 'alpha' warning when using the new Ara API?

from @andyczerwonka in #189

Limit discussion of access control to its own section.

The current exposition of the first technical page describing citrine-python describes access control in a place that distracts from the priority of describing how the pieces fit together. This can be minimized by linking to the access control section.

Serialization properties cannot be defined in parent classes

Many of our PolymorphicSerializable class hierarchies have shared members. It would be nice to bring these members (e.g. key descriptors) into the parent class to communicate that they are part of the interface and to deduplicate their definition. However, the magik in those properties causes an error to be thrown when this is attempted, e.g.

E       assert <citrine.informatics.descriptors.InorganicDescriptor object at 0x7fe6b29af630> == <citrine.informatics.descriptors.InorganicDescriptor object at 0x7fe6945453c8>

Interesting, I get a completely different error when I try to do this in #163 :

tests/ara/test_variables.py:9: in <module>
    RootInfo("root name", ["Root", "Name"], "name"),
src/citrine/ara/variables.py:71: in __init__
    self.short_name = short_name
src/citrine/_serialization/properties.py:114: in __set__
    getattr(base_class, self.serialization_path).fset(obj, value_to_set)
E   AttributeError: 'NoneType' object has no attribute 'fset'

Add tests_require to setup.py

Right now, the requirements for testing are included in test_requirements.txt but not in setup.py. This isn't a bug, per se, but it could make it difficult for a developer that is building on top of citrine-python to include test-scoped dependencies (because pypi won't walk requirements files).

These test dependencies should be added with reasonably permissive version bounds, again so that a developer who is using citrine-python can flexibly test their own code.

Outdated documentation which references `citrine.attributes`

In the documentation which shows example code here:
https://github.com/CitrineInformatics/citrine-python/blob/master/docs/source/getting_started/code_examples.rst#create-a-linked-process-material-and-measurement

Condition, Parameter, and Property are imported this way:

from citrine.attributes.condition import Condition
from citrine.attributes.parameter import Parameter
from citrine.attributes.property import Property

but citrine.attributes has been depreciated and no longer exists. So running the example code results in

---------------------------------------------------------------------------
ModuleNotFoundError                       Traceback (most recent call last)
<ipython-input-68-a74b6227c8ac> in <module>
----> 1 from citrine.attributes.condition import Condition
      2 from citrine.attributes.parameter import Parameter
      3 from citrine.attributes.property import Property

ModuleNotFoundError: No module named 'citrine.attributes'

Flesh out Contributing.md

File currently has minimal content. At a minimum it should include the required format for writing docstrings, a discussion of the linter requirements, and a note about the test coverage requirement.