Giter Club home page Giter Club logo

coverage-model's People

Contributors

caseybryant avatar daf avatar emilyhahn avatar lukecampbell avatar mauricemanning avatar oceanzus avatar

Stargazers

 avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

coverage-model's Issues

Can't observe data after 2036-02-07T06:28:15

For observations made after that date the following error is raised:

   ----- exception: 'I' format requires 0 <= number <= 4294967295 -----
tion/science_granule_ingestion_worker.py:459    self.add_granule(stream_id, rdt)
tion/science_granule_ingestion_worker.py:617    self.insert_values(coverage, rdt, stream_id)
tion/science_granule_ingestion_worker.py:535    coverage.set_parameter_values(np_dict)
overage-model/coverage_model/coverage.py:554    self._persistence_layer.write_parameters(self.get_write_id(), values)
l/storage/parameter_persisted_storage.py:263    span_table.write_span(span)
e_model/storage/postgres_span_storage.py:52     stats_sql, bin_sql = self.get_span_stats_and_bin_insert_sql(span)
e_model/storage/postgres_span_storage.py:74     time_min = PostgresDB._get_time_string(span_stats[time_db_key][0])
ge-model/coverage_model/db_connectors.py:239    ntp_time = struct.pack(IonTime.ntpv4_timestamp, i, d)

ConstantOverTime overlapping

If I set a parameter value using ConstantOverTime, and set it again later, the latest one completely overwrites the first.

data_dict = {'sparseness' : ConstantOverTime('sparseness', 4096, time_start=0, time_end=3000)}
cov.set_parameter_values(data_dict)
data_dict = {'sparseness' : ConstantOverTime('sparseness', 2048, time_start=30, time_end=3000)}
cov.set_parameter_values(data_dict)

Results in

><> cov.get_parameter_values(['sparseness']).get_data()
-->
rec.array([(-9999999.0, 20.0), (-9999999.0, 21.0), (-9999999.0, 22.0),
       (-9999999.0, 23.0), (-9999999.0, 24.0), (-9999999.0, 25.0),
       (-9999999.0, 26.0), (-9999999.0, 27.0), (-9999999.0, 28.0),
       (-9999999.0, 29.0), (2048.0, 30.0), (2048.0, 31.0), (2048.0, 32.0),
       (2048.0, 33.0), (2048.0, 34.0), (2048.0, 35.0), (2048.0, 36.0),
       (2048.0, 37.0), (2048.0, 38.0), (2048.0, 39.0)],
      dtype=[('sparseness', '<f8'), ('time', '<f8')])

Complex Coverage referencing empty coverages

Could we add support so that the referenced coverages can be empty and later be filled in?

I think, when the planning preload is done we'll have complex coverages created that reference empty simplex coverages. The extents will be defined from like March 2014 to November 2014 or something but the simplexes will be empty until we start streaming data.

Numexpr Parameter Functions

The values from Numexpr Parameter Functions are all fill values or they raise an error depending on if fill_empty_params is set to True or not.

Array Types cause ValueError when reading

When running

ion.services.dm.test.test_dm_extended:TestDMExtended.test_array_flow_paths

the test fails with a ValueError caused by reading from the coverage.

Traceback (most recent call last):
  File "/Users/luke/Documents/Dev/code/coi2/extern/pyon/pyon/util/breakpoint.py", line 86, in wrapper
    return func(*args, **kwargs)
  File "/Users/luke/Documents/Dev/code/coi2/ion/processes/data/replay/replay_process.py", line 153, in _cov2granule
    data_dict = coverage.get_parameter_values(param_names=parameters, time_segment=(start_time, end_time), stride_length=stride_time, fill_empty_params=True).get_data()
  File "/Users/luke/Documents/Dev/code/coi2/extern/coverage-model/coverage_model/coverage.py", line 388, in get_parameter_values
    function_params=function_params, as_record_array=as_record_array)
  File "/Users/luke/Documents/Dev/code/coi2/extern/coverage-model/coverage_model/storage/parameter_persisted_storage.py", line 278, in read_parameters
    np_dict, functions, rec_arr = self.get_data_products(params, time_range, time, sort_parameter, stride_length=stride_length, create_record_array=as_record_array, fill_empty_params=fill_empty_params)
  File "/Users/luke/Documents/Dev/code/coi2/extern/coverage-model/coverage_model/storage/parameter_persisted_storage.py", line 291, in get_data_products
    np_dict = self._create_parameter_dictionary_of_numpy_arrays(numpy_params, function_params, params=dict_params)
  File "/Users/luke/Documents/Dev/code/coi2/extern/coverage-model/coverage_model/storage/parameter_persisted_storage.py", line 446, in _create_parameter_dictionary_of_numpy_arrays
    npa[insert_index:end_idx] = np_data
ValueError: setting an array element with a sequence.

The branch is here:
https://github.com/lukecampbell/coi-services/tree/cals-submodules

You'll need to add lukecampbell repository for both the coverage model submodule and the ion-definitions submodule.

To run the test:

bin/nosetests -vs nose -vs ion.services.dm.test.test_dm_extended:TestDMExtended.test_array_flow_paths

num_timesteps for complex coverage

Could we implement num_timesteps for complex coverage so that it returns effectively the shape for

cov.get_parameter_values(['time']).get_data()['time'].shape[0]

Or if there's a more efficient or better way to do that.

Downsampling or Striding Data

Is there an interface for downsampling or striding data? I'm concerned that if I put the stride logic in the application layer that we'll run hit major performance issues. It's very possible that it exists and I just missed it.

Ragged Array Support

  • Parameter Type for Ragged Arrays
  • list of lists for return type (I'm open to suggestions too)

ComplexCoverage get_parameter_values raises if using time segment of (None,None)

======================================================================
ERROR: test_attributes ( coverage_model.test.test_complex_coverage:TestNewComplexCoverageInt.test_attributes )
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/Users/luke/Documents/Dev/code/coi2/extern/coverage-model/coverage_model/test/test_complex_coverage.py", line 901, in test_attributes
    data = ccov.get_parameter_values(time_segment=(None,None), fill_empty_params=True, as_record_array=False).get_data()
  File "/Users/luke/Documents/Dev/code/coi2/extern/coverage-model/coverage_model/coverages/complex_coverage.py", line 71, in get_parameter_values
    current_time_segment = get_overlap(extents, time_segment)
  File "/Users/luke/Documents/Dev/code/coi2/extern/coverage-model/coverage_model/util/extent_utils.py", line 24, in get_overlap
    raise RuntimeError('No overlap')
RuntimeError: No overlap

----------------------------------------------------------------------

Complex Coverage

An interface (which we've typically referred to as the complex coverage) that can aggregate one or more coverages.

  • Complex Coverages are read only
  • Complex Coverages have their own parameter dictionaries
  • Queries return fill values for missing parameters
  • A first pass solution would be array concatenation
  • (second pass) ERDDAP will need proper slicing support, i.e. if all the coverage's parameter values were treated as a single array and sliced, that is the same value that is returned from the complex coverage.

Fill Value Dense Arrays

For parameters that are missing observations, can we fill them completely with fill values?

So, if the parameters are explicitly or implicitly requested they always return an array of the same size as the time values.

Advanced Selectors

This is a low priority.

For the first integration iteration of the new coverage model, ERDDAP is doing selectors at the application layer and it's a bit inefficient but still ok.

PyDAP gets a list of selectors in the OPeNDAP query which look like:

('time', '>=', '0.0')
('time', '<=', '3.1556736E9')
('temp', '>', '5')
('temp', '<', '7')

References:
http://www.opendap.org/pdf/ESE-RFC-004v1.1.pdf

Sparse types named lon raise error

I'm not entirely sure if it's name related but here's a test that shows it in action.

If the sparse parameter is called lon it raises an error, if it is named lat, it doesn't.

Support Temporally Neighboring Observation Lookup

Extrapolation must currently be supported using only interpolated data because data points outside a window aren't provided. In order to provide better support for extrapolation, the coverage model should add the ability to request the data points that are temporally closest to, but not included in, the lookup time segment.

R2 to R3 Coverage Migration

Implement a coverage converter that automatically loads and converts R2 coverages to R3 coverages.

The conversion should keep track of migration, use R3 version if it exists, and provide a mechanism to reclaim disk space of migrated R2 coverages.

Coverage Model Version

Can we bump the coverage model version.

Check with Matt, but I'd like to call this either version 2 or version 3.

is_empty

Add an is_empty() method to the Coverage interface and remove insert_timesteps and num_timesteps.

We came to the same conclusion, that the number of timesteps for a coverage is irrelevant and misleading.

Root Module Imports

Could we add key class imports to the root module coverage_model/__init__.py to make client importing easier for the general case.

  • SimplexCoverage
  • ParameterDictionary
  • ParameterTypes
  • NumpyParameterData

just for example

slice object not iterable error

  File "/Users/luke/Documents/Dev/code/coi2/extern/pyon/pyon/util/breakpoint.py", line 86, in wrapper
    return func(*args, **kwargs)
  File "/Users/luke/Documents/Dev/code/coi2/ion/services/dm/inventory/data_retriever_service.py", line 179, in retrieve
    retrieve_data = self.retrieve_oob(dataset_id=dataset_id,query=query,delivery_format=delivery_format)
  File "/Users/luke/Documents/Dev/code/coi2/ion/services/dm/inventory/data_retriever_service.py", line 164, in retrieve_oob
    return rdt.to_granule()
  File "/Users/luke/Documents/Dev/code/coi2/ion/services/dm/utility/granule/record_dictionary.py", line 215, in to_granule
    granule.record_dictionary[self.to_ordinal(key)] = self[key]
  File "/Users/luke/Documents/Dev/code/coi2/ion/services/dm/utility/granule/record_dictionary.py", line 350, in __getitem__
    return pfv[:]
  File "/Users/luke/Documents/Dev/code/coi2/extern/coverage-model/coverage_model/parameter_values.py", line 306, in __getitem__
    return _cleanse_value(r, time_segment)
  File "/Users/luke/Documents/Dev/code/coi2/extern/coverage-model/coverage_model/parameter_values.py", line 33, in _cleanse_value
    if ret.size == 1 and not np.atleast_1d([isinstance(s, slice) for s in slice_]).all():

I'm going to look into it further but it's only happening for parameter functions.

Cannot create an OBJECT array from memory buffer

I was testing for out of order data and came across this exception. I'm going to try to see if I can narrow down specifically what is causing it

   ----- exception: cannot create an OBJECT array from memory buffer -----
ap/handlers/coverage/coverage_handler.py:244    data = self.get_data(cov, name, bitmask)
ap/handlers/coverage/coverage_handler.py:144    data = self.get_values(cov, name)
ap/handlers/coverage/coverage_handler.py:226    data_dict = cov.get_parameter_values(param_names=[field], fill_empty_params=True).get_data()
overage-model/coverage_model/coverage.py:395    return self._persistence_layer.read_parameters(param_names, time_segment, time, sort_parameter, fill_empty_params=fill_empty_params)
l/storage/parameter_persisted_storage.py:269    np_dict, function_params, rec_arr = self.get_data_products(params, time_range, time, sort_parameter, create_record_array=True, fill_empty_params=fill_empty_params)
l/storage/parameter_persisted_storage.py:276    associated_spans = self._get_span_dict(params, time_range, time)
l/storage/parameter_persisted_storage.py:266    return SpanTablesFactory.get_span_table_obj().get_spans(coverage_ids=self.master_manager.guid, decompressors=self.value_list)
e_model/storage/postgres_span_storage.py:102    spans.append(Span.from_json(data, decompressors))
verage-model/coverage_model/data_span.py:58     uncompressed_params[str(param)] = decompressors[param].decompress(data)
l/storage/parameter_persisted_storage.py:629    vals = base64decode(obj)
l/storage/parameter_persisted_storage.py:588    arr = np.frombuffer(base64.decodestring(loaded[1]),data_type)
2014-05-08 13:24:43,571 ERROR Dummy-167 ion.util.pydap.handlers.coverage.coverage_handler:295 Problem reading cov Simplex Coverage for 45b11959d81144f89cb86c19d6fc6b0d cannot create an OBJECT array from memory buffer

Open Interval Support

Can we add support for open intervals in the time_segment argument, an open interval is indicated by a None?

Array types return object arrays

For now it's ok but in the future, I think it would make the interface better if it returned a proper numpy array with a definite shape. Mostly this is to assist the pydap handler in properly formatting the array for ERDDAP.

Well Done

After just a few tweaks, I got ingest, retrieve and erddap working.

There's a lot of edge cases that we'll have to work through with the various data types but at least we have data flowing in the right direction.

ERDDAP ScreenShot

Sparse Arrays

I can successfully set data as an array but a ValueError is raised when attempting to call get_parameter_values

Traceback (most recent call last):
  File "/Users/luke/Documents/Dev/code/coi2/extern/coverage-model/coverage_model/test/test_postgres_storage.py", line 561, in test_sparse_arrays
    cov.get_parameter_values().get_data()
  File "/Users/luke/Documents/Dev/code/coi2/extern/coverage-model/coverage_model/coverage.py", line 401, in get_parameter_values
    vals = self._persistence_layer.read_parameters(param_names, time_segment, time, sort_parameter, stride_length=stride_length, fill_empty_params=fill_empty_params)
  File "/Users/luke/Documents/Dev/code/coi2/extern/coverage-model/coverage_model/storage/parameter_persisted_storage.py", line 268, in read_parameters
    np_dict, function_params, rec_arr = self.get_data_products(params, time_range, time, sort_parameter, stride_length=stride_length, create_record_array=True, fill_empty_params=fill_empty_params)
  File "/Users/luke/Documents/Dev/code/coi2/extern/coverage-model/coverage_model/storage/parameter_persisted_storage.py", line 280, in get_data_products
    np_dict = self._create_parameter_dictionary_of_numpy_arrays(numpy_params, function_params, stride_length=stride_length, params=dict_params)
  File "/Users/luke/Documents/Dev/code/coi2/extern/coverage-model/coverage_model/storage/parameter_persisted_storage.py", line 419, in _create_parameter_dictionary_of_numpy_arrays
    fill_value=self.value_list[param_name].fill_value)
  File "/Users/luke/Documents/Dev/code/coi2/extern/coverage-model/coverage_model/parameter_data.py", line 155, in merge_data_as_numpy_array
    arr = obj_dict[key].get_data_as_numpy_array(alignment_array, fill_value=fill_value, arr=arr)
  File "/Users/luke/Documents/Dev/code/coi2/extern/coverage-model/coverage_model/parameter_data.py", line 182, in get_data_as_numpy_array
    arr[alignment_array >= self.start] = self._data
ValueError: NumPy boolean array indexing assignment cannot assign 2 input values to the 20 output values where the mask is true

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.