Giter Club home page Giter Club logo

icss's People

Contributors

hustonhoburg avatar jessecrouch avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

Forkers

circleback

icss's Issues

Icss::Message load booms loading a geo ICSS

On the 'flip' branch, after I require 'icss/core_types' by hand in icss.rb and change the global Settings[:catalog_root] flag to point to the correct place. The lib crashes on loading the geonames ICSS (geo/location/geonames/places.icss.yaml)

/Users/dsnyder/.rubylib/icss/message.rb:88:in `receive_is_geo': uninitialized constant     Icss::Meta::Req::PointWithRadiusGeolocator (NameError)
from /Users/dsnyder/.rubylib/icss/type/record_model.rb:19:in `block in receive!'
from /Users/dsnyder/.rubylib/icss/type/record_model.rb:15:in `each'
from /Users/dsnyder/.rubylib/icss/type/record_model.rb:15:in `receive!'
from /Users/dsnyder/.rubylib/icss/type/record_type.rb:33:in `receive'
from /Users/dsnyder/.rubylib/icss/type/structured_schema.rb:80:in `block in receive'
from /Users/dsnyder/.rubylib/icss/type/structured_schema.rb:80:in `each'
from /Users/dsnyder/.rubylib/icss/type/structured_schema.rb:80:in `receive'
from /Users/dsnyder/.rubylib/icss/type/record_type.rb:138:in `block in rcvr'
from /Users/dsnyder/.rubylib/icss/protocol.rb:120:in `block in receive_messages'
from /Users/dsnyder/.rubylib/icss/type/type_factory.rb:106:in `with_namespace'
from /Users/dsnyder/.rubylib/icss/protocol.rb:119:in `receive_messages'
from /Users/dsnyder/.rubylib/icss/type/record_model.rb:19:in `block in receive!'
from /Users/dsnyder/.rubylib/icss/type/record_model.rb:15:in `each'
from /Users/dsnyder/.rubylib/icss/type/record_model.rb:15:in `receive!'
from /Users/dsnyder/.rubylib/icss/type/record_type.rb:33:in `receive'
from /Users/dsnyder/.rubylib/troop.rb:63:in `initialize'
from /usr/local/bin/troop:51:in `new'
from /usr/local/bin/troop:51:in `block in show'
from /usr/local/bin/troop:50:in `each'
from /usr/local/bin/troop:50:in `show'
from /usr/local/bin/troop:77:in `<main>'

Icss serialization fails for to_yaml with 'anonymous class'

Flip branch

p = Icss::Meta::Protocol.receive(YAML.load(File.read("../infochimps_explorer/catalog/datasets/geo/location/geonames/places.icss.yaml")))
=> #<Icss::Meta::Protocol:0x00000100d099f8 @protocol="places", @namespace="geo.location.geonames", @types=[Icss::Geo::Location::Geonames::ExtendedIdentifiers, Icss::Geo::Location::Geonames::ExtendedProperties, Icss::Geo::Location::Geonames::FeatureCodes, Icss::Geo::GeonamesPlace], @_doc_hints={}, @messages={:search=>#<Icss::Meta::Message:0x00000100cc59b0 @request=[#<Icss::Meta::RecordField:0x00000100cc5690 @name=:params, @type=Icss::Geo::Place, @_extra_params={}, @validation_context=nil, @errors={}, @is_reference=true>], responseIcss::Geo::GeonamesPlace, samples[#<Icss::Meta::MessageSample:0x00000100cc3070 @request=[{"radius"=>1000, "latitude"=>30.3, "longitude"=>-97.75, "zoom_level"=>9, "tile_x"=>23, "tile_y"=>58, "address_text"=>"1214 W 6th St. Suite 202 Austin, TX 78703", "bbox"=>"30.2,-97.85,30.4,-97.65", "q"=>"park"}], message#<Icss::Meta::Message:0x00000100cc59b0 ...], @request_decorators={:anchors=>[]}, response_referencenesstrue, protocol#<Icss::Meta::Protocol:0x00000100d099f8 ..., @name=:search>}, data_assets[], code_assets[], targets{}, extra_params{}, validation_contextnil, errors{}
p.to_hash.to_yaml
TypeError: can't dump anonymous class Class
from /Users/dsnyder/.rvm/rubies/ruby-1.9.2-p180/lib/ruby/1.9.1/syck/rubytypes.rb:6:in `to_yaml'

This fails due to the presence of raw klasses like Icss::Geo::Place which cannot be serialized. I guess the lib should not serialize out the is_a associations once the fields have already been substituted in, or klasses should be converted back to their downcase string form, i.e. geo.place.

Icss::Meta::Protocol.find wildcard search does not work

acts_as_catalog.rb says ' Name can include wildcards(*) for simple searching' but

Icss::Meta::Protocol.find("social.network.tw.metrics") and Icss::Meta::Protocol.find("social.network.tw.influence") find results while Icss::Meta::Protocol.find("social.network.tw.*")raises Icss::NotFoundError: Cannot find social.*.tw.influence

GeoIndex target is missing

The GeoIndex target is missing from lib/icss/target.rb.

Without it I get an error on this page (staging only at the minute):
datasets/vector-maps-from-automotive-navigation-data

IdenticalFactory doesn't serialize correctly

Flip branch

":packages: !seq:Icss::ArrayOfHashOfMetaDotIdenticalFactory \n      - !map:Icss::HashOfMetaDotIdenticalFactory"

Icss::Meta::IdenticalFactory doesn't toArray or toHash correctly, so I get problems trying to toYaml an ICSS.

Icss::RecordField now redefines length in terms of Hash

The spec describes the field attribute length as "a length constraint, for downstream consumers that choose to take advantage of it (optional)."

Using ActsasHash, the record field now defines length in terms of the number of keys in the hash, and it is always present. If field#length is not a length constraint, it should not be defined, or else the specification should be adjusted.

<script src="https://gist.github.com/1029549.js?file=irb_dump.rb"></script>

Infochimps Core Types collection

  • Import Schema.org as a pile of ICSS
  • Modify ICSS to read into a global context (ie there is a directory of protocols and types; loading an icss file simply imports it into that namespace.
  • Properties should be named with the natural #underscore transliteration of the Schema.org property: interactionCount becomes interaction_count

Datatype System

  • Schema.org is a well-thought out approach to ontology.
  • It has several types we'd regard as superfluous ('HealthAndBeautyBusiness')
  • It's lacks many we'd regard as primary: PhysicalParticle, but especially Region.
  • It's syntactically incompatible with eg GeoJSON, but not irreconcilable.
  • We love the ICSS auto-vivified klass mechanism.

Questions to answer

Org::GeoJSON::Place vs Org::Schema::Point

There's a proposal over at the geo_adventure page. It's fundamentally sound, but

Additions to the list of Core types

@sya is assembling preliminary candidates for those core types.

What is the name for the Thing > CreativeWork > Article object in Schema.org?

There's a couple questions:

  • the namespace for all Schema.org (or other top-level types)
  • the way inheritance is described
  • the namespace for derived types in a target domain

Namespace

In this example, Article is a core type, exactly equivalent to the Schema.org type.

  • com.infochimps.article
  • org.schema.article
  • icss_type.article
  • icss.type.article
  • icss.article
  • core_type.article
  • core.article
  • article

When the icss lib vivifies a type, it needs to be in a module namespace to avoid collision -- it looks like it's Icss::Type right now. So:

  • Icss::Type::Article
  • Icss::Type::CoreType::Article
  • Icss::Type::Core::Article
  • Icss::Type::Com::Infochimps::Article
  • Icss::Type::Org::Schema::Article
  • Icss::Article
  • Com::Infochimps::Article
  • Org::Schema::Article
  • CoreType::Article
  • Core::Article
  • Article

The last one is clearly dangerous.

I lean towards Icss::Type::Article.

Inheritance

Prefixed by the namespace above,

  • CreativeWork::Article, or
  • Article

I lean pretty heavily toward the last, Article. The inheritance/mixin provenance is well captured by, well, ruby's inheritance/mixin mechanism.

Derived Types

Suppose the dataset content.corpora.nytimes.historical_archive (i.e. ICSS namespace content.corpora.nytimes) has a type NytimesArticle < Article. (In this case, assume that there's nothing like a NewspaperArticle core type -- we're only talking about non-core properties.)

  • Content::Corpora::Nytimes::NytimesArticle
  • Content::Corpora::Nytimes::Article
  • Icss::Type::Article::NytimesArticle
  • Icss::Type::NytimesArticle

The top-level NytimesArticle is dangerous. Content::Corpora::Nytimes::NytimesArticle is what ICSS would imply by, in the protocol, declaring type NytimesArticle. I think it's clearly Content::Corpora::Nytimes::NytimesArticle or Icss::Content::Corpora::Nytimes::NytimesArticle

ICSS Changes/fixes

  • Possibly rails-like associations to created parent/child relationships
  • Bounded transparency between things being passed around as arrays or hashes
  • Receive/after_receive works great for delegating down, but we still need more information before receiving (possibly a before receive) to fix things like namespaces

ICSS + Gorillib migration

pushed the new gorillib and icss + gorillib to master.

Integration points are george, apeyeye, hackboxen, buzzkill and troop.

active_support vs gorillib

For george, check that the icss/core_ext isn't included -- see infochimps/george#4

If your library is non-rails yet you feel it can't be migrated off active_support or extlib, please let me know.

core extensions

In hackboxen, troop and apeyeye, you'll see some cases where perhaps it used to know about symbolize_keys and now it doesn't, etc. Gorillib requires you to explicitly enumerate how you're extending base classes, so either in icss/core_ext.rb (if its in icss) or in your project, add the appropriate require explicitly.

Foo.receive(*constructor_args, hsh)

The signature of class-level Foo.receive has changed. The class-level method creates a new instance obj = self.new, and then invokes obj.receive!(hsh) on the instance. Some, but a very few, places want to pass in constructor args. Foo.receive's signature used to be receive(hsh, *constructor_args), but that is a) non-vernacular, b) might leave you thinking the hsh gets applied first. So it's now receive(*constructor_args, hsh), using the extract_options! pattern.

I scanned through to find out where the fancy syntax was being used, and though I think I got them all may have missed some.

Hashlike

  • A receiver has_key?(:foo) when the instance variable has been set (@foo exists) whether nil or not.
  • When a receiver object acts as hash, it is now almost indistinguishable in behavior from a hash. I doubt it will come up, but its behavior may have been subtly un-hashlike before in a way you're depending on.

:catalog_root load is brittle

Settings.define :catalog_root, :type => :filename, :default => (defined?(Rails) ? (Rails.root+'catalog') :       File.expand_path('../../infochimps_catalog', __FILE__))

This is ridiculous. If we're going to commit to core_types their location and version need to be consistent and reliable. Requiring them isn't just a 'bonus', the whole thing crashes and burns without them. You've hotfixed this so it will work on your laptop and for infochimps_explorer, that's the extent of its portability.

Quick, tell me what version of the core_types George is using? Troop is using? Infochimps_explorer is using? Cornelius is using (soon)? Ingestion is using (soon)?

Core types need to be, at the least, a pegged submodule of ICSS lib. When I need to read an ICSS, there should be NO external dependencies. This lib is peppered throughout our infrastructure in various states in george, apeyeye, infochimps_explorer, troop, among others, and therefore every bit of it needs to be rock solid.

to_xml

Right now the to_xml method in icss/type/factory.rb doesn't function correctly. For clarification, make a trstrank.xml? call and tail the apeyeye logs.

Proposed Changes to Icss Layout

Targets

Targets should be changed from a hash of hashes...

targets:
  catalog:
  - foo: bar
    baz: qux
  apeyeye:
  - code_assets:
    - my_foobar_asset

...to an array of hashes because there can (and most often will be) more than one entry under targets:

targets:
- catalog:
  - foo:bar
    baz: qux
- apeyeye::
  - code_assets:
    - my_foobar_asset

Applying this idea to the individual entries under targets (catalog, apeyeye, geo_index...) makes each entry a hash, as there should only ever be one per dataset:

targets:
- catalog:
    foo: bar
    baz: qux
- apeyeye:
    code_assets:
    - my_foobar_asset

Catalog

Example:

- catalog:
  id: 98d83n3kjak22nda33
  title: Foo Bar Dataset
  desc: The best dataset in the world.
  tags:
  - foo
  - bar
  apeyeye:
    message: lookup
    code_asset: my_foobar_asset
  • It has been suggested to move title: desc: and tags: be moved to top level entries. Doing so makes the Icss overlay easier (all information that might be adjusted will live in a very accessible place), but for publishing purposes, it makes more sense conceptually to have all information being passed to the site be contained in one location (i.e. the catalog: entry). desc: could become doc: and this would still adhere to Avro specification, but the other headings would not...

Pig and Mysql field mappings have been ripped out

In a fit of surely inspired madness, the pig and mysql field mappings were removed. They are used all over troop, and regardless of whether they actually belonged there, I would like to adopt a better deprecation cycle than CNTL-K.

  # Registry for primitive types
  unless defined?(::Icss::Type::PRIMITIVE_TYPES)
    ::Icss::Type::PRIMITIVE_TYPES = {}
    ::Icss::Type::PRIMITIVE_TYPES[:null]    = PrimitiveType.receive(:ruby_klass => NilClass, :pig_name => 'FIXME WHAT GOES HERE' )
    ::Icss::Type::PRIMITIVE_TYPES[:boolean] = PrimitiveType.receive(:ruby_klass => Boolean,  :pig_name => 'FIXME WHAT GOES HERE')
    ::Icss::Type::PRIMITIVE_TYPES[:string]  = PrimitiveType.receive(:ruby_klass => String,   :pig_name => 'chararray',:mysql_name => 'VARCHAR')
    ::Icss::Type::PRIMITIVE_TYPES[:bytes]   = PrimitiveType.receive(:ruby_klass => String,   :pig_name => 'bytearray',:mysql_name => 'VARCHAR')
    ::Icss::Type::PRIMITIVE_TYPES[:int]     = PrimitiveType.receive(:ruby_klass => Integer,  :pig_name => 'int',:mysql_name => 'INT')
    ::Icss::Type::PRIMITIVE_TYPES[:long]    = PrimitiveType.receive(:ruby_klass => Integer,  :pig_name => 'long',:mysql_name => 'BIGINT')
    ::Icss::Type::PRIMITIVE_TYPES[:float]   = PrimitiveType.receive(:ruby_klass => Float,    :pig_name => 'float',:mysql_name => 'FLOAT')
    ::Icss::Type::PRIMITIVE_TYPES[:double]  = PrimitiveType.receive(:ruby_klass => Float,    :pig_name => 'double',:mysql_name => 'DOUBLE')
    #
    ::Icss::Type::PRIMITIVE_TYPES[:symbol]  = PrimitiveType.receive(:ruby_klass => Symbol,   :pig_name => 'chararray')
    ::Icss::Type::PRIMITIVE_TYPES[:time]    = PrimitiveType.receive(:ruby_klass => Time,     :pig_name => 'chararray')
    ::Icss::Type::PRIMITIVE_TYPES[:date]    = PrimitiveType.receive(:ruby_klass => Date,     :pig_name => 'chararray')
    ::Icss::Type::PRIMITIVE_TYPES.freeze
  end

Icss and yajl

The data team would like to see Icss as a fully functioning gem, and one of the things preventing this is the inclusion of the yajl gem for JSON parsing. Can we switch it to something installable by jruby? This is keeping us from being able to jruby gem install icss because of its dependencies.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.