Giter Club home page Giter Club logo

go-whosonfirst-index's Introduction

go-whosonfirst-index

Go package for indexing Who's On First documents

Important

This package is no longer maintained.

Version "2"

Version "2" of this package was short-lived and quickly replaced by whosonfirst/go-whosonfirst-iterate. You should use that instead.

Version "1"

If you need to use the original "v1" package you specify it as follows in your go.mod file:

require (
	github.com/whosonfirst/go-whosonfirst-index v0.3.4
)

Documentation for this package is incomplete and will be updated shortly.

Example

package main

import (
       "context"
       "flag"
       "github.com/whosonfirst/go-whosonfirst-index/v2/emitter"       
       "github.com/whosonfirst/go-whosonfirst-index/v2/indexer"
       "io"
       "log"
)

func main() {

	emitter_uri := flag.String("emitter-uri", "repo://", "A valid whosonfirst/go-whosonfirst-index/v2/emitter URI")
	
     	flag.Parse()

	ctx := context.Background()

	cb := func(ctx context.Context, fh io.ReadSeekCloser, args ...interface{}) error {
		path, _ := index.PathForContext(ctx)
		log.Printf("Indexing %s\n", path)
		return nil
	}

	idx, _ := indexer.NewIndexer(ctx, *emitter_uri, cb)

	uris := flag.Args()
	idx.Index(ctx, uris...)
}	

Error handling removed for the sake of brevity.

Concepts

Indexer

Emitters

To be written

Interfaces

type EmitterInitializeFunc func(context.Context, string) (Emitter, error)

type EmitterCallbackFunc func(context.Context, io.ReadSeekCloser, ...interface{}) error

type Emitter interface {
	IndexURI(context.Context, EmitterCallbackFunc, string) error
}

To be written

URIs and Schemes

To be written

directory://

featurecollection://

file://

filelist://

geojsonls://

repo://

Filters

To be written

Differences from "v1"

There was never a "v1" release. The last published release before "v2" was v0.3.4.

  • Go 1.16 or higher is required.
  • The introduction of the emitter.Emitter interface separate from a general-purpose indexer.Indexer instance.
  • Migrating the index.NewIndexer method in to the indexer.NewIndexer package method.
  • Migrating the index.PathForContext method in to the emitter.PathForContext package method.
  • Migrating the index.Drivers method in to the emitter.Schemes package method.
  • The use of the aaronland/go-roster package to manage registered emitters.
  • Changing the requirement in emitter (previously indexer) callbacks from io.Reader to io.ReadSeekCloser.
  • The introduction of the filters.Filters interface, and corresponding emitter URI query parameters, for limiting results that are sent to emitter (previously indexer) callback functions.

Tools

$> make cli
go build -mod vendor -o bin/count cmd/count/main.go
go build -mod vendor -o bin/emit cmd/emit/main.go

count

Count files in one or more whosonfirst/go-whosonfirst-index/v2/emitter sources.

$> ./bin/count -h
Count files in one or more whosonfirst/go-whosonfirst-index/v2/emitter sources.
Usage:
	 ./bin/count [options] uri(N) uri(N)
Valid options are:

  -emitter-uri string
    	A valid whosonfirst/go-whosonfirst-index/v2/emitter URI. Supported emitter URI schemes are: directory://,featurecollection://,file://,filelist://,geojsonl://,repo:// (default "repo://")

For example:

$> ./bin/count \
	/usr/local/data/sfomuseum-data-architecture/

2021/02/17 14:07:01 time to index paths (1) 87.908997ms
2021/02/17 14:07:01 Counted 1072 records (1072) in 88.045771ms

Or:

$> ./bin/count \
	-emitter-uri 'repo://?include=properties.sfomuseum:placetype=terminal&include=properties.mz:is_current=1' \
	/usr/local/data/sfomuseum-data-architecture/
	
2021/02/17 14:09:18 time to index paths (1) 71.06355ms
2021/02/17 14:09:18 Counted 4 records (4) in 71.184227ms

emit

Publish features from one or more whosonfirst/go-whosonfirst-index/v2/emitter sources.

$> ./bin/emit -h
Publish features from one or more whosonfirst/go-whosonfirst-index/v2/emitter sources.
Usage:
	 ./bin/emit [options] uri(N) uri(N)
Valid options are:

  -emitter-uri string
    	A valid whosonfirst/go-whosonfirst-index/v2/emitter URI. Supported emitter URI schemes are: directory://,featurecollection://,file://,filelist://,geojsonl://,repo:// (default "repo://")
  -geojson
    	Emit features as a well-formed GeoJSON FeatureCollection record.
  -json
    	Emit features as a well-formed JSON array.
  -null
    	Publish features to /dev/null
  -stdout
    	Publish features to STDOUT. (default true)

For example:

$> ./bin/emit \
	-emitter-uri 'repo://?include=properties.sfomuseum:placetype=museum' \
	-geojson \	
	/usr/local/data/sfomuseum-data-architecture/ \

| jq '.features[]["properties"]["wof:id"]'

1729813675
1477855937
1360521563
1360521569
1360521565
1360521571
1159157863

See also

go-whosonfirst-index's People

Contributors

botsonfirst avatar thisisaaronland avatar

Stargazers

 avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

go-whosonfirst-index's Issues

Move SQLite indexer in to a separate package

This will cause some amount of pain for certain tools but it will mean that non-SQLite-things that use this package will be able to cross-compile again. Currently this is not possible because of the SQLite stuff depends on go-sqlite3.

Add utility callback for filtering records by existential flags

For example, take this which is in the go-whosonfirst-pip-v2 package and make it a high-level shared thingy for use in other packages...

type ApplicationIndexerOptions struct {
        IndexMode         string
        IsWOF             bool
        IncludeDeprecated bool
        IncludeSuperseded bool
        IncludeCeased     bool
        IncludeNotCurrent bool
}

func NewApplicationIndexer(appindex pip.Index, opts ApplicationIndexerOptions) (*index.Indexer, error) {

        cb := func(fh io.Reader, ctx context.Context, args ...interface{}) error {

                var f geojson.Feature

                if opts.IsWOF {

                        ok, err := pip_utils.IsValidRecord(fh, ctx)

                        if err != nil {
                                return err
                        }

                        if !ok {
                                return err
                        }

                        tmp, err := feature.LoadWOFFeatureFromReader(fh)

                        if err != nil {
                                return err
                        }

                        if !opts.IncludeNotCurrent {

                                fl, err := whosonfirst.IsCurrent(f)

                                if err != nil {
                                        return err
                                }

                                if fl.IsTrue() && fl.IsKnown() {
                                        return nil
                                }
                        }
 
                         // and so  on...

Ignore Emacs-style temporary files

Things like:

  • Files that start with #
  • Files that end with ~

For example this uses go-whosonfirst-index to build an in-memory PIP spatial database:

> go run -mod vendor cmd/point-in-polygon/main.go -query 'properties.sfomuseum:placetype=gallery' -query 'properties.mz:is_current=1' -spatial-database-uri 'sqlite://?dsn=:memory:' \
-spatial-source /usr/local/data/sfomuseum-data-architecture /usr/local/data/sfomuseum-data-architecture/
2021/02/16 11:26:49 PATHS [/usr/local/data/sfomuseum-data-architecture]
2021/02/16 11:26:50 INDEX 1729802937
2021/02/16 11:26:50 INDEX 1729802937
2021/02/16 11:26:51 -122.38771606591274 37.6143874695422 1
2021/02/16 11:26:51 Update /usr/local/data/sfomuseum-data-architecture/data/172/980/301/9/1729803019.geojson
2021/02/16 11:26:51 -122.38769587933967 37.61435000323429 1
2021/02/16 11:26:51 -122.38796986270393 37.61431349822708 2
2021/02/16 11:26:51 San Francisco Airport Commission Aviation Library and Louis A. Turpen Aviation Museum 1729802937
2021/02/16 11:26:51 San Francisco Airport Commission Aviation Library and Louis A. Turpen Aviation Museum 1729802937
2021/02/16 11:26:51 Failed crawl callback for /usr/local/data/sfomuseum-data-architecture/data/147/785/595/5/1477855955.geojson: Multiple results (2), after filtering
exit status 1

And fails because there is a #{WOFID}.geojson file:

> git grep 1729802937 | grep wof:id
data/172/980/293/7/#1729802937.geojson#:    "wof:id": 1729802937,
data/172/980/293/7/1729802937.geojson:    "wof:id": 1729802937,

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.