googlearchive / flashlight Goto Github PK

View Code? Open in Web Editor NEW

756.0 67.0 158.0 94 KB

A pluggable integration with ElasticSearch to provide advanced content searches in Firebase.

Home Page: http://firebase.github.io/flashlight/

JavaScript 80.97% HTML 15.88% CSS 2.26% Dockerfile 0.89%

flashlight's Introduction

Status: Archived

This repository has been archived and is no longer maintained.

Flashlight

A pluggable integration with ElasticSearch to provide advanced content searches in Firebase.

This script can:

monitor multiple Firebase paths and index data in real time
communicates with client completely via Firebase (client pushes search terms to search/request and reads results from search/result)
clean up old, outdated requests

Getting Started

Install and run ElasticSearch or add Bonsai service via Heroku
git clone https://github.com/firebase/flashlight
npm install
edit config.js (see comments at the top, you must set FB_URL and FB_SERVICEACCOUNT at a minimum)
node app.js (run the app)

Check out the recommended security rules in example/seed/security_rules.json. See example/README.md to seed and run an example client app.

If you experience errors like {"error":"IndexMissingException[[firebase] missing]","status":404}, you may need to manually create the index referenced in each path:

curl -X POST http://localhost:9200/firebase

To read more about setting up a Firebase service account and configuring FB_SERVICEACCOUNT, click here.

Client Implementations

Read example/index.html and example/example.js for a client implementation. It works like this:

Push an object to /search/request which has the following keys: index, type, and q (or body for advanced queries)
Listen on /search/response for the reply from the server

The body object can be any valid ElasticSearch DSL structure (see Building ElasticSearch Queries).

Deploy to Heroku

cd flashlight
heroku login
heroku create (add heroku to project)
heroku addons:add bonsai (install bonsai)
heroku config (check bonsai instance info and copy your new BONSAI_URL - you will need it later)
heroku config:set FB_NAME=<instance> FB_TOKEN="<token>" (declare environment variables)
git add config.js (update)
git commit -m "configure bonsai"
git push heroku master (deploy to heroku)
heroku ps:scale worker=1 (start dyno worker)

Setup Initial Index with Bonsai

After you've deployed to Heroku, you need to create your initial index name to prevent IndexMissingException error from Bonsai. Create an index called "firebase" via curl using the BONSAI_URL that you copied during Heroku deployment.

curl -X POST <BONSAI_URL>/firebase (ex: https://user:[email protected]/firebase)

Migration

0.2.0 -> 0.3.0

Flashlight now returns the direct output of ElasticSearch, instead of just returning the hits part. This change is required to support aggregations and include richer information. You must change how you read the reponse accordingly. You can see example responses of Flashlight below:

Before, in 0.2.0

"total" : 1000,
"max_score" : null,
"hits" : [
  ..
]

After, in 0.3.0

{
  "took" : 63,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "failed" : 0
  },
  "hits" : {
    "total" : 1000,
    "max_score" : null,
    "hits" : [
      ..
    ]
  },
  "aggregations" : {
    ..
  }
}

Advanced Topics

Parsing and filtering indexed data

The paths specified in config.js can include the special filter and parse functions to manipulate the contents of the index. For example, if I had a messaging app, but I didn't want to index any system-generated messages, I could add the following filter to my messages path:

filter: function(data) { return data.name !== 'system'; }

Here, data represents the JSON snapshot obtained from the database. If this method does not return true, that record will not be indexed. Note that the filter method is applied before parse.

If I want to remove or alter data getting indexed, that is done using the parse function. For example, assume I wanted to index user records, but remove any private information from the index. I could add a parse function to do this:

parse: function(data) {
   return {
      first_name: data.first_name,
      last_name: data.last_name,
      birthday: new Date(data.birthday_as_number).toISOString()
   };
}

Building ElasticSearch Queries

The full ElasticSearch API is supported. Check out this great tutorial on querying ElasticSearch. And be sure to read the ElasticSearch API Reference.

Example: Simple text search

 {
   "q": "foo*"
 }

Example: Paginate

You can control the number of matches (defaults to 10) and initial offset for paginating search results:

 {
   "from" : 0, 
   "size" : 50, 
   "body": {
     "query": {
        "match": {
           "_all": "foo"
        }
     }
   }
 };

Example: Search for multiple tags or categories

 {
   "body": {
     "query": {
       { "tag": [ "foo", "bar" ] }
     }
   }
 }

 {
   "body": {
     "query": {
       "match": {
         "field":  "foo",
       }
     }
   }
 }

Example: Give more weight to specific fields

 {
   "body": {
     "query": {
       "multi_match": {
         "query":  "foo",
         "type":   "most_fields", 
         "fields": [ 
            "important_field^10", // adding ^10 makes this field relatively more important 
            "trivial_field" 
         ]
       }
     }
   }
 }

Helpful section of ES docs

Search lite (simple text searches with q) Finding exact values Sorting and relevance Partial matching Wildcards and regexp Proximity matching Dealing with human language

Operating at massive scale

Is Flashlight designed to work at millions or requests per second? No. It's designed to be a template for implementing your production services. Some assembly required.

Here are a couple quick optimizations you can make to improve scale:

Separate the indexing worker and the query worker (this could be as simple as creating two Flashlight workers, opening app.js in each, and commenting out SearchQueue.init() or PathMonitor.process() respectively.
When your service restarts, all data is re-indexed. To prevent this, you can use refBuilder as described in the next section.
With a bit of work, both PathMonitor and SearchQueue could be adapted to function as a Service Worker for firebase-queue,
allowing multiple workers and potentially hundreds of thousands of writes per second (with minor degredation and no losses at even higher throughput).

Use refBuilder to improve indexing efficiency

In config.js, each entry in paths can be assigned a refBuilder function. This can construct a query for determining which records get indexed.

This can be utilized to improve efficiency by preventing all data from being re-indexed any time the Flashlight service is restarted, and generally by preventing a large backlog from being read into memory at once.

For example, if I were indexing chat messages, and they had a timestamp field, I could use the following to never look back more than a day during a server restart:

exports.paths = [
   {
      path  : "chat/messages",
      index : "firebase",
      type  : "message",
      fields: ['message_body', 'tags'],
      refBuilder: function(ref, path) {
         return ref.orderByChild('timestamp').startAt(Date.now());
      }
   }
];

Loading paths to index from the database instead of config file

Paths to be indexed can be loaded dynamically from the database by providing a path string instead of the paths array. For example, the paths given in config.example.js could be replaced with dynamic_paths and then those paths could be stored in the database, similar to this.

Any updates to the database paths are handled by Flashlight (new paths are indexed when they are added, old paths stop being indexed when they are removed).

Unfortunately, since JSON data stored in Firebase can't contain functions, the filter, parser, and refBuilder options can't be used with this approach.

Support

Submit questions or bugs using the issue tracker.

For Firebase-releated questions, try the mailing list.

License

flashlight's People

Contributors

Stargazers

Watchers

Forkers

nessup dappel briandamage shaohua grenade casetext evert0n agius njoshi22 nemanja-stanarevic camallen richardh9l wehriam pdemilly nadeesha modulexcite idanb11 jsgeekiee fmnxl mopineyro cuulee susanwolfgram tremendus robotnoises kunalmkamble bojhan colinwitkamp wrenth04 2947721120 rrawla tilman2013 advantej ryantan stukennedy solojavier smusa michaelonubogu eugeneliang alexboorman rodolfobarretoweb jparish3 ardfard javondavis ykkwon michael-alade webanywhere sanjayradadiya gregorynicholas cubissimo zoltrain adelespinasse kuberjsr danleavitt0 markz1204 hoangpq garenk02 yos6813 christianezeani eselle amalgta zacck-zz sabakaio evansnguyen0104 bppp bangadennis maitham legacy-account alphadevteam astorise tjmonsi peepo3663 brentfarris hikarivina tampei101 webeli hans-vm shocoben alberto98fx bookbottles stephanpartzsch medavid mbifulco vcrepin jafriesen hocjs forkeds niilante usmansahir mukul-sharma paraselene candytv lrandom levenson erinburns jpamorgan kleeb fernandocruz 14kw gulzar1996 matomesc

flashlight's Issues

Related to request Timeout

Though it is good that we keep requestTimeout while we create an elastic client instance. When I do not keep this parameter, then my indexer stops and throws :
failed to index indexname/resource/resourcename: Error: Request Timeout after 30000ms
by which my most of the firebase data are not indexed.

Ideally, the requestTimeout is a parameter for when the request is taking too long then timeout that request, but in my case the request is always fulfilled though the indexer timeouts after 30 seconds.

my indexed data flows like
indexed indexname/resource/0000df7b-b9ea-4496-b2e6-bb9c86aa8b5c
indexed indexname/resource/00008527-5f7f-43c2-97c8-bd71c6d409ca
indexed indexname/resource/000cd4b8-5dba-4614-b699-dfb592490ca4
indexed indexname/resource/00014bd1-88dc-4ff4-8248-ef4d071ba28d
indexed indexname/resource/0016f0a2-edda-4d2a-9ea6-d8c6e9575b4e
indexed indexname/resource/0018fe53-1e45-4925-9580-e3851ee9728e
indexed indexname/resource/0012b89d-10a5-4d54-9134-453e87fdc67c
indexed indexname/resource/00059da0-5174-4c48-ab3c-fd342c4edbf4

after 30 seconds:
failed to index indexname/resource/ffd5517e-cab0-4bec-8989-4f6d760f18eb: Error: Request Timeout after 30000ms
failed to index indexname/resource/ffdb0ad0-c9bb-4ebd-9f30-9aecf4ffedd3: Error: Request Timeout after 30000ms
failed to index indexname/resource/ffdfce2e-e11e-415c-b9f6-8e1f6e6777d7: Error: Request Timeout after 30000ms
failed to index indexname/resource/ffe39ae2-ad3e-4944-a65c-744490407247: Error: Request Timeout after 30000ms
failed to index indexname/resource/fff3c43f-6082-4cad-97ab-bfe04c61039a: Error: Request Timeout after 30000ms

in config js i have:
exports.paths = [
{
path: "resources",
index: "indexname",
type: "resource"
}
];

my firebase data looks like:

is there any problem with my data or the structure of the firebase data storage, or is there any other thing that I need to take care while indexing ?

Since the current data that I am indexing is in dev mode, then its not the issue for me keeping the requestTimeout to any number that is sufficient to index all the data, but my production database is quiet huge I cannot decide on what my requestTimeout should be inorder to index all the data.

Please let me know any leads on the above issue.

Secondly, is this only designed to run as a service, or can we alter it to index the data then exit from the program and then run the update or delete part to just index only the new changes in firebase (maybe new record created, old record updated, or deleted)? but not touching all the already indexed data.

Mapping options (multifields, etc)

Any support for custom mapping? Need to work with some multifields and can't see where to pass these options to the elasticsearchclient.

Server Timeouts

I have a flashlight instance up and running on Windows Azure. Every so often (about an hour or so) the server goes into an idle state and I end up having to kill and restart it. It doesn't seem to have anything to do with the ElasticSearch Service that I have up and running.

I was wondering if this issue has come up anywhere else? The only things I can think of is that maybe there's some type of timeout for the firebase subscription...

I get the error Service account must contain a "private_key" field

Followed the tutorial. Uploaded the json file.. Now I get:

throw new Error('Service account must contain a "private_key" field');
^

Error: Service account must contain a "private_key" field

It return only no more than 10 results in the Hits array.

even if the total of results is more. How can we have more results?

Unhandled error

I got this while doing a mass delete:

events.js:72
throw er; // Unhandled 'error' event
^
Error: socket hang up
at createHangUpError (http.js:1473:15)
at Socket.socketOnEnd as onend
at Socket.g (events.js:175:14)
at Socket.EventEmitter.emit (events.js:117:20)
at _stream_readable.js:920:16
at process._tickCallback (node.js:415:13)

f

App crashes on parse failure (add try/catch block)

I think that recent changes in firebase are causing problems in flashlight. Below is the heroku output from running a fairly vanilla flashlight instance. I get a similar crash when running locally.

» 12:25:00.990 2014-09-05 11:25:00.564857+00:00 heroku api - - Release v15 created by [email protected]
» 12:25:01.707 2014-09-05 11:25:01.089019+00:00 heroku worker.1 - - State changed from crashed to starting
» 12:25:05.401 2014-09-05 11:25:05.120210+00:00 heroku worker.1 - - Starting process with command `node ./app.js`
» 12:25:06.187 2014-09-05 11:25:05.809169+00:00 heroku worker.1 - - State changed from starting to up
» 12:25:07.294 2014-09-05 11:25:06.569529+00:00 app worker.1 - - Configured using BONSAI_URL environment variable https://xxx:[email protected] { FB_URL: 'https://snappynumber.firebaseio.com/',
» 12:25:07.382 2014-09-05 11:25:06.569543+00:00 app worker.1 - - FB_TOKEN: 'xxx',
» 12:25:07.382 2014-09-05 11:25:06.569544+00:00 app worker.1 - - FB_REQ: 'search/request',
» 12:25:07.382 2014-09-05 11:25:06.569546+00:00 app worker.1 - - FB_RES: 'search/response',
» 12:25:07.382 2014-09-05 11:25:06.569547+00:00 app worker.1 - - ES_HOST: 'beech-8286975.eu-west-1.bonsai.io',
» 12:25:07.382 2014-09-05 11:25:06.569548+00:00 app worker.1 - - ES_PORT: 80,
» 12:25:07.382 2014-09-05 11:25:06.569550+00:00 app worker.1 - - ES_USER: 'xxx',
» 12:25:07.382 2014-09-05 11:25:06.569551+00:00 app worker.1 - - ES_PASS: 'xxx' }
» 12:25:07.382 2014-09-05 11:25:06.800821+00:00 app worker.1 - - Connected to ElasticSearch host beech-8286975.eu-west-1.bonsai.io:80
» 12:25:07.403 2014-09-05 11:25:06.802644+00:00 app worker.1 - - Connecting to Firebase https://snappynumber.firebaseio.com/
» 12:25:07.488 2014-09-05 11:25:06.802738+00:00 app worker.1 - - Authenticating with token KA...72
» 12:25:08.755 2014-09-05 11:25:08.151432+00:00 app worker.1 - - Authenticated
» 12:25:08.841 2014-09-05 11:25:08.152535+00:00 app worker.1 - - Indexing sn-persons/person using path "persons"
» 12:25:08.843 2014-09-05 11:25:08.155779+00:00 app worker.1 - - Queue started, IN: "search/request", OUT: "search/response"
» 12:25:08.929 2014-09-05 11:25:08.155926+00:00 app worker.1 - - Next cleanup in 60 seconds
» 12:25:10.836 2014-09-05 11:25:10.667271+00:00 app worker.1 - - /app/node_modules/firebase/lib/firebase-node.js:44
» 12:25:10.837 2014-09-05 11:25:10.666877+00:00 app worker.1 - -
» 12:25:10.840 2014-09-05 11:25:10.667610+00:00 app worker.1 - - function fc(a){try{a()}catch(b){setTimeout(function(){throw b;},Math.floor(0))
» 12:25:10.840 2014-09-05 11:25:10.667709+00:00 app worker.1 - - ^
» 12:25:10.840 2014-09-05 11:25:10.669917+00:00 app worker.1 - - SyntaxError: Unexpected token r at Object.parse (native) at Object.SearchQueue._process (/app/lib/SearchQueue.js:21:60) at /app/node_modules/firebase/lib/firebase-node.js:93:980 at fc (/app/node_modules/firebase/lib/firebase-node.js:44:20) at Yd (/app/node_modules/firebase/lib/firebase-node.js:93:966) at Wd.Hb (/app/node_modules/firebase/lib/firebase-node.js:93:908) at Zd.Hb (/app/node_modules/firebase/lib/firebase-node.js:94:419) at /app/node_modules/firebase/lib/firebase-node.js:109:406 at /app/node_modules/firebase/lib/firebase-node.js:62:1194 at ac (/app/node_modules/firebase/lib/firebase-node.js:58:222) Exception
» 12:25:11.701 2014-09-05 11:25:11.360664+00:00 heroku worker.1 - - Process exited with status 8
» 12:25:11.997 2014-09-05 11:25:11.367072+00:00 heroku worker.1 - - State changed from up to crashed

Does flashlight scale horizontally?

I am working on a firebase app. In this app, I use elasticsearch and flashlight for searching through my content. Currently our customer base is rather small. Because of this we are hosting flashlight on a single server on AWS.

We are planning to get a quite significant growth in our customer base. With this new growth, we'll need to scale flashlight. We hope to be able to scale it horizontally by running multiple instances of flashlight and a elasticsearch cluster.

Is this possible?

From what I understand, the searching should not be a problem but the indexing might be. My understanding is that flashlight indexes content every time there's a change in firebase. If we have multiple instances of flashlight, when a change is made that requires indexing all of the servers will be notified and they'll all try to index.

If this is the case, we're never really scaling horizontally for indexing and we might end up with some inconsistencies in our index.

invalid string length

I have tried with both flashlight and flue (a fork I found on github with a port to firebase 2.2) and I am getting the same error.

Any help appreciated

/home/search/flue/node_modules/firebase/lib/firebase-node.js:170
cket_failure")};function fh(a,b){a.frames.push(b);if(a.frames.length==a.Ze){va
^
RangeError: Invalid string length
at Array.join (native)
at fh (/home/search/flue/node_modules/firebase/lib/firebase-node.js:170:431)
at va.onmessage (/home/search/flue/node_modules/firebase/lib/firebase-node.js:169:143)
at EventTarget.dispatchEvent (/home/search/flue/node_modules/firebase/node_modules/faye-websocket/lib/faye/websocket/api/event_target.js:22:30)
at instance._receiveMessage (/home/search/flue/node_modules/firebase/node_modules/faye-websocket/lib/faye/websocket/api.js:134:10)
at null. (/home/search/flue/node_modules/firebase/node_modules/faye-websocket/lib/faye/websocket/api.js:34:49)
at emit (events.js:129:20)
at null. (/home/search/flue/node_modules/firebase/node_modules/faye-websocket/node_modules/websocket-driver/lib/websocket/driver/hybi.js:451:14)
at pipe (/home/search/flue/node_modules/firebase/node_modules/faye-websocket/node_modules/websocket-driver/node_modules/websocket-extensions/lib/pipeline/index.js:37:40)
at Pipeline._loop (/home/search/flue/node_modules/firebase/node_modules/faye-websocket/node_modules/websocket-driver/node_modules/websocket-extensions/lib/pipeline/index.js:44:3)

Batch ES requests

Requests are sent asynchronously via node. For larger data sets (> 100k records) this requires 100k HTTP requests. We can batch the requests to ES to greatly reduce bandwidth and connections here.

High volume traffic spikes Heroku worker memory results in slow performance

Hello,

We currently use the flashlight app (out of the box) in production. I haven't made any modifications to your code in our Heroku environment.

I've tried several tiers of Heroku to attempt to solve this. Currently, we are at the largest Heroku worker tier Performance - L I believe is the name (the 500 a month one). The instance runs fine except when we see hi volume of updates on firebase and when we have many thousands of clients updating firebase at once.

I've attached an image to show the usage statistics and to show the out of memory errors we are getting. My theory is that the .on function queues up changes to be made to elastic (the updating of indexes). This queue fills up so quickly that the worker is unable to process them and begins to have a backlog of changes to be made thus filling up available memory. As the worker struggles to complete tasks it falls more and more behind and eventually slows to a crawl while catching up.

My question: is there a way to resolve this issue by distributing the work evenly among many workers? I've tried spinning up several worker instances but they all seem to work on indexing the same path together. It is only when there is high volume traffic where I see one worker doing another process different from other workers. I am using Heroku logs --tail to get this data and make this assumption.

I've also tried replacing the unsupported elasticsearchclient with the official client but the Heroku instances just crash or spit out errors I haven't seen before. I have also upgraded elastic.js to version 1.2.0 and I still see these issues. Luckily the server doesn't exactly crash when under stress it just crawls very slowly until it's able to catch up regardless of how many resources I throw at it.

Thank you,
TB

Memory usage & timeouts

Hey guys,

I've got a database that I get indexed in Elasticsearch using flashlight. All worked fine for a while, but the database grew quite a lot lately and when I re-indexed the whole database today, I got a lot of errors:

Error: Request Timeout after 30000ms

Our database has about 50.000 records that should be indexed, but many are now missing. How to prevent this?

Also, I noticed that this script uses at least as much memory as the size of the database. Is there a way to reduce this?

geolocation queries

I have my fields setup like

{
      "location" : {
            "lat" : 40.12,
            "lon" : -71.34
        }
}

I added the location field as one to watch but does this work to do geo distance queries like on here https://www.elastic.co/guide/en/elasticsearch/reference/2.3/query-dsl-geo-distance-query.html

Add option to add additional params to the ElasticSearch.Client

Hi,

The current implementation is :

new ElasticSearch.Client({
  hosts: [
    {
      host: conf.ES_HOST,
      port: conf.ES_PORT,
      auth: (conf.ES_USER && conf.ES_PASS) ? conf.ES_USER + ':' + conf.ES_PASS : null
    }  
  ]
})

one should be able to use optional params like so:

new ElasticSearch.Client({
  hosts: [
    {
      host: conf.ES_HOST,
      port: conf.ES_PORT,
      auth: (conf.ES_USER && conf.ES_PASS) ? conf.ES_USER + ':' + conf.ES_PASS : null
    }  
  ],
  requestTimeout: 60000,
  maxSockets    : 100,
  log           : conf.LOG_LEVEL
})

maybe use values defined in the config.js file?

Update readme for new config params.

The readme says that at a minimum you need FB_URL, but that's no longer the case as of #62. FB_SERVICEACCOUNT is now required and points to a json file containing service account credentials. Should probably link to the server docs for Firebase as well.

Security Rules Questions

In the security rules under seed, newData.id is referenced:

".write": "!data.exists() && (newData.id === auth.id || newData.id === auth.uid)",

Is this correct? Should it be

".write": "!data.exists() && (newData.child('id').val() === auth.id || newData.child('id').val() === auth.uid)",

Also, for an authenticated user I get no results and:

FIREBASE WARNING: set at /search/request/-JHx7mH0XfRW-0O2TpgM failed: permission_denied  firebase.js:3200
FIREBASE WARNING: on() or once() for /search/response/-JHx7mH0XfRW-0O2TpgM failed: Error: permission_denied: Client doesn't have permission to access the desired data.  firebase.js:3200
FIREBASE WARNING: on() or once() for /search/response failed: Error: permission_denied: Client doesn't have permission to access the desired data.

but for an unauthenticated user I get results.

Is this correct?

I'm getting the following error

Can anybody help in resolving the following error:

.../flashlight/node_modules/firebase/lib/firebase-node.js:123
Aug 12 07:30:38 flashlight-service[755]: function fg(a,b,c){c instanceof O&&(c=new lf(c,a));if(!p(b))throw Error(a+"contains undefined "+nf(c));if(t(b))throw Error(a+"contains
 a function "+nf(c)+" with contents: "+b.toString());if(rd(b))throw Error(a+"contains "+b.toString()+" "+nf(c));if(q(b)&&b.length>10485760/3&&10485760<Nb(b))throw Error(a+"contains a string greater than 10485760
 utf8 bytes "+nf(c)+" ('"+b.substring(0,50)+"...')");if(ga(b)){var d=!1,e=!1;Db(b,function(b,g){if(".value"===b)d=!0;else if(".priority"!==b&&".sv"!==b&&(e=
Aug 12 07:30:38 flashlight-service[755]: ^

Thanks.

example won't work

The results in the example.html won't update. I created my own instance on Firebase and added the data given, with security rules (reed: true, write:true).

I think it would be really helpful to give more info on the set-up, like where elasticsearch & flashlight should be placed in the app home directory. Right now, I have them in separate folders.

TypeError: Cannot read property 'hasOwnProperty' of undefined.

function ac(a){try{a()}catch(b){setTimeout(function(){throw b;},Math.floor(0))
                                                            ^
TypeError: Cannot read property 'hasOwnProperty' of undefined
    at Object.SearchQueue._process (c:\Users\Idanb11\Work\maptiv8-firebase\flashlight\lib\SearchQueue.js:22:29)
    at c:\Users\Idanb11\Work\maptiv8-firebase\flashlight\node_modules\firebase\lib\firebase-node.js:94:979
    at ac (c:\Users\Idanb11\Work\maptiv8-firebase\flashlight\node_modules\firebase\lib\firebase-node.js:44:20)
    at Td (c:\Users\Idanb11\Work\maptiv8-firebase\flashlight\node_modules\firebase\lib\firebase-node.js:94:965)
    at Rd.Eb (c:\Users\Idanb11\Work\maptiv8-firebase\flashlight\node_modules\firebase\lib\firebase-node.js:94:908)
    at Ud.Eb (c:\Users\Idanb11\Work\maptiv8-firebase\flashlight\node_modules\firebase\lib\firebase-node.js:95:417)
    at c:\Users\Idanb11\Work\maptiv8-firebase\flashlight\node_modules\firebase\lib\firebase-node.js:110:469
    at c:\Users\Idanb11\Work\maptiv8-firebase\flashlight\node_modules\firebase\lib\firebase-node.js:62:1194
    at Wb (c:\Users\Idanb11\Work\maptiv8-firebase\flashlight\node_modules\firebase\lib\firebase-node.js:58:222)
    at R (c:\Users\Idanb11\Work\maptiv8-firebase\flashlight\node_modules\firebase\lib\firebase-node.js:62:1171)

Error on demo page

Hi,

I'm seeing

{
  "error": {
    "displayName": "ServiceUnavailable",
    "message": "SearchPhaseExecutionException[Failed to execute phase [query], all shards failed]",
    "status": 503
  },
  "total": 0
}

on the demo page.

Size of results

This is what i have:

rootRef.child("search/request").push().set {index: 'firebase', type: 'names', query:    {query_string: {query: '*fa*'} }}

Having read through the API docs for specifying the number of results returned and also here:
http://okfnlabs.org/blog/2013/07/01/elasticsearch-query-tutorial.html

Specifically under the 'Query Language' section.

I would assume that size can be defined like this:

    rootRef.child("search/request").push().set {index: 'firebase', type: 'names', size: 3, query:    {query_string: {query: '*fa*'} }}

or like this:

  rootRef.child("search/request").push().set {index: 'firebase', type: 'names', query:    {query_string: {query: '*fa*'}, size: 3 }}

Either of them don't work, what am i missing?

Thanks!

Website doesn't seem to work

I enter a search term such as "bruce" and click "search", but nothing happens. Isn't the expected behavior a result or a message saying that there are no results?

Issue on deploying with Bonsai

I've tested this on my local server and it works fine, installing with Heroku/Bonsai gives the following errors in my log file and displays an "Application Error":

16:08:18.326 2015-01-11 10:38:17.850780+00:00 app worker.1 - indexed firebase/restaurant/-JfKiTMVyCByfMQQx9nH
» 16:08:18.326 2015-01-11 10:38:17.850782+00:00 app worker.1 - indexed firebase/restaurant/-JfKjGodhQ0YDa5q9w2x
» 16:08:18.326 2015-01-11 10:38:17.850784+00:00 app worker.1 - indexed firebase/restaurant/-JfKiQKqJUcN_jXItxKz
» 16:08:18.326 2015-01-11 10:38:17.856144+00:00 app worker.1 - indexed firebase/restaurant/-JfKm02oV24QnCXZ0WBR
» 16:08:19.479 2015-01-11 10:38:19.092945+00:00 heroku worker.1 - Process exited with status 143
» 16:09:17.877 2015-01-11 10:39:17.482779+00:00 app worker.1 - Next cleanup in 60 seconds`» 16:10:17.837 2015-01-11 10:40:17.531292+00:00 app worker.1 - - Next cleanup in 60 seconds
» 16:11:17.993 2015-01-11 10:41:17.584495+00:00 app worker.1 - - Next cleanup in 60 seconds
» 16:12:18.125 2015-01-11 10:42:17.634753+00:00 app worker.1 - - Next cleanup in 60 seconds
» 16:12:44.113 2015-01-11 10:42:43.810730+00:00 heroku router - - at=error code=H14 desc="No web processes running" method=GET path="/" host=damp-depths-9841.herokuapp.com request_id=7c0ae031-dec1-4357-8a3f-7bb9b920d686 fwd="122.179.35.248" dyno= connect= service= status=503 bytes= Fatal

However, there is a dyno connected to the "worker: node ./app.js" as per the procfile, the general solve for error H14.

Any idea what's going on? Thanks.

Update example/ to work with 3.x SDK

When listening for new response, the .on('value' callback is called twice

searchRef.child('response/' + key).on('value', function(resSnapshot){
        console.log(resSnapshot.val());
      });

the callback executes twice,
first time with this error - {error: "IndexMissingException[[firebase] missing]", total: 0}
second time with the actual data - {hits: Array[10], max_score: 1, total: 319}

Just ignore it when i get the error?

when I run this: curl -X POST http://localhost:9200/firebase/
I get: {"error":"IndexAlreadyExistsException[[firebase] already exists]","status":400}

Thanks,
Idanb11

Flashlight not returning data on heroku

Hi, When I try running the flashlight along with elasticsearch on my local connecting to remote firebase it works fine but somehow when I configured it following the exact steps it is not returning any data. Can you please help?
{
took: 1,
timed_out: false,
_shards: {
total: 1,
successful: 1,
failed: 0
},
hits: {
total: 0,
max_score: null,
hits: [ ]
}
}

ElasticSearch Scroll API

I have fully integrated Flashlight with my Firebase database and have several queries working perfectly already. I really love how nice everything works together. I have a particular use case in my iOS app where I need to request a huge number of items (sorting not necessary) to display on a map. Since this list could grow to greater than 100k results in the future I wanted to use ElasticSearch's scroll query string. I'm new to Node.js and am not sure exactly how to get Flashlight to do what I want. I am passing a moderately complex JSON query string as the query parameter and have set the index and type parameter to the correct index. This works perfect as one request, but is there any way I can get Flashlight to do a scroll pagination (i.e. _search?scroll=1m)? If not, do you have any recommendations on how to accomplish this?

Thanks so much for any help you can provide.
-Adam

Error: ref.child(...).push(...).key is not a function

In example.js line 26 is the following:
var key = ref.child('request').push({ index: index, type: type, query: query }).key();
which gives me the error in the subject line.

Just if I change .key() to .name() I get the correct push ID in return. Is that a bug?

I am using Firebase version 2.2.9.

Just did a fresh install of flashlight

I just did a fresh install of flashligiht but I am getting the following error not sure why

failed to index firebase/userssearch/-JbNpxICq4RijCKHSKhX: Error: getaddrinfo ENOTFOUND

Question regarding setup with new Firebase

Hi Firebase team, what is the FB_NAME and FB_TOKEN in the following setup command for Heroku/Bonsai?

heroku config:set FB_NAME=<instance> FB_TOKEN="<token>"

Is FB_NAME Firebase app name?
If so, is it with the letters/numbers following it due to the new Firebase setup where it's no longer app_name.firebaseio.com but app_name-abc123.firebaseio.com?

and what is FB_TOKEN? (is it a key or something in my google plist I download?)

Thanks in advance!

Question: Any docs on filter function for the paths?

Just trying to figure out how to get the filter working... looking for some insight. As it doesn't seem to be working currently...

config.js

exports.paths = [
   {
      path:  "products",
      index: "firebase",
      type:  "product",
      fields: ['brands', 'name', 'categories'],
      filter: function(data) {
         return data.live != false || data.searchable != false;
      }
   }
];

Still getting all products that match the query string
Any guidance? Articles? etc? having some trouble finding anything...

Need to update Firebase client in package.json

The current package.json in this repo is still linking to Firebase 1, so it throws errors when cloned and run out-of-the-box. Specifically, the .key() function didn't exist yet in that version of Firebase.

Indexes not being deleted

After I delete an item from firebase, the index is not being removed from my elastic search server. The logs however, say the item is being deleted. But then, when i do another search, the item is still there.

Here is the "deleted" messaged

Oct 02 16:33:25 bookreadings-search app/worker.1:  deleted firebase/reading/-JYCgDAf6Fgdbp3Focga

But then, a few minutes after, the index appears in a search result. (Note the same _ids)

Oct 02 16:37:23 bookreadings-search app/worker.1:  search result {"took":1,"timed_out":false,"_shards":{"total":1,"successful":1,"failed":0},"hits":{"total":1,"max_score":1.1399126,"hits":[{"_index":"firebase","_type":"reading","_id":"-JYCgDAf6Fgdbp3Focga","_score":0.06655927,"_source":{"title":"This will be the most played.","tags":["Most Played"],"created":1412200194853}}]}}

Batching Request | Error Handling

Hi !

Everything works well when the service locally, but in production, using bonsai.io I can't have concurrent writting.
So when I do batch update on my firebase database, I get the following :

failed to delete firebase/user/<horrible-hash-here>: Error: Concurrent request limit exceeded. Please consider batching your requests, or contact [email protected] for help.

For sure, more concurrent request means more billing.
How would we be able to implement a throttling system or something similar ?
Maybe adding a retry queue for errors is an idea ?

ReferenceError: jQuery is not defined

I did run on windows
$ node example.js
C:\wamp\www\flashlight\example\example.js:85
})(jQuery);
^

ReferenceError: jQuery is not defined
at Object. (C:\wamp\www\flashlight\example\example.js:85:4)
at Module._compile (module.js:409:26)
at Object.Module._extensions..js (module.js:416:10)
at Module.load (module.js:343:32)
at Function.Module._load (module.js:300:12)
at Function.Module.runMain (module.js:441:10)
at startup (node.js:139:18)
at node.js:968:3

how to use this plugin behind proxy

Hi, how to use this plugin behind proxy?

Thanks in advance.

Example for providing dynamic paths in the config.

Could you guys provide example for indexing dynamic path using flashlight.

Firebase Flashlight (ElasticSearch) filtering, sorting, pagination

I am using Flashlight Firebase plugin

I am using this example and it's working fine

In the example you can see example.js file have method for query as below

// display search results
 function doSearch(index, type, query) {

    var ref = database.ref().child(PATH);
    var key = ref.child('request').push( { index: index, type: type, query: query }  ).key;
     ref.child('response/'+key).on('value', showResults);

 }

above function returning me the results when I pass values like following JSON

{ index: index, type: type, query: query }

It returning me nothing when i am trying to pass values like following JSON

{ index: index, type: type, query: { "from" : 1, "size" : 5 , "query": query }

but the following ElasticSearch API returning me the result

http://localhost:9200/firebase/user/_search?q=*mani*&pretty&size=5&from=1

and How do i filtering the query using Flashlight like following

  {
     "query": {
    "filtered": {
        "query": {
            "query_string": {
                "query": "drama"
            }
        },
       "filter": {
            //Filter to apply to the query
        }
     }
   }
}

I am using example security rules

Please let me know how to perform complex queries and filtering with Flashlight

Switch to official elasticsearch module to fix high volume errors.

See #22. Indexes are not deleted appropriately when a lot of deletes happen simultaneously.

To fix this, switch out:
-> elastic.js to 1.2.0
-> switched from elasticsearchclient to official elasticsearch module (version 3.0.0)
-> Switched single object processing to bulk operation every 5 seconds

elasticsearchclient library is no longer maintained

elasticsearchclient is no longer maintained.
Need to switch to - https://www.elastic.co/guide/en/elasticsearch/client/javascript-api/current/index.html

ES does not return any value

Hi guys,
I'm trying to hook up my database to ElasticSearch. I did all things and it's partially running.
Why partially? When I call 'http://localhost:9200/firebase/user/_search' in postman with query
'{"query":{"bool":{"must":[{"match":{"visibleName":"skye"}},{"match":{"userRole":10}}]}}}' it's returning something like this:
{ "took": 3, "timed_out": false, "_shards": { "total": 5, "successful": 5, "failed": 0 }, "hits": { "total": 1, "max_score": 5.436413, "hits": [ { "_index": "firebase", "_type": "user", "_id": "2nzz9Dd0kcdgxYOsou2lqNPD1uJ2", "_score": 5.436413, "_source": { ..... "userRole": 10, "visibleName": "skye" } } ] }
This user I was looking for. The problem comes when I try to execute a query from my android application. It's (query) finally looks like:
"-KSCCogBkV2BpjGeJ-jO" : { "index" : "firebase", "query" : { "query" : { "bool" : { "must" : [ { "match" : { "visibleName" : "skye" } }, { "match" : { "userRole" : 10 } } ] } } }, "type" : "User" },
After that, my datasnapshot is DataSnapshot { key = -KSCoGGRuEYgk4Y7lhCr, value = {.priority=1.474479330934E12, total=0} }.

Am I doing something wrong?

Question

I didn't see a mailing list so I am asking here:

I have a firebase database whose top structure is divided by organizaations. each org has for example their own set of customers. org1/customers/{1,2,3,4,5} ... org2/customers/{1,2,3,4,5} ...
each customer is a map line { id: 1, name: { first: 'John', middle: 'Q', last: 'Consumer' } }

How can I use flashlight to index the database separately?
How can I setup flashlight for such model?
How would I make queries specific to only an organization?

Thanks in advance

Pascal

Timeout after 30s

I am running the code for indexing the firebase data. But the indexer stops after 30 seconds, giving:

failed to index indexname/resource/resourcename: Error: Request Timeout after 30000ms

I tried giving timeout in the app.js
var esc = new ElasticSearch.Client({
hosts: [
{
host: conf.ES_HOST,
port: conf.ES_PORT,
auth: (conf.ES_USER && conf.ES_PASS) ? conf.ES_USER + ':' + conf.ES_PASS : null,
requestTimeout: 120000,
timeout: 120000
}
]
});

but that too does not work. Can you please help me out in understanding the problem and the related work around.

Demo is not differentiating between users and messages

Lost the type parameter. Add this back.

Nested Queries

Hi, I was just wondering if flashlight has the ability to do nested queries. I tried it on a large dataset, but am getting the error:

QueryParsingException[[firebase] [nested] nested object under path [0] is not of nested type]

If my mapping is:
firebase --> record --> 0 --> etc

perhaps I am not formatting my push correctly?

'index': 'firebase',
'type': 'record',
'query': {
'filtered' : {
'query': {
'match_all' : {}
},
'filter' : {
'nested': {
'path': '0',
'query' : {
'query_string': {
'query': "closing shift"
}
}
}
}
}

I also attempted to write a parser that would forcibly add "type": "nested" to every relevant of JSON, but it did not affect the mapping at all.

Thanks in advance.

Getting error on demo page

Getting different results from elastic search for - on.('value'

null
Object {hits: Array[250], max_score: 1, total: 2053}
Object {hits: Array[250], max_score: 1, total: 1610}

How can I know which results is the correct one?

How to make flashlight work on the Google Cloud Platform?

So, I uploaded my website to the web as a static site via Cloud Storage, then I used the click-to-deploy GCP feature, but I don't know how to connect the two things.

Even an informal, quick set of instructions written here below, would help a great deal!
Thanks!

Field monitoring

Configuring the app to monitor only specific fields doesn't seem to work. Even though I have specified a 'fields' array, search still returns results as if all fields are being monitored. Perhaps I have misunderstood how this is supposed to work?

googlearchive / flashlight Goto Github PK

flashlight's Introduction

Status: Archived

Flashlight

Getting Started

Client Implementations

Deploy to Heroku

Setup Initial Index with Bonsai

Migration

0.2.0 -> 0.3.0

Before, in 0.2.0

After, in 0.3.0

Advanced Topics

Parsing and filtering indexed data

Building ElasticSearch Queries

Example: Simple text search

Example: Paginate

Example: Search for multiple tags or categories

Example: Search only specific fields

Example: Give more weight to specific fields

Helpful section of ES docs

Operating at massive scale

Use refBuilder to improve indexing efficiency

Loading paths to index from the database instead of config file

Support

License

flashlight's People

Contributors

Stargazers

Watchers

Forkers

flashlight's Issues

Recommend Projects

Recommend Topics

Recommend Org