Giter Club home page Giter Club logo

gnippy's People

Contributors

abh1nav avatar dan-blanchard avatar everilae avatar j-bennet avatar jimmoffitt avatar ryankicks avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

gnippy's Issues

Empty config settings from environment override correct settings from file

After the #7 merge, even if I have no GNIPPY_ environment variables set, gnippy is still trying to use them and ignores my valid provided config file.

Here is some relevant traceback:

  File "configurator.py", line 74, in list_rules
    all_rules = rules.get_rules(config_file_path=config_file)
  File "/Users/irina/.pyenv/versions/casterisk/lib/python2.7/site-packages/gnippy/rules.py", line 150, in get_rules
    rules_url = _generate_rules_url(conf['url'])
  File "/Users/irina/.pyenv/versions/casterisk/lib/python2.7/site-packages/gnippy/rules.py", line 13, in _generate_rules_url
    if ".json" not in url:
TypeError: argument of type 'NoneType' is not iterable

I have URL defined in my config file, but none of the GNIPPY_ environment variables.

Rule management will change with PowerTrack 2.0

With PowerTrack 2.0 migration coming up (deadline is December 1, 2016), the following will not be valid anymore:

gnippy/gnippy/rules.py

Lines 17 to 24 in 9765bbe

def _generate_rules_url(url):
""" Generate a rules URL from a PowerTrack URL """
if ".json" not in url:
raise BadPowerTrackUrlException("Doesn't end with .json")
if "stream.gnip.com" not in url:
raise BadPowerTrackUrlException("Doesn't contain stream.gnip.com")
return url.replace(".json", "/rules.json").replace("stream.gnip.com", "api.gnip.com")

Instead of:

https://api.gnip.com:443/accounts/<account-name>/publishers/twitter/streams/track/<stream-label>/rules.json

rules will have to be posted to:

https://gnip-api.twitter.com/rules/powertrack/accounts/<account-name>/publishers/twitter/<stream-label>.json.

It would be better if gnippy accepted rules_url as part of the config.

In addition, rule deletion will now be done with a POST request and &_method=delete parameter (from here).

If an error is raised when consuming PowerTrack stream, there's no way to communicate it to the caller

@abh1nav

I encountered a rare "Connection reset by peer" when consuming PowerTrack stream:

https://github.com/abh1nav/gnippy/blob/master/gnippy/powertrackclient.py#L75

The exception was thrown in iter_lines; unfortunately, I don't have the traceback available.

Because this error is happening within a worker thread, and there's no way to catch that, the calling thread also dies. I need a way to capture the error event and restart the connection to PowerTrack.

Suggestion:

  • add an exception_callback parameter when creating PowerTrackClient
  • all exceptions would be caught in worker thread's run
  • if we have an exception_callback provided, that would be called, letting the caller know that a reconnect may be needed
  • if no exception_callback is provided, re-raise.

If that sounds ok, I'd implement that.

Thoughts?

Bad credentials don't throw an error

It would be great if the client could throw an error if the credentials are wrong. Right now looks like the client is just waiting for data to come back. Happy to work on this if you like.

Thanks for your great work on this!

New release is needed

@abh1nav We're using gnippy in parse.ly. We need a release after this PR was merged, otherwise our dependencies end up in a bad state:

#29

Any chance you could do a release, or make me or @dan-blanchard a gnippy maintainer on pypi?

Data loss when stop requested

When Worker is requested to stop it will do so at once when possible, which is when r.iter_lines wakes up and returns data. Depending on the stream and rules this could either be just an ack with high probability, or precious JSON.

    for line in r.iter_lines():
        if self.stopped():
            break

        elif line:
            self.on_data(line)

It is a bit of an opinionated subject, but to me it seems like it should handle the possibly received data and then stop.

Deleting rules retrieved with get_rules() errors out

If I do this:

all_rules = rules.get_rules()
rules.delete_rules(all_rules)

I get an exception:

Traceback (most recent call last):
  File "configurator.py", line 93, in <module>
    delete_rules(popts.file_name)
  File "configurator.py", line 43, in delete_rules
    rules.delete_rules(all_rules)
  File "/Users/irina/.pyenv/versions/casterisk/lib/python2.7/site-packages/gnippy/rules.py", line 180, in delete_rules
    _delete(conf, rules_list)
  File "/Users/irina/.pyenv/versions/casterisk/lib/python2.7/site-packages/gnippy/rules.py", line 96, in _delete
    _check_rules_list(built_rules)
  File "/Users/irina/.pyenv/versions/casterisk/lib/python2.7/site-packages/gnippy/rules.py", line 50, in _check_rules_list
    fail()
  File "/Users/irina/.pyenv/versions/casterisk/lib/python2.7/site-packages/gnippy/rules.py", line 33, in fail
    raise RulesListFormatException(msg)
gnippy.errors.RulesListFormatException: rules_list is not in the correct format. Please use build_rule to build your rules list.

In my case, all_rules only contains one item and looks like this:

[{u'tag': None, u'value': u'url_contains:"mashable.com"'}]

It seems that {'tag'=None} is the problem when submitting to delete API. If my list looks like this:

[{u'value': u'url_contains:"mashable.com"'}]

then there's no exception and rules are deleted.

Why not "requests >="

Right now, Gnippy uses requests strictly pinned to 2.8.1:

requests==2.8.1

This is a conflicting requirement with one of the packages we use that requires latest requests (not pinned).

Can this be changed to requests>=2.8.1?

New rule format breaks delete example

The examples in the README for deleting rules are broken due to the addition of the id field in the rules returned by GNIP.

from gnippy import rules
from gnippy.errors import RuleDeleteFailedException, RulesGetFailedException

try:
    rules_list = rules.get_rules()
    # Suppose I want to delete the first rule in the list
    rules.delete_rule(rules_list[0])
    # OR ... I want to delete ALL rules
    rules.delete_rules(rules_list)

except RuleDeleteFailedException, RulesGetFailedException:
    pass

At the moment a user needs to either manually remove the id field

del rule['id']
rules.delete_rule(rule)

or rebuild the rule using rules.build

 rules.delete_rule(rules.build(rule['value'], rule.get('tag')))

IncompleteRead - We need some strategy to handle this.

When this happens it would be good to be able to recover gracefully.

I'm using:
$ python -V
Python 2.7.12

BTW Many thanks for a great library.

Exception in thread Thread-1:                                                                                                                                                                                                                        
Traceback (most recent call last):                                                                                                                                                                                                                   
  File "/usr/local/lib/python2.7.12/lib/python2.7/threading.py", line 801, in __bootstrap_inner                                                                                                                                                      
    self.run()                                                                                                                                                                                                                                       
  File "/home/cabox/env/g2/lib/python2.7/site-packages/gnippy/powertrackclient.py", line 133, in run                                                                                                                                                 
    for line in r.iter_lines():                                                                                                                                                                                                                      
  File "/home/cabox/env/g2/lib/python2.7/site-packages/requests/models.py", line 720, in iter_lines                                                                                                                                                  
    for chunk in self.iter_content(chunk_size=chunk_size, decode_unicode=decode_unicode):                                                                                                                                                            
  File "/home/cabox/env/g2/lib/python2.7/site-packages/requests/models.py", line 679, in generate                                                                                                                                                    
    raise ChunkedEncodingError(e)                                                                                                                                                                                                                    
ChunkedEncodingError: ('Connection broken: IncompleteRead(0 bytes read, 512 more expected)', IncompleteRead(0 bytes read, 512 more expecte
```d))                           

                                                                             
                                                                                                                                                 

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.