Giter Club home page Giter Club logo

gnip4j's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

gnip4j's Issues

NullPointerException thrown when receiving a "delete favorite" event on the compliance stream

When the compliance stream receives a delete favorite event, the following exception is thrown:

WARN  [] c.z.g.api.impl.DefaultGnipStream {} - Unexpected exception while consuming activity stream {your-stream-name}: null
[10.100.6.81] out: java.lang.NullPointerException: null
[10.100.6.81] out: 	at com.zaubersoftware.gnip4j.api.model.compliance.DeleteStatusActivity$Status.access$000(DeleteStatusActivity.java:38) ~[gnip4j-core-2.1.0.jar:na]
[10.100.6.81] out: 	at com.zaubersoftware.gnip4j.api.model.compliance.DeleteStatusActivity.toActivity(DeleteStatusActivity.java:49) ~[gnip4j-core-2.1.0.jar:na]
[10.100.6.81] out: 	at com.zaubersoftware.gnip4j.api.impl.formats.ComplianceActivityUnmarshaller.unmarshall(ComplianceActivityUnmarshaller.java:42) ~[gnip4j-core-2.1.0.jar:na]
[10.100.6.81] out: 	at com.zaubersoftware.gnip4j.api.impl.formats.ComplianceActivityUnmarshaller.unmarshall(ComplianceActivityUnmarshaller.java:35) ~[gnip4j-core-2.1.0.jar:na]
[10.100.6.81] out: 	at com.zaubersoftware.gnip4j.api.impl.formats.ByLineFeedProcessor.process(ByLineFeedProcessor.java:53) ~[gnip4j-core-2.1.0.jar:na]
[10.100.6.81] out: 	at com.zaubersoftware.gnip4j.api.impl.DefaultGnipStream$GnipHttpConsumer.run(DefaultGnipStream.java:229) ~[gnip4j-core-2.1.0.jar:na]
[10.100.6.81] out: 	at java.lang.Thread.run(Thread.java:745) [na:1.8.0_72]

Because the exception is unexpected, it actually assumes that it is disconnects and immediately tries to reconnect, resulting in some thrashing of the connections.

I have a fix in the works.

Upgrade to Gnip 2.0 API

It is not certainly an issue.
I am forced to create issue as I am not aware how to contact with you/team.

I would like to whether you have any plans to upgrade this library to support Gnip 2.0 API

Handling of Gnip Compliance 2.0 Stream

The JSON format for compliance events has changed from Gnip 1.0 to Gnip 2.0. This prevents gnip4j from deserializing the stream properly.

The old format:

{
    "objectType": "activity",
    "verb": "delete",
    "object": {
        "id": "tag:search.twitter.com,2005:780997295722627072"
    },
    "actor": {
        "id": "id:twitter.com:4500096554"
    },
    "timestampMs": "2016-09-28T05:08:14.483+00:00"
}

The new format:

{
    "delete": {
        "status": {
            "id": 666590812910694401,
            "id_str": "666590812910694401",
            "user_id": 205634580,
            "user_id_str": "205634580"
        },
        "timestamp_ms": "1475038917641"
    }
}

From what I can see in the code, fixing this might involve changing out what type of FeedProcessor is constructed in the DefaultGnipStream (or perhaps having a different GnipStream implementation) for the compliance stream. The FeedProcessor would need to coerce the new compliance format into the new format. For reference, the new possible objects on the stream are described here: http://support.gnip.com/sources/twitter/data_format.html#SamplePayloads

Out Of Memory Issue

Hi,

I am getting out of memory issue while using your library.
I have generated the heap dump and ran Eclipse Memory Analyzer Tool.
It showed the below message:
One instance of "java.util.concurrent.ThreadPoolExecutor" loaded by "" occupies 3,532,605,080 (86.41%) bytes. The instance is referenced by com.zaubersoftware.gnip4j.api.impl.formats.ByLineFeedProcessor @ 0x6c8eb0d00 , loaded by "org.apache.catalina.loader.WebappClassLoader @ 0x6c2e36908".

Keywords
java.util.concurrent.ThreadPoolExecutor
org.apache.catalina.loader.WebappClassLoader @ 0x6c2e36908

I don't know whether it's right place to post it or not.
I hope you will help me out!
Thanks.

Matching rules doesn't have the ID field

Hello,

In the new GNIP 2.0, the field matching_rules now has the fields "tag" and "id". However, this change is not implemented in the class "MatchingRules" since there is only "tag" and "value" (old version).

We should add a field in this class called "id" (and maintain the "value" one for compatibility).

Thanks.

Jackson mapping Issue - Powertrack V2

I'm getting the following error when try to parse v2 payload

org.codehaus.jackson.map.JsonMappingException: Can not construct instance of com.zaubersoftware.gnip4j.api.model.Geometry, problem: abstract types can only be instantiated with additional type information at [Source: java.io.StringReader@26e35d06; line: 1, column: 1694] (through reference chain: com.zaubersoftware.gnip4j.api.model.Activity["location"]->com.zaubersoftware.gnip4j.api.model.Location["geo"]->com.zaubersoftware.gnip4j.api.model.Geo["coordinates"]) at org.codehaus.jackson.map.JsonMappingException.from(JsonMappingException.java:163) at org.codehaus.jackson.map.deser.StdDeserializationContext.instantiationException(StdDeserializationContext.java:233) at org.codehaus.jackson.map.deser.AbstractDeserializer.deserialize(AbstractDeserializer.java:60) at org.codehaus.jackson.map.deser.SettableBeanProperty.deserialize(SettableBeanProperty.java:299) at org.codehaus.jackson.map.deser.SettableBeanProperty$MethodProperty.deserializeAndSet(SettableBeanProperty.java:414) at org.codehaus.jackson.map.deser.BeanDeserializer.deserializeFromObject(BeanDeserializer.java:697) at org.codehaus.jackson.map.deser.BeanDeserializer.deserialize(BeanDeserializer.java:580) at org.codehaus.jackson.map.deser.SettableBeanProperty.deserialize(SettableBeanProperty.java:299) at org.codehaus.jackson.map.deser.SettableBeanProperty$MethodProperty.deserializeAndSet(SettableBeanProperty.java:414) at org.codehaus.jackson.map.deser.BeanDeserializer.deserializeFromObject(BeanDeserializer.java:697) at org.codehaus.jackson.map.deser.BeanDeserializer.deserialize(BeanDeserializer.java:580) at org.codehaus.jackson.map.deser.SettableBeanProperty.deserialize(SettableBeanProperty.java:299) at org.codehaus.jackson.map.deser.SettableBeanProperty$MethodProperty.deserializeAndSet(SettableBeanProperty.java:414) at org.codehaus.jackson.map.deser.BeanDeserializer.deserializeFromObject(BeanDeserializer.java:697) at org.codehaus.jackson.map.deser.BeanDeserializer.deserialize(BeanDeserializer.java:580) at org.codehaus.jackson.map.ObjectMapper._readMapAndClose(ObjectMapper.java:2732) at org.codehaus.jackson.map.ObjectMapper.readValue(ObjectMapper.java:1863) at com.dev.search.gnip.TestVTwo.main(TestVTwo.java:29) Exception in thread "main" java.lang.NullPointerException at com.dev.search.gnip.TestVTwo.main(TestVTwo.java:34)

JacksonJsonMapper will throw EOFException if gnip sends \r for every 30 seconds

The case is gnip will send a carriage return for every 30 seconds to make sure that the stream connection is not closed. The problem is JacksonJsonMapper will throw EOFException when receiving this carriage return (refer to _initForReading() of Jackson ObjectMapper.java).

The EOFException will cause the gnip connection be disconnected (although gnip4j will reconnect automatically).

The DefaultGnipStream.java make use of the extended ObjectMapper and override the _initForReading(), which ignore the EOFException for stream class
e.g.

private static class GnipStreamJsonObjectMapper extends ObjectMapper {
    @Override
    protected JsonToken _initForReading(JsonParser jp)
            throws IOException, JsonParseException, JsonMappingException {
        /*
         * First: must point to a token; if not pointing to one, advance. This
         * occurs before first read from JsonParser, as well as after clearing
         * of current token.
         */
        JsonToken t = jp.getCurrentToken();
        if (t == null) {
            // and then we must get something...
            t = jp.nextToken();
            if (t == null) {
                /*
                 * [JACKSON-99] Should throw EOFException, closed thing
                 * semantically
                 */
//                 throw new EOFException("No content to map to Object due to end of input");
                return t;
            }
        }
        return t;
    }
}

No JSON parsing

Hi all,
I wonder if this use case is covered by the current API.
We need to connect to GNIP to get data. The data will be stored in a document NoSQL database. Which means that we do not what to spend the time on parsing the JSON document into the Activity instance. But we would like to re-use the connection and monitoring facilities from this library.
Are you interested in such a use case ?

Cheers,
Andrey

Make api models serializable

I am using gnip4j as a spout in a twitter storm installation. For this to work I had to make all of the api models implement serializable. It would be great if this were merged into the code base.

New release

Hello, can a new release be published to maven repositories to pick up recent changes?

Missing property order "type" for Geo.java

For com.zaubersoftware.gnip4j.api.model.Geo, the "type" property is missing in JAXB property order.

@XmlType(name = "", propOrder = { "coordinates" })

should be replaced by

@XmlType(name = "", propOrder = { "coordinates", "type" })

Add UTF8 support

All tweets that I receive in Japanese are corrupted. All mulitbyte characters are corrupted.

I am logging activity.getObject().getSummary() to see the problem.

Thanks in advance.

Error Deleting Arabic Rule

Hi

I am getting an issue deleting an Arabic rule I have added to gnip.

The rule is created fine on gnip and can view in edit console
(االخلافة على منهج النبوة ) ((الخلافة على منهاج النبوة ) OR (الخلافة على منهج النبوة ))

However, looking at readonly gnip view it displays differently
{"value":"(االخلا�ة على منهج النبوة ) ((الخلا�ة على منهاج النبوة ) OR (الخلا�ة على منهج النبوة ))","tag":"26974_266"}

Then when I call gnip.delete it fails to delete the rule and doesn't throw an error.
Calling getRules again confirms the rule is found and still on gnip

Regards
Tom

cannot deserialize properly for the geo coordinates

I found that the geo4j could not deserialize properly for the geo coordinates using the twitter sample payload from http://gnip.com/twitter/power-track.

The issue is coordinates in the JSON format is in array of array of array. However, the coordinates property of Geo.java is simply double[]

To fix the issue, double []coordinates should be replaced by double [][][]coordinates (Of course, the accessor methods should be updated correspondingly).

PowerTrack V2 Stream/Rules Base URL

The API Reference: http://support.gnip.com/apis/powertrack2.0/api_reference.html#Stream shows the following:

Stream URL: https://gnip-stream.twitter.com/...
Rules URL: https://gnip-api.twitter.com...

But PowerTrackV2UriStrategy.java uses:

public static final String DEFAULT_STREAM_URL_BASE  = "https://stream-data-api.twitter.com";
public static final String DEFAULT_RULE_URL_BASE = "https://data-api.twitter.com";

Are they both valid? Is the latter one deprecated?

Default timer in JRERemoteResourceProvider

Just a small issue, the default readTimeout counter in com.zaubersoftware.gnip4j.api.support.http.JRERemoteResourceProvider (line 48) is set to 10 seconds, Gnip streams send out keep alives every 30 seconds so this causes the stream to die fairly quickly on low activity streams, I'd suggest increasing this to > 30 seconds. (35s seems stable for me).

As a workaround you can override the JRERemoteResourceProvider.doConfiguration() method and put in a longer pause period.

Many thanks for sharing this project!

New Release

Are there any plans to publish a new Maven release? It would be really helpful to get the latest from the repository instead of building the project and having just this one library that we have to track differently.

Mistakes in the Maven snippets on the gnip4j site

On the gnip4j website, in the Usage section, there are at least two mistakes in the Maven snippets:

  • The outer tag dependencyManagement is spelled wrong
  • The inner dependency tags all need to be children of dependencies.

This causes annoyance to anyone who tries to just copy-and-paste the snippet in. Here's a corrected version to add to the site:

<dependencyManagement>
  <dependencies>
    <dependency>
      <groupId>com.zaubersoftware.gnip4j</groupId>
      <artifactId>gnip4j-core</artifactId>
      <version>${gnip4j.version}</version>
    </dependency>
    <dependency>
      <groupId>com.zaubersoftware.gnip4j</groupId>
      <artifactId>gnip4j-http</artifactId>
      <version>${gnip4j.version}</version>
    </dependency>
  </dependencies>
</dependencyManagement>

Hope this helps :)

Still maintained?

Just wondering if this project is still being actively maintained?

Bug in JSONDeserializationTest#testNPE()

The test testNPE() in com.zaubersoftware.gnip4j.http.JSONDeserializationTest fails on trunk because the private function removeTimeZoneFields() is using an incorrect regex for parsing postedTime attributes.

The bug is triggered in this line:

assertEquals(removeTimeZoneFields(IOUtils.toString(expectedIs)), removeTimeZoneFields(ww.toString()));

The test data is in payload-twitter-entities.js (for expectedIs) and in payload-twitter-entities.xml (for ww).

However, the postedTime values in expectedIs and ww are different:


[...] postedTime="2008-05-07T16:27:40.000+02:00">

[...] postedTime="2008-05-07T11:27:40.000-03:00">

As you can see, the first string has +02:00 (note the + character). I'm not sure how the "+02:00" got there in the first place; I could find it in neither the .js nor in the .xml file. I suppose it is being (incorrectly) deduced from my own local time zone (which is CET, i.e. UTC+1), which would also explain why a colleague of mine in the US (UTC-5) is not seeing this problem in his build.

What happens is that the regex in removeTimeZoneFields() will remove the -03:00 postedTime attribute, but it will fail to match on the + -- the in +02:00 and therefore not remove this postedTime attribute. This causes the unit test to fail.

I can reproduce this bug by running the build, e.g. via mvn test or mvn install in the top-level directory. I tested this on Ubuntu 11.10 and Maven 3.

The following change fixes the problem:

In com.zaubersoftware.gnip4j.http.JSONDeserializationTest, replace

return input.replaceAll("postedTime=\"[\\d-T:\\.]*\"", "");

With

return input.replaceAll("postedTime=\"[\\d-\\+T:\\.]*\"", "");

I'll also send a pull request if you're happy with this patch.

Inflater has been closed

I receive next error when I call GnipStream.close()

14:57:46 INFO [Thread-4] [DefaultGnipStream(info:84)] - Shutting Down prod
14:57:47 WARN [prod-consumer-http] [DefaultGnipStream(warn:114)] - Unexpected exception while consuming activity stream prod: Inflater has been closed
java.lang.NullPointerException: Inflater has been closed
	at java.util.zip.Inflater.ensureOpen(Inflater.java:389) ~[?:1.8.0_111]
	at java.util.zip.Inflater.inflate(Inflater.java:257) ~[?:1.8.0_111]
	at java.util.zip.InflaterInputStream.read(InflaterInputStream.java:152) ~[?:1.8.0_111]
	at java.util.zip.GZIPInputStream.read(GZIPInputStream.java:117) ~[?:1.8.0_111]
	at com.zaubersoftware.gnip4j.api.support.http.AbstractReleaseInputStream.read(AbstractReleaseInputStream.java:60) ~[gnip4j-core-2.2.0.jar:?]
	at com.zaubersoftware.gnip4j.api.stats.commonsio.ProxyInputStream.read(ProxyInputStream.java:79) ~[gnip4j-core-2.2.0.jar:?]
	at com.zaubersoftware.gnip4j.api.stats.commonsio.TeeInputStream.read(TeeInputStream.java:128) ~[gnip4j-core-2.2.0.jar:?]
	at sun.nio.cs.StreamDecoder.readBytes(StreamDecoder.java:284) ~[?:1.8.0_111]
	at sun.nio.cs.StreamDecoder.implRead(StreamDecoder.java:326) ~[?:1.8.0_111]
	at sun.nio.cs.StreamDecoder.read(StreamDecoder.java:178) ~[?:1.8.0_111]
	at java.io.InputStreamReader.read(InputStreamReader.java:184) ~[?:1.8.0_111]
	at java.io.BufferedReader.fill(BufferedReader.java:161) ~[?:1.8.0_111]
	at java.io.BufferedReader.readLine(BufferedReader.java:324) ~[?:1.8.0_111]
	at java.io.BufferedReader.readLine(BufferedReader.java:389) ~[?:1.8.0_111]
	at com.zaubersoftware.gnip4j.api.impl.formats.ByLineFeedProcessor.process(ByLineFeedProcessor.java:51) ~[gnip4j-core-2.2.0.jar:?]
	at com.zaubersoftware.gnip4j.api.impl.DefaultGnipStream$GnipHttpConsumer.run(DefaultGnipStream.java:229) [gnip4j-core-2.2.0.jar:?]
	at java.lang.Thread.run(Thread.java:745) [?:1.8.0_111]

What is this?

I use:

        <!-- Gnip4j -->
        <dependency>
            <groupId>com.zaubersoftware.gnip4j</groupId>
            <artifactId>gnip4j-core</artifactId>
            <version>2.2.0</version>
        </dependency>
        <dependency>
            <groupId>com.zaubersoftware.gnip4j</groupId>
            <artifactId>gnip4j-http</artifactId>
            <version>2.1.0</version>
        </dependency>

Replay Stream Support

Would it make sense to have a ReplayUriStrategy?

Looking at PowerTrack 2.0 Replay, it would mostly be a copy-paste of PowerTrackV2UriStrategy, but with a few changes:

Instead of

public final class PowerTrackV2UriStrategy implements UriStrategy {
    public static final String PATH_GNIP_STREAM_URI =  "/stream/powertrack/accounts/%s/publishers/%s/%s.json";
    public static final String PATH_GNIP_RULES_URI =  "/rules/powertrack/accounts/%s/publishers/%s/%s.json";

    @Override
    public URI createStreamUri(final String account, final String streamName) {
        ...

        return URI.create(String.format(streamUrlBase + PATH_GNIP_STREAM_URI, account.trim(), publisher.trim(), streamName.trim()));
    }

    private String createRulesBaseUrl(final String account, final String streamName) {
       ...

         return String.format(ruleUrlBase + PATH_GNIP_RULES_URI, account.trim(), publisher.trim(), streamName.trim());
    }
}

it would have something like

public class PowerTrackV2ReplayUriStrategy implements UriStrategy {
    public static final String PATH_GNIP_REPLAY_STREAM_URI
      = "/replay/powertrack/accounts/%s/publishers/%s/%s.json?fromDate=%s&toDate=%s";
    public static final String PATH_GNIP_REPLAY_RULES_URI
      = "/rules/powertrack-replay/accounts/%s/publishers/%s/%s.json";

    DateFormat formatter = new SimpleDateFormat("yyyyMMddHHmm");
    Date from;
    Date to;

    public PowerTrackV2ReplayUriStrategy(final Date from, final Date to) {
        this.from = from;
        this.to = to;
    }

    @Override
    public URI createStreamUri(String account, String streamName) {
        ...

        final String fromString = formatter.format(from);
        final String toString = formatter.format(to);

        return URI.create(String.format(
            this.streamUrlBase + PATH_GNIP_REPLAY_STREAM_URI, account.trim(),
            this.publisher.trim(), streamName.trim(), fromString, toString));
    }

    private String createRulesBaseUrl(final String account, final String streamName) {
        ...

        return String.format(
            this.ruleUrlBase + PATH_GNIP_REPLAY_RULES_URI, account.trim(), this.publisher.trim(),
            streamName.trim());
  }
}

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.