zauberlabs / gnip4j Goto Github PK
View Code? Open in Web Editor NEWGnip Client Library for Java
License: Apache License 2.0
Gnip Client Library for Java
License: Apache License 2.0
When the compliance stream receives a delete favorite event, the following exception is thrown:
WARN [] c.z.g.api.impl.DefaultGnipStream {} - Unexpected exception while consuming activity stream {your-stream-name}: null
[10.100.6.81] out: java.lang.NullPointerException: null
[10.100.6.81] out: at com.zaubersoftware.gnip4j.api.model.compliance.DeleteStatusActivity$Status.access$000(DeleteStatusActivity.java:38) ~[gnip4j-core-2.1.0.jar:na]
[10.100.6.81] out: at com.zaubersoftware.gnip4j.api.model.compliance.DeleteStatusActivity.toActivity(DeleteStatusActivity.java:49) ~[gnip4j-core-2.1.0.jar:na]
[10.100.6.81] out: at com.zaubersoftware.gnip4j.api.impl.formats.ComplianceActivityUnmarshaller.unmarshall(ComplianceActivityUnmarshaller.java:42) ~[gnip4j-core-2.1.0.jar:na]
[10.100.6.81] out: at com.zaubersoftware.gnip4j.api.impl.formats.ComplianceActivityUnmarshaller.unmarshall(ComplianceActivityUnmarshaller.java:35) ~[gnip4j-core-2.1.0.jar:na]
[10.100.6.81] out: at com.zaubersoftware.gnip4j.api.impl.formats.ByLineFeedProcessor.process(ByLineFeedProcessor.java:53) ~[gnip4j-core-2.1.0.jar:na]
[10.100.6.81] out: at com.zaubersoftware.gnip4j.api.impl.DefaultGnipStream$GnipHttpConsumer.run(DefaultGnipStream.java:229) ~[gnip4j-core-2.1.0.jar:na]
[10.100.6.81] out: at java.lang.Thread.run(Thread.java:745) [na:1.8.0_72]
Because the exception is unexpected, it actually assumes that it is disconnects and immediately tries to reconnect, resulting in some thrashing of the connections.
I have a fix in the works.
It is not certainly an issue.
I am forced to create issue as I am not aware how to contact with you/team.
I would like to whether you have any plans to upgrade this library to support Gnip 2.0 API
The JSON format for compliance events has changed from Gnip 1.0 to Gnip 2.0. This prevents gnip4j from deserializing the stream properly.
The old format:
{
"objectType": "activity",
"verb": "delete",
"object": {
"id": "tag:search.twitter.com,2005:780997295722627072"
},
"actor": {
"id": "id:twitter.com:4500096554"
},
"timestampMs": "2016-09-28T05:08:14.483+00:00"
}
The new format:
{
"delete": {
"status": {
"id": 666590812910694401,
"id_str": "666590812910694401",
"user_id": 205634580,
"user_id_str": "205634580"
},
"timestamp_ms": "1475038917641"
}
}
From what I can see in the code, fixing this might involve changing out what type of FeedProcessor is constructed in the DefaultGnipStream (or perhaps having a different GnipStream implementation) for the compliance stream. The FeedProcessor would need to coerce the new compliance format into the new format. For reference, the new possible objects on the stream are described here: http://support.gnip.com/sources/twitter/data_format.html#SamplePayloads
Hi,
I am getting out of memory issue while using your library.
I have generated the heap dump and ran Eclipse Memory Analyzer Tool.
It showed the below message:
One instance of "java.util.concurrent.ThreadPoolExecutor" loaded by "" occupies 3,532,605,080 (86.41%) bytes. The instance is referenced by com.zaubersoftware.gnip4j.api.impl.formats.ByLineFeedProcessor @ 0x6c8eb0d00 , loaded by "org.apache.catalina.loader.WebappClassLoader @ 0x6c2e36908".
Keywords
java.util.concurrent.ThreadPoolExecutor
org.apache.catalina.loader.WebappClassLoader @ 0x6c2e36908
I don't know whether it's right place to post it or not.
I hope you will help me out!
Thanks.
Hello,
In the new GNIP 2.0, the field matching_rules now has the fields "tag" and "id". However, this change is not implemented in the class "MatchingRules" since there is only "tag" and "value" (old version).
We should add a field in this class called "id" (and maintain the "value" one for compatibility).
Thanks.
I'm getting the following error when try to parse v2 payload
org.codehaus.jackson.map.JsonMappingException: Can not construct instance of com.zaubersoftware.gnip4j.api.model.Geometry, problem: abstract types can only be instantiated with additional type information at [Source: java.io.StringReader@26e35d06; line: 1, column: 1694] (through reference chain: com.zaubersoftware.gnip4j.api.model.Activity["location"]->com.zaubersoftware.gnip4j.api.model.Location["geo"]->com.zaubersoftware.gnip4j.api.model.Geo["coordinates"]) at org.codehaus.jackson.map.JsonMappingException.from(JsonMappingException.java:163) at org.codehaus.jackson.map.deser.StdDeserializationContext.instantiationException(StdDeserializationContext.java:233) at org.codehaus.jackson.map.deser.AbstractDeserializer.deserialize(AbstractDeserializer.java:60) at org.codehaus.jackson.map.deser.SettableBeanProperty.deserialize(SettableBeanProperty.java:299) at org.codehaus.jackson.map.deser.SettableBeanProperty$MethodProperty.deserializeAndSet(SettableBeanProperty.java:414) at org.codehaus.jackson.map.deser.BeanDeserializer.deserializeFromObject(BeanDeserializer.java:697) at org.codehaus.jackson.map.deser.BeanDeserializer.deserialize(BeanDeserializer.java:580) at org.codehaus.jackson.map.deser.SettableBeanProperty.deserialize(SettableBeanProperty.java:299) at org.codehaus.jackson.map.deser.SettableBeanProperty$MethodProperty.deserializeAndSet(SettableBeanProperty.java:414) at org.codehaus.jackson.map.deser.BeanDeserializer.deserializeFromObject(BeanDeserializer.java:697) at org.codehaus.jackson.map.deser.BeanDeserializer.deserialize(BeanDeserializer.java:580) at org.codehaus.jackson.map.deser.SettableBeanProperty.deserialize(SettableBeanProperty.java:299) at org.codehaus.jackson.map.deser.SettableBeanProperty$MethodProperty.deserializeAndSet(SettableBeanProperty.java:414) at org.codehaus.jackson.map.deser.BeanDeserializer.deserializeFromObject(BeanDeserializer.java:697) at org.codehaus.jackson.map.deser.BeanDeserializer.deserialize(BeanDeserializer.java:580) at org.codehaus.jackson.map.ObjectMapper._readMapAndClose(ObjectMapper.java:2732) at org.codehaus.jackson.map.ObjectMapper.readValue(ObjectMapper.java:1863) at com.dev.search.gnip.TestVTwo.main(TestVTwo.java:29) Exception in thread "main" java.lang.NullPointerException at com.dev.search.gnip.TestVTwo.main(TestVTwo.java:34)
The case is gnip will send a carriage return for every 30 seconds to make sure that the stream connection is not closed. The problem is JacksonJsonMapper will throw EOFException when receiving this carriage return (refer to _initForReading() of Jackson ObjectMapper.java).
The EOFException will cause the gnip connection be disconnected (although gnip4j will reconnect automatically).
The DefaultGnipStream.java make use of the extended ObjectMapper and override the _initForReading(), which ignore the EOFException for stream class
e.g.
private static class GnipStreamJsonObjectMapper extends ObjectMapper {
@Override
protected JsonToken _initForReading(JsonParser jp)
throws IOException, JsonParseException, JsonMappingException {
/*
* First: must point to a token; if not pointing to one, advance. This
* occurs before first read from JsonParser, as well as after clearing
* of current token.
*/
JsonToken t = jp.getCurrentToken();
if (t == null) {
// and then we must get something...
t = jp.nextToken();
if (t == null) {
/*
* [JACKSON-99] Should throw EOFException, closed thing
* semantically
*/
// throw new EOFException("No content to map to Object due to end of input");
return t;
}
}
return t;
}
}
Hi all,
I wonder if this use case is covered by the current API.
We need to connect to GNIP to get data. The data will be stored in a document NoSQL database. Which means that we do not what to spend the time on parsing the JSON document into the Activity instance. But we would like to re-use the connection and monitoring facilities from this library.
Are you interested in such a use case ?
Cheers,
Andrey
I am using gnip4j as a spout in a twitter storm installation. For this to work I had to make all of the api models implement serializable. It would be great if this were merged into the code base.
Hello, can a new release be published to maven repositories to pick up recent changes?
For com.zaubersoftware.gnip4j.api.model.Geo, the "type" property is missing in JAXB property order.
@XmlType(name = "", propOrder = { "coordinates" })
should be replaced by
@XmlType(name = "", propOrder = { "coordinates", "type" })
the current configuration to the stream url has changed, from https://stream.gnip.com:443/accounts/xxxxxxx/publishers/twitter/streams/track/xxxxxx.json to https://stream.gnip.com:443/accounts/xxxxxxx/publishers/twitter/streams/usertrack/xxxxxx.json, the url must be set in order to prevent this bug.
All tweets that I receive in Japanese are corrupted. All mulitbyte characters are corrupted.
I am logging activity.getObject().getSummary() to see the problem.
Thanks in advance.
Hi
I am getting an issue deleting an Arabic rule I have added to gnip.
The rule is created fine on gnip and can view in edit console
(االخلافة على منهج النبوة ) ((الخلافة على منهاج النبوة ) OR (الخلافة على منهج النبوة ))
However, looking at readonly gnip view it displays differently
{"value":"(االخلا�ة على منهج النبوة ) ((الخلا�ة على منهاج النبوة ) OR (الخلا�ة على منهج النبوة ))","tag":"26974_266"}
Then when I call gnip.delete it fails to delete the rule and doesn't throw an error.
Calling getRules again confirms the rule is found and still on gnip
Regards
Tom
I found that the geo4j could not deserialize properly for the geo coordinates using the twitter sample payload from http://gnip.com/twitter/power-track.
The issue is coordinates in the JSON format is in array of array of array. However, the coordinates property of Geo.java is simply double[]
To fix the issue, double []coordinates should be replaced by double [][][]coordinates (Of course, the accessor methods should be updated correspondingly).
The API Reference: http://support.gnip.com/apis/powertrack2.0/api_reference.html#Stream shows the following:
Stream URL: https://gnip-stream.twitter.com/...
Rules URL: https://gnip-api.twitter.com...
But PowerTrackV2UriStrategy.java
uses:
public static final String DEFAULT_STREAM_URL_BASE = "https://stream-data-api.twitter.com";
public static final String DEFAULT_RULE_URL_BASE = "https://data-api.twitter.com";
Are they both valid? Is the latter one deprecated?
Just a small issue, the default readTimeout counter in com.zaubersoftware.gnip4j.api.support.http.JRERemoteResourceProvider (line 48) is set to 10 seconds, Gnip streams send out keep alives every 30 seconds so this causes the stream to die fairly quickly on low activity streams, I'd suggest increasing this to > 30 seconds. (35s seems stable for me).
As a workaround you can override the JRERemoteResourceProvider.doConfiguration() method and put in a longer pause period.
Many thanks for sharing this project!
Are there any plans to publish a new Maven release? It would be really helpful to get the latest from the repository instead of building the project and having just this one library that we have to track differently.
On the gnip4j website, in the Usage section, there are at least two mistakes in the Maven snippets:
dependencyManagement
is spelled wrongdependency
tags all need to be children of dependencies
.This causes annoyance to anyone who tries to just copy-and-paste the snippet in. Here's a corrected version to add to the site:
<dependencyManagement>
<dependencies>
<dependency>
<groupId>com.zaubersoftware.gnip4j</groupId>
<artifactId>gnip4j-core</artifactId>
<version>${gnip4j.version}</version>
</dependency>
<dependency>
<groupId>com.zaubersoftware.gnip4j</groupId>
<artifactId>gnip4j-http</artifactId>
<version>${gnip4j.version}</version>
</dependency>
</dependencies>
</dependencyManagement>
Hope this helps :)
Just wondering if this project is still being actively maintained?
The test testNPE()
in com.zaubersoftware.gnip4j.http.JSONDeserializationTest
fails on trunk because the private function removeTimeZoneFields()
is using an incorrect regex for parsing postedTime
attributes.
The bug is triggered in this line:
assertEquals(removeTimeZoneFields(IOUtils.toString(expectedIs)), removeTimeZoneFields(ww.toString()));
The test data is in payload-twitter-entities.js
(for expectedIs
) and in payload-twitter-entities.xml
(for ww
).
However, the postedTime
values in expectedIs
and ww
are different:
[...] postedTime="2008-05-07T16:27:40.000+02:00">
[...] postedTime="2008-05-07T11:27:40.000-03:00">
As you can see, the first string has +02:00
(note the +
character). I'm not sure how the "+02:00" got there in the first place; I could find it in neither the .js nor in the .xml file. I suppose it is being (incorrectly) deduced from my own local time zone (which is CET, i.e. UTC+1), which would also explain why a colleague of mine in the US (UTC-5) is not seeing this problem in his build.
What happens is that the regex in removeTimeZoneFields()
will remove the -03:00
postedTime attribute, but it will fail to match on the +
-- the in +02:00
and therefore not remove this postedTime attribute. This causes the unit test to fail.
I can reproduce this bug by running the build, e.g. via mvn test
or mvn install
in the top-level directory. I tested this on Ubuntu 11.10 and Maven 3.
The following change fixes the problem:
In com.zaubersoftware.gnip4j.http.JSONDeserializationTest
, replace
return input.replaceAll("postedTime=\"[\\d-T:\\.]*\"", "");
With
return input.replaceAll("postedTime=\"[\\d-\\+T:\\.]*\"", "");
I'll also send a pull request if you're happy with this patch.
I receive next error when I call GnipStream.close()
14:57:46 INFO [Thread-4] [DefaultGnipStream(info:84)] - Shutting Down prod
14:57:47 WARN [prod-consumer-http] [DefaultGnipStream(warn:114)] - Unexpected exception while consuming activity stream prod: Inflater has been closed
java.lang.NullPointerException: Inflater has been closed
at java.util.zip.Inflater.ensureOpen(Inflater.java:389) ~[?:1.8.0_111]
at java.util.zip.Inflater.inflate(Inflater.java:257) ~[?:1.8.0_111]
at java.util.zip.InflaterInputStream.read(InflaterInputStream.java:152) ~[?:1.8.0_111]
at java.util.zip.GZIPInputStream.read(GZIPInputStream.java:117) ~[?:1.8.0_111]
at com.zaubersoftware.gnip4j.api.support.http.AbstractReleaseInputStream.read(AbstractReleaseInputStream.java:60) ~[gnip4j-core-2.2.0.jar:?]
at com.zaubersoftware.gnip4j.api.stats.commonsio.ProxyInputStream.read(ProxyInputStream.java:79) ~[gnip4j-core-2.2.0.jar:?]
at com.zaubersoftware.gnip4j.api.stats.commonsio.TeeInputStream.read(TeeInputStream.java:128) ~[gnip4j-core-2.2.0.jar:?]
at sun.nio.cs.StreamDecoder.readBytes(StreamDecoder.java:284) ~[?:1.8.0_111]
at sun.nio.cs.StreamDecoder.implRead(StreamDecoder.java:326) ~[?:1.8.0_111]
at sun.nio.cs.StreamDecoder.read(StreamDecoder.java:178) ~[?:1.8.0_111]
at java.io.InputStreamReader.read(InputStreamReader.java:184) ~[?:1.8.0_111]
at java.io.BufferedReader.fill(BufferedReader.java:161) ~[?:1.8.0_111]
at java.io.BufferedReader.readLine(BufferedReader.java:324) ~[?:1.8.0_111]
at java.io.BufferedReader.readLine(BufferedReader.java:389) ~[?:1.8.0_111]
at com.zaubersoftware.gnip4j.api.impl.formats.ByLineFeedProcessor.process(ByLineFeedProcessor.java:51) ~[gnip4j-core-2.2.0.jar:?]
at com.zaubersoftware.gnip4j.api.impl.DefaultGnipStream$GnipHttpConsumer.run(DefaultGnipStream.java:229) [gnip4j-core-2.2.0.jar:?]
at java.lang.Thread.run(Thread.java:745) [?:1.8.0_111]
What is this?
I use:
<!-- Gnip4j -->
<dependency>
<groupId>com.zaubersoftware.gnip4j</groupId>
<artifactId>gnip4j-core</artifactId>
<version>2.2.0</version>
</dependency>
<dependency>
<groupId>com.zaubersoftware.gnip4j</groupId>
<artifactId>gnip4j-http</artifactId>
<version>2.1.0</version>
</dependency>
Would it make sense to have a ReplayUriStrategy?
Looking at PowerTrack 2.0 Replay, it would mostly be a copy-paste of PowerTrackV2UriStrategy, but with a few changes:
Instead of
public final class PowerTrackV2UriStrategy implements UriStrategy {
public static final String PATH_GNIP_STREAM_URI = "/stream/powertrack/accounts/%s/publishers/%s/%s.json";
public static final String PATH_GNIP_RULES_URI = "/rules/powertrack/accounts/%s/publishers/%s/%s.json";
@Override
public URI createStreamUri(final String account, final String streamName) {
...
return URI.create(String.format(streamUrlBase + PATH_GNIP_STREAM_URI, account.trim(), publisher.trim(), streamName.trim()));
}
private String createRulesBaseUrl(final String account, final String streamName) {
...
return String.format(ruleUrlBase + PATH_GNIP_RULES_URI, account.trim(), publisher.trim(), streamName.trim());
}
}
it would have something like
public class PowerTrackV2ReplayUriStrategy implements UriStrategy {
public static final String PATH_GNIP_REPLAY_STREAM_URI
= "/replay/powertrack/accounts/%s/publishers/%s/%s.json?fromDate=%s&toDate=%s";
public static final String PATH_GNIP_REPLAY_RULES_URI
= "/rules/powertrack-replay/accounts/%s/publishers/%s/%s.json";
DateFormat formatter = new SimpleDateFormat("yyyyMMddHHmm");
Date from;
Date to;
public PowerTrackV2ReplayUriStrategy(final Date from, final Date to) {
this.from = from;
this.to = to;
}
@Override
public URI createStreamUri(String account, String streamName) {
...
final String fromString = formatter.format(from);
final String toString = formatter.format(to);
return URI.create(String.format(
this.streamUrlBase + PATH_GNIP_REPLAY_STREAM_URI, account.trim(),
this.publisher.trim(), streamName.trim(), fromString, toString));
}
private String createRulesBaseUrl(final String account, final String streamName) {
...
return String.format(
this.ruleUrlBase + PATH_GNIP_REPLAY_RULES_URI, account.trim(), this.publisher.trim(),
streamName.trim());
}
}
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.