Comments (21)
Hey Ken, let me try to reproduce that. Can you please give me the environment details? OS type and version, which Python client release, and which server version are you working with (on which OS, version), etc.
from aerospike-client-python.
Thanks! Let's see:
- Python 2.6.9
- Both server and client are on: Linux version 3.10.42-52.145.amzn1.x86_64 (mockbuild@gobi-build-64003) (gcc version 4.8.2 20131212 (Red Hat 4.8.2-7) (GCC) ) SMP Tue Jun 10 23:46:43 UTC 2014
- Python client version: 1.0.41
- Server version: aerospike-amc-enterprise-3.5.4-el5.x86_64-3.rpm
from aerospike-client-python.
A couple of questions:
- Do you also this behavior with release 1.0.44?
- Are you saying that you're missing records or that for certain sets the entire query fails (nothing is printed at all from the foreach() callback)?
- If you modify your code to user results() is data missing, or is it only happening in foreach()?
- Can you confirm that there is a secondary index on the bin 'index' in all the sets? Can you go into AQL and check with
show indexes
?
from aerospike-client-python.
- Yes, it is reproduced in 1.0.44.
- Nothing is printed from the foreach. What should a missing record look like in contrast?
- Unfortunately so. We are receiving
[]
fromquery.results()
even though we know there are data in the set. - Yes. I can confirm that we are receiving no results while there are secondary indices on the bin.
The following is roughly what our tables look like:
+-------------------------+--------------------+------+------+----------+------------+-----+-------------+--------+
| index | value | bids | wins | spend | cost | fee | impressions | clicks |
+-------------------------+--------------------+------+------+----------+------------+-----+-------------+--------+
| "index" | "value" | 105 | 53 | 2661136 | 2866136 | 0 | 40 | 2 |
+-------------------------+--------------------+------+------+----------+------------+-----+-------------+--------+
We have multiple of these. We query them separately and aggregate the results. Many of these are not returning results via the Python client while we receive them from NodeJS client and AQL.
Thanks again.
from aerospike-client-python.
I think this behavior goes back at least a couple months, but I'm not sure which version I first saw it in.
from aerospike-client-python.
Hi Ken. I haven't yet been able to reproduce the issue, but I noticed a mistake in the callback function. The parameter for it should be a tuple (key, meta, bins) rather than three separate arguments. What happens if you change your script to:
def print_result((k,m,r)):
print r
Also, can you show me what is in the setnames
list? If you can post the output of show indexes
in AQL, even a partial one, that would help.
One more thing, is it the set name None
that is the problem? A query over a namespace and the None
set will give you records that are not part of any formal set. This is different from how a scan over a namespace with a None
set behaves (that would return all records of all the namespace)
from aerospike-client-python.
We're experiencing similar problems with the latest (to date) AS driver and Python 2.7
this code bellow returns no results (slight abstraction on top of predicate and query)
uspib.search_by_field_value('iid_pn', concat([input_split['import_id'], page_num]))
but when using a local variable it magically works
x = concat([input_split['import_id'], page_num])
uspib.search_by_field_value('iid_pn', x)
Which makes me think it's somehow related to the reference count of the variables.
from aerospike-client-python.
My coworker @pauloborges just showed me this
In [44]: query = client.query('test', 'demo')
In [45]: query.where(p.equals('index', 'foo:bar'))
Out[45]: <aerospike.Query at 0x1f309f0>
In [46]: query.results()
Out[46]: []
In [47]: query = client.query('test', 'demo'); query.where(p.equals('index', 'foo:bar')); query.results()
Out[47]:
[(('test',
'demo',
None,
bytearray(b'g-\xdemG\x10\xacIQ\xa1\x95bU\xc0RK3\xffl\xcd')),
{'gen': 1, 'ttl': 4294967295},
{'index': 'foo:bar'})]
from aerospike-client-python.
@rbotzer Thanks for following up.
The argument error doesn't exist in our production code, it was my mistake when transcribing. Sorry about that.
The following is a snippet of show indexes
. Hopefully it'll give you some insights.
+-----------+---------+-------------------------+----------+-------+----------------------------------------------+------------+--------+
| ns | bins | set | num_bins | state | indexname | sync_state | type |
+-----------+---------+-------------------------+----------+-------+----------------------------------------------+------------+--------+
| "week_2x" | "index" | "live_stats_utc_sun_08" | 1 | "RW" | "idx__index__week_2x__live_stats_utc_sun_08" | "synced" | "TEXT" |
| "week_2x" | "index" | "live_stats_utc_mon_08" | 1 | "RW" | "idx__index__week_2x__live_stats_utc_mon_08" | "synced" | "TEXT" |
| "week_2x" | "index" | "live_stats_utc_mon_23" | 1 | "RW" | "idx__index__week_2x__live_stats_utc_mon_23" | "synced" | "TEXT" |
| "week_2x" | "index" | "live_stats_utc_wed_14" | 1 | "RW" | "idx__index__week_2x__live_stats_utc_wed_14" | "synced" | "TEXT" |
| "week_2x" | "index" | "live_stats_utc_thu_06" | 1 | "RW" | "idx__index__week_2x__live_stats_utc_thu_06" | "synced" | "TEXT" |
I don't believe we have any None
set.
While I'm not sure it's the same cause as the one @pauloborges described, we are observing similar results.
from aerospike-client-python.
This is definitely something I want to prioritize because it's affecting core functionality, and because to be honest it's a weird bug. I will spin up similar EC2 instances and see if I can replicate it there.
Can I ask you to please run the unit tests, especially the following ones?
py.test -v test_query.py
py.test -v test_scan.py
from aerospike-client-python.
Thanks @rbotzer
Ok, I had to do some tweaking because our cluster doesn't have a test.demo namespace/set, but the modified test is throwing a bunch of errors that looks like this:
E InvalidRequest: (4L, 'Invalid type must be [functional, userland, default]', 'src/main/client/sec_index.c', 502)
Does the test not match the driver? Our driver is now at 1.0.44 as of this test.
from aerospike-client-python.
If you removed the test
namespace (which is there by default) you'll have to edit the tests. demo
is a set inside test
, and will be created automatically.
Running py.test -v
should return detailed error information (the lines around that error). Can you paste it here? This seems pertinent, because on our various QA environments we don't see that. I'm hoping it will help us focus in the right direction.
from aerospike-client-python.
This should be fixed by release >= 1.0.45. Please verify.
from aerospike-client-python.
Unfortunately, we are still seeing the issue.
Apologies for not being able to get to the unit test. We don't have a test
namespace in our cluster, making it a little messier to run. Are there internals from the query
instance that we can quickly print out to get you some extra info?
from aerospike-client-python.
Hey, Ken. It's fairly easy to add the test namespace back in, even as an in-memory one. You change aerospike.conf
and restart a node, then do the same for the next nodes in the cluster. I'll get back to you about the method for getting extra information.
namespace test {
storage-engine memory
memory-size 2G
replication-factor 2
high-water-memory-pct 60
stop-writes-pct 90
default-ttl 0
}
from aerospike-client-python.
Release 1.0.46 is now available. I'd appreciate if you tried it and if the error is still present, if you could run the tests and quote their output.
from aerospike-client-python.
Again, running the query tests and quoting the results would be very helpful if you are having this issue.
The following are two scripts that I'm using to try and reproduce the problem.
issue56_prep.py
from __future__ import print_function
import aerospike
from aerospike.exception import *
import sys
config = {'hosts': [('192.168.119.3', 3000)]}
try:
client = aerospike.client(config).connect()
except ClientError as e:
print("Error: {0} [{1}]".format(e.msg, e.code))
sys.exit(1)
distribution = {}
ival = 0
rid = 1
while rid < 300:
if rid <= 100 and (rid % 3 != 0):
ival = ival + 1
elif (rid > 100 and rid <= 200) and (rid % 5 == 0):
ival = ival + 1
elif (rid > 200 and rid <= 300) and (rid % 10 == 0):
ival = ival + 1
try:
distribution[ival] = distribution[ival] + 1
except KeyError:
distribution[ival] = 1
try:
client.put(('test','i56',rid), {'id':rid, 'ival':ival, 'sval':str(ival)})
except RecordError as e:
print("Error: {0} [{1}]".format(e.msg, e.code))
sys.exit(2)
rid = rid + 1
client.index_integer_create('test', 'i56', 'ival', 'i56_int_idx')
client.index_string_create('test', 'i56', 'sval', 'i56_str_idx')
print(distribution)
client.close()
The distribution of ival
values is printed when this script is done.
issu56_test.py
from __future__ import print_function
import aerospike
from aerospike.exception import *
from aerospike import predicates as p
import sys
config = {'hosts': [('192.168.119.3', 3000)]}
try:
client = aerospike.client(config).connect()
except ClientError as e:
print("Error: {0} [{1}]".format(e.msg, e.code))
sys.exit(1)
# query for an expected 10 record result:
query = client.query("test", "i56")
query.where(p.equals("ival", 96))
res = query.results()
print(res)
print("There are ", len(res), " results for this query")
print('-----------------------------------------')
# query for an expected 5 record result:
query = client.query("test", "i56")
query.where(p.equals("ival", 74))
res = query.results()
print(res)
print("There are ", len(res), " results for this query")
print('-----------------------------------------')
# query for an expected 2 record result:
query = client.query("test", "i56")
query.where(p.equals("ival", 6))
res = query.results()
print(res)
print("There are ", len(res), " results for this query")
print('-----------------------------------------')
# query for an expected 1 record result:
query = client.query("test", "i56")
query.where(p.equals("ival", 7))
res = query.results()
print(res)
print("There are ", len(res), " results for this query")
client.close()
So far I have seen the expected results with OS X 10.10, Debian 7, Ubuntu 14.04. If you are getting unexpected results please copy and paste them in a comment, and also check in your AQL for whether the results there are mismatched.
AQL output
aql> select * from test.i56 where ival=96
+------+------+-----+
| ival | sval | id |
+------+------+-----+
| 96 | "96" | 293 |
| 96 | "96" | 298 |
| 96 | "96" | 290 |
| 96 | "96" | 292 |
| 96 | "96" | 294 |
| 96 | "96" | 299 |
| 96 | "96" | 296 |
| 96 | "96" | 297 |
| 96 | "96" | 291 |
| 96 | "96" | 295 |
+------+------+-----+
10 rows in set (0.032 secs)
aql> select * from test.i56 where ival=74
+------+------+-----+
| ival | sval | id |
+------+------+-----+
| 74 | "74" | 139 |
| 74 | "74" | 137 |
| 74 | "74" | 136 |
| 74 | "74" | 138 |
| 74 | "74" | 135 |
+------+------+-----+
5 rows in set (0.008 secs)
aql> select * from test.i56 where ival=6
+------+------+----+
| ival | sval | id |
+------+------+----+
| 6 | "6" | 9 |
| 6 | "6" | 8 |
+------+------+----+
2 rows in set (0.006 secs)
aql> select * from test.i56 where ival=7
+------+------+----+
| ival | sval | id |
+------+------+----+
| 7 | "7" | 10 |
+------+------+----+
1 row in set (0.007 secs)
from aerospike-client-python.
With the help of @arthurprs and @pauloborges we have debugged this issue, and it is particular to EC2.
Overview
EC2 nodes have a private IP, that is visible in their subnet within their availability zone, and a public IP. Application nodes (where the client lives) may or may not be able to access the private IP, for example if they connect from a different zone.
Initial Connection
When the client connects to the public IP of a cluster node it will inquire about the other nodes in the cluster. This is equivalent to running asinfo -v 'services'
on that node. Depending on its configuration, the cluster may respond with unreachable private IPs.
By default, the access-address
config parameter will be set to any
, which means it will expose its own IP. In EC2 the node knows only of its private IP.
Consequences
Key-Value Operations
These will continue to occur by proxy. The client will send all reads and writes to the single node it has access to, and that node will proxy those operations. You will see a high number of proxy events, which normally only show up during migrations.
Queries
Queries do not proxy. As a result, the client will send the query request to the single node it has a connection to. The records matched against the secondary index will stream back from that client, giving fewer than expected resulting records. No data will come back from any node that is unreachable by the client. This is not ideal behavior, as it would be better for the client to give a clear error rather than a partial result. I have opened an internal ticket for this ( AER-3903 ).
Workaround
This problem is distinct to a cloud environment, such as EC2. There are two workarounds:
- Locate the application (client) nodes in the same availability zone as the server nodes. This will also result in lower latency between them.
- If the application nodes may be in a different zone, configure the
access-address
to be the public IP of the node. Even if the clients are all in the same zone, ensure that each client node can access the private IP of all the cluster nodes. Again, useasinfo
or telnet to port 3000 on the private IPs to determine this.
@whosken and @ryanwitt please check if this is your problem, and let me know if the workarounds solve it. Thank you everyone for your help in identifying and debugging this problem.
from aerospike-client-python.
@rbotzer Thanks for your patience and efforts.
I have verified that our cluster are all in the same availability zone. While the cause may be the same, the workaround may not resolve our situation. I have also verified with release 1.0.46, and still found the faulty behavior.
Similar bug is not present in the Node driver, however. I am uncertain why. Would that suggest there's a workaround or solution we can apply on the client side?
from aerospike-client-python.
@whosken thanks for looking into it. Did you also try asinfo -v services
on all the app nodes? That should show you the IPs of the other nodes in the cluster, while asinfo -v service
shows you the IP of the node you are connected to. Those should give you a good idea if the cluster is defined correctly and accessible from all the app nodes. I'd appreciate if you do that.
Beyond that, I am unsure. We seemed to see the same problem with AQL, which also wraps around the C client, and that suggested it wasn't specifically in Python. Today was a long day 😫 and I think I've maxed out on problem-solving. I will be talking to the main C client developer tomorrow about it, though.
PS: did you run AQL and node.js on the same app nodes as where your Python scripts run?
from aerospike-client-python.
One last try to collect information. Release 1.0.49 and server release 3.5.15 are out. Please try again. In case the problem remains please answer the questions I had up the thread, and I'll reopen.
from aerospike-client-python.
Related Issues (20)
- Unable to install older version of aerospike HOT 3
- when run python3 kvs.py got error HOT 1
- Unable to install on Amazon Linux HOT 1
- Support for min_conns_per_node HOT 2
- execute udf aerospike:remove failed HOT 3
- Unable to install on MacOS 12.2 (M1) HOT 9
- add python 3.10 support HOT 2
- Aerospike alpine installation failure HOT 6
- Error while running in Ubuntu 22.04 HOT 1
- `map_operations.map_remove_by_key` has `return_type` but `expressions.MapRemoveByKey` does not HOT 1
- Help wanted! Update in nested data HOT 1
- Query on two fields (Alternate of predexp) HOT 2
- support pypy 3.9 HOT 1
- Error status of aerospike_scan_wait() is fixed on ParamError, therefore cannot detect error of inside the function. HOT 1
- How to use put_async.py? HOT 2
- Is the Aerospike Python client compatible with Debian 11? HOT 3
- Max records for a batch_write call is less than server's default HOT 8
- Windows support / compability HOT 9
- Memory Leak while getting data from aerospike HOT 15
- Need more Detail Exception ( raise error if port dataType is not PyLong) HOT 3
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from aerospike-client-python.