honeycombio / beeline-python Goto Github PK
View Code? Open in Web Editor NEWLegacy instrumentation for your Python apps with Honeycomb.
License: Apache License 2.0
Legacy instrumentation for your Python apps with Honeycomb.
License: Apache License 2.0
Issue created in the sentry repository @ getsentry/sentry-python#442
Details:
I'm activating sentry in the top of settings.py (docs)
sentry_sdk.init(
dsn=dsn,
environment=env,
integrations=[DjangoIntegration()],
)
beeline is being initialized in the gunicorn config file (docs)
def post_worker_init(worker):
beeline.init(
writekey=honeycomb_key,
dataset=dataset,
service_name=service
)
I can manually send errors to sentry, but the automatic error reporting doesn't work unless honeycomb is removed.
Partially redacted requirements.txt
Django~=2.1.9
djangorestframework~=3.9.1
gunicorn~=19.9.0
sentry-sdk~=0.10.2
honeycomb-beeline~=2.6.1
libhoney~=1.8.0
statsd~=3.3.0
So far, I have been able to turn off honeycomb in my django in circleci using this way
import beeline
from django.apps import AppConfig
from django.conf import settings
class CoreConfig(AppConfig):
name = "core"
def ready(self):
if settings.HONEYCOMB_ON:
beeline.init(
writekey=settings.HONEYCOMB_API_KEY,
dataset=settings.HONEYCOMB_DATASET,
service_name=settings.HONEYCOMB_SERVICE_NAME,
debug=settings.DEBUG,
)
So in circleci, settings.HONEYCOMB_ON is false and it works until 2.12
but with the new 2.13 version this no longer works for me.
in my circleci, I get the following
Traceback (most recent call last):
File "/home/circleci/repo/config/tests/test_redirect.py", line 8, in test_redirect
response = self.client.get("/")
File "/home/circleci/repo/venv/lib/python3.7/site-packages/django/test/client.py", line 535, in get
response = super().get(path, data=data, secure=secure, **extra)
File "/home/circleci/repo/venv/lib/python3.7/site-packages/django/test/client.py", line 347, in get
**extra,
File "/home/circleci/repo/venv/lib/python3.7/site-packages/django/test/client.py", line 422, in generic
return self.request(**r)
File "/home/circleci/repo/venv/lib/python3.7/site-packages/django/test/client.py", line 503, in request
raise exc_value
File "/home/circleci/repo/venv/lib/python3.7/site-packages/django/core/handlers/exception.py", line 34, in inner
response = get_response(request)
File "/home/circleci/repo/venv/lib/python3.7/site-packages/beeline/middleware/django/__init__.py", line 144, in __call__
response = self.create_http_event(request)
File "/home/circleci/repo/venv/lib/python3.7/site-packages/beeline/middleware/django/__init__.py", line 104, in create_http_event
dr = DjangoRequest(request)
File "/home/circleci/repo/venv/lib/python3.7/site-packages/beeline/middleware/django/__init__.py", line 12, in __init__
beeline.get_beeline().log(request.META)
AttributeError: 'NoneType' object has no attribute 'log'
----------------------------------------------------------------------
how do i turn off honeycomb during the circleci run of the my django app?
Currently if an exception is thrown during lambda execution it does not end up in honeycomb unless it is explicitly handled/recorded.
It would be nice of the lambda wrapper had a top level error handler that catches, logs, and rethrows.
Specifically, this page: https://docs.honeycomb.io/beeline/python/#using-traces
The name of the argument passed to the context manager is parent_id
and not parent_span_id
.
I just tried to set up honeycomb in my relatively simple (admin site only) Django 2.1.5 app with a Postgres 11.1 DB running locally, configured with the middleware added, and:
from django.apps import AppConfig # type: ignore
from django.conf import settings # type: ignore
import beeline # type: ignore
class LettersConfig(AppConfig):
name = "letters"
def ready(self):
beeline.init(
writekey=settings.HONEYCOMB_WRITE_KEY,
dataset="artandtybie",
service_name="my-app-name",
debug=True,
)
I have so far not received any events in the Honeycomb UI, and see a lot of these messages in the logs:
2019-02-09 04:06:31,892 - honeycomb-sdk-xmit - DEBUG - enqueuing response = {'status_code': 0, 'body': '', 'error': TypeError('Object of type datetime is not JSON serializable'), 'duration': 5.233049392700195, 'metadata': None}
Maybe they're coming from trying to serialize query args for a query against a model with a date field?
2019-02-09 04:06:31,633 - honeycomb-sdk - DEBUG - send enqueuing event ev = {'service_name': 'my-app-name', 'meta.beeline_version': '2.4.6', 'meta.local_hostname': 'localhost.localdomain', 'name': 'django_postgresql_query', 'trace.trace_id': '58a65be3-31a9-4c70-a9cd-8068459d30ef', 'trace.parent_id': '474479f5-e10b-474c-94f3-4830afc4e116', 'trace.span_id': '2abbba28-578c-4b0e-b27c-f9844c93df85', 'type': 'db', 'db.query': 'SELECT "django_session"."session_key", "django_session"."session_data", "django_session"."expire_date" FROM "django_session" WHERE ("django_session"."expire_date" > %s AND "django_session"."session_key" = %s)', 'db.query_args': (datetime.datetime(2019, 2, 9, 4, 6, 31, 626689, tzinfo=<UTC>), 'xxxxxxxx'), 'db.duration': 2.083, 'db.last_insert_id': 0, 'db.rows_affected': 1, 'duration_ms': 2.4789999999999996}
Am I configuring this Beeline correctly?
The Go beeline supports adding a dataset
field to the trace propagation header (x-honeycomb-trace
). The effect of including this field is that spans from the downstream service should be sent to the indicated dataset. The rest of the beelines should implement this addition.
I'm not entirely sure for the reason of this bit of code (https://github.com/honeycombio/beeline-python/blob/main/beeline/middleware/flask/__init__.py#L48-L52), but if I catch an exception in my code (logging with sentry), then if I return a 500 status code the root trace is never closed, and it shows up as missing the root trace in the Honeycomb trace view.
I temporarily worked around this by patching the honeycomb middleware:
class PatchedHoneyWSGIMiddleware(object):
def __init__(self, app):
self.app = app
def __call__(self, environ, start_response):
req = Request(environ, shallow=True)
wr = WSGIRequest("flask", environ)
root_span = beeline.propagate_and_start_trace(wr.request_context(), wr)
def _start_response(status, headers, *args):
status_code = int(status[0:4])
beeline.add_context_field("response.status_code", status_code)
beeline.finish_trace(root_span)
return start_response(status, headers, *args)
return self.app(environ, _start_response)
beeline.middleware.flask.HoneyWSGIMiddleware = PatchedHoneyWSGIMiddleware
making sure to always close the trace root, but this is probably missing something.
Waitress is an increasingly popular application server in cloud contexts (google cloud run, for example) as it buffers incoming requests and can be used without being frontend by Nginx or some other buffering server.
https://docs.honeycomb.io/getting-data-in/python/beeline/#using-the-python-beeline-with-python-pre-fork-models mentions a few different ways to initialise the library in a pre-fork environment.
I don't believe waitress uses a forking model. My understanding (basic) is that it uses asyncio to buffer individual requests, and a threadpool to execute the application.
Does the beeline need to be initialised in a special way with Waitress?
In HoneyMiddlewareBase.create_http_event
I can see that request.POST.dict() is being called. However, it would be much more useful to me to have request.data. As outlined in the Django API guide, request.POST does not have all the information we need in it. See here
Is this something we can consider adding to the trace context?
Suggestion for supporting CherryPy here:
beeline.init(writekey=honeycombApiToken, dataset='xyz', service_name='something.or.other')
def reportBeeline(func):
def wrapper(*args, **kwargs):
trace = beeline.start_trace()
beeline.add_context({'method': cherrypy.request.method, 'endpoint': cherrypy.request.path_info})
try:
result = func(*args, **kwargs)
except Exception as e:
beeline.finish_trace(trace)
raise
beeline.finish_trace(trace)
return result
return wrapper
Then, for each function that deals with a HTTP request, simply decorate it with the reportBeeline() decorator:
@cherrypy.tools.json_out(handler=dumper)
@reportBeeline
def servePage():
return 'test'
I'm updating the beeline in one of our applications and found this in the changelog. And found request.route
would clash with some instrumentation we already have. We've used request.url_rule
from flask as the field value for request.route
. For anyone not familiar this returns something like /user/<username>
, rather than endpoint which returns the function name.
This is the line I'm referring to:
I'm happy to PR a fix which would implement the following:
beeline.add_field("request.endpoint", flask.request.endpoint)
beeline.add_field("request.route", flask.request.url_rule)
Other beeline's instrument the "pattern" of a request rather than the function name, some examples:
Go: https://github.com/honeycombio/beeline-go/blob/ca594899bf23a4e2496df8fd214bd7e0455ffb0f/wrappers/hnyecho/echo_test.go#L60
NodeJS/Express: (Returns: /users/:user_id
) https://github.com/honeycombio/beeline-nodejs/blob/7e9ee85b5db91dce469770599019e78730ff12d0/lib/instrumentation/express.js#L143
Ruby: https://www.honeycomb.io/blog/honeybyte-beeline-dev-molly-struve/
Tested with vbeeline versions (error happens in both):
When multiple database sessions are created (e.g. via python Threading ) and a new app is created with a SQLAlchemy session, the following is observed:
Traceback (most recent call last):
File "/home/chris/Documents/programming/python/app/venv/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 1281, in _execute_context
self.dispatch.after_cursor_execute(
File "/home/chris/Documents/programming/python/app/venv/lib/python3.8/site-packages/sqlalchemy/event/attr.py", line 322, in __call__
fn(*args, **kw)
File "/home/chris/Documents/programming/python/app/venv/lib/python3.8/site-packages/beeline/middleware/flask/__init__.py", line 113, in after_cursor_execute
query_duration = datetime.datetime.now() - self.query_start_time
AttributeError: 'HoneyDBMiddleware' object has no attribute 'query_start_time'
An flask app which creates a sqlalchemy thread , ran with
flask run --with-threads
The calling code, which is using Thread and creating a new SqlAlchemy Session (due to create_app factory function): https://github.com/Subscribie/subscribie/blob/ccda2c81f432b3104f58c1226fd31e50d3319fee/subscribie/email.py#L23,L74
A full traceback is below:
Traceback (most recent call last):
File "/home/chris/Documents/programming/python/app/venv/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 1281, in _execute_context
self.dispatch.after_cursor_execute(
File "/home/chris/Documents/programming/python/app/venv/lib/python3.8/site-packages/sqlalchemy/event/attr.py", line 322, in __call__
fn(*args, **kw)
File "/home/chris/Documents/programming/python/app/venv/lib/python3.8/site-packages/beeline/middleware/flask/__init__.py", line 113, in after_cursor_execute
query_duration = datetime.datetime.now() - self.query_start_time
AttributeError: 'HoneyDBMiddleware' object has no attribute 'query_start_time'
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/home/chris/Documents/programming/python/app/venv/lib/python3.8/site-packages/flask/app.py", line 2464, in __call__
return self.wsgi_app(environ, start_response)
File "/home/chris/Documents/programming/python/app/venv/lib/python3.8/site-packages/beeline/middleware/flask/__init__.py", line 56, in __call__
return self.app(environ, _start_response)
File "/home/chris/Documents/programming/python/app/venv/lib/python3.8/site-packages/flask/app.py", line 2450, in wsgi_app
response = self.handle_exception(e)
File "/home/chris/Documents/programming/python/app/venv/lib/python3.8/site-packages/flask_cors/extension.py", line 165, in wrapped_function
return cors_after_request(app.make_response(f(*args, **kwargs)))
File "/home/chris/Documents/programming/python/app/venv/lib/python3.8/site-packages/flask_cors/extension.py", line 165, in wrapped_function
return cors_after_request(app.make_response(f(*args, **kwargs)))
File "/home/chris/Documents/programming/python/app/venv/lib/python3.8/site-packages/flask/app.py", line 1867, in handle_exception
reraise(exc_type, exc_value, tb)
File "/home/chris/Documents/programming/python/app/venv/lib/python3.8/site-packages/flask/_compat.py", line 39, in reraise
raise value
File "/home/chris/Documents/programming/python/app/venv/lib/python3.8/site-packages/flask/app.py", line 2447, in wsgi_app
response = self.full_dispatch_request()
File "/home/chris/Documents/programming/python/app/venv/lib/python3.8/site-packages/flask/app.py", line 1952, in full_dispatch_request
rv = self.handle_user_exception(e)
File "/home/chris/Documents/programming/python/app/venv/lib/python3.8/site-packages/flask_cors/extension.py", line 165, in wrapped_function
return cors_after_request(app.make_response(f(*args, **kwargs)))
File "/home/chris/Documents/programming/python/app/venv/lib/python3.8/site-packages/flask_cors/extension.py", line 165, in wrapped_function
return cors_after_request(app.make_response(f(*args, **kwargs)))
File "/home/chris/Documents/programming/python/app/venv/lib/python3.8/site-packages/flask/app.py", line 1821, in handle_user_exception
reraise(exc_type, exc_value, tb)
File "/home/chris/Documents/programming/python/app/venv/lib/python3.8/site-packages/flask/_compat.py", line 39, in reraise
raise value
File "/home/chris/Documents/programming/python/app/venv/lib/python3.8/site-packages/flask/app.py", line 1950, in full_dispatch_request
rv = self.dispatch_request()
File "/home/chris/Documents/programming/python/app/venv/lib/python3.8/site-packages/flask/app.py", line 1936, in dispatch_request
return self.view_functions[rule.endpoint](**req.view_args)
File "/home/chris/Documents/programming/python/app/app/blueprints/checkout/__init__.py", line 192, in thankyou
send_welcome_email()
File "/home/chris/Documents/programming/python/app/app/email.py", line 79, in send_welcome_email
return render_template("thankyou.html")
File "/home/chris/Documents/programming/python/app/venv/lib/python3.8/site-packages/flask/templating.py", line 136, in render_template
ctx.app.update_template_context(context)
File "/home/chris/Documents/programming/python/app/venv/lib/python3.8/site-packages/flask/app.py", line 838, in update_template_context
context.update(func())
File "/home/chris/Documents/programming/python/app/app/views.py", line 60, in inject_template_globals
company = Company.query.first()
File "/home/chris/Documents/programming/python/app/venv/lib/python3.8/site-packages/sqlalchemy/orm/query.py", line 3429, in first
ret = list(self[0:1])
File "/home/chris/Documents/programming/python/app/venv/lib/python3.8/site-packages/sqlalchemy/orm/query.py", line 3203, in __getitem__
return list(res)
File "/home/chris/Documents/programming/python/app/venv/lib/python3.8/site-packages/sqlalchemy/orm/query.py", line 3535, in __iter__
return self._execute_and_instances(context)
File "/home/chris/Documents/programming/python/app/venv/lib/python3.8/site-packages/sqlalchemy/orm/query.py", line 3560, in _execute_and_instances
result = conn.execute(querycontext.statement, self._params)
File "/home/chris/Documents/programming/python/app/venv/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 1011, in execute
return meth(self, multiparams, params)
File "/home/chris/Documents/programming/python/app/venv/lib/python3.8/site-packages/sqlalchemy/sql/elements.py", line 298, in _execute_on_connection
return connection._execute_clauseelement(self, multiparams, params)
File "/home/chris/Documents/programming/python/app/venv/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 1124, in _execute_clauseelement
ret = self._execute_context(
File "/home/chris/Documents/programming/python/app/venv/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 1316, in _execute_context
self._handle_dbapi_exception(
File "/home/chris/Documents/programming/python/app/venv/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 1508, in _handle_dbapi_exception
util.raise_(newraise, with_traceback=exc_info[2], from_=e)
File "/home/chris/Documents/programming/python/app/venv/lib/python3.8/site-packages/sqlalchemy/util/compat.py", line 182, in raise_
raise exception
File "/home/chris/Documents/programming/python/app/venv/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 1281, in _execute_context
self.dispatch.after_cursor_execute(
File "/home/chris/Documents/programming/python/app/venv/lib/python3.8/site-packages/sqlalchemy/event/attr.py", line 322, in __call__
fn(*args, **kw)
File "/home/chris/Documents/programming/python/app/venv/lib/python3.8/site-packages/beeline/middleware/flask/__init__.py", line 113, in after_cursor_execute
query_duration = datetime.datetime.now() - self.query_start_time
AttributeError: '_thread._local' object has no attribute 'span'
I'd like to use Honeycomb with Pyramid (https://trypyramid.com/)
I gave it a try like this, but I'm not seeing data show up at Honeycomb:
if __name__ == "__main__":
config = Configurator()
config.add_route(ENDPOINT, "/{}".format(ENDPOINT))
config.scan()
app = config.make_wsgi_app()
wrapped_app = HoneyWSGIMiddleware(app)
server = make_server("0.0.0.0", 5000, wrapped_app)
server.serve_forever()
Has anyone else done this already?
In the Django beeline, we see that db.query_args
is always added to the context when making a query.
https://github.com/honeycombio/beeline-python/blob/master/beeline/middleware/django/__init__.py#L28
It would be nice if this would be somewhat overrideable, similarly to how it's now possible to do so for the Middleware, since https://github.com/honeycombio/beeline-python/pull/73/files.
They way I think this would be best achievable would be to:
HoneyMiddleware
which specifies which class the db_wrapper
has.beeline.add_context
the result of another callable on the HoneyDBWrapper
class.Alternatively, this could of course also be handled by a presend hook.
Currently, I'm using beeline-python with Django and so far it's working great. However, despite disabling Django logging and setting beeline.init(debug=False)
, beeline is spitting out way too many log statements.
I'm using beeline like this,
import os
import beeline
from django.apps import AppConfig
class OrdersConfig(AppConfig):
name = "orders"
if os.environ.get("HONEYCOMB_LOG") == "True":
def ready(self):
beeline.init(writekey=os.environ["HONEYCOMB_API_KEY"], dataset="lis-production",
service_name="lis", debug=False)
My Django setting has DEBUG=False
. However, I'm seeing an excessive amount of request log statements on the console.
reply: 'HTTP/1.1 200 OK\r\n'
header: Date: Thu, 21 Jan 2021 20:05:31 GMT
header: Content-Type: application/json
header: Content-Length: 64
header: Connection: keep-alive
header: Access-Control-Allow-Origin: *
header: Content-Encoding: gzip
header: Vary: Accept-Encoding
2021-01-21 20:05:31,910 DEBUG https://api.honeycomb.io:443 "POST /1/batch/lis-production HTTP/1.1" 200 64
2021-01-21 20:05:31,910 DEBUG https://api.honeycomb.io:443 "POST /1/batch/lis-production HTTP/1.1" 200 64
send: b'POST /1/batch/lis-production HTTP/1.1\r\nHost: api.honeycomb.io\r\nUser-Agent: libhoney-py/1.10.0 beeline-python/2.16.0\r\nAccept-Encoding: gzip, deflate\r\nAccept: */*\r\nConnection: keep-alive\r\nContent-Encoding: gzip\r\nX-Honeycomb-Team: 632a7b8862ccc4cd53092d6e10c2fd58\r\nContent-Type: application/json\r\nContent-Length: 11606\r\n\r\n'
send: b'\x1f\x8b\x08\x00\xc8\xde\t`\x04\xff\xed}\x89\x8e\xe56v\xe8\xaf\x14\x1a\x18 \x06\xec\x82\xf6e\x80\x04\x98\xd8m\xc4\x03Ow\x9e\x17\x0c\x92qp\xa1\x85\xaa\x92\xfbn\xbeK/\x0e\xe6\xdf\xdf!\xc5\xe5P:\x92x\xed\xfb*\x83\x17\x96\x8dF\x15\xcf!E\x89\xd2\xd9\x97\xbf\xfd\xf7\xabK\xbfc\xaf\xfe\xf8\xf0*\n\xa2\xf0\x8b \xfc"\n\x7f\x88\x82?\x06\xd9\x1f\xe3\xe81\x0c\xd2(\xcc\xfe\xf3\xd5\xe7\x0f\xaf\xce\xd5\xee\xb8e\xa7\xea\xc2\x91C\x18h\xabK\x05\xbf\xfe\xf7\xab3;\xbd\xef\x1b\xb6\xd9W\xc3B\xdb\xfe\xcc\'
How can I get rid of these request logs in production?
Currently, Flask routes come in with a boring old root function of flask_http_get
, and that's all you get. There's a request.path
, but that doesn't really specify a route (especially if I pattern-match things internally -- as I would in https://github.com/honeycombio/examples/blob/39c8732285c5f9cffaa728b7d55840c601039b8e/python-gatekeeper/app.py#L49 ). It would be much better for the span name to be the function name for the route, or at least for the function name for the route to be part of it.
This should probably be its own field.
The query start time is being stashed on the shared object, rather than the thread-local storage:
As a consequence, db duration is incorrectly calculated when multiple concurrent threads are running.
Here's a simple repro:
import os
import time
import beeline
from beeline.middleware.flask import HoneyMiddleware
from flask import Flask
from flask_sqlalchemy import SQLAlchemy
beeline.init(
writekey=os.environ.get("HONEYCOMB_API_KEY"),
dataset="concurrency-test",
service_name="busted",
debug=True,
)
# Pass your Flask app to HoneyMiddleware
app = Flask(__name__)
app.config[
"SQLALCHEMY_DATABASE_URI"
] = "postgresql://postgres:password@localhost/honeycomb_test"
db = SQLAlchemy(app)
HoneyMiddleware(
app, db_events=True
) # db_events defaults to True, set to False if not using our db middleware with Flask-SQLAlchemy
@app.route("/sleep/<seconds>")
def sleepy(seconds):
time.sleep(1)
db.session.execute(f"SELECT pg_sleep({seconds})")
time.sleep(1)
return f"yawn! (slept {int(seconds) + 2} seconds)"
Call it with this script
curl http://localhost:5000/sleep/5 &
sleep 4
curl http://localhost:5000/sleep/5
The second call generates a database span that has a five-second duration, but the db.duration
is reported as ~1s.
I'm instrumenting one of our applications which uses SQLalchemy but I found when I had db_events=True
I would get the error AttributeError: 'pyodbc.Cursor' object has no attribute 'lastrowid'
. From my research I found that this property isn't part of the DB API spec 1, nor implemented in pyodbc (the driver we're using).
The offending line i found was:
Commenting it out I don't get the AttributeError.
My proposed fix would be something along these lines:
- "db.last_insert_id": cursor.lastrowid,
+ "db.last_insert_id": getattr(cursor, 'lastrowid', None),
I can submit a PR for this if this fix works for you
HoneyDBMiddleware will crash if you use pass a complex type like a list as a query parameter.
Example:
body = "SELECT * FROM table WHERE id IN :item_ids"
query = text(body).params(
item_ids=[1,2,3],
)
Results in:
TypeError: not all arguments converted during string formatting
Because the format string is expecting a single argument:
I guess the intention is just to stringify the parameter, in which case the following might do:
param += str(v)
It would be nice to just decorate a function and wrap it with a span. We have something similar in the lambda middleware that we could just generalize:
This may not be desirable behaviour for all integrations, but I’d love for the query parameters to be added to the event emitted when using the requests library.
I wonder what people’s thoughts on this functionality are?
As seen in
, the data in the event context is hardcoded.I think it would be very beneficial to have some way to override the data in the context without having to re-define create_http_event
completely.
If the contents of the context=
parameter could be moved to a separate function, it would be easier to override without also having to re-implement the start_trace
...finish_trace
.
We often do large cursor and wonder whether or not it would make more sense to fire this on execute() rather than the low level cursor. Tracking the amount of the sub execute might be nice, but the span should track the entire execution? I wonder if there's metadata that we could accumulate as well.
beeline-python/beeline/patch/requests.py
Line 18 in cf6272b
I’m not sure how to get traces to show up across processes running beeline-python Rpcing to each other using libraries like requests or urllib. Help me understand is it just supposed to work? Am I supposed to provide the same name across traces when I can with? I’m not sure how to trace micro flask services calling other micro flask services
I am looking here for either guidance or a fix.
I have a Flask app that runs in different environments, one of them being jupyter notebooks. As you could imagine, Jupyter notebook loads part of the app, crashing when tracer gets called, if uninitialised. I'd like to provide an experience where a developer does not have to load honeycomb setup in Jupyter to experiment with the code.
Here is the little helper package I wrote for myself to introduce a nice decorator for functions I want to trace::
import beeline
import functools
import config
import os
from beeline.middleware.flask import HoneyMiddleware
tracer = beeline.tracer
add_field = beeline.add_field
add_trace_field = beeline.add_trace_field
def init(app):
beeline.init(writekey=config.HONEYCOMB_API_KEY, dataset='<redacted>', service_name='<redacted>', presend_hook=presend)
HoneyMiddleware(app, db_events=False)
def presend(fields):
fields['pid'] = os.getpid()
def traced(name):
def decorator(func):
@functools.wraps(func)
def wrapper(*args, **kwargs):
with tracer(name):
return func(*args, **kwargs)
return wrapper
return decorator
Exception
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
<ipython-input-6-edff33869504> in <module>()
1 import model
----> 2 model.load_bot_model(6)
~/SageMaker/team-ml/sth/src/honeycomb.py in wrapper(*args, **kwargs)
24 @functools.wraps(func)
25 def wrapper(*args, **kwargs):
---> 26 with tracer(name):
27 return func(*args, **kwargs)
28 return wrapper
~/anaconda3/lib/python3.6/site-packages/beeline/__init__.py in tracer(name, trace_id, parent_id)
361 - `name`: a descriptive name for the this trace span, i.e. "database query for user"
362 '''
--> 363 return _GBL.tracer(name=name, trace_id=trace_id, parent_id=parent_id)
364
365 def start_trace(context=None, trace_id=None, parent_span_id=None):
AttributeError: 'NoneType' object has no attribute 'tracer'
I can see two different solutions here:
init
to the beeline.tracer, otherwise have just a fake one.What are your thoughts?
The problem seems to be that the parameters
argument to the before_cursor_execute
listener is almost always (maybe always) a dict in my project. Beeline always treats it as if it's a list: https://github.com/honeycombio/beeline-python/blob/634a567/beeline/middleware/flask/__init__.py#L115.
Quote from the docs about the parameters
argument:
Dictionary, tuple, or list of parameters being passed to the execute() or executemany() method of the DBAPI cursor. In some cases may be None.
The Python Beeline adds a field for the User-Agent to the http_server
spans it creates:
The field name request.user_agent
is inconsistent with the other Beelines which use request.header.user_agent
:
It would be great if these could be consistent to ease querying in deployments with polyglot services.
Example use:
with beeline.tracer("name") as trace:
start_thread_and_pass_trace(trace)
Hi there 👋
I was wondering if there’s a way to add a field to the current span, and its child spans, but not to its parent span, as currently happens with add_trace_field
?
I know that I could manually add the fields to each span, but I’d love to not have to do that, if possible.
My use case is iterating over multiple “accounts”, where the full iteration has a span, and each individual iteration has another span (and child spans), which attach the “account_id”. Using add_trace_field
for the account_id mainly works as I’d like, expect the parent span also has the final account_id attached to it, which is undesirable.
G'day! We're propagating trace context around our environment using W3C Trace Context aka. traceparent
following the OpenCensus convention and OpenTelemetry specification:
Traceparent: 00-D2E1605E07E0EEA1EDBCB72CA3DDEC23-50FEF9796F94C8B8-01
That doesn't fit so well with the X-Honeycomb-Trace
header key hard-coded through this code base, preventing us using the beeline to help us send events to Honeycomb from ALL THE THINGS.
Strikes me the headers could be configurable, perhaps in such a way suppose the beelines could accept and issue both if necessary. You up for a PR for that?
Hi,
I'm using starlette and uvicorn for my python webservice. I've put together a middleware for starlette based on the WSGI one I saw in this codebase. Worth saying that I have no idea what I'm doing, but I think the code below is what is required.
import beeline
from starlette.datastructures import URL, Headers
from starlette.responses import RedirectResponse
from starlette.types import ASGIApp, Receive, Scope, Send
class HoneycombMiddleware:
def __init__(self, app: ASGIApp) -> None:
self.app = app
async def __call__(self, scope: Scope, receive: Receive, send: Send) -> None:
trace = beeline.start_trace(
context=self.get_context_from_environ(scope))
def send_wrapper(response):
beeline.add_context_field(
"response.status_code", response.get("status"))
beeline.finish_trace(trace)
return send(response)
await self.app(scope, receive, send_wrapper)
def get_context_from_environ(self, scope):
request_method = scope.get('method')
if request_method:
trace_name = "starlette_http_%s" % request_method.lower()
else:
trace_name = "starlette_http"
headers = Headers(scope=scope)
return {
"name": trace_name,
"type": "http_server",
"request.host": headers.get('host'),
"request.method": request_method,
"request.path": scope.get('path'),
"request.content_length": int(headers.get('content-length', 0)),
"request.user_agent": headers.get('user-agent'),
"request.scheme": scope.get('scheme'),
"request.query": scope.get('query_string').decode("ascii")
}
AWS boto3
uses botocore
and botocore
seems to bring its own requests
@ botocore.vendored.requests
, which does not get patched.
I've managed to get something working on my end by copying the patch to patch botocore.vendored.requests
as well.
can this MAYBE happen within beeline-python? or is that maybe too much to ask? or is there another recommendation?
from pollinators:
spawned uWSGI http 1 (pid: 78)
Traceback (most recent call last):
File "/usr/local/lib/python2.7/site-packages/flask/app.py", line 1997, in __call__
return self.wsgi_app(environ, start_response)
File "/usr/local/lib/python2.7/site-packages/beeline/middleware/flask/__init__.py", line 37, in __call__
"request.user_agent": environ['HTTP_USER_AGENT'],
KeyError: 'HTTP_USER_AGENT'
Hi there,
Running the tests on one of my projects with the beeline enabled resulted in this little gem:
[...]/venv/lib/python3.8/site-packages/beeline/middleware/django/__init__.py:100: RemovedInDjango40Warning: request.is_ajax() is deprecated. See Django 3.1 release notes for more details about this deprecation. "request.xhr": request.is_ajax(),
Citing from the release notes:
"The HttpRequest.is_ajax() method is deprecated as it relied on a jQuery-specific way of signifying AJAX calls, while current usage tends to use the JavaScript Fetch API. Depending on your use case, you can either write your own AJAX detection method, or use the new HttpRequest.accepts() method if your code depends on the client Accept HTTP header."
It will be removed with Django 4.0, so there is plenty of time to make the necessary changes. I just wanted to put in on your radar.
I am not actively using this field myself and am only starting out with Django. So I didn't prepare a pull request, because I've no clue what's the expected behaviour.
The beeline docs show this line for instrumenting outgoing requests:
from beeline.patch.requests import *
This isn't necessary and caused problems for us in an app that had a variable which got shadowed by this import.
It's sufficient and safer to run simply:
from beeline.patch.requests import requests
The Python beeline seems to go to great lengths to prefix trace fields with app.
, but not context fields. This is proving problematic for my distributed tracing, where my trace-level fields from a Ruby service named like foo.bar
become app.foo.bar
on the Python side. Granted, I had to kind of hack around the Ruby beeline to even get trace field without the app.
prefix, but the same doesn't seem particularly doable in the Python beeline.
This auto-prefixing already bit once with the app.app.
doubling up in #96. I personally think the whole idea is in need of revisiting across beelines. Why can't the beeline just "do what I say"? The Python beeline already does this for context fields. The app.
prefix is also not particularly useful to me for distributed tracing; more sensible would be use my different service names as the primary "namespace". Which I could still do with app.service_name.*
, but the app.
seems redundant to me.
Those philosophical considerations aside, removing the auto-prefixing of trace fields in the Python beeline would require a major version bump. Perhaps it'd be safer to treat unmarshalled traces specially, avoiding the auto-prefixing? That'd keep the trace-level field names consistent across the distributed trace, at least, which would make for more predictable querying.
Add tests for Starlette middleware was added in #109.
For python 2.7 we switched
beeline-python/beeline/trace.py
Line 294 in 23ebbd1
to
self.event.start_time = time.clock()
and
beeline-python/beeline/trace.py
Lines 137 to 138 in 23ebbd1
to
duration_ms = (time.clock() - span.event.start_time) * 1000
in order to get sub millisecond timings.
I'd PR this simple change, except:
Python 3.7.1 (v3.7.1:260ec2c36a, Oct 20 2018, 14:57:15) [MSC v.1915 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import time
>>> time.clock()
__main__:1: DeprecationWarning: time.clock has been deprecated in Python 3.3 and will be removed from Python 3.8: use time.perf_counter or time.process_time instead
3.8311687
I checked https://github.com/benjaminp/six but there are no entries around the time
module.
The current version of the Python Beeline expects the Honeycomb tracing headers to contain at least three keys.
beeline-python/beeline/trace.py
Line 339 in 92b972c
Since dataset
and context
are optional there are times when there are only two keys in the header 1;trace_id=xxx,parent_id=xxx
. This makes the Python Beeline start a new trace and not associate itself with the current trace due to the code to extract the current trace_id
not being executed
One of the endpoints we implemented on our Django rest framework application has POST
method with multipart/form-data
content type. We also utilize a parser which converts camelCase
fields in incoming requests to snake_case
before they hit the view
s.
That one endpoint started failing right after integrating Beeline. Actually, it still receives the request, but when you look into the request, you see that it didn't get modified by the parser. Basically, that specific request skips the parsers and potentially other middleware after Beeline middleware.
I've spent some time to investigate what was going on and realized that the stream on the request object was exhausted by the line on the below:
request.POST.dict()
I was able to find some more reference when I went deeper:
The comment in the source code: https://github.com/encode/django-rest-framework/blob/0cc09f0c0dbe4a6552b1a5bbaa4f7f921270698a/rest_framework/request.py#L326
A warning (the green part under process_view
on the page) on the documentation (this is from Django package for process_view
, but it still applies): https://docs.djangoproject.com/en/2.2/topics/http/middleware/#process-view
I am working on flask application. Intermittently, after tests succeed (run with pytest), I get following error:
Exception in thread Thread-1:
Traceback (most recent call last):
File "/Users/mario/src/project/appenv/lib/python3.7/site-packages/libhoney/transmission.py", line 113, in _sender
ev = self.pending.get(timeout=self.send_frequency)
File "/usr/local/Cellar/python/3.7.0/Frameworks/Python.framework/Versions/3.7/lib/python3.7/queue.py", line 178, in get
raise Empty
_queue.Empty
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/local/Cellar/python/3.7.0/Frameworks/Python.framework/Versions/3.7/lib/python3.7/threading.py", line 917, in _bootstrap_inner
self.run()
File "/usr/local/Cellar/python/3.7.0/Frameworks/Python.framework/Versions/3.7/lib/python3.7/threading.py", line 865, in run
self._target(*self._args, **self._kwargs)
File "/Users/mario/src/project/appenv/lib/python3.7/site-packages/libhoney/transmission.py", line 126, in _sender
pool.submit(self._flush, events)
File "/usr/local/Cellar/python/3.7.0/Frameworks/Python.framework/Versions/3.7/lib/python3.7/concurrent/futures/thread.py", line 151, in submit
raise RuntimeError('cannot schedule new futures after shutdown')
RuntimeError: cannot schedule new futures after shutdown
beeline is initialised with an empty string as key in test env.
I've just tried honeycomb-beeline
to instrument my Flask app following the instructions at https://docs.honeycomb.io/getting-data-in/beelines/beeline-python/ but I am getting this error:
127.0.0.1 - - [07/Sep/2018 09:30:38] "GET / HTTP/1.1" 500 -
Traceback (most recent call last):
File "/Users/afausti/Projects/squash-demo/squash-deployment/squash-restful-api/env/lib/python3.6/site-packages/flask/app.py", line 2309, in __call__
return self.wsgi_app(environ, start_response)
File "/Users/afausti/Projects/squash-demo/squash-deployment/squash-restful-api/env/lib/python3.6/site-packages/beeline/middleware/flask/__init__.py", line 40, in __call__
}, trace_name=trace_name, top_level=True)
File "/Users/afausti/Projects/squash-demo/squash-deployment/squash-restful-api/env/lib/python3.6/site-packages/beeline/__init__.py", line 183, in _new_event
ev = g_tracer.new_traced_event(trace_name)
TypeError: new_traced_event() missing 2 required positional arguments: 'trace_id' and 'parent_id'
Here is how I am invoking the beeline in my app:
import os
import beeline
from beeline.middleware.flask import HoneyMiddleware
from app import create_app, db
profile = os.environ.get('SQUASH_API_PROFILE', 'app.config.Development')
honey_api_key = os.environ.get('HONEY_API_KEY')
app = create_app(profile)
beeline.init(writekey=honey_api_key, dataset="squash-rest-api", service_name="squash")
HoneyMiddleware(app, db_events=True)
I see the same with
HoneyMiddleware(app, db_events= False)
The versions I am running:
Flask 1.0.2
Flask-RESTful 0.3.6
Flask-SQLAlchemy 2.3.2
honeycomb-beeline 1.2.0
libhoney 1.5.0
beeline-python/beeline/trace.py
Line 41 in 35acade
As you can see, the function __call__
does not make use of these parameters. A linter should have caught this...
As a result, it's impossible to use the context manager to continue a trace across threads, for example.
We can see these imports
from beeline.propagation import Request <---- Request
from flask import current_app, signals
# needed to build a request object from environ in the middleware
from werkzeug.wrappers import Request <---- Request again, overriding it
Then a request is instantiated here
req = Request(environ, shallow=True)
It looks like the import from the propagate module is wrong because it doesn't take parameters. It has to be using the werkzeug one. But the built instance is not really used, so, can it be removed along with the imports to avoid confusion?
The type checking done by
beeline-python/beeline/trace.py
Lines 211 to 214 in 43a3f9b
json.loads
(at least in Python 2.7.12):
>>> json.loads('{"app.field":"value"}')
{u'app.field': u'value'}
Thus, when unmarshal_trace_context
parses a distributed trace header's fields with
beeline-python/beeline/trace.py
Line 352 in 43a3f9b
add_trace_field
with the likes of beeline-python/beeline/middleware/flask/__init__.py
Lines 60 to 63 in 43a3f9b
type(u'app.field') == unicode
, not str
, so it still gets coerced to 'app.app.field'
.We seem to be experiencing lag in the after request. Honeycomb currently uses the after_request . This is not a reasonable solution https://stackoverflow.com/questions/48994440/execute-a-function-after-flask-returns-response. We use gunicorn, perhaps we can move it after the request to your backend there. Also I noticed there's no logging at all if we can't connect to the honeycomb backend.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.