Giter Club home page Giter Club logo

miner_exporter's Introduction

miner_exporter

Prometheus exporter for the Helium miner (validator). Using prometheus_client, this code exposes metrics from the helium miner to a prometheus compatible server.

This is only the exporter, which still requires a prometheus server for data and grafana for the dashboard. Prometheus and Grafana servers can run on an external machine, the same machine as the miner, or possibly using a cloud service. The helium_miner_grafana_dashboard can be imported to Grafana.

Note port 9825 is the 'reserved' port for this specific exporter. Feel free to use whatever you like, of course, but you won't be able to dial 9VAL on your phone.

Running via Docker

Using the docker file, you can run this with Docker or docker-compose! Both of these expose Prometheus on 9825, feel free to choose your own port. The images are hosted on both GHCR and Dockerhub.

Docker client

docker run -p 9825:9825 --name miner_exporter -v /var/run/docker.sock:/var/run/docker.sock ghcr.io/tedder/miner_exporter:latest

Docker-Compose

Using your existing docker-compose file, add the section for the exporter (below). When you're done, run docker-compose up -d as usual. That's it!

version: "3"
services:
  validator:
    image: quay.io/team-helium/validator:latest-val-amd64
    container_name: validator
...
  miner_exporter:
    image: ghcr.io/tedder/miner_exporter:latest
    container_name: miner_exporter
    volumes:
    - /var/run/docker.sock:/var/run/docker.sock
    ports:
    - "9825:9825"

Running locally

On the miner machine:

install python3

pip install prometheus_client psutil docker

Details on the libraries:

Configuration

The following have valid defaults, but you can change them:

UPDATE_PERIOD  # seconds between scrapes, int
VALIDATOR_CONTAINER_NAME # eg 'validator', string
API_BASE_URL # URL for api access, string. For testnet, set to "https://testnet-api.helium.wtf/v1"
ENABLE_RPC # opt in to using the RPC API with a truthy value (defaults to falsey value until `exec` calls are fully replaced).

miner_exporter's People

Contributors

artifactstaking avatar gradoj avatar kellybyrd avatar tedder avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

miner_exporter's Issues

CloudWatch agent support

I'd like to see miner_exporter support exporting metrics to CloudWatch agent. This may be appealing to users who run their Helium validator on AWS EC2 instances who would like to keep their monitoring inside the AWS console.

To support this in the existing miner_exporter implementation, I think we'd need to do something like https://github.com/tedder/miner_exporter/blob/main/miner_exporter.py#L462 for CloudWatch agent. @tedder mentioned an upcoming refactor that may make it even easier to add support for CloudWatch agent.

Fetching Error

Traceback (most recent call last):
  File "/opt/app/miner_exporter.py", line 369, in <module>
    stats()
  File "<decorator-gen-1>", line 2, in stats
  File "/usr/local/lib/python3.9/site-packages/prometheus_client/context_managers.py", line 66, in wrapped
    return func(*args, **kwargs)
  File "/opt/app/miner_exporter.py", line 132, in stats
    collect_balance(docker_container,miner_facts['address'],hotspot_name_str)
  File "/opt/app/miner_exporter.py", line 196, in collect_balance
    if not api_accounts.get('data') or not api_accounts['data'].get('balance'):
AttributeError: 'NoneType' object has no attribute 'get'

Exporter Shut Down

Hey Tedder,

Still seeing shutdowns on the miner_exporter, though far less frequently now. Restart works just fine. Typically will run for a couple days and then I'll see a shutdown.

Traceback (most recent call last):
  File "/opt/app/miner_exporter.py", line 327, in <module>
    stats()
  File "<decorator-gen-1>", line 2, in stats
  File "/usr/local/lib/python3.9/site-packages/prometheus_client/context_managers.py", line 66, in wrapped
    return func(*args, **kwargs)
  File "/opt/app/miner_exporter.py", line 126, in stats
    collect_balance(docker_container,miner_facts['address'],hotspot_name_str)
  File "/opt/app/miner_exporter.py", line 149, in collect_balance
    if not api_validators.get('data') and not api_validators['data'].get('owner'):
AttributeError: 'NoneType' object has no attribute 'get'```

Another one for ya...

Traceback (most recent call last):
  File "/opt/app/miner_exporter.py", line 369, in <module>
    stats()
  File "<decorator-gen-1>", line 2, in stats
  File "/usr/local/lib/python3.9/site-packages/prometheus_client/context_managers.py", line 66, in wrapped
    return func(*args, **kwargs)
  File "/opt/app/miner_exporter.py", line 132, in stats
    collect_balance(docker_container,miner_facts['address'],hotspot_name_str)
  File "/opt/app/miner_exporter.py", line 196, in collect_balance
    if not api_accounts.get('data') or not api_accounts['data'].get('balance'):
AttributeError: 'NoneType' object has no attribute 'get'

crash getting balance from api

miner_exporter    | miner_exporter.py:collect_miner_version:298:INFO    found miner version: 0.1.51
miner_exporter    | miner_exporter.py:collect_in_consensus:163:INFO    in consensus? 0 / false
miner_exporter    | Traceback (most recent call last):
miner_exporter    |   File "/opt/app/miner_exporter.py", line 307, in <module>
miner_exporter    |     stats()
miner_exporter    |   File "<decorator-gen-1>", line 2, in stats
miner_exporter    |   File "/usr/local/lib/python3.9/site-packages/prometheus_client/context_managers.py", line 66, in wrapped
miner_exporter    |     return func(*args, **kwargs)
miner_exporter    |   File "/opt/app/miner_exporter.py", line 122, in stats
miner_exporter    |     collect_balance(docker_container,miner_facts['address'],hotspot_name_str)
miner_exporter    |   File "/opt/app/miner_exporter.py", line 132, in collect_balance
miner_exporter    |     owner = api_validators['data']['owner']
miner_exporter    | KeyError: 'data'

I'm assuming this is a new validator not yet showing up in api

Grafana Cloud integration

I've had some trouble getting this exporter to work with Grafana agent. Based on this article it seems like Grafana Cloud supports default integrations that make it easier to set up the configuration for a Grafana agent. I'm not sure what the process is for on-boarding a new integration but I figured I'd create an issue if maintainers of this project were aware / interested in it.

Exporter Keeps Shutting Down

You guys still seeing these issues? My node exporter will run for about 5 minutes before it shuts down. These are the errors showing up in the logs:

Traceback (most recent call last):
File "/opt/app/miner_exporter.py", line 327, in
stats()
File "", line 2, in stats
File "/usr/local/lib/python3.9/site-packages/prometheus_client/context_managers.py", line 66, in wrapped
return func(*args, **kwargs)
File "/opt/app/miner_exporter.py", line 126, in stats
collect_balance(docker_container,miner_facts['address'],hotspot_name_str)
File "/opt/app/miner_exporter.py", line 148, in collect_balance
api_validators = safe_get_json(f'https://testnet-api.helium.wtf/v1/validators/{addr}')
File "/opt/app/miner_exporter.py", line 130, in safe_get_json
ret = requests.get(url)
File "/usr/local/lib/python3.9/site-packages/requests/api.py", line 76, in get
return request('get', url, params=params, **kwargs)
File "/usr/local/lib/python3.9/site-packages/requests/api.py", line 61, in request
return session.request(method=method, url=url, **kwargs)
File "/usr/local/lib/python3.9/site-packages/requests/sessions.py", line 542, in request
resp = self.send(prep, **send_kwargs)
File "/usr/local/lib/python3.9/site-packages/requests/sessions.py", line 655, in send
r = adapter.send(request, **kwargs)
File "/usr/local/lib/python3.9/site-packages/requests/adapters.py", line 516, in send
raise ConnectionError(e, request=request)
requests.exceptions.ConnectionError: HTTPSConnectionPool(host='testnet-api.helium.wtf', port=443): Max retries exceeded with url: /v1/validators/1ZaCvhW663DFrchg3i5xvgXnH2en4QKc2BTTzPpMQLD7r7SQFci (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x7fc421132370>: Failed to establish a new connection: [Errno -3] Temporary failure in name resolution'))

Another one for ya...

Not sure if this will help or not, but the miner_exporter node on my AWS validator runs without a hiccup but the one on my local server here at home never seems to last a day before shutting down. Here's the most recent log:

Traceback (most recent call last):
  File "/opt/app/miner_exporter.py", line 369, in <module>
    stats()
  File "<decorator-gen-1>", line 2, in stats
  File "/usr/local/lib/python3.9/site-packages/prometheus_client/context_managers.py", line 66, in wrapped
    return func(*args, **kwargs)
  File "/opt/app/miner_exporter.py", line 132, in stats
    collect_balance(docker_container,miner_facts['address'],hotspot_name_str)
  File "/opt/app/miner_exporter.py", line 196, in collect_balance
    if not api_accounts.get('data') or not api_accounts['data'].get('balance'):
AttributeError: 'NoneType' object has no attribute 'get'

Here's another one for you...

Traceback (most recent call last):
  File "/opt/app/miner_exporter.py", line 369, in <module>
    stats()
  File "<decorator-gen-1>", line 2, in stats
  File "/usr/local/lib/python3.9/site-packages/prometheus_client/context_managers.py", line 66, in wrapped
    return func(*args, **kwargs)
  File "/opt/app/miner_exporter.py", line 132, in stats
    collect_balance(docker_container,miner_facts['address'],hotspot_name_str)
  File "/opt/app/miner_exporter.py", line 196, in collect_balance
    if not api_accounts.get('data') and not api_accounts['data'].get('balance'):
AttributeError: 'NoneType' object has no attribute 'get'

Can't obtain the metrics

Hi! i can't obtain the metrics using this file on docker, miner_exporter. it just doesn't show them

Grafana cloud/agent dashboard 14319 server variable incorrect

Dashboard 14319 doesn't display the node_exporter data when imported into Grafana cloud.
Using Grafana Agent and Grafana Cloud no servers found unless i remove the regex then it finds 'grafana.com'
image
image

server_name variable from the data is 'Thinkstation2-P620' not 'grafana.com' so not sure how to fix this
image

exporter crashes with no docker container

Exporter crashes if you stop and remove validator docker container:

Traceback (most recent call last):
  File "/opt/app/miner_exporter.py", line 307, in <module>
    stats()
  File "<decorator-gen-1>", line 2, in stats
  File "/usr/local/lib/python3.9/site-packages/prometheus_client/context_managers.py", line 66, in wrapped
    return func(*args, **kwargs)
  File "/opt/app/miner_exporter.py", line 106, in stats
    docker_container = dc.containers.get(VALIDATOR_CONTAINER_NAME)
  File "/usr/local/lib/python3.9/site-packages/docker/models/containers.py", line 889, in get
    resp = self.client.api.inspect_container(container_id)
  File "/usr/local/lib/python3.9/site-packages/docker/utils/decorators.py", line 19, in wrapped
    return f(self, resource_id, *args, **kwargs)
  File "/usr/local/lib/python3.9/site-packages/docker/api/container.py", line 773, in inspect_container
    return self._result(
  File "/usr/local/lib/python3.9/site-packages/docker/api/client.py", line 274, in _result
    self._raise_for_status(response)
  File "/usr/local/lib/python3.9/site-packages/docker/api/client.py", line 270, in _raise_for_status
    raise create_api_error_from_http_exception(e)
  File "/usr/local/lib/python3.9/site-packages/docker/errors.py", line 31, in create_api_error_from_http_exception
    raise cls(e, response=response, explanation=explanation)
docker.errors.NotFound: 404 Client Error for http+docker://localhost/v1.41/containers/validator/json: Not Found ("No such container: validator")

Helium API calls require user-agent string

As of this writing, Helium now requires a user-agent as part of any API calls, or will return a 429 status code. This is causing validator balance fetches to fail.

Already have a fork working and tested with user-agent added. Will generate PR shortly.

validator_name="Error: Usage information not found"

Certain metrics output an error for the "validator_name" parameter. Examples:

system_usage{job="Helium_Validator", resource_type="CPU", validator_name="Error: Usage information not found for the given command RPC to '[email protected]' failed: {'EXIT', {{case_clause, {error, {no_matching_spec,["info","name"]}}}, [{blockchain_console,command,1, [{file,"blockchain_console.erl"}, {line,16}]}, {rpc,'-handle_call_call/6-fun-0-',5, [{file,"rpc.erl"},{line,197}]}]}}"}

system_usage{job="Helium_Validator", resource_type="Memory", validator_name="Exact ERTS version (10.7.1) match not found, instead using 10.7.1. The release may fail to run."}

Typically, the metric is duplicated and the last "system_usage" in the group correctly displays the validator_name, but the metrics above it display the error.

Other metrics where the same error pops up:
validator_block_age, validator_container_uptime, validator_version_info.

See attached image:

image

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.