supermodularxyz / grants-etl Goto Github PK
View Code? Open in Web Editor NEWLicense: MIT License
License: MIT License
Ideally the same public data about projects that was displayed on grants explorer (and is now captured with allo)
Voters would only show the address, same as with allo
add metabase support for easy querying of the dataset and query reuse
Here is some additional data that I think could help enhance the ETL tool
Passport score
Passport stamp data (indexer or API)
Staking data
Historical data by address (only ceramic for now ๐ฆ )
from omni:
I can tell you straight up that one of the best ways to make this tool useful is to have prepackaged queries that represent common views that data scientists / sybil hunters will use to get familiar with the platform. Stuff like:
The first type of queries allow new users to jump right into understanding the data and analyzing it without having to think much about the structure.
and the second type of queries help inform users on what can be done and get them up to speed on what has been done so they don't waste time reinventing the wheel.
Both of these are issues when getting onboarding with Gitcoin data. People spend a bunch of time understanding the schema and how it relates to the platform, then they recreate a bunch of basic statistics which, unfortunately, are already known.
I'm getting this issue on Windows when I do docker-compose
pgadmin | postfix/postlog: starting the Postfix mail system
pgadmin | [2023-07-30 15:59:54 +0000] [1] [INFO] Starting gunicorn 20.1.0
pgadmin | [2023-07-30 15:59:54 +0000] [1] [INFO] Listening at: http://[::]:80 (1)
pgadmin | [2023-07-30 15:59:54 +0000] [1] [INFO] Using worker: gthread
pgadmin | [2023-07-30 15:59:54 +0000] [81] [INFO] Booting worker with pid: 81
pgadmin | [2023-07-30 15:59:58 +0000] [81] [ERROR] Exception in worker process
pgadmin | Traceback (most recent call last):
pgadmin | File "/venv/lib/python3.11/site-packages/gunicorn/arbiter.py", line 589, in spawn_worker
pgadmin | worker.init_process()
pgadmin | File "/venv/lib/python3.11/site-packages/gunicorn/workers/gthread.py", line 92, in init_process
pgadmin | super().init_process()
pgadmin | File "/venv/lib/python3.11/site-packages/gunicorn/workers/base.py", line 134, in init_process
pgadmin | self.load_wsgi()
pgadmin | File "/venv/lib/python3.11/site-packages/gunicorn/workers/base.py", line 146, in load_wsgi
pgadmin | self.wsgi = self.app.wsgi()
pgadmin | ^^^^^^^^^^^^^^^
pgadmin | File "/venv/lib/python3.11/site-packages/gunicorn/app/base.py", line 67, in wsgi
pgadmin | self.callable = self.load()
pgadmin | ^^^^^^^^^^^
pgadmin | File "/venv/lib/python3.11/site-packages/gunicorn/app/wsgiapp.py", line 58, in load
pgadmin | return self.load_wsgiapp()
pgadmin | ^^^^^^^^^^^^^^^^^^^
pgadmin | File "/venv/lib/python3.11/site-packages/gunicorn/app/wsgiapp.py", line 48, in load_wsgiapp
pgadmin | return util.import_app(self.app_uri)
pgadmin | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
pgadmin | File "/venv/lib/python3.11/site-packages/gunicorn/util.py", line 359, in import_app
pgadmin | mod = importlib.import_module(module)
pgadmin | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
pgadmin | File "/usr/lib/python3.11/importlib/init.py", line 126, in import_module
pgadmin | return _bootstrap._gcd_import(name[level:], package, level)
pgadmin | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
pgadmin | File "", line 1204, in _gcd_import
pgadmin | File "", line 1176, in _find_and_load
pgadmin | File "", line 1147, in _find_and_load_unlocked
pgadmin | File "", line 690, in _load_unlocked
pgadmin | File "", line 940, in exec_module
pgadmin | File "", line 241, in _call_with_frames_removed
pgadmin | File "/pgadmin4/run_pgadmin.py", line 4, in
pgadmin | from pgAdmin4 import app
pgadmin | File "/pgadmin4/pgAdmin4.py", line 104, in
pgadmin | app = create_app()
pgadmin | ^^^^^^^^^^^^
pgadmin | File "/pgadmin4/pgadmin/init.py", line 477, in create_app
pgadmin | run_migration_for_sqlite()
pgadmin | File "/pgadmin4/pgadmin/init.py", line 452, in run_migration_for_sqlite
pgadmin | os.chmod(config.SQLITE_PATH, 0o600)
pgadmin | PermissionError: [Errno 1] Operation not permitted: '/var/lib/pgadmin/pgadmin4.db'
pgadmin | [2023-07-30 15:59:58 +0000] [81] [INFO] Worker exiting (pid: 81)
pgadmin | [2023-07-30 15:59:58 +0000] [1] [INFO] Shutting down: Master
pgadmin | [2023-07-30 15:59:58 +0000] [1] [INFO] Reason: Worker failed to boot.
pgadmin exited with code 0
From a perspective of a person that took part in sybil-seeking hackathon, being provided a schema + db like this would have been pretty nice. On the other hand - if you are looking at a single round, a flat .csv file with votes
is probably a great low-effort starting point to jump into doing analysis.
Possible additions to existing schema:
vote
it would be nice to know gas i.e. gas price
+ gas spent
.apply to round
on-chain I would like to see hash of transaction used to do that (+ gas fee).blockNumbers
for vote
is nice, would be even nicer if there was also approximate_timestamp
for plotting timeseries.Key friction for me usability wise, is that I want to get clean, processed data from authoritative source without having to re-run the pipelines myself.
Interesting external information about each voter/grant address: POAPs, ENS name history, Snapshot votes
I'm trying to index GR18 data and encountering an error when I run. It is expecting chainId
to come as an int not a string.
yarn run etl --chain [chainId]
I also tried running the jobs directly by modifying index.ts (eg, const chainId = argv.chainId ?? 10
) and running yarn run etl
and this generated an error:
Invalid value for argument `applicationsEndTime`: number too large to fit in target type. Expected big integer String.
If I switched to chainIds 1 or 424, then I got a fetch error, eg:
details: 'fetch is not defined',
docsPath: undefined,
metaMessages: [
'URL: https://rpc.publicgoods.network',
'Request body: {"method":"eth_getLogs","params":[{"address":"0x222EA76664ED77D18d4416d2B2E77937b76f0a35","topics":["0xca792622046325e9cd4e24b490cb000ef72acea3a15284efc14ee709307a5e00","0x3532b7116e113d629a3d0a0364840f52c9d93f6b81b2ecc61b2cb228c39ee9fb"],"fromBlock":"0xf9700","toBlock":"0xf9701"}]}'
],
shortMessage: 'HTTP request failed.',
version: '[email protected]',
body: { method: 'eth_getLogs', params: [ [Object] ] },
headers: undefined,
status: undefined,
url: 'https://rpc.publicgoods.network'
}
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.