-
run ./install_for_clickhouse.sh on your terminal
-
go make yourself a coffee
(the intermediate CSV files are being removed as we insert them into the database, no need to tidy things up)
-
you're all set, you can request your database: try SELECT count(*) from gdelt; DESCRIBE TABLE gdelt;
-
feel free to check out the gdelt data structure here: http://www.gdeltproject.org/data.html#documentation GDELT 1.0 Data Format Documentation & CAMEO Code Reference are both particularly pertinent.
-
Same goes for clickhouse documentation: https://clickhouse.yandex/docs/en/
-
If you quit and want to access your clickhouse db once again, here is the command: docker run -it --rm --link gdelt_clickhouse:clickhouse-server yandex/clickhouse-client --host clickhouse-server
Some usefull links: https://hub.docker.com/r/yandex/clickhouse-server/
SELECT EventCode, Actor1Geo_CountryCode, SOURCEURL FROM gdelt WHERE EventCode = '1821' AND Actor1Geo_CountryCode = 'FR'
SELECT EventCode, SOURCEURL, DATEADDED FROM gdelt WHERE SOURCEURL LIKE '%Macron%';
SELECT EventCode, ActionGeo_Lat, ActionGeo_Long, SOURCEURL FROM gdelt WHERE Actor1Code = 'NGOHLHIRC' ORDER BY EventCode ASC;