as w205, under /setup, create Postgres database and table:
$ python create_db.py
bash screen should show:
[w205@ip-172-31-6-39 setup]$ python create_db.py
creating tcount database in postgres ...
database tcount is successfully created!
creating tweetwordcount table in tcount ...
table tweetwordcount is successfully created!
postgres setup completed, you are good to go, happy streaming :)
as w205, under /EX2Tweetwordcount, start streaming:
$ sparse run
note: to avoid log flood, we only log count every 100 counts for each word, it could be several seconds before you see the first count shows up
e.g. 37262 [Thread-41] INFO backtype.storm.task.ShellBolt - ShellLog pid:4265, name:count-bolt weather: 100
to stop streaming, press Ctrl+C at any time.
Checking results (under /serving_scripts):
check all words count:
$ python finalresults.py
check a specific word:
$ python finalresults.py weather
check histogram with specified range:
$ python histogram.py 600 1000
perform customized query on Postgres:
$ [w205@ip-172-31-6-39 ~]$ psql -U postgres
psql (8.4.20)
Type "help" for help.
postgres=# \\c tcount
psql (8.4.20)
You are now connected to database "tcount".
tcount=# select * from tweetwordcount order by count desc limit 20
tcount-# ;
word | count
---------+-------
like | 700
one | 610
love | 524
will | 443
amp | 430
know | 421
want | 390
weather | 370
see | 360
time | 340
people | 331
good | 330
make | 310
new | 309
thank | 308
day | 282
much | 282
need | 277
back | 274
really | 270
(20 rows)