Giter Club home page Giter Club logo

dstk's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

dstk's Issues

PostGres server sometimes crashes

Steps:
Run the DSTK server for several days under load

Result:
The street2coordinates API may stop responding with results, returning an error instead.

Notes:
This seems to be caused by memory allocations problems within the postgres server when dealing with large requests.

Here's the error log from /var/log/postgresql/postgresql-8.4-main.log:

UTC FATAL: could not create shared memory segment: Cannot allocate memory
UTC DETAIL: Failed system call was shmget(key=5432001, size=29278208, 03600).
UTC HINT: This error usually means that PostgreSQL's request for a shared memory segment exceeded available memory or swap space. To reduce the request size (currently 29278208 bytes), reduce PostgreSQL's shared_buffers parameter (currently 3072) and/or its max_connections parameter (currently 103).
The PostgreSQL documentation contains more information about shared memory configuration.

html2text utf-8 decode bug

From email:

I get this error on some websites when i do a html2text
UnicodeDecodeError: 'utf8' codec can't decode byte 0x92 in position
78: unexpected code byte

error on manual setup

Trying to put my own server together and ran into an issue on the first line of the setup txt:

https://github.com/petewarden/dstk/blob/master/docs/ec2setup.txt

~$ sudo apt-add-repository -y ppa:olivier-berten/geo
Exception in thread Thread-1:
Traceback (most recent call last):
File "/usr/lib/python2.7/threading.py", line 551, in __bootstrap_inner self.run()
File "/usr/lib/python2.7/dist-packages/softwareproperties/ppa.py", line 99, in run self.add_ppa_signing_key(self.ppa_path)
File "/usr/lib/python2.7/dist-packages/softwareproperties/ppa.py", line 117, in add_ppa_signing_key  ppa_info = get_ppa_info_from_lp(owner_name, ppa_name)
File "/usr/lib/python2.7/dist-packages/softwareproperties/ppa.py", line 87, in get_ppa_info_from_lp
return json.loads(lp_page)
File "/usr/lib/python2.7/json/__init__.py", line 326, in loads
return _default_decoder.decode(s)
File "/usr/lib/python2.7/json/decoder.py", line 366, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File "/usr/lib/python2.7/json/decoder.py", line 384, in raw_decode
raise ValueError("No JSON object could be decoded")
ValueError: No JSON object could be decoded

Not sure if this is my new clean install (12.04 Ubuntu) or the repo but figured I would just flag it for your attention in case expected behavior is different. I'll keep on playing around on my end. I have a dedicated server so it seems silly to run an additional VM inside but I'm more than happy to do it.

Oh, and dstk is amazing - thank you for your work!

Internal Server Error on html2story, html2text

Trying to pull in the HTML from a news article returns Internal Server Error on both datasciencetoolkit.org and my Amazon AMI. Seems to happen on any legitimate news article, CNN or Reuters I've tested it on, I have to cut down the request to approximately 3k in length for it to succeed.

Test article
http://www.reuters.com/article/2014/05/19/us-usa-security-imam-idUSBREA4I0NL20140519?feedType=RSS&feedName=domesticNews

HTTP/1.1 500 Internal Server Error
Date: Mon, 19 May 2014 21:45:16 GMT
Server: Apache/2.2.22 (Ubuntu)
Status: 500 Internal Server Error
Vary: Accept-Encoding
Content-Length: 630
Content-Type: text/html; charset=iso-8859-1

<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
<html><head>
<title>500 Internal Server Error</title>
</head><body>
<h1>Internal Server Error</h1>
<p>The server encountered an internal error or
misconfiguration and was unable to complete
your request.</p>
<p>Please contact the server administrator,
 [no address given] and inform them of the time the error occurred,
and anything you might have done that may have
caused the error.</p>
<p>More information about this error may be available
in the server error log.</p>
<hr>
<address>Apache/2.2.22 (Ubuntu) Server at www.datasciencetoolkit.org Port 80</address>
</body></html>

New ec2setup.txt suggestion and hints...

This is my version of ec2setup.txt that I modified to work on my own home grown Ubuntu 12.04 LTS instance.

Start with AMI # ami-3fec7956 (Ubuntu 12.04), 32GB
(ec2-run-instances ami-3fec7956 -t m1.large --region us-east-1 -z us-east-1d --block-device-mapping /dev/sda1=:32:false -k )

sudo apt-add-repository -y ppa:olivier-berten/geo
sudo add-apt-repository -y ppa:webupd8team/java
sudo aptitude update
sudo aptitude safe-upgrade -y
sudo aptitude full-upgrade -y
sudo aptitude install -y build-essential apache2 apache2.2-common apache2-mpm-prefork apache2-utils libexpat1 ssl-cert postgresql libpq-dev ruby1.8-dev ruby1.8 ri1.8 rdoc1.8 irb1.8 libreadline-ruby1.8 libruby1.8 libopenssl-ruby sqlite3 libsqlite3-ruby1.8 git-core libcurl4-openssl-dev apache2-prefork-dev libapr1-dev libaprutil1-dev subversion postgresql-9.1-postgis autoconf libtool libxml2-dev libbz2-1.0 libbz2-dev libgeos-dev proj-bin libproj-dev ocropus pdftohtml catdoc unzip ant openjdk-6-jdk lftp php5-cli rubygems flex postgresql-server-dev-9.1 proj libjson0-dev xsltproc docbook-xsl docbook-mathml gettext postgresql-contrib-9.1 pgadmin3 python-software-properties bison dos2unix
sudo aptitude install -y oracle-java7-installer
sudo aptitude install -y libgdal-dev
sudo aptitude install -y libgeos++-dev
sudo bash -c 'echo "/usr/lib/jvm/java-7-oracle/jre/lib/amd64/server" > /etc/ld.so.conf.d/jvm.conf'
sudo ldconfig

Note that here you should create a new user called ubuntu (I used my own user and had to modify various scripts and config files which is described below)

mkdir ~/sources
cd ~/sources
wget http://download.osgeo.org/postgis/source/postgis-2.0.3.tar.gz
tar xfvz postgis-2.0.3.tar.gz
cd postgis-2.0.3
./configure --with-gui

./configure --with-gui --without-topology

If the GEO version is incorrect then perform the following steps:

wget http://download.osgeo.org/geos/geos-3.3.8.tar.bz2
tar xjf geos-3.3.8.tar.bz2
cd geos-3.3.8
./configure
make
sudo make install
cd ~/sources/postgis-2.0.3
./configure --with-gui

Note that the above steps didnt work. It appears that there should be a way to setup the load libraries correctly but I gave up.

otherwise continue here:

make
sudo make install
sudo ldconfig
sudo make comments-install

sudo sed -i "s/ident/trust/" /etc/postgresql/9.1/main/pg_hba.conf
sudo sed -i "s/md5/trust/" /etc/postgresql/9.1/main/pg_hba.conf
sudo sed -i "s/peer/trust/" /etc/postgresql/9.1/main/pg_hba.conf
sudo /etc/init.d/postgresql restart
createdb -U postgres geodict

sudo -u postgres createdb template_postgis
sudo -u postgres psql -d template_postgis -f /usr/share/postgresql/9.1/contrib/postgis-2.0/postgis.sql
sudo -u postgres psql -d template_postgis -f /usr/share/postgresql/9.1/contrib/postgis-2.0/spatial_ref_sys.sql
sudo -u postgres psql -d template_postgis -f /usr/share/postgresql/9.1/contrib/postgis-2.0/postgis_comments.sql
sudo -u postgres psql -d template_postgis -f /usr/share/postgresql/9.1/contrib/postgis-2.0/rtpostgis.sql
sudo -u postgres psql -d template_postgis -f /usr/share/postgresql/9.1/contrib/postgis-2.0/raster_comments.sql
sudo -u postgres psql -d template_postgis -f /usr/share/postgresql/9.1/contrib/postgis-2.0/topology.sql
sudo -u postgres psql -d template_postgis -f /usr/share/postgresql/9.1/contrib/postgis-2.0/topology_comments.sql
sudo -u postgres psql -d template_postgis -f /usr/share/postgresql/9.1/contrib/postgis-2.0/legacy.sql
sudo -u postgres psql -d template_postgis -f /usr/share/postgresql/9.1/contrib/postgis-2.0/legacy_gist.sql

cd ~/sources
git clone git://github.com/petewarden/dstk.git
git clone git://github.com/petewarden/dstkdata.git
cd dstk
sudo gem install bundler
sudo bundle install

cd ~/sources/dstkdata

If you want to save disk space and don't need geo-statistics, you can skip everything

up until the comment indicating the end of the geostats loading.

I SKIPPED TO %%%%%%%%% BELOW

createdb -U postgres -T template_postgis statistics

tar xzf statistics/gl_gpwfe_pdens_15_bil_25.tar.gz
export PATH=$PATH:/usr/lib/postgresql/9.1/bin/
/usr/lib/postgresql/9.1/bin/raster2pgsql -s 4236 -t 32x32 -I gl_gpwfe_pdens_15_bil_25/glds15ag.bil public.population_density | psql -U postgres -d statistics
rm -rf gl_gpwfe_pdens_15_bil_25
unzip statistics/glc2000_v1_1_Tiff.zip
/usr/lib/postgresql/9.1/bin/raster2pgsql -s 4236 -t 32x32 -I Tiff/glc2000_v1_1.tif public.land_cover | psql -U postgres -d statistics
rm -rf Tiff

sudo mkdir /mnt/data
sudo chown pjm /mnt/data
cd /mnt/data

The zip files are here: http://gis-lab.info/data/srtm-tif/, or here http://srtm.csi.cgiar.org/ or here https://hc.app.box.com/shared/1yidaheouv password = ThanksCSI!

sudo curl -O "http://static.datasciencetoolkit.org.s3-website-us-east-1.amazonaws.com/SRTM_NE_250m.tif.zip"

unzip SRTM_NE_250m.tif.zip

I got the TIF files from here instead!

sudo curl -O "https://hc.box.net/shared/1yidaheouv/SRTM_SE_250m_TIF.rar"
unrar SRTM_NE_250m_TIF.rar
/usr/lib/postgresql/9.1/bin/raster2pgsql -s 4236 -t 32x32 SRTM_NE_250m.tif public.elevation | psql -U postgres -d statistics
rm -rf SRTM_NE_250m*
curl -O "http://static.datasciencetoolkit.org.s3-website-us-east-1.amazonaws.com/SRTM_W_250m.tif.zip"
unzip SRTM_W_250m.tif.zip
/usr/lib/postgresql/9.1/bin/raster2pgsql -s 4236 -t 32x32 -a SRTM_W_250m.tif public.elevation | psql -U postgres -d statistics
rm -rf unzip SRTM_W_250m*
curl -O "http://static.datasciencetoolkit.org.s3-website-us-east-1.amazonaws.com/SRTM_SE_250m.tif.zip"
unzip SRTM_SE_250m.tif.zip
/usr/lib/postgresql/9.1/bin/raster2pgsql -s 4236 -t 32x32 -a -I SRTM_SE_250m.tif public.elevation | psql -U postgres -d statistics
rm -rf SRTM_SE_250m*

curl -O "http://static.datasciencetoolkit.org.s3-website-us-east-1.amazonaws.com/tmean_30s_bil.zip"
unzip tmean_30s_bil.zip
/usr/lib/postgresql/9.1/bin/raster2pgsql -s 4236 -t 32x32 -I tmean_1.bil public.mean_temperature_01 | psql -U postgres -d statistics
/usr/lib/postgresql/9.1/bin/raster2pgsql -s 4236 -t 32x32 -I tmean_2.bil public.mean_temperature_02 | psql -U postgres -d statistics
/usr/lib/postgresql/9.1/bin/raster2pgsql -s 4236 -t 32x32 -I tmean_3.bil public.mean_temperature_03 | psql -U postgres -d statistics
/usr/lib/postgresql/9.1/bin/raster2pgsql -s 4236 -t 32x32 -I tmean_4.bil public.mean_temperature_04 | psql -U postgres -d statistics
/usr/lib/postgresql/9.1/bin/raster2pgsql -s 4236 -t 32x32 -I tmean_5.bil public.mean_temperature_05 | psql -U postgres -d statistics
/usr/lib/postgresql/9.1/bin/raster2pgsql -s 4236 -t 32x32 -I tmean_6.bil public.mean_temperature_06 | psql -U postgres -d statistics
/usr/lib/postgresql/9.1/bin/raster2pgsql -s 4236 -t 32x32 -I tmean_7.bil public.mean_temperature_07 | psql -U postgres -d statistics
/usr/lib/postgresql/9.1/bin/raster2pgsql -s 4236 -t 32x32 -I tmean_8.bil public.mean_temperature_08 | psql -U postgres -d statistics
/usr/lib/postgresql/9.1/bin/raster2pgsql -s 4236 -t 32x32 -I tmean_9.bil public.mean_temperature_09 | psql -U postgres -d statistics
/usr/lib/postgresql/9.1/bin/raster2pgsql -s 4236 -t 32x32 -I tmean_10.bil public.mean_temperature_10 | psql -U postgres -d statistics
/usr/lib/postgresql/9.1/bin/raster2pgsql -s 4236 -t 32x32 -I tmean_11.bil public.mean_temperature_11 | psql -U postgres -d statistics
/usr/lib/postgresql/9.1/bin/raster2pgsql -s 4236 -t 32x32 -I tmean_12.bil public.mean_temperature_12 | psql -U postgres -d statistics
rm -rf tmean_*

curl -O "http://static.datasciencetoolkit.org.s3-website-us-east-1.amazonaws.com/prec_30s_bil.zip"
unzip prec_30s_bil.zip
/usr/lib/postgresql/9.1/bin/raster2pgsql -s 4236 -t 32x32 -I prec_1.bil public.precipitation_01 | psql -U postgres -d statistics
/usr/lib/postgresql/9.1/bin/raster2pgsql -s 4236 -t 32x32 -I prec_2.bil public.precipitation_02 | psql -U postgres -d statistics
/usr/lib/postgresql/9.1/bin/raster2pgsql -s 4236 -t 32x32 -I prec_3.bil public.precipitation_03 | psql -U postgres -d statistics
/usr/lib/postgresql/9.1/bin/raster2pgsql -s 4236 -t 32x32 -I prec_4.bil public.precipitation_04 | psql -U postgres -d statistics
/usr/lib/postgresql/9.1/bin/raster2pgsql -s 4236 -t 32x32 -I prec_5.bil public.precipitation_05 | psql -U postgres -d statistics
/usr/lib/postgresql/9.1/bin/raster2pgsql -s 4236 -t 32x32 -I prec_6.bil public.precipitation_06 | psql -U postgres -d statistics
/usr/lib/postgresql/9.1/bin/raster2pgsql -s 4236 -t 32x32 -I prec_7.bil public.precipitation_07 | psql -U postgres -d statistics
/usr/lib/postgresql/9.1/bin/raster2pgsql -s 4236 -t 32x32 -I prec_8.bil public.precipitation_08 | psql -U postgres -d statistics
/usr/lib/postgresql/9.1/bin/raster2pgsql -s 4236 -t 32x32 -I prec_9.bil public.precipitation_09 | psql -U postgres -d statistics
/usr/lib/postgresql/9.1/bin/raster2pgsql -s 4236 -t 32x32 -I prec_10.bil public.precipitation_10 | psql -U postgres -d statistics
/usr/lib/postgresql/9.1/bin/raster2pgsql -s 4236 -t 32x32 -I prec_11.bil public.precipitation_11 | psql -U postgres -d statistics
/usr/lib/postgresql/9.1/bin/raster2pgsql -s 4236 -t 32x32 -I prec_12.bil public.precipitation_12 | psql -U postgres -d statistics
rm -rf prec_*

unzip /home/pjm/sources/dstkdata/statistics/us_statistics_rasters.zip -d .
for f in .tif; do raster2pgsql -s 4236 -t 32x32 -I $f basename $f .tif | psql -U postgres -d statistics; done
rm -rf us

rm -rf metadata

This is the end of the geostats loading, continue from here if you decide to skip that part.

%%%%%%%% START HERE AGAIN

sudo gem install passenger
sudo passenger-install-apache2-module

You'll need to update the version number below to match whichever actual passenger version was installed

This is what the build said:

LoadModule passenger_module /var/lib/gems/1.8/gems/passenger-5.0.18/buildout/apache2/mod_passenger.so

PassengerRoot /var/lib/gems/1.8/gems/passenger-5.0.18

PassengerDefaultRuby /usr/bin/ruby1.8

I changed the passenger version in the lines below to match what was found from the lines above:

sudo bash -c 'echo "LoadModule passenger_module /var/lib/gems/1.8/gems/passenger-5.0.18/buildout/apache2/mod_passenger.so" > /etc/apache2/mods-enabled/passenger.load'
sudo bash -c 'echo "PassengerRoot /var/lib/gems/1.8/gems/passenger-5.0.18" > /etc/apache2/mods-enabled/passenger.conf'
sudo bash -c 'echo "PassengerRuby /usr/bin/ruby1.8" >> /etc/apache2/mods-enabled/passenger.conf'
sudo bash -c 'echo "PassengerMaxPoolSize 3" >> /etc/apache2/mods-enabled/passenger.conf'
sudo sed -i "s/MaxRequestsPerChild[ \t][ \t][0-9][0-9]/MaxRequestsPerChild 20/" /etc/apache2/apache2.conf

I needed to change the DocumentRoot to match the actual location where the data was installed. In my case the sources directory was /home/pjm/sources instead of /home/ubuntu/sources.

Ideally there should have been a new user called ubuntu but I didnt know about this until I was too far into the process.

sudo bash -c 'echo "
<VirtualHost *:8000>
ServerName 127.0.1.1
DocumentRoot /home/pjm/sources/dstk/public
RewriteEngine On
RewriteCond %{HTTP_HOST} ^datasciencetoolkit.org$ [NC]
RewriteRule ^(.)$ http://www.datasciencetoolkit.org$1 [R=301,L]
RewriteCond %{HTTP_HOST} ^datasciencetoolkit.com$ [NC]
RewriteRule ^(.
)$ http://www.datasciencetoolkit.com$1 [R=301,L]
<Directory /home/pjm/sources/dstk/public>
AllowOverride all
Options -MultiViews


" > /etc/apache2/sites-enabled/000-default'
sudo ln -s /etc/apache2/mods-available/rewrite.load /etc/apache2/mods-enabled/rewrite.load

sudo /etc/init.d/apache2 restart

sudo gem install postgres -v '0.7.9.2008.01.28'

cd ~/sources/dstk
./populate_database.rb

cd ~/sources
mkdir maxmind
cd maxmind
wget "http://geolite.maxmind.com/download/geoip/database/GeoLiteCity.dat.gz"
gunzip GeoLiteCity.dat.gz
wget "http://geolite.maxmind.com/download/geoip/api/c/GeoIP.tar.gz"
tar xzvf GeoIP.tar.gz
cd GeoIP-1.4.8/
libtoolize -f
./configure
make
sudo make install
cd ..
svn checkout svn://rubyforge.org/var/svn/net-geoip/trunk net-geoip
cd net-geoip/
ruby ext/extconf.rb
make
sudo make install

cd ~/sources
wget http://ftp.gnu.org/pub/gnu/libiconv/libiconv-1.11.tar.gz
tar -xvzf libiconv-1.11.tar.gz
cd libiconv-1.11
./configure --prefix=/usr/local/libiconv
make
sudo make install
sudo ln -s /usr/local/libiconv/lib/libiconv.so.2 /usr/lib/libiconv.so.2

createdb -U postgres -T template_postgis reversegeo

cd ~/sources
git clone git://github.com/petewarden/osm2pgsql
cd osm2pgsql/
./autogen.sh
sed -i 's/version = BZ2_bzlibVersion();//' configure
sed -i 's/version = zlibVersion();//' configure
./configure
make
sudo make install
cd ..

osm2pgsql -U postgres -d reversegeo -p world_countries -S osm2pgsql/styles/world_countries.style dstkdata/world_countries.osm -l
osm2pgsql -U postgres -d reversegeo -p admin_areas -S osm2pgsql/styles/admin_areas.style dstkdata/admin_areas.osm -l
osm2pgsql -U postgres -d reversegeo -p neighborhoods -S osm2pgsql/styles/neighborhoods.style dstkdata/neighborhoods.osm -l

The above commands take several hours to complete

I started the next set of commands in a new window...

cd ~/sources
git clone git://github.com/petewarden/boilerpipe
cd boilerpipe/boilerpipe-core/
ant
cd src
javac -cp ../dist/boilerpipe-1.1-dev.jar boilerpipe.java

cd ~/sources/dstk/
psql -U postgres -d reversegeo -f sql/loadukpostcodes.sql

osm2pgsql -U postgres -d reversegeo -p uk_osm -S ../osm2pgsql/default.style ../dstkdata/uk_osm.osm.bz2 -l

psql -U postgres -d reversegeo -f sql/buildukindexes.sql

cd ~/sources
git clone git://github.com/geocommons/geocoder.git
cd geocoder
make
sudo make install

Build the latest Tiger/Line data for US address lookups

cd /mnt/data
mkdir tigerdata
cd tigerdata
lftp ftp2.census.gov:/geo/tiger/TIGER2012/EDGES
mirror --parallel=5 .
cd ../FEATNAMES
mirror --parallel=5 .
cd ../ADDR
mirror --parallel=5 .
exit
cd ~/sources/geocoder/build/
mkdir ../../geocoderdata/
./tiger_import ../../geocoderdata/geocoder2012.db /mnt/data/tigerdata/

Completed to here

cd ~/sources
git clone git://github.com/luislavena/sqlite3-ruby.git
cd sqlite3-ruby
ruby setup.rb config
ruby setup.rb setup
sudo ruby setup.rb install

cd ~/sources/geocoder
bin/rebuild_metaphones ../geocoderdata/geocoder2012.db
chmod +x build/build_indexes
build/build_indexes ../geocoderdata/geocoder2012.db
rm -rf /mnt/data/tigerdata

createdb -U postgres names
cd /mnt/data
curl -O "http://www.ssa.gov/oact/babynames/names.zip"
dos2unix yob*.txt
~/sources/dstk/dataconversion/analyzebabynames.rb . > babynames.csv
psql -U postgres -d names -f ~/sources/dstk/sql/loadnames.sql

Fix for postgres crashes,

sudo sed -i "s/shared_buffers = [0-9A-Za-z]*/shared_buffers = 512MB/" /etc/postgresql/9.1/main/postgresql.conf
sudo sysctl -w kernel.shmmax=576798720
sudo bash -c 'echo "kernel.shmmax=576798720" >> /etc/sysctl.conf'
sudo bash -c 'echo "vm.overcommit_memory=2" >> /etc/sysctl.conf'
sudo sed -i "s/max_connections = 100/max_connections = 200/" /etc/postgresql/9.1/main/postgresql.conf
sudo /etc/init.d/postgresql restart

Remove files not needed at runtime

rm -rf /mnt/data/*
rm -rf ~/sources/libiconv-1.11.tar.gz
rm -rf ~/sources/postgis-2.0.3.tar.gz
cd ~/sources/
mkdir dstkdata_runtime
mv dstkdata/ethnicityofsurnames.csv dstkdata_runtime/
mv dstkdata/GeoLiteCity.dat dstkdata_runtime/
rm -rf dstkdata
mv dstkdata_runtime dstkdata

Up to this point, you'll have a 0.50 version of the toolkit.

The following will upgrade you to a 0.51 version

cd ~/sources/dstk
git pull origin master

I found that the toolkit wass already uptodate

TwoFishes geocoder

cd ~/sources
mkdir twofishes
cd twofishes
mkdir bin
curl "http://www.twofishes.net/binaries/latest.jar" > bin/twofishes.jar
mkdir data

The source link above is obsolete

curl "http://www.twofishes.net/indexes/revgeo/latest.zip" > data/twofishesdata.zip

This one might work... its unknown what was in latest.zip versus 2015-03-05.zip

curl "http://www.twofishes.net/indexes/revgeo/2015-03-05.zip" > data/twofishesdata.zip

The ~/sources/dstk/twofishd.sh must be edited to point to the new directory.

change

java -Xmx1500M -jar /home/ubuntu/sources/twofishes/bin/twofishes.jar --hfile_basepath /home/ubuntu/sources/twofishes/data/latest/

to this

java -Xmx1500M -jar /home/pjm/sources/twofishes/bin/twofishes.jar --hfile_basepath /home/pjm/sources/twofishes/data/2015-03-05-20-05-30.753698/

The entire ~/sources/dstk/ directory should be check to see if there is any reference to /home/ubuntu and renamed to point to /home/pjm instead

I looked through the dstk and found several instances like this:

cd ~/sources/dstk

grep '/home/ubuntu' *

geodict_daemon.rb:Daemons.run('/home/ubuntu/sources/dstk/dstk_server.rb', {

twofishes.conf:exec start-stop-daemon --start -c root --exec /home/ubuntu/sources/dstk/twofishesd.sh

twofishesd.sh:java -Xmx1500M -jar /home/ubuntu/sources/twofishes/bin/twofishes.jar --hfile_basepath /home/ubuntu/sources/twofishes/data/latest/

cd data
unzip twofishesdata.zip

sudo cp ~/sources/dstk/twofishes.conf /etc/init/twofishes.conf
sudo service twofishes start

Here is what the VirtualHost field looks like already

sudo bash -c 'echo "

<VirtualHost *:8000>

ServerName 127.0.1.1

DocumentRoot /home/pjm/sources/dstk/public

RewriteEngine On

RewriteCond %{HTTP_HOST} ^datasciencetoolkit.org$ [NC]

RewriteRule ^(.*)$ http://www.datasciencetoolkit.org$1 [R=301,L]

RewriteCond %{HTTP_HOST} ^datasciencetoolkit.com$ [NC]

RewriteRule ^(.*)$ http://www.datasciencetoolkit.com$1 [R=301,L]

<Directory /home/pjm/sources/dstk/public>

AllowOverride all

Options -MultiViews

" > /etc/apache2/sites-enabled/000-default'

This will be changed to this now:

sudo bash -c 'echo "
<VirtualHost :8000>
ServerName 127.0.1.1
DocumentRoot /home/pjm/sources/dstk/public
RewriteEngine On
RewriteCond %{HTTP_HOST} ^datasciencetoolkit.org$ [NC]
RewriteRule ^(.)$ http://www.datasciencetoolkit.org$1 [R=301,L]
RewriteCond %{HTTP_HOST} ^datasciencetoolkit.com$ [NC]
RewriteRule ^(.
)$ http://www.datasciencetoolkit.com$1 [R=301,L]
# We have an internal TwoFishes server running on port 8081, so redirect
# requests that look like they belong to its API
ProxyPass /twofishes http://localhost:8081
<Directory /home/pjm/sources/dstk/public>
AllowOverride all
Options -MultiViews
Header set Access-Control-Allow-Origin "
"
Header set Cache-Control "max-age=86400"


" > /etc/apache2/sites-enabled/000-default'
sudo ln -s /etc/apache2/mods-available/rewrite.load /etc/apache2/mods-enabled/rewrite.load
sudo ln -s /etc/apache2/mods-available/proxy.load /etc/apache2/mods-enabled/proxy.load
sudo ln -s /etc/apache2/mods-available/proxy_http.load /etc/apache2/mods-enabled/proxy_http.load
sudo ln -s /etc/apache2/mods-available/headers.load /etc/apache2/mods-enabled/headers.load

sudo /etc/init.d/apache2 restart

I now go to http://192.168.0.5:8000 and I get the datasciencetoolkit webpage along with all the tools!! Nice!!

Berlin, Germany is not recognized

Steps:
Feed "Berlin, Germany" into text2places

Expected result:
The capital of Germany is recognized!

Actual result:
No match is returned

street2coordinates fails after about 1000 queries

From email:

I'm running DSTK on an EC2 instance, and after roughly 1000 queries to
street2coordinates/ I get the following error. The error appears no
matter what street address is given. Rebooting the EC2 image fixes
it, but it reappears predictably after another 1000 or so queries. Is
it possible there query limits on the EC2 image? If so, how do I
remove them?

{"error":"street2coordinates error: #["/usr/lib/ruby/1.8/sqlite3/
errors.rb:62:in check'\", \"/usr/lib/ruby/1.8/sqlite3/database.rb: 79:ininitialize'", "../geocoder/lib/geocoder/us/database.rb:38:in
new'\", \"../geocoder/lib/geocoder/us/database.rb:38:in initialize'", "./dstk_server.rb:457:in new'\", \"./dstk_server.rb: 457:instreet2coordinates'", "./dstk_server.rb:953:in `GET /
street2coordinates/*'",
...

GEO Version question from ec2setup.txt

Pete,

I am trying to build my own dstk server using the ec2setup.txt instructions.
I’m running Ubuntu 12.04 and have run into problems with configure – results below. It seems like there is something wrong with the TOPOLOGY support (or a version check issue??).

Is there some way to upgrade the GEOS version that I should use to continue? What will happen if I use the --without-topology switch?
Do I need to go through the configure script an patch it? Any insight you can provide would be useful.

… cut here …
checking libxml/xpathInternals.h usability... yes
checking libxml/xpathInternals.h presence... yes
checking for libxml/xpathInternals.h... yes
checking for xmlInitParser in -lxml2... yes
checking for geos-config... /usr/bin/geos-config
checking GEOS version... 3.2.2
checking geos_c.h usability... yes
checking geos_c.h presence... yes
checking for geos_c.h... yes
checking for initGEOS in -lgeos_c... yes
checking whether make sets $(MAKE)... yes
checking for a BSD-compatible install... /usr/bin/install -c
checking for a thread-safe mkdir -p... /bin/mkdir -p
checking whether NLS is requested... yes
checking for msgfmt... /usr/bin/msgfmt
checking for gmsgfmt... /usr/bin/msgfmt
checking for xgettext... /usr/bin/xgettext
checking for msgmerge... /usr/bin/msgmerge
checking for ld used by GCC... /usr/bin/ld
checking if the linker (/usr/bin/ld) is GNU ld... yes
checking for shared library run path origin... done
checking for CFPreferencesCopyAppValue... no
checking for CFLocaleCopyCurrent... no
checking for GNU gettext in libc... yes
checking whether to use NLS... yes
checking where the gettext function comes from... libc
checking proj_api.h usability... yes
checking proj_api.h presence... yes
checking for proj_api.h... yes
checking for pj_get_release in -lproj... yes
checking json/json.h usability... yes
checking json/json.h presence... yes
checking for json/json.h... yes
checking for json_object_get in -ljson... yes
GUI: Build requested, checking for dependencies (GKT+2.0)
checking for pkg-config... /usr/bin/pkg-config
checking for GTK+ - version >= 2.8.0... yes (version 2.24.10)
TOPOLOGY: Topology support requested
configure: error: Topology requires GEOS version >= 3.3.2. Use --without-topology or install a newer GEOS.
pjm@pjm-desktop:~/sources/postgis-2.0.3$

Locating U.S. cities

Geodict has trouble with some U.S. cities (and others, too, I imagine). Specifically, it doesn't seem to consider region names, so it doesn't flag things like "Brooklyn, NY" or "Chicago, IL" as named locations. It does recognize "Brooklyn, United States," but then there's the problem that it doesn't know which state is the one in question (here it defaults to Brooklyn, AL). And of course no one ever writes "Brooklyn, United States."

Looking at the code in geodict_lib.rb, it seems this shouldn't be the case, that regions/states should be matched. But it doesn't work that way when I invoke it from the web-based interface and I'm not a good enough programmer to see what the issue might be.

Note that this isn't related to the database population issue is bug #7 (#7), which I've fixed on my machine (with your patch to the populate_database script).

Also (forgive me if this should be a separate bug), the speed-optimized matching from the end of the string forward seems to produce problems with some multi-word city names. For example, "San Francisco, United States" is matched as "Francisco, United States." I imagine taking regions into account would help, but it wouldn't solve the problem. To wit, "New York, NY" and "York, NY."

Cities getting overwritten in geodict/text2places database

I'm seeing something strange in the cities table; it looks as though a lot of cities that are in the source data are missing from the populated geodict database, possibly getting clobbered on import.

Take Brooklyn, for example. In worldcitiespop.csv, grep finds 49 entries for 'brooklyn' (42 of which are in the US); in the geodict database, there are five entries for 'brooklyn', only one of which is in the US (and the US entry is in Alabama). The same seems to be true of other US cities like Rochester and Boston, each of which is found only once in the US (and in an alphabetically early state like AL or CA). Are the others getting clobbered on import? Or am I maybe making a mistake in looking through the database (not much experience with MySQL here).

The SQL query I'm using is:

SELECT city, country, region_code, population, lat, lon FROM cities WHERE city = 'Brooklyn';
Other things that might be relevant:

The populate_database.py script produces two errors when I run it:
./populate_database.py:49: Warning: Data truncated for column 'last_word' at row 1 (city, country, region_code, population, lat, lon, last_word))

./populate_database.py:49: Warning: Data truncated for column 'city' at row 1 (city, country, region_code, population, lat, lon, last_word))

populate_database.py won't work at all unless I first create the geodict database by hand, even though it looks as though the script is meant to handle that.

System info:

uname -a

Darwin wilkens-imac.wustl.edu 10.7.0 Darwin Kernel Version 10.7.0: Sat Jan 29 15:17:16 PST 2011; root:xnu-1504.9.37~1/RELEASE_I386 i386

mysql --version

mysql Ver 14.14 Distrib 5.1.56, for apple-darwin10.3.0 (i386) using readline 5.1

Any other info I can provide? Happy to do any kind of debugging that might help. Thanks!

URL encode method in Ruby gem might be buggy

Hey,

I noticed that the method DSTK::DSTK.u does not appear to produce the correct output on Ruby 1.9.3-p392 sometimes produces unexpected results, for example:

irb(main):005:0> dstk.u "1600 Amphitheatre Parkway Mountain View CA"
ArgumentError: invalid value for Integer(): " "
    from /home/user/.rvm/gems/ruby-1.9.3-p392/gems/dstk-0.50.2/lib/dstk.rb:38:in `sprintf'
    from /home/user/.rvm/gems/ruby-1.9.3-p392/gems/dstk-0.50.2/lib/dstk.rb:38:in `block in u'
    from /home/user/.rvm/gems/ruby-1.9.3-p392/gems/dstk-0.50.2/lib/dstk.rb:38:in `gsub'
    from /home/user/.rvm/gems/ruby-1.9.3-p392/gems/dstk-0.50.2/lib/dstk.rb:38:in `u'
    from (irb):5
    from /home/user/.rvm/rubies/ruby-1.9.3-p392/bin/irb:13:in `<main>'

As I haven't been able to track down the location where the code of the gem is hosted, so I'll just report it here. In my case I've simply resolved it by using encoding provided by open-uri:

require 'open-uri'

  def u(value)
    URI::encode(value)
  end

Of course this can be solved even neater by simply removing the u-method and using this encoding method directly. I've tried to look up the url that is mentioned in the comments (http://web.elctech.com/?p=58) but it appears to be down, so I'm not sure there's some edge case at play here or if the method is actually faulty.

No postal code returned (even if it's provided!)

Here is a call that returns no postal code:
http://www.datasciencetoolkit.org/maps/api/geocode/json?address=4699+Auburn+Blvd%2C+Sacramento%2C+CA%2C+USA

even if I include postal code:
http://www.datasciencetoolkit.org/maps/api/geocode/json?address=4699+Auburn+Blvd%2C+Sacramento%2C+CA+95841%2C+USA

No postal code is returned while other structured data is. Worth noting that google returns a postal code in this case:

Street: 4699 Auburn Boulevard
City: Sacramento
State: CA
Zip: 95841
Latitude: 38.653747
Longitude: -121.3545975
Country: US

(and the lat/lng are similar)

Is this to be expected? It's just a bit strange to me

Google style geocoder returning inconsistent results to same query

When querying cities in Canada, the google style geocoder is occasionally returning results in Europe. This seems to happen randomly. For example querying through the web interface using

"100 Duncan St Toronto ON Canada" I could press refresh and toggle back and forth between the following results, seemingly randomly

[{"address_components":[{"short_name":"20","types":["administrative_area_level_1","political"],"long_name":"20"},{"short_name":"tr","types":["country","political"],"long_name":"Turkey"}],"types":["administrative_area_level_1","political"],"geometry":{"location_type":"APPROXIMATE","location":{"lat":38.9167,"lng":40.3},"viewport":{"southwest":{"lat":37.9167,"lng":39.3},"northeast":{"lat":39.9167,"lng":41.3}}}}]

[{"geometry":{"location_type":"APPROXIMATE","location":{"lng":-79.4163,"lat":43.70011},"viewport":{"southwest":{"lng":-79.6427230835,"lat":43.5466194153},"northeast":{"lng":-79.2320251465,"lat":43.8083610535}}},"types":["locality","political"],"address_components":[{"short_name":"Toronto","long_name":"Toronto, ON, CA","types":["locality","political"]},{"short_name":"CA","long_name":"Canada","types":["country","political"]}]}]

html2text crashes on BBC News front page

Steps to reproduce:
html2text http://bbc.co.uk/news

Result:
Ruby code fails with
ERROR TypeError: expected Hash (got String) for param 1' /opt/local/lib/ruby/gems/1.8/gems/rack-1.2.2/lib/rack/utils.rb:93:innormalize_params'
/opt/local/lib/ruby/gems/1.8/gems/rack-1.2.2/lib/rack/utils.rb:94:in `normalize_params'
...

Python street2coordinates UnicodeDecodeError

For the address "7332 CIRCULO PAPAYO, CARLSBAD,CA 92009" I get an exception "UnicodeDecodeError: 'utf8' codec can't decode byte 0xed ...".

The stack trace points to dstk.py, line 106, in street2coordinates
response = json.loads(response_string)

Adding
response_string = unicode(response_string, 'latin-1')
response = json.loads(response_string)

Does the trick.

I'll make a pull request.

Google-style geocoder has difficulty parsing less formal addresses

I've set up an AWS instance running version 0.51 and am having issues getting it to recognize common versions of some addresses.

For example, the Village Voice building in New York City is located at

36 Cooper Square, New York City, New York 10003

However, the following versions of the address all result in a Lat/Lng located in Turkey.

36 Cooper Square, New York City
36 Cooper Square, nyc
36 Cooper Square, nyc, ny
36 Cooper Square, 10003

Example URL:

/maps/api/geocode/json?sensor=false&address=36%20cooper%20sq,%20nyc,%20ny

The resulting JSON looks like:

{
  "status": "OK",
  "results": [
    {
      "address_components": [
        {
          "long_name": "36",
          "types": [
            "administrative_area_level_1",
            "political"
          ],
          "short_name": "36"
        },
        {
          "long_name": "Turkey",
          "types": [
            "country",
            "political"
          ],
          "short_name": "tr"
        }
      ],
      "geometry": {
        "location_type": "APPROXIMATE",
        "viewport": {
          "southwest": {
            "lat": 38.9389,
            "lng": 34.3244
          },
          "northeast": {
            "lat": 40.9389,
            "lng": 36.3244
          }
        },
        "location": {
          "lat": 39.9389,
          "lng": 35.3244
        }
      },
      "types": [
        "administrative_area_level_1",
        "political"
      ]
    }
  ]
}

html2story UTF-8 issue

From email:

I had just tried to mess with the html2story api, and sent an UTF-8 encoded html string in. The results were great, except all the accented characters (e.g. [áéíóöőúüű] - all the Hungarian vowels) where sent back as "??".

New York, NY is not recognized

From Gabe Gaster on dstk-users:

I'm a relatively new user with a noob kind of question. I understand
that geodict is designed to be intolerant of false positives, but no
matter how I try I can't seem to get geodict to recognize the place
"New York NY" or similar text. Is there a way around this or to make
geodict more tolerant of false positives or is YahooPlaceFinder more
suited for that?

file2text broken?

Trying to convert pdf/jpeg to text, I'm getting:

{ "error": "Error when converting file to text" }

and for docx, just:

Internal Server Error

Same issue on datasciencetoolkit.org and my own local instance. I've tried several files to (hopefully) rule out an isolated issue with my data.

Vagrant VM download failure

Hi, I tried to download the Vagrant VM but received this error:

$ vagrant box add dstk http://where.jetpac.com.s3.amazonaws.com/dstk_0.51.box                                                   
Downloading or copying the box...
Extracting box...te: 2512k/s, Estimated time remaining: 0:00:01)
The box failed to unpackage properly. Please verify that the box
file you're trying to add is not corrupted and try again. The
output from attempting to unpackage (if any):

x box-disk1.vmdk: Write failed
x box.ovf: Write to restore size failed
x include: Write to restore size failed
x include/_Vagrantfile: Write to restore size failed
x Vagrantfile: Write to restore size failed
bsdtar: Error exit delayed from previous errors.

Docker Image

Hi there,

I'd love to stick DSTK into our Mesos cluster, and a Docker image would be perfect for that. Has there been any work done on this? Would it be a welcome pull request?

Scripts fail with Python 3.2

From email:

I'm using terminal on my Mac running 10.6.7 with Python 3.2 as the
default.

On the Command Line I put, html2text http://nytimes.com | text2people
I get the following error:

don-larsons-imac:~ dwlarson$ html2text http://nytimes.com |
text2people
File "/usr/bin/dstk.py", line 119
print api_body
^
SyntaxError: invalid syntax
File "/usr/bin/dstk.py", line 119
print api_body
^
SyntaxError: invalid syntax

coordinates2politics doesn't handle bare inputs

From email:

I'm having trouble calling the coordinates2politics API. I tried the curl example in /developerdocs, and also coordinates for other addresses, and each time (all from my terminal) I get an "internal server error". Am I missing something in my setup, or is the server down?

Ruby gem does not support the geocode API

I noticed that the Ruby gem does not currently support the Google-style Geocoder method. As with #24 I'm not sure where the code can be found so I'm not able to submit a pull request. In any case, adding support for this method should be as simple as incorporating the following:

  def geocode(text)
    response = json_api_call('/maps/api/geocode/json', {"address" => text})
    response
  end

Encoding issue on gem test 'test_coordinates2statistics'

The 'test_coordinates2statistics' test is failing right now. It appears to be an encoding issue between the 'expected' and the 'response'.

/rubygem/test/test_dstk.rb:81

(rdb:1) response[0]['statistics']['population_density']['description'].encoding
#<Encoding:UTF-8>
(rdb:1) expected[0]['statistics']['population_density']['description'].encoding
#<Encoding:US-ASCII>

The problem lies with the '-' in the description of SEDAC. If I compare the first part of the string, then assert_equal passes:

(rdb:1) ex =  expected[0]['statistics']['population_density']['description'][0..20]
"The number of inhabit"
(rdb:1) re =  response[0]['statistics']['population_density']['description'][0..20]
"The number of inhabit"
(rdb:1) assert_equal ex, re
true

The shortest route is to change the test, there may be more we want to do to clean it up.

Google style geocoder yields incorrect results

example:

27 Saint Lukes RD Allston MA 02134
dstk places this at: 27 Sandy Way, Weymouth, MA

71 Rockdale ST Mattapan MA 02136
dstk: 71 Rockdale St, Braintree, MA

698 Eighth ST South Boston MA 03108
dstk: 198 Boston St, Manchester, NH

Size issue

I'm following your EC2 setup instructions as guide to install on a non-EC2 server. I noticed this is taking up quite a bit of space. Is there any way to specify the geographical data that's used to save space and bandwidth?

Error running Boilerpipe

Have setup my own server from Vagrant, trying to use /html2story, and it's throwing an error:

petey$ curl -d "<html><head><title>MyTitle</title></head><body><scrit type="text/javascript">something();</script><div>Some actual text</div></body></html>" "http://myserver.org:8080/html2story"
<?xml version="1.0" encoding="utf-8"?><error>Error running Boilerpipe</error>

/html2text (and everything else I've tried) works fine (Pete, when we emailed re: this, used the wrong endpoint, i.e. this one):

petey$ curl -d "<html><head><title>MyTitle</title></head><body><script type="text/javascript">something();</script><div>Some actual text</div></body></html>" "http://myserver.org:8080/html2story"
{
  "text": "Some ctual text\n"}

I don't have a ton of RAM or CPU in this machine; could it be running into some kind of error ala TwoFishes? Searched 'boilerplate' in issues and didn't find anything...I'm assuming it's the inadequacy of my hardware, but figured I would report so it could at least be a known issue.

ec2setup.txt instructions incomplete.

Hello,
I'm trying to create an installation of the Data Science Toolkit on a fresh Ubuntu 12.04 installation running on VMWare in my datacenter. I've been using the steps listed in ec2setup.txt, but they seem to be missing a step.

Line 49 references the directory ~/sources/dstkdata which has not yet been created. In this directory I am supposed to untar gl_gpwfe_pdens_15_bil_25.tar.gz, but I have no idea where to get that file. glc2000_v1_1_Tiff.zip, SRTM_NE_250m.tif.zip, SRTM_W_250m.tif.zip, SRTM_SE_250m.tif.zip, tmean_30s_bil.zip also lack a source.

By chance, is there a version of the Data Science Toolkit that can run on VMWare? I understand you used to support this.

Virtual machine server error on missing bounds

Reproducing the error
Using the vagrant image from http://where.jetpac.com.s3.amazonaws.com/dstk_0.51.box, request the location for 'punta' http://localhost:8080/maps/api/geocode/json?address=punta&language=en&sensor=false, this will result in an internal server error.

Analysis
The issue seems to lie in 'emulategoogle.rb', line 312. Checking if bounds is not nil, provides at least a temporary fix for the issue.


Edit: The location 'punta del hidalgo' (http://localhost:8080/maps/api/geocode/json?address=punta+del+hidalgo&language=en&sensor=false) seems to result in a couple of null pointer exceptions.

Internal Server Error when I use Google Style Geocoder

Hi,

I used the amazon AMI ID to launch the server and when I paste my public dns name on the browser I can see the home page. The problem arises when I use the google style geocoder API it is showing internal error.

Request - http://www.datasciencetoolkit.org/maps/api/geocode/json?sensor=false
&address=1600+Amphitheatre+Parkway,+Mountain+View,+CA

It is returning result but when I try this:

Request - ec2-45-567-3-75.compute-1.amazonaws.com/maps/api/geocode/json?sensor=false
&address=1600+Amphitheatre+Parkway,+Mountain+View,+CA (Changed DNS Name)

It is showing server internal error. My server is running since I can see the homepage but the APis are not running. Could you please help with this.

Regards

Unexpected Google-style geocoding results

During some testing I noticed that for some strings the results produced by the Google-style geocoder of the dstk API are at least somewhat unexpected.

For example, the following string:

28 2nd St, San Francisco, CA 94105, USA

Results in a response that shows a location somewhere on the border between Turkey and Syria, while Google is able to geocode this properly.

As a workaround I noticed that removing the ', USA' part restores the output to the expected value but it would be great if the geocoder also works when using a more international format.

Pypi Version Mismatch

The dstk package on Pypi has required_version set to 130, but all the servers seem to be on version 40 or 41. The required_version in dstk.py in this repository still has it set to 40. Not sure which is correct.

Thanks for the great project!

Frequent internal errors when using the DSTK AMI

Lately we've been using a lot of internal errors being returned by the DSTK API for a large range of queries, which makes me wonder if we've perhaps misconfigured it somehow. Each of these exceptions boil down to the following:

23.0.0.109 - - [11/Aug/2013 16:41:53] "GET /maps/api/geocode/json?address=Mansfield,%20TX,%20US " 200 1104 0.0025
54.0.0.128 - - [11/Aug/2013 16:41:54] "GET /info " 200 19 0.0006
54.0.0.128 - - [11/Aug/2013 16:42:06] "GET /info " 200 19 0.0006
ERROR:  relation "postal_codes" does not exist
LINE 1: DECLARE myportal CURSOR FOR SELECT * FROM postal_codes WHERE...
                                                  ^
SystemExit - exit:
 /home/ubuntu/sources/dstk/geodict_lib.rb:776:in `exit'
 /home/ubuntu/sources/dstk/geodict_lib.rb:776:in `select_as_hashes'
 /home/ubuntu/sources/dstk/geodict_lib.rb:563:in `is_postal_code'
 /home/ubuntu/sources/dstk/geodict_lib.rb:83:in `send'
 /home/ubuntu/sources/dstk/geodict_lib.rb:83:in `find_locations_in_text'
 /home/ubuntu/sources/dstk/geodict_lib.rb:776:in `each_with_index'
 /home/ubuntu/sources/dstk/geodict_lib.rb:70:in `each'
 /home/ubuntu/sources/dstk/geodict_lib.rb:70:in `each_with_index'
 /home/ubuntu/sources/dstk/geodict_lib.rb:70:in `find_locations_in_text'
 /home/ubuntu/sources/dstk/geodict_lib.rb:60:in `each'
 /home/ubuntu/sources/dstk/geodict_lib.rb:60:in `find_locations_in_text'
 /home/ubuntu/sources/dstk/emulategoogle.rb:41:in `google_geocoder_api_call'
 ./dstk_server.rb:1322:in `GET /maps/api/geocode/:format'
 /var/lib/gems/1.8/gems/sinatra-1.2.0/lib/sinatra/base.rb:1125:in `call'
 /var/lib/gems/1.8/gems/sinatra-1.2.0/lib/sinatra/base.rb:1125:in `compile!'
 /var/lib/gems/1.8/gems/sinatra-1.2.0/lib/sinatra/base.rb:709:in `instance_eval'
 /var/lib/gems/1.8/gems/sinatra-1.2.0/lib/sinatra/base.rb:709:in `route_eval'
 /var/lib/gems/1.8/gems/sinatra-1.2.0/lib/sinatra/base.rb:693:in `route!'
 /var/lib/gems/1.8/gems/sinatra-1.2.0/lib/sinatra/base.rb:741:in `process_route'
 /var/lib/gems/1.8/gems/sinatra-1.2.0/lib/sinatra/base.rb:738:in `catch'
 /var/lib/gems/1.8/gems/sinatra-1.2.0/lib/sinatra/base.rb:738:in `process_route'
 /var/lib/gems/1.8/gems/sinatra-1.2.0/lib/sinatra/base.rb:692:in `route!'
 /var/lib/gems/1.8/gems/sinatra-1.2.0/lib/sinatra/base.rb:691:in `each'
 /var/lib/gems/1.8/gems/sinatra-1.2.0/lib/sinatra/base.rb:691:in `route!'
 /var/lib/gems/1.8/gems/sinatra-1.2.0/lib/sinatra/base.rb:826:in `dispatch!'
 /var/lib/gems/1.8/gems/sinatra-1.2.0/lib/sinatra/base.rb:619:in `call!'
 /var/lib/gems/1.8/gems/sinatra-1.2.0/lib/sinatra/base.rb:791:in `instance_eval'
 /var/lib/gems/1.8/gems/sinatra-1.2.0/lib/sinatra/base.rb:791:in `invoke'
 /var/lib/gems/1.8/gems/sinatra-1.2.0/lib/sinatra/base.rb:791:in `catch'
 /var/lib/gems/1.8/gems/sinatra-1.2.0/lib/sinatra/base.rb:791:in `invoke'
 /var/lib/gems/1.8/gems/sinatra-1.2.0/lib/sinatra/base.rb:619:in `call!'
 /var/lib/gems/1.8/gems/sinatra-1.2.0/lib/sinatra/base.rb:604:in `call'
 /var/lib/gems/1.8/gems/rack-1.2.1/lib/rack/methodoverride.rb:24:in `call'
 /var/lib/gems/1.8/gems/rack-1.2.1/lib/rack/commonlogger.rb:18:in `call'
 /var/lib/gems/1.8/gems/sinatra-1.2.0/lib/sinatra/base.rb:1237:in `call'
 /var/lib/gems/1.8/gems/sinatra-1.2.0/lib/sinatra/base.rb:1263:in `synchronize'
 /var/lib/gems/1.8/gems/sinatra-1.2.0/lib/sinatra/base.rb:1237:in `call'
 /var/lib/gems/1.8/gems/passenger-3.0.19/lib/phusion_passenger/rack/request_handler.rb:96:in `process_request'
 /var/lib/gems/1.8/gems/passenger-3.0.19/lib/phusion_passenger/abstract_request_handler.rb:516:in `accept_and_process_next_request'
 /var/lib/gems/1.8/gems/passenger-3.0.19/lib/phusion_passenger/abstract_request_handler.rb:274:in `main_loop'
 /var/lib/gems/1.8/gems/passenger-3.0.19/lib/phusion_passenger/rack/application_spawner.rb:206:in `start_request_handler'
 /var/lib/gems/1.8/gems/passenger-3.0.19/lib/phusion_passenger/rack/application_spawner.rb:171:in `send'
 /var/lib/gems/1.8/gems/passenger-3.0.19/lib/phusion_passenger/rack/application_spawner.rb:171:in `handle_spawn_application'
 /var/lib/gems/1.8/gems/passenger-3.0.19/lib/phusion_passenger/utils.rb:470:in `safe_fork'
 /var/lib/gems/1.8/gems/passenger-3.0.19/lib/phusion_passenger/rack/application_spawner.rb:166:in `handle_spawn_application'
 /var/lib/gems/1.8/gems/passenger-3.0.19/lib/phusion_passenger/abstract_server.rb:357:in `__send__'
 /var/lib/gems/1.8/gems/passenger-3.0.19/lib/phusion_passenger/abstract_server.rb:357:in `server_main_loop'
 /var/lib/gems/1.8/gems/passenger-3.0.19/lib/phusion_passenger/abstract_server.rb:206:in `start_synchronously'
 /var/lib/gems/1.8/gems/passenger-3.0.19/lib/phusion_passenger/abstract_server.rb:180:in `start'
 /var/lib/gems/1.8/gems/passenger-3.0.19/lib/phusion_passenger/rack/application_spawner.rb:129:in `start'
 /var/lib/gems/1.8/gems/passenger-3.0.19/lib/phusion_passenger/spawn_manager.rb:253:in `spawn_rack_application'
 /var/lib/gems/1.8/gems/passenger-3.0.19/lib/phusion_passenger/abstract_server_collection.rb:132:in `lookup_or_add'
 /var/lib/gems/1.8/gems/passenger-3.0.19/lib/phusion_passenger/spawn_manager.rb:246:in `spawn_rack_application'
 /var/lib/gems/1.8/gems/passenger-3.0.19/lib/phusion_passenger/abstract_server_collection.rb:82:in `synchronize'
 /var/lib/gems/1.8/gems/passenger-3.0.19/lib/phusion_passenger/abstract_server_collection.rb:79:in `synchronize'
 /var/lib/gems/1.8/gems/passenger-3.0.19/lib/phusion_passenger/spawn_manager.rb:244:in `spawn_rack_application'
 /var/lib/gems/1.8/gems/passenger-3.0.19/lib/phusion_passenger/spawn_manager.rb:137:in `spawn_application'
 /var/lib/gems/1.8/gems/passenger-3.0.19/lib/phusion_passenger/spawn_manager.rb:275:in `handle_spawn_application'
 /var/lib/gems/1.8/gems/passenger-3.0.19/lib/phusion_passenger/abstract_server.rb:357:in `__send__'
 /var/lib/gems/1.8/gems/passenger-3.0.19/lib/phusion_passenger/abstract_server.rb:357:in `server_main_loop'
 /var/lib/gems/1.8/gems/passenger-3.0.19/lib/phusion_passenger/abstract_server.rb:206:in `start_synchronously'
 /var/lib/gems/1.8/gems/passenger-3.0.19/helper-scripts/passenger-spawn-server:99

Our installation has the latest revisions from Git on it. Is there perhaps some way to resolve these postal_codes errors?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.