Giter Club home page Giter Club logo

hiveudfs's Introduction

HiveUDFs

My Personal Collection of Hive UDFs

Compiling

This is a Maven project. To compile it

mvn install

Function Lists

Below are function list that is currently in this project. More to come!

LongToIP

LongToIP translates IP in long format to string format.

Usage:

ADD JAR HiveUDFs.jar;
CREATE TEMPORARY FUNCTION longtoip as 'net.petrabarus.hiveudfs.LongToIP';
SELECT longtopip(2130706433) FROM table;

IPToLong

IPToLong translates IP in string format to long format.

Usage:

ADD JAR HiveUDFs.jar;
CREATE TEMPORARY FUNCTION iptolong as 'net.petrabarus.hiveudfs.IPToLong';
SELECT iptolong("127.0.0.1") FROM table;

GeoIP

GeoIP wraps MaxMind GeoIP function for Hive. This is a derivation from @edwardcapriolo hive-geoip. Separate GeoIP database will be needed to run the function. The GeoIP will need three argument.

  1. IP address in long
  2. IP attribute (e.g. COUNTRY, CITY, REGION, etc. See full list in the javadoc.)
  3. Database file name

A lite version of the MaxMind GeoIP can be obtained from [here] (http://dev.maxmind.com/geoip/geolite).

Usage:

ADD JAR HiveUDFs.jar;
ADD FILE /usr/share/GeoIP/GeoIPCity.dat;
CREATE TEMPORARY FUNCTION geoip as 'net.petrabarus.hiveudfs.GeoIP';
SELECT GeoIP(cast(ip AS bigint), 'CITY', './GeoIPCity.dat') FROM table;

SearchEngineKeyword

SearchEngineKeyword is a simple function to extract keyword from URL referrer that comes from Google, Bing, and Yahoo. Need to expand this to cover more search engines.

Usage

ADD JAR HiveUDFs.jar
CREATE TEMPORARY FUNCTION searchenginekeyword as 'net.petrabarus.hiveudfs.SearchEngineKeyword';
SELECT searchenginekeyword(url) FROM table;

UCWords

UCWords is UDF function equivalent to PHP ucwords().

Usage

ADD JAR HiveUDFs.jar
CREATE TEMPORARY FUNCTION ucwords as 'net.petrabarus.hiveudfs.UCWords';
SELECT ucwords(text) FROM table;

##Copyright and License

MaxMind GeoIP is a trademark of MaxMind LLC.

hiveudfs's People

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.