Giter Club home page Giter Club logo

node-impala's Introduction

node-impala Build Status

Node client for Impala

Install

$ npm install --save node-impala

Usage

See the issue before using this module.

import { createClient } from 'node-impala';

const client = createClient();

client.connect({
  host: '127.0.0.1',
  port: 21000,
  resultType: 'json-array'
});

client.query('SELECT column_name FROM table_name')
  .then(result => console.log(result))
  .catch(err => console.error(err))
  .done(() => client.close().catch(err => console.error(err)));

Example

Bloomery: Web UI for Impala and uses this client to execute queries.

Options

host

Type: string
Default: '127.0.0.1'

If you are connecting Thrift Server remotely, such as connecting Cloudera from your host machine instead virtual machine, it corresponds to inet address of virtual machine that you can learn using ifconfig command in terminal. Otherwise, you should leave it as its default if you are connecting from virtual machine.

port

Type: number
Default: 21000

Default value corresponds to Impala Daemon Frontend Port which is used to transmit commands and receive results by Beeswax.

resultType

Type: string
Default: null

Returns result of query according to the given type.

Available variables:

  • json-array returns json array
  • map maps columns to rows
  • boolean returns true if query is successful
  • null returns results of query and table schemas as an array

timeout

Type: number
Default: 1000

Timeout value for closing transport after process finished.

API

createClient()

Creates client that uses BeeswaxService.

const client = createClient();

client.connect(props, callback)

Creates connection using given props.

client.connect({ resultType: 'boolean' })
  .then(message => console.log(message))
  .catch(error => console.error(error));

client.close(callback)

Closes the current connection.

client.close().catch((err) => console.error(err));

client.explain(sql, callback)

Gets the query plan for a query.

client.explain(sql)
  .then(explanation => console.log(explanation))
  .catch(err => console.error(err));

client.getResultsMetadata(sql, callback)

Gets the result metadata.

client.getResultsMetadata(sql)
  .then(metaData => console.log(metaData))
  .catch(err => console.error(err));

client.query(sql, callback)

Transmits SQL command and receives result via Beeswax Service asynchronously. Fetch size is fixed as 1024, namely, it returns the maximum 1024 results of a query.

client.query(sql)
  .then(results => console.log(results))
  .catch(err => console.error(err));

Versions

1.x.x and 2.x.x uses Impala chd5-2.1_5.3.0

License

MIT © Ömer Ufuk Efendioğlu

node-impala's People

Contributors

coderade avatar harunurhan avatar ofrebourg avatar snyk-bot avatar ufukomer avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

node-impala's Issues

Port documentation to ESDoc

Since ESDoc is one of the best options when it comes to generate documentation for ES6 codebase, I think it would be good to have proper documentation format that follows the rules of ESDoc (Syntax is almost same as JSDoc).

Also while porting, types for parameters should be added like described in tutorial.

ESDoc Tutorial

Invalid method name 'query'

i installed the node-impala by npm and copy to a server,
but cannot query sql.
the query err is: TApplicationException: Invalid method name 'query'

query 1024 row limit

Hi,

When I want to paging a query with LIMIT and OFFSET clauses,
I only have 24 rows when LIMIT = 1000 and OFFSET = 1000

the result of request count is 2152.

I think this is an issue.

All my queries return only 1024 lines and then stops?

I am connecting to an impala db via node-impala from my node server, but every time I do a query (unless the limit is less than) the result return 1024 line from the db. Even if I SELECT * it returns 1024 lines form a db with over 6 million lines. I understand that beeswax reads 1024 lines at a time when querying, but why does it stop after the first run?

Please please help.
And thank you in advance to whomever gets back to me :)

Unexpected behaviour using promises on closing

When I use then on connection close it seems not to executed. But the connection seems to be closed and there is no error raised. Is that intentional behaviour?

client.close()
    .then( () => {
       console.log('closed.'); // never logged
       this.closed = true; 
    })
    .catch(err => console.error(err));

My use case: I wanted to keep the connection open while firing requests parallel. Therefore I need the state whether the connection is already open or not. Is there a built in method to do so?

TApplicationException: Invalid method name

Using the example there: from https://www.npmjs.com/package/node-impala I got the following error:

{ TApplicationException: Invalid method name: 'query'
    at BeeswaxServiceClient.recv_query ([...]/node_modules/node-impala/lib/thrift/BeeswaxService.js:1527:13)
    at [...]/node_modules/thrift/lib/nodejs/lib/thrift/connection.js:157:41
    at Socket.<anonymous> [...]/node_modules/thrift/lib/nodejs/lib/thrift/buffered_transport.js:48:5)
    at emitOne (events.js:96:13)
    at Socket.emit (events.js:188:7)
    at readableAddChunk (_stream_readable.js:172:18)
    at Socket.Readable.push (_stream_readable.js:130:10)
    at TCP.onread (net.js:542:20)
  name: 'TApplicationException',
  message: 'Invalid method name: \'query\'',
  type: 1 }

The connection to the server seems to be established, but I can not execute the query.

Does node-impala work on Impala with Kerberos enabled

Hi, there,

We have a cluster of Impala daemon nodes which requires Kerberos auth. I tried to use the lib to connect to those nodes and it was not successful. Here is the stack trace,

{ Error: read ECONNRESET at exports._errnoException (util.js:1026:11) at TCP.onread (net.js:572:26) code: 'ECONNRESET', errno: 'ECONNRESET', syscall: 'read' }

The stack trace may not help much. I am not sure if it is something related to the Kerberos auth requirement. As least, I don't know how to configure this lib to use GSSAPI to authenticate against the cluster.

So my question is if this lib support GSSAPI or Kerberos for auth mech? If so, how to configure it.

Thanks,
Shuo

Settings for Connection Pool

The Impala driver I need to connect to has a connection pool set up. I need to connect to this pool when using the node-impala client (I need to specify the pool to which I want to connect). How can I do this?

I will appreciate the help and feedback!

Cannot get data of the sql query.

Hi, ufukomer,

I use Node.js to run the code, here is my code,
`var impalaDB = require('node-impala');
const client = impalaDB.createClient();

client.connect({
host: '10.2.18.185',
port: 21050,
resultType: 'json-array'
})
.then(function (message) {
console.log(message); //this output 'Connection established'
})
.catch(function (err) {
console.log(err);
});

client.query('show databases')
.then(result => console.log(result))
.catch(err => console.error(err))
.done(() => client.close().catch(err => console.error(err))); //here output nothing
`

Could you please explain why the query doesn't output anything, result or err? Thank you so much!

Node.js Impala client TapplicationException error

Hi ,

I am trying to connect to impala using the package and i'm getting the TapplicationExcpetion error.

In server.js

var impala=require('node-impala');
app.post('/submit_dsp_block',function(req,res){
var day=req.body.day;
var limit=req.body.limit;
console.log(day);
console.log(limit);
console.log(impala);
var client = impala.createClient();
console.log(client);
console.log("trying to connect");
client.connect(
{
host:'10.201.50.11',
port: 21050,
resultType: 'json-array'
});
console.log("client connected");
client.query('select * from rpt.rptdaily limit 100')
.then(result => console.log(result))
.catch(err => console.error(err));
.done(() => client.close().catch(err => console.error(err)));
});

This is the console result in git bash:

node server.js
Example app listening on port 3000!
2016-08-15
100
{ createClient: [Function: createClient] }
ImpalaClient {}
trying to connect
client connected
{ [TApplicationException: Invalid method name: 'query']
name: 'TApplicationException',
message: 'Invalid method name: 'query'',
type: 1 }

Guess my client is getting connected, but there is problem with the query. Also how do i execute prepared query using node-impala , for example something like this

var day=req.body.day;
var limit=req.body.limit;
client.execute('select* from portal.pubdomainstats where day=? limit ?',[day,limit], function (err, result) {
if (err) {
return console.error('There was while trying to retrieve data from system.local', err);
}

I'm pretty new to node.js and would be helpful if I can get a layman's answer to my problem

Thanks

query returns one row only, my code style is es5

Hi ufukomer,

Here ais my es5 codes, it always return the first row of results
i think it may caused by the 'pending' state, but how can i get all 50 rows after the pending finished?


> var sql = "select system_name from itm.system group by system_name";
> var client = require('node-impala').createClient({"host": "hadoop3"});
> client.query(sql, function(err, data){ console.log('err', err, 'data', data) })
{ state: 'pending' }
> err null data [ [ 'CASCECUP01:KUX' ],
[ { name: 'system_name', type: 'string', comment: '' } ] ]


> client.resultType = 'map'
'map'
> client.query(sql, function(err, data){ console.log('err', err, 'data', data) })
{ state: 'pending' }
> err null data Map { 'system_name' => [ 'CASCECUP01:KUX' ] }


> var client = require('node-impala').createClient({"host": "hadoop3", "resultType": 'map'})
undefined
> client.query(sql, function(err, data){ console.log('err', err, 'data', data) })
{ state: 'pending' }
> err null data Map { 'system_name' => [ 'CASCECUP01:KUX' ] }

Please check same query under impala-shell :

[hadoop3:21000] > select system_name from itm.system group by system_name;
Query: select system_name from itm.system group by system_name
+----------------+
| system_name |
+----------------+
| CASCECUP01:KUX |
| ASCECUP14:KUX |
| ASCECSP02:KUX |
| ASCECUP15:KUX |
| ASCECMP01:KUX |
| ASCECUP10:KUX |
| ASCECUP09:KUX |
| ESCECUP03:KUX |
| DBCECUP02:KUX |
| ASCECUP11:KUX |
| ASCECUP07:KUX |
| DBCECMP02:KUX |
| ASCECUP08:KUX |
| ASCECSP01:KUX |
| FSCECUP01:KUX |
| CASCECMP01:KUX |
| ASCECUP13:KUX |
| ASCECMP02:KUX |
| ASCECUP05:KUX |
| ASCECUP20:KUX |
| ASCECUP02:KUX |
| ASCECUP03:KUX |
| ASCECUP04:KUX |
| ESCECUP01:KUX |
| ESCECUP06:KUX |
| MQCECUP01:KUX |
| MQCECUP06:KUX |
| CESCECUP02:KUX |
| ASCECUP18:KUX |
| ASCECUP06:KUX |
| ASCECUP01:KUX |
| CASCECUP02:KUX |
| MQCECUP03:KUX |
| MQCECUP04:KUX |
| MQCECUP05:KUX |
| CESCECUP01:KUX |
| CASCECUP04:KUX |
| ESCECUP02:KUX |
| ESCECUP05:KUX |
| ASCECUP16:KUX |
| CASCECUP03:KUX |
| DBCECUP01:KUX |
| ASCECUP12:KUX |
| DBCECMP01:KUX |
| ASCECUP17:KUX |
| CFSCECUP01:KUX |
| FSCECUP02:KUX |
| ESCECUP04:KUX |
| MQCECUP02:KUX |
| ASCECUP19:KUX |
+----------------+
Fetched 50 row(s) in 0.78s

Weird data formatting being returned

Hi,

Thanks again for this piece of software.

When executing the following query:

SELECT 1 as foo, 2 as foo

Data returned is:

[
    [ '1\t2' ],
    [ { name: 'foo', type: 'tinyint', comment: '' }, { name: 'foo', type: 'tinyint', comment: '' } ]
]

I'm not sure i quite understand why 1\t2, i was expecting ['1', '2'].

Is this a bug ?

Regards

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.