Comments (7)
I'm not sure if I understand correctly what you are trying to do but if you use LOAD DATA
with a path from another container then it is expected to fail because the hive container and the other container do not share a filesystem.
from docker-hive.
Hi @gmouchakis, that is right. What I am trying to do is to push data into HDFS through the REST API of the HDFS system and then to be able to use it from hive, pretty much as I am doing in production environments with my company cluster.
I also discovered the /user/hive/warehouse path created by HIVE does not appear on the HDFS filesystem defined by this docker compose environment, which means the hive docker container is somehow pointing to a different HDFS, but I still can't find the root of the problem.
If I am right, HIVE uses namenode to find in which data node you have the information in the HDFS filesystem, but not in this containerized environment. Everything should happen by http interactions between docker containers isn't it?
from docker-hive.
Hi @enanablancaynumeros! Was a stupid mistake from my side, Hadoop inside Hive container was not setup to work with remote hadoop. Fixed now
I've also updated hadoop to 2.8. If you still have this problem, feel free to reopen the issue.
from docker-hive.
I broke jdbc connector now, give me a moment.
from docker-hive.
First start-up the system as written in README.md.
Connect to hive:
➜ docker-hive git:(master) ✗ docker exec -it hive-server bash
root@d16fce776021:/opt# /opt/hive/bin/beeline -u jdbc:hive2://localhost:10000
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/opt/hive/lib/log4j-slf4j-impl-2.4.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/opt/hadoop-2.8.0/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
Connecting to jdbc:hive2://localhost:10000
Connected to: Apache Hive (version 2.1.1)
Driver: Hive JDBC (version 2.1.1)
17/04/24 14:52:53 [main]: WARN jdbc.HiveConnection: Request to set autoCommit to false; Hive does not support autoCommit=false.
Transaction isolation: TRANSACTION_REPEATABLE_READ
Beeline version 2.1.1 by Apache Hive
0: jdbc:hive2://localhost:10000> LOAD DATA LOCAL INPATH './../examples/files/kv1.txt' OVERWRITE INTO TABLE pks;
Error: Error while compiling statement: FAILED: SemanticException [Error 10001]: Line 1:74 Table not found 'pks' (state=42S02,code=10001)
0: jdbc:hive2://localhost:10000> CREATE TABLE pks (foo INT, bar STRING);
No rows affected (0.171 seconds)
0: jdbc:hive2://localhost:10000> LOAD DATA LOCAL INPATH './../examples/files/kv1.txt' OVERWRITE INTO TABLE pks;
No rows affected (0.291 seconds)
0: jdbc:hive2://localhost:10000> SELECT * FROM pks;
+----------+----------+--+
| pks.foo | pks.bar |
+----------+----------+--+
| 238 | val_238 |
| 86 | val_86 |
| 311 | val_311 |
...
Connect to namenode and check that Hive is writing to the right HDFS:
➜ docker-hive git:(master) ✗ docker exec -it namenode bash
root@d15fdc2345a6:/# hadoop fs -ls /tmp/hive/root
Found 2 items
drwx------ - root supergroup 0 2017-04-24 14:46 /tmp/hive/root/50d7fa4d-e731-4a2b-8ed2-bd9b2c7dbfd7
drwx------ - root supergroup 0 2017-04-24 14:35 /tmp/hive/root/69eb874b-f76d-4446-acb9-babb80211e17
Now if you do the same thing from hive-server container you will see the same thing:
➜ docker-hive git:(master) ✗ docker exec -it hive-server bash
root@d16fce776021:/opt# hadoop fs -ls /tmp/hive/root
Found 2 items
drwx------ - root supergroup 0 2017-04-24 14:46 /tmp/hive/root/50d7fa4d-e731-4a2b-8ed2-bd9b2c7dbfd7
drwx------ - root supergroup 0 2017-04-24 14:35 /tmp/hive/root/69eb874b-f76d-4446-acb9-babb80211e17
from docker-hive.
Hi @earthquakesan, Thanks for the answer! I got it working if I only use the new entrypoint script, but it doesn't work under other use cases you haven't describe in your previous steps if I try to use the new hadoop and jdbc version.
For some reason I haven't had time to identify, it is possible to execute beeline inside of the docker container and now it points to the right HDFS, but if you try to hit the port 10000 from outside or other docker containers it refuses the connection, which affects plain curl calls and external jdbc drivers.
Let me know if you can not reproduce the problem. I run docker-compose build --no-cache after your changes a couple of times and still got that error.
from docker-hive.
@enanablancaynumeros I need a description of how you run the application, because I have no problems running the example app from here: https://cwiki.apache.org/confluence/display/Hive/HiveServer2+Clients#HiveServer2Clients-JDBC
What I did to run it:
- Copy hadoop and hive from docker container to my dev environment
docker cp hive-server:/opt ./
➜ hive-example-app ls opt
hadoop-2.8.0 hive
- Copy/paste code from the example to HiveJdbcClient.java and compile it:
javac HiveJdbcClient.java
- Download hadoop-core from mvn central:
# Copy it to opt/hadoop-2.8.0/
➜ hive-example-app ls opt/hadoop-2.8.0/hadoop-core-1.2.1.jar
opt/hadoop-2.8.0/hadoop-core-1.2.1.jar
- Create run.sh script (my current working dir is /home/ivan/Workspace/Apps/hive-example-app/):
#!/bin/bash
HADOOP_HOME=/home/ivan/Workspace/Apps/hive-example-app/opt/hadoop-2.8.0
HIVE_HOME=/home/ivan/Workspace/Apps/hive-example-app/opt/hive
echo -e '1\x01foo' > /tmp/a.txt
echo -e '2\x01bar' >> /tmp/a.txt
HADOOP_CORE=$(ls $HADOOP_HOME/hadoop-core*.jar)
CLASSPATH=.:$HIVE_HOME/conf:$(./opt/hadoop-2.8.0/bin/hadoop classpath)
for i in ${HIVE_HOME}/lib/*.jar ; do
CLASSPATH=$CLASSPATH:$i
done
java -cp $CLASSPATH HiveJdbcClient
- Run run.sh:
chmod +x run.sh
./run.sh
It will fail to read a.txt, because it is created on your host and not on remote host. Therefore you first have to copy the file there:
docker cp /tmp/a.txt hive-server:/tmp/
Otherwise, I can not see any problems with this sample application. Here is the output:
➜ hive-example-app ./run.sh
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/home/ivan/Workspace/Apps/hive-example-app/opt/hadoop-2.8.0/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/home/ivan/Workspace/Apps/hive-example-app/opt/hive/lib/log4j-slf4j-impl-2.4.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
17/04/26 10:55:53 INFO jdbc.Utils: Supplied authorities: localhost:10000
17/04/26 10:55:53 INFO jdbc.Utils: Resolved authority: localhost:10000
Running: show tables 'testHiveDriverTable'
testhivedrivertable
Running: describe testHiveDriverTable
key int
value string
Running: load data local inpath '/tmp/a.txt' into table testHiveDriverTable
Running: select * from testHiveDriverTable
1 foo
2 bar
Running: select count(1) from testHiveDriverTable
2
from docker-hive.
Related Issues (20)
- Presto cannot find the hive tables
- not listening on 10000/10002 when i upgrade to 3.1.2 HOT 3
- Compatible version for hadoop 3.2.0 HOT 2
- How can I use beeline to solve Chinese garbled HOT 1
- Presto connector version
- Error starting the containers HOT 4
- Spark not able to connect to hive metastore HOT 2
- how to add hive container to docker-hadoop
- Is there an updated version of "docker-hive" ...
- Getting qemu: uncaught target signal 11 (Segmentation fault) - core dumped Segmentation fault HOT 8
- how can i post text file to HDFS
- How can I access Yarn Web UI
- Any plans to support newer versions of Hive and Hadoop?
- Error when starting run hive with docker-compose
- How to connect hive remotely? HOT 1
- why i get in hive-server container ,The cursor blinks,do nothing ?
- Not able to start hive metastore postgresql container HOT 3
- Cannot connect to hive metastore using dbeaver HOT 1
- Fails on Apple Silicon M2
- Is there a way to (use) create table from CSV that is located on AWS S3
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from docker-hive.