Giter Club home page Giter Club logo

sqoop2's Introduction

= Welcome to Sqoop

Apache Sqoop is a tool designed for efficiently transferring bulk data between
Apache Hadoop and structured datastores such as relational databases. You can use
Sqoop to import data from external structured datastores into Hadoop Distributed
File System or related systems like Hive and HBase. Conversely, Sqoop can be used
to extract data from Hadoop and export it to external structured datastores such
as relational databases and enterprise data warehouses.

== Documentation

Sqoop ships with documentation, please check module "docs" for additional materials.

More documentation is available online on Sqoop home page:

http://sqoop.apache.org/

== Compiling Sqoop

Sqoop uses the Maven build system, and it can be compiled and built running the
following commands:

  mvn compile # Compile project
  mvn package # Build source artifact
  mvn package -Pbinary # Build binary artifact

Sqoop is using Sphinx plugin to generate documentation that have higher memory
requirements that might not fit into default maven configuration. You might need
to increase maximal memory allowance to successfully execute package goal. This
can done using following command:

  export MAVEN_OPTS="-Xmx512m -XX:MaxPermSize=512m"

Sqoop currently supports multiple Hadoop distributions. In order to compile Sqoop
against a specific Hadoop version, please specify the hadoop.profile property in
Maven commands. For example:

  mvn package -Pbinary -Dhadoop.profile=100

Please refer to the Sqoop documentation for a full list of supported Hadoop
distributions and values of the hadoop.profile property.

sqoop2's People

Contributors

abayer avatar danielshahaf avatar olamy avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

sqoop2's Issues

Unknown option Verbose

sqoop:000> show option
Verbose = false
Poll-timeout = 10000
sqoop:000> set option --name Verbose --value true
Unknown option Verbose. Ignoring...
sqoop:000> show version
client version:
  Sqoop 1.99.4-cdh5.3.0 source revision 75a4ffb64ddc4001a26a04366271b51b262224a1
  Compiled by jenkins on Tue Dec 16 20:14:23 PST 2014

Getting Following error for Spill failed

there are null values in the table. these is the error I'm getting.

2015-09-25 21:08:37,685 WARN mapreduce.Counters: Group org.apache.hadoop.mapred.Task$Counter is deprecated. Use org.apache.hadoop.mapreduce.TaskCounter instead
2015-09-25 21:08:38,294 INFO org.apache.hadoop.conf.Configuration.deprecation: session.id is deprecated. Instead, use dfs.metrics.session-id
2015-09-25 21:08:38,295 INFO org.apache.hadoop.metrics.jvm.JvmMetrics: Initializing JVM Metrics with processName=MAP, sessionId=
2015-09-25 21:08:38,914 INFO org.apache.hadoop.util.ProcessTree: setsid exited with exit code 0
2015-09-25 21:08:38,918 INFO org.apache.hadoop.mapred.Task: Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin@ca25138
2015-09-25 21:08:39,272 INFO org.apache.hadoop.mapred.MapTask: Processing split: org.apache.sqoop.job.mr.SqoopSplit@5f7e2943
2015-09-25 21:08:39,279 INFO org.apache.hadoop.mapred.MapTask: Map output collector class = org.apache.hadoop.mapred.MapTask$MapOutputBuffer
2015-09-25 21:08:39,283 INFO org.apache.hadoop.mapred.MapTask: io.sort.mb = 256
2015-09-25 21:08:39,447 INFO org.apache.hadoop.mapred.MapTask: data buffer = 204010960/255013696
2015-09-25 21:08:39,447 INFO org.apache.hadoop.mapred.MapTask: record buffer = 671088/838860
2015-09-25 21:08:39,568 INFO org.apache.sqoop.job.mr.SqoopMapper: Starting progress service
2015-09-25 21:08:39,570 INFO org.apache.sqoop.job.mr.SqoopMapper: Running extractor class org.apache.sqoop.connector.jdbc.GenericJdbcExtractor
2015-09-25 21:08:39,837 INFO org.apache.sqoop.connector.jdbc.GenericJdbcExtractor: Using query: SELECT advt_rollup_date,a_platform_id,p_platform_id,publisher_account_nk,site_nk,site_section_nk,page_nk,ad_unit_nk,rev_split,deal_type_uid,deal_cpm,rev_share_enabled,package_nk,deal_nk,advertiser_account_nk,order_nk,line_item_nk,ad_nk,line_item_pricing_model_code,line_item_pricing_rate,brand_nk,delivery_medium_code,screen_location_code,content_topic_group_sid,sales_channel_code,p_currency_code,a_currency_code,country_code,state_name,city_name,dma_code,ad_width,ad_height,ad_duration,ssp_elig,tot_requests,tot_discards,tot_impressions,tot_raw_billable_impressions,tot_billable_impressions,tot_clicks,tot_refresh_impressions,tot_billable_refresh_impressions,tot_view_conversions,tot_click_conversions,tot_raw_spend,tot_spend,tot_raw_conversion_spend,tot_conversion_spend,tot_mkt_elig_req,tot_mkt_requests,p_req_deliv_medium FROM mstr_datamart.temp_advt_supply_demand_geo_monthly_fact WHERE 14 <= rowId AND rowId <= 15
2015-09-25 21:09:31,496 INFO org.apache.hadoop.mapred.MapTask: Spilling map output: buffer full= true
2015-09-25 21:09:31,496 INFO org.apache.hadoop.mapred.MapTask: bufstart = 0; bufend = 204011138; bufvoid = 255013696
2015-09-25 21:09:31,496 INFO org.apache.hadoop.mapred.MapTask: kvstart = 0; kvend = 444368; length = 838860
2015-09-25 22:08:57,102 INFO org.apache.sqoop.job.mr.SqoopMapper: Stopping progress service
2015-09-25 22:08:57,115 INFO org.apache.hadoop.mapred.TaskLogsTruncater: Initializing logs' truncater with mapRetainSize=-1 and reduceRetainSize=-1
2015-09-25 22:08:57,118 WARN org.apache.hadoop.mapred.Child: Error running child
org.apache.sqoop.common.SqoopException: MAPRED_EXEC_0017:Error occurs during extractor run
at org.apache.sqoop.job.mr.SqoopMapper.run(SqoopMapper.java:99)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:672)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:330)
at org.apache.hadoop.mapred.Child$4.run(Child.java:268)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
at org.apache.hadoop.mapred.Child.main(Child.java:262)
Caused by: org.apache.sqoop.common.SqoopException: MAPRED_EXEC_0013:Cannot write to the data writer
at org.apache.sqoop.job.mr.SqoopMapper$SqoopMapDataWriter.writeContent(SqoopMapper.java:153)
at org.apache.sqoop.job.mr.SqoopMapper$SqoopMapDataWriter.writeArrayRecord(SqoopMapper.java:126)
at org.apache.sqoop.connector.jdbc.GenericJdbcExtractor.extract(GenericJdbcExtractor.java:96)
at org.apache.sqoop.connector.jdbc.GenericJdbcExtractor.extract(GenericJdbcExtractor.java:38)
at org.apache.sqoop.job.mr.SqoopMapper.run(SqoopMapper.java:95)
... 7 more
Caused by: java.io.IOException: Spill failed
at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTask.java:905)
at org.apache.hadoop.mapred.MapTask$NewOutputCollector.write(MapTask.java:601)
at org.apache.hadoop.mapreduce.task.TaskInputOutputContextImpl.write(TaskInputOutputContextImpl.java:85)
at org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.write(WrappedMapper.java:106)
at org.apache.sqoop.job.mr.SqoopMapper$SqoopMapDataWriter.writeContent(SqoopMapper.java:151)
... 11 more
Caused by: java.lang.NullPointerException
at org.apache.sqoop.job.io.SqoopWritable.readFields(SqoopWritable.java:66)
at org.apache.hadoop.io.WritableComparator.compare(WritableComparator.java:158)
at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.compare(MapTask.java:987)
at org.apache.hadoop.util.QuickSort.sortInternal(QuickSort.java:99)
at org.apache.hadoop.util.QuickSort.sort(QuickSort.java:63)
at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.sortAndSpill(MapTask.java:1277)
at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.access$1900(MapTask.java:724)
at org.apache.hadoop.mapred.MapTask$MapOutputBuffer$SpillThread.run(MapTask.java:1222)
2015-09-25 22:08:57,123 INFO org.apache.hadoop.mapred.Task: Runnning cleanup for the task

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.