Giter Club home page Giter Club logo

big-replicate's People

Contributors

pingles avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

big-replicate's Issues

Parameterise destination

Currently just the project is overwritten, it might be useful to also be able to set a different dataset (perhaps for copying data within a project from a US to an EU located dataset).

Failure when replicating tables with DATE field type

When attempting to replicate a table with a DATE type field, the type is not found in the Field.Type enum. Assuming missing support for the newer Standard SQL datatypes in the version of BigQuery library:

exception #error {
 :cause No enum constant com.google.cloud.bigquery.Field.Type.Value.DATE
 :via
 [{:type java.lang.IllegalArgumentException
   :message No enum constant com.google.cloud.bigquery.Field.Type.Value.DATE
   :at [java.lang.Enum valueOf Enum.java 238]}]
 :trace
 [[java.lang.Enum valueOf Enum.java 238]
  [com.google.cloud.bigquery.Field$Type$Value valueOf Field.java 75]
  [com.google.cloud.bigquery.Field fromPb Field.java 368]
  [com.google.cloud.bigquery.Field$1 apply Field.java 46]
  [com.google.cloud.bigquery.Field$1 apply Field.java 43]
  [com.google.common.collect.Lists$TransformingRandomAccessList$1 transform Lists.java 640]
  [com.google.common.collect.TransformedIterator next TransformedIterator.java 48]
  [java.util.AbstractCollection toArray AbstractCollection.java 141]
  [java.util.ArrayList <init> ArrayList.java 177]
  [com.google.common.collect.Lists newArrayList Lists.java 146]
  [com.google.cloud.bigquery.Schema$Builder fields Schema.java 78]
  [com.google.cloud.bigquery.Schema of Schema.java 151]
  [com.google.cloud.bigquery.Schema fromPb Schema.java 159]
  [com.google.cloud.bigquery.TableDefinition$Builder <init> TableDefinition.java 88]
  [com.google.cloud.bigquery.StandardTableDefinition$Builder <init> StandardTableDefinition.java 140]
  [com.google.cloud.bigquery.StandardTableDefinition$Builder <init> StandardTableDefinition.java 119]
  [com.google.cloud.bigquery.StandardTableDefinition fromPb StandardTableDefinition.java 283]
  [com.google.cloud.bigquery.TableDefinition fromPb TableDefinition.java 172]
  [com.google.cloud.bigquery.TableInfo$BuilderImpl <init> TableInfo.java 157]
  [com.google.cloud.bigquery.Table fromPb Table.java 348]
  [com.google.cloud.bigquery.BigQueryImpl getTable BigQueryImpl.java 353]
  [sun.reflect.GeneratedMethodAccessor18 invoke nil -1]
  [sun.reflect.DelegatingMethodAccessorImpl invoke DelegatingMethodAccessorImpl.java 43]
  [java.lang.reflect.Method invoke Method.java 498]
  [clojure.lang.Reflector invokeMatchingMethod Reflector.java 93]
  [clojure.lang.Reflector invokeInstanceMethod Reflector.java 28]
  [gclouj.bigquery$table invokeStatic bigquery.clj 142]
  [gclouj.bigquery$table invoke bigquery.clj 141]
  [uswitch.big_replicate.sync$load_table invokeStatic sync.clj 92]
  [uswitch.big_replicate.sync$load_table invoke sync.clj 91]
  [uswitch.big_replicate.sync$progress$fn__7851 invoke sync.clj 129]
  [uswitch.big_replicate.sync$progress invokeStatic sync.clj 129]
  [uswitch.big_replicate.sync$progress invoke sync.clj 121]
  [uswitch.big_replicate.sync$replicator_agent$fn__7855 invoke sync.clj 138]
  [clojure.core.async$thread_call$fn__6122 invoke async.clj 439]
  [clojure.lang.AFn run AFn.java 22]
  [java.util.concurrent.ThreadPoolExecutor runWorker ThreadPoolExecutor.java 1142]
  [java.util.concurrent.ThreadPoolExecutor$Worker run ThreadPoolExecutor.java 617]
  [java.lang.Thread run Thread.java 745]]}

Sync hangs with no tables

09:13:56.977 [async-thread-macro-1] INFO uswitch.big-replicate.sync - syncing 0 tables:
09:13:56.977 [main] INFO uswitch.big-replicate.sync - creating 8 replicator agents

Clean staging data

Staging data is not automatically deleted once its been loaded into the destination table. It'll gradually fill up over time so it'd be nice to automatically tidy once the load has finished successfully.

Use Copy job rather than Extract/Load job

The BigQuery API has a Copy job that will copy data between tables. It should be possible to merge the use of separate extract + load jobs into just the copy job. It'll also mean there's no need to manage Cloud Storage used currently for intermediate data.

gclouj will need to change to add support for building the job configuration: CopyJobConfiguration.

Generalise table filters

The tool currently assumes its replicating only Google Analytics exported data. It would be nice to change this to allow a table regexp to be specified on the cli so its less specific to GA data.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.