My training is stagnant at training step 0. My STS-B fintuning command is as follow:
1,2,4,5
gqxx-01-071
2019年 06月 22日 星期六 18:19:32 CST
2019-06-22 18:19:35.814846: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX AVX2 FMA
2019-06-22 18:19:35.821830: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 2400050000 Hz
2019-06-22 18:19:35.822174: I tensorflow/compiler/xla/service/service.cc:150] XLA service 0x4752650 executing computations on platform Host. Devices:
2019-06-22 18:19:35.822229: I tensorflow/compiler/xla/service/service.cc:158] StreamExecutor device (0): <undefined>, <undefined>
INFO:tensorflow:Device is available but not used by distribute strategy: /device:CPU:0
INFO:tensorflow:Device is available but not used by distribute strategy: /device:XLA_CPU:0
WARNING:tensorflow:Not all devices in `tf.distribute.Strategy` are visible to TensorFlow.
INFO:tensorflow:Use MirroredStrategy with 4 devices.
INFO:tensorflow:Initializing RunConfig with distribution strategies.
INFO:tensorflow:Not using Distribute Coordinator.
INFO:tensorflow:Using config: {'_save_checkpoints_secs': None, '_keep_checkpoint_max': 0, '_task_type': 'worker', '_cluster_spec': <tensorflow.python.training.server_lib.ClusterSpec object at 0x7f3143b1ded0>, '_keep_checkpoint_every_n_hours': 10000, '_service': None, '_num_ps_replicas': 0, '_tpu_config': TPUConfig(iterations_per_loop=600, num_shards=4, num_cores_per_replica=None, per_host_input_for_training=3, tpu_job_name=None, initial_infeed_sleep_secs=None, input_partition_dims=None), '_tf_random_seed': None, '_device_fn': None, '_cluster': None, '_num_worker_replicas': 1, '_task_id': 0, '_log_step_count_steps': 100, '_evaluation_master': '', '_eval_distribute': None, '_train_distribute': <tensorflow.contrib.distribute.python.mirrored_strategy.MirroredStrategy object at 0x7f314584e8d0>, '_distribute_coordinator_mode': None, '_session_config': allow_soft_placement: true
, '_global_id_in_cluster': 0, '_is_chief': True, '_protocol': None, '_save_checkpoints_steps': 600, '_experimental_distribute': None, '_save_summary_steps': 100, '_model_dir': 'exp/sts-b', '_master': ''}
WARNING:tensorflow:Estimator's model_fn (<function model_fn at 0x7f3143a579b0>) includes params argument, but params are not passed to Estimator.
INFO:tensorflow:Use tfrecord file proc_data/sts-b/spiece.model.len-128.train.tf_record
INFO:tensorflow:Num of train samples: 5749
INFO:tensorflow:Do not overwrite tfrecord proc_data/sts-b/spiece.model.len-128.train.tf_record exists.
INFO:tensorflow:Input tfrecord file proc_data/sts-b/spiece.model.len-128.train.tf_record
WARNING:tensorflow:From run_classifier.py:535: map_and_batch (from tensorflow.contrib.data.python.ops.batching) is deprecated and will be removed in a future version.
Instructions for updating:
Use `tf.data.experimental.map_and_batch(...)`.
WARNING:tensorflow:From /mnt/lustre/sjtu/home/myl01/anaconda3/envs/xlnet/lib/python2.7/site-packages/tensorflow/python/data/ops/dataset_ops.py:1419: colocate_with (from tensorflow.python.framework.ops) is deprecated and will be removed in a future version.
Instructions for updating:
Colocations handled automatically by placer.
OMP: Info #204: KMP_AFFINITY: decoding x2APIC ids.
OMP: Info #202: KMP_AFFINITY: Affinity capable, using global cpuid leaf 11 info
OMP: Info #154: KMP_AFFINITY: Initial OS proc set respected: {5,6,7,33,34}
OMP: Info #156: KMP_AFFINITY: 5 available OS procs
OMP: Info #158: KMP_AFFINITY: Nonuniform topology
OMP: Info #179: KMP_AFFINITY: 1 packages x 3 cores/pkg x 2 threads/core (3 total cores)
OMP: Info #206: KMP_AFFINITY: OS proc to physical thread map:
OMP: Info #171: KMP_AFFINITY: OS proc 5 maps to package 0 core 5 thread 0
OMP: Info #171: KMP_AFFINITY: OS proc 33 maps to package 0 core 5 thread 1
OMP: Info #171: KMP_AFFINITY: OS proc 6 maps to package 0 core 6 thread 0
OMP: Info #171: KMP_AFFINITY: OS proc 34 maps to package 0 core 6 thread 1
OMP: Info #171: KMP_AFFINITY: OS proc 7 maps to package 0 core 8 thread 0
OMP: Info #242: KMP_AFFINITY: pid 11876 thread 0 bound to OS proc set {5}
2019-06-22 18:19:37.729498: I tensorflow/core/common_runtime/process_util.cc:71] Creating new thread pool with default inter op setting: 2. Tune using inter_op_parallelism_threads for best performance.
INFO:tensorflow:Calling model_fn.
INFO:tensorflow:memory input None
INFO:tensorflow:Use float type <dtype: 'float32'>
WARNING:tensorflow:From /mnt/lustre/sjtu/home/myl01/NLP/xlnet/modeling.py:532: dropout (from tensorflow.python.layers.core) is deprecated and will be removed in a future version.
Instructions for updating:
Use keras.layers.dropout instead.
WARNING:tensorflow:From /mnt/lustre/sjtu/home/myl01/anaconda3/envs/xlnet/lib/python2.7/site-packages/tensorflow/python/keras/layers/core.py:143: calling dropout (from tensorflow.python.ops.nn_ops) with keep_prob is deprecated and will be removed in a future version.
Instructions for updating:
Please use `rate` instead of `keep_prob`. Rate should be set to `rate = 1 - keep_prob`.
WARNING:tensorflow:From /mnt/lustre/sjtu/home/myl01/NLP/xlnet/modeling.py:67: dense (from tensorflow.python.layers.core) is deprecated and will be removed in a future version.
Instructions for updating:
Use keras.layers.dense instead.
INFO:tensorflow:#params: 361318401
INFO:tensorflow:Initialize from the ckpt xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt
INFO:tensorflow:Calling model_fn.
INFO:tensorflow:memory input None
INFO:tensorflow:Use float type <dtype: 'float32'>
INFO:tensorflow:#params: 361318401
INFO:tensorflow:Initialize from the ckpt xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt
INFO:tensorflow:Calling model_fn.
INFO:tensorflow:memory input None
INFO:tensorflow:Use float type <dtype: 'float32'>
INFO:tensorflow:#params: 361318401
INFO:tensorflow:Initialize from the ckpt xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt
INFO:tensorflow:Calling model_fn.
INFO:tensorflow:memory input None
INFO:tensorflow:Use float type <dtype: 'float32'>
INFO:tensorflow:#params: 361318401
INFO:tensorflow:Initialize from the ckpt xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt
DEBUG:tensorflow:Initialize variable model/transformer/layer_0/ff/LayerNorm/beta:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_0/ff/LayerNorm/beta
DEBUG:tensorflow:Initialize variable model/transformer/layer_0/ff/LayerNorm/gamma:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_0/ff/LayerNorm/gamma
DEBUG:tensorflow:Initialize variable model/transformer/layer_0/ff/layer_1/bias:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_0/ff/layer_1/bias
DEBUG:tensorflow:Initialize variable model/transformer/layer_0/ff/layer_1/kernel:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_0/ff/layer_1/kernel
DEBUG:tensorflow:Initialize variable model/transformer/layer_0/ff/layer_2/bias:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_0/ff/layer_2/bias
DEBUG:tensorflow:Initialize variable model/transformer/layer_0/ff/layer_2/kernel:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_0/ff/layer_2/kernel
DEBUG:tensorflow:Initialize variable model/transformer/layer_0/rel_attn/LayerNorm/beta:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_0/rel_attn/LayerNorm/beta
DEBUG:tensorflow:Initialize variable model/transformer/layer_0/rel_attn/LayerNorm/gamma:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_0/rel_attn/LayerNorm/gamma
DEBUG:tensorflow:Initialize variable model/transformer/layer_0/rel_attn/k/kernel:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_0/rel_attn/k/kernel
DEBUG:tensorflow:Initialize variable model/transformer/layer_0/rel_attn/o/kernel:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_0/rel_attn/o/kernel
DEBUG:tensorflow:Initialize variable model/transformer/layer_0/rel_attn/q/kernel:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_0/rel_attn/q/kernel
DEBUG:tensorflow:Initialize variable model/transformer/layer_0/rel_attn/r/kernel:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_0/rel_attn/r/kernel
DEBUG:tensorflow:Initialize variable model/transformer/layer_0/rel_attn/v/kernel:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_0/rel_attn/v/kernel
DEBUG:tensorflow:Initialize variable model/transformer/layer_1/ff/LayerNorm/beta:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_1/ff/LayerNorm/beta
DEBUG:tensorflow:Initialize variable model/transformer/layer_1/ff/LayerNorm/gamma:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_1/ff/LayerNorm/gamma
DEBUG:tensorflow:Initialize variable model/transformer/layer_1/ff/layer_1/bias:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_1/ff/layer_1/bias
DEBUG:tensorflow:Initialize variable model/transformer/layer_1/ff/layer_1/kernel:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_1/ff/layer_1/kernel
DEBUG:tensorflow:Initialize variable model/transformer/layer_1/ff/layer_2/bias:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_1/ff/layer_2/bias
DEBUG:tensorflow:Initialize variable model/transformer/layer_1/ff/layer_2/kernel:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_1/ff/layer_2/kernel
DEBUG:tensorflow:Initialize variable model/transformer/layer_1/rel_attn/LayerNorm/beta:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_1/rel_attn/LayerNorm/beta
DEBUG:tensorflow:Initialize variable model/transformer/layer_1/rel_attn/LayerNorm/gamma:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_1/rel_attn/LayerNorm/gamma
DEBUG:tensorflow:Initialize variable model/transformer/layer_1/rel_attn/k/kernel:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_1/rel_attn/k/kernel
DEBUG:tensorflow:Initialize variable model/transformer/layer_1/rel_attn/o/kernel:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_1/rel_attn/o/kernel
DEBUG:tensorflow:Initialize variable model/transformer/layer_1/rel_attn/q/kernel:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_1/rel_attn/q/kernel
DEBUG:tensorflow:Initialize variable model/transformer/layer_1/rel_attn/r/kernel:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_1/rel_attn/r/kernel
DEBUG:tensorflow:Initialize variable model/transformer/layer_1/rel_attn/v/kernel:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_1/rel_attn/v/kernel
DEBUG:tensorflow:Initialize variable model/transformer/layer_10/ff/LayerNorm/beta:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_10/ff/LayerNorm/beta
DEBUG:tensorflow:Initialize variable model/transformer/layer_10/ff/LayerNorm/gamma:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_10/ff/LayerNorm/gamma
DEBUG:tensorflow:Initialize variable model/transformer/layer_10/ff/layer_1/bias:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_10/ff/layer_1/bias
DEBUG:tensorflow:Initialize variable model/transformer/layer_10/ff/layer_1/kernel:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_10/ff/layer_1/kernel
DEBUG:tensorflow:Initialize variable model/transformer/layer_10/ff/layer_2/bias:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_10/ff/layer_2/bias
DEBUG:tensorflow:Initialize variable model/transformer/layer_10/ff/layer_2/kernel:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_10/ff/layer_2/kernel
DEBUG:tensorflow:Initialize variable model/transformer/layer_10/rel_attn/LayerNorm/beta:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_10/rel_attn/LayerNorm/beta
DEBUG:tensorflow:Initialize variable model/transformer/layer_10/rel_attn/LayerNorm/gamma:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_10/rel_attn/LayerNorm/gamma
DEBUG:tensorflow:Initialize variable model/transformer/layer_10/rel_attn/k/kernel:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_10/rel_attn/k/kernel
DEBUG:tensorflow:Initialize variable model/transformer/layer_10/rel_attn/o/kernel:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_10/rel_attn/o/kernel
DEBUG:tensorflow:Initialize variable model/transformer/layer_10/rel_attn/q/kernel:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_10/rel_attn/q/kernel
DEBUG:tensorflow:Initialize variable model/transformer/layer_10/rel_attn/r/kernel:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_10/rel_attn/r/kernel
DEBUG:tensorflow:Initialize variable model/transformer/layer_10/rel_attn/v/kernel:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_10/rel_attn/v/kernel
DEBUG:tensorflow:Initialize variable model/transformer/layer_11/ff/LayerNorm/beta:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_11/ff/LayerNorm/beta
DEBUG:tensorflow:Initialize variable model/transformer/layer_11/ff/LayerNorm/gamma:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_11/ff/LayerNorm/gamma
DEBUG:tensorflow:Initialize variable model/transformer/layer_11/ff/layer_1/bias:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_11/ff/layer_1/bias
DEBUG:tensorflow:Initialize variable model/transformer/layer_11/ff/layer_1/kernel:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_11/ff/layer_1/kernel
DEBUG:tensorflow:Initialize variable model/transformer/layer_11/ff/layer_2/bias:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_11/ff/layer_2/bias
DEBUG:tensorflow:Initialize variable model/transformer/layer_11/ff/layer_2/kernel:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_11/ff/layer_2/kernel
DEBUG:tensorflow:Initialize variable model/transformer/layer_11/rel_attn/LayerNorm/beta:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_11/rel_attn/LayerNorm/beta
DEBUG:tensorflow:Initialize variable model/transformer/layer_11/rel_attn/LayerNorm/gamma:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_11/rel_attn/LayerNorm/gamma
DEBUG:tensorflow:Initialize variable model/transformer/layer_11/rel_attn/k/kernel:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_11/rel_attn/k/kernel
DEBUG:tensorflow:Initialize variable model/transformer/layer_11/rel_attn/o/kernel:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_11/rel_attn/o/kernel
DEBUG:tensorflow:Initialize variable model/transformer/layer_11/rel_attn/q/kernel:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_11/rel_attn/q/kernel
DEBUG:tensorflow:Initialize variable model/transformer/layer_11/rel_attn/r/kernel:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_11/rel_attn/r/kernel
DEBUG:tensorflow:Initialize variable model/transformer/layer_11/rel_attn/v/kernel:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_11/rel_attn/v/kernel
DEBUG:tensorflow:Initialize variable model/transformer/layer_12/ff/LayerNorm/beta:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_12/ff/LayerNorm/beta
DEBUG:tensorflow:Initialize variable model/transformer/layer_12/ff/LayerNorm/gamma:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_12/ff/LayerNorm/gamma
DEBUG:tensorflow:Initialize variable model/transformer/layer_12/ff/layer_1/bias:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_12/ff/layer_1/bias
DEBUG:tensorflow:Initialize variable model/transformer/layer_12/ff/layer_1/kernel:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_12/ff/layer_1/kernel
DEBUG:tensorflow:Initialize variable model/transformer/layer_12/ff/layer_2/bias:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_12/ff/layer_2/bias
DEBUG:tensorflow:Initialize variable model/transformer/layer_12/ff/layer_2/kernel:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_12/ff/layer_2/kernel
DEBUG:tensorflow:Initialize variable model/transformer/layer_12/rel_attn/LayerNorm/beta:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_12/rel_attn/LayerNorm/beta
DEBUG:tensorflow:Initialize variable model/transformer/layer_12/rel_attn/LayerNorm/gamma:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_12/rel_attn/LayerNorm/gamma
DEBUG:tensorflow:Initialize variable model/transformer/layer_12/rel_attn/k/kernel:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_12/rel_attn/k/kernel
DEBUG:tensorflow:Initialize variable model/transformer/layer_12/rel_attn/o/kernel:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_12/rel_attn/o/kernel
DEBUG:tensorflow:Initialize variable model/transformer/layer_12/rel_attn/q/kernel:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_12/rel_attn/q/kernel
DEBUG:tensorflow:Initialize variable model/transformer/layer_12/rel_attn/r/kernel:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_12/rel_attn/r/kernel
DEBUG:tensorflow:Initialize variable model/transformer/layer_12/rel_attn/v/kernel:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_12/rel_attn/v/kernel
DEBUG:tensorflow:Initialize variable model/transformer/layer_13/ff/LayerNorm/beta:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_13/ff/LayerNorm/beta
DEBUG:tensorflow:Initialize variable model/transformer/layer_13/ff/LayerNorm/gamma:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_13/ff/LayerNorm/gamma
DEBUG:tensorflow:Initialize variable model/transformer/layer_13/ff/layer_1/bias:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_13/ff/layer_1/bias
DEBUG:tensorflow:Initialize variable model/transformer/layer_13/ff/layer_1/kernel:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_13/ff/layer_1/kernel
DEBUG:tensorflow:Initialize variable model/transformer/layer_13/ff/layer_2/bias:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_13/ff/layer_2/bias
DEBUG:tensorflow:Initialize variable model/transformer/layer_13/ff/layer_2/kernel:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_13/ff/layer_2/kernel
DEBUG:tensorflow:Initialize variable model/transformer/layer_13/rel_attn/LayerNorm/beta:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_13/rel_attn/LayerNorm/beta
DEBUG:tensorflow:Initialize variable model/transformer/layer_13/rel_attn/LayerNorm/gamma:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_13/rel_attn/LayerNorm/gamma
DEBUG:tensorflow:Initialize variable model/transformer/layer_13/rel_attn/k/kernel:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_13/rel_attn/k/kernel
DEBUG:tensorflow:Initialize variable model/transformer/layer_13/rel_attn/o/kernel:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_13/rel_attn/o/kernel
DEBUG:tensorflow:Initialize variable model/transformer/layer_13/rel_attn/q/kernel:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_13/rel_attn/q/kernel
DEBUG:tensorflow:Initialize variable model/transformer/layer_13/rel_attn/r/kernel:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_13/rel_attn/r/kernel
DEBUG:tensorflow:Initialize variable model/transformer/layer_13/rel_attn/v/kernel:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_13/rel_attn/v/kernel
DEBUG:tensorflow:Initialize variable model/transformer/layer_14/ff/LayerNorm/beta:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_14/ff/LayerNorm/beta
DEBUG:tensorflow:Initialize variable model/transformer/layer_14/ff/LayerNorm/gamma:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_14/ff/LayerNorm/gamma
DEBUG:tensorflow:Initialize variable model/transformer/layer_14/ff/layer_1/bias:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_14/ff/layer_1/bias
DEBUG:tensorflow:Initialize variable model/transformer/layer_14/ff/layer_1/kernel:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_14/ff/layer_1/kernel
DEBUG:tensorflow:Initialize variable model/transformer/layer_14/ff/layer_2/bias:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_14/ff/layer_2/bias
DEBUG:tensorflow:Initialize variable model/transformer/layer_14/ff/layer_2/kernel:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_14/ff/layer_2/kernel
DEBUG:tensorflow:Initialize variable model/transformer/layer_14/rel_attn/LayerNorm/beta:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_14/rel_attn/LayerNorm/beta
DEBUG:tensorflow:Initialize variable model/transformer/layer_14/rel_attn/LayerNorm/gamma:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_14/rel_attn/LayerNorm/gamma
DEBUG:tensorflow:Initialize variable model/transformer/layer_14/rel_attn/k/kernel:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_14/rel_attn/k/kernel
DEBUG:tensorflow:Initialize variable model/transformer/layer_14/rel_attn/o/kernel:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_14/rel_attn/o/kernel
DEBUG:tensorflow:Initialize variable model/transformer/layer_14/rel_attn/q/kernel:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_14/rel_attn/q/kernel
DEBUG:tensorflow:Initialize variable model/transformer/layer_14/rel_attn/r/kernel:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_14/rel_attn/r/kernel
DEBUG:tensorflow:Initialize variable model/transformer/layer_14/rel_attn/v/kernel:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_14/rel_attn/v/kernel
DEBUG:tensorflow:Initialize variable model/transformer/layer_15/ff/LayerNorm/beta:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_15/ff/LayerNorm/beta
DEBUG:tensorflow:Initialize variable model/transformer/layer_15/ff/LayerNorm/gamma:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_15/ff/LayerNorm/gamma
DEBUG:tensorflow:Initialize variable model/transformer/layer_15/ff/layer_1/bias:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_15/ff/layer_1/bias
DEBUG:tensorflow:Initialize variable model/transformer/layer_15/ff/layer_1/kernel:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_15/ff/layer_1/kernel
DEBUG:tensorflow:Initialize variable model/transformer/layer_15/ff/layer_2/bias:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_15/ff/layer_2/bias
DEBUG:tensorflow:Initialize variable model/transformer/layer_15/ff/layer_2/kernel:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_15/ff/layer_2/kernel
DEBUG:tensorflow:Initialize variable model/transformer/layer_15/rel_attn/LayerNorm/beta:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_15/rel_attn/LayerNorm/beta
DEBUG:tensorflow:Initialize variable model/transformer/layer_15/rel_attn/LayerNorm/gamma:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_15/rel_attn/LayerNorm/gamma
DEBUG:tensorflow:Initialize variable model/transformer/layer_15/rel_attn/k/kernel:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_15/rel_attn/k/kernel
DEBUG:tensorflow:Initialize variable model/transformer/layer_15/rel_attn/o/kernel:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_15/rel_attn/o/kernel
DEBUG:tensorflow:Initialize variable model/transformer/layer_15/rel_attn/q/kernel:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_15/rel_attn/q/kernel
DEBUG:tensorflow:Initialize variable model/transformer/layer_15/rel_attn/r/kernel:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_15/rel_attn/r/kernel
DEBUG:tensorflow:Initialize variable model/transformer/layer_15/rel_attn/v/kernel:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_15/rel_attn/v/kernel
DEBUG:tensorflow:Initialize variable model/transformer/layer_16/ff/LayerNorm/beta:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_16/ff/LayerNorm/beta
DEBUG:tensorflow:Initialize variable model/transformer/layer_16/ff/LayerNorm/gamma:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_16/ff/LayerNorm/gamma
DEBUG:tensorflow:Initialize variable model/transformer/layer_16/ff/layer_1/bias:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_16/ff/layer_1/bias
DEBUG:tensorflow:Initialize variable model/transformer/layer_16/ff/layer_1/kernel:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_16/ff/layer_1/kernel
DEBUG:tensorflow:Initialize variable model/transformer/layer_16/ff/layer_2/bias:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_16/ff/layer_2/bias
DEBUG:tensorflow:Initialize variable model/transformer/layer_16/ff/layer_2/kernel:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_16/ff/layer_2/kernel
DEBUG:tensorflow:Initialize variable model/transformer/layer_16/rel_attn/LayerNorm/beta:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_16/rel_attn/LayerNorm/beta
DEBUG:tensorflow:Initialize variable model/transformer/layer_16/rel_attn/LayerNorm/gamma:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_16/rel_attn/LayerNorm/gamma
DEBUG:tensorflow:Initialize variable model/transformer/layer_16/rel_attn/k/kernel:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_16/rel_attn/k/kernel
DEBUG:tensorflow:Initialize variable model/transformer/layer_16/rel_attn/o/kernel:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_16/rel_attn/o/kernel
DEBUG:tensorflow:Initialize variable model/transformer/layer_16/rel_attn/q/kernel:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_16/rel_attn/q/kernel
DEBUG:tensorflow:Initialize variable model/transformer/layer_16/rel_attn/r/kernel:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_16/rel_attn/r/kernel
DEBUG:tensorflow:Initialize variable model/transformer/layer_16/rel_attn/v/kernel:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_16/rel_attn/v/kernel
DEBUG:tensorflow:Initialize variable model/transformer/layer_17/ff/LayerNorm/beta:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_17/ff/LayerNorm/beta
DEBUG:tensorflow:Initialize variable model/transformer/layer_17/ff/LayerNorm/gamma:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_17/ff/LayerNorm/gamma
DEBUG:tensorflow:Initialize variable model/transformer/layer_17/ff/layer_1/bias:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_17/ff/layer_1/bias
DEBUG:tensorflow:Initialize variable model/transformer/layer_17/ff/layer_1/kernel:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_17/ff/layer_1/kernel
DEBUG:tensorflow:Initialize variable model/transformer/layer_17/ff/layer_2/bias:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_17/ff/layer_2/bias
DEBUG:tensorflow:Initialize variable model/transformer/layer_17/ff/layer_2/kernel:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_17/ff/layer_2/kernel
DEBUG:tensorflow:Initialize variable model/transformer/layer_17/rel_attn/LayerNorm/beta:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_17/rel_attn/LayerNorm/beta
DEBUG:tensorflow:Initialize variable model/transformer/layer_17/rel_attn/LayerNorm/gamma:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_17/rel_attn/LayerNorm/gamma
DEBUG:tensorflow:Initialize variable model/transformer/layer_17/rel_attn/k/kernel:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_17/rel_attn/k/kernel
DEBUG:tensorflow:Initialize variable model/transformer/layer_17/rel_attn/o/kernel:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_17/rel_attn/o/kernel
DEBUG:tensorflow:Initialize variable model/transformer/layer_17/rel_attn/q/kernel:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_17/rel_attn/q/kernel
DEBUG:tensorflow:Initialize variable model/transformer/layer_17/rel_attn/r/kernel:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_17/rel_attn/r/kernel
DEBUG:tensorflow:Initialize variable model/transformer/layer_17/rel_attn/v/kernel:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_17/rel_attn/v/kernel
DEBUG:tensorflow:Initialize variable model/transformer/layer_18/ff/LayerNorm/beta:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_18/ff/LayerNorm/beta
DEBUG:tensorflow:Initialize variable model/transformer/layer_18/ff/LayerNorm/gamma:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_18/ff/LayerNorm/gamma
DEBUG:tensorflow:Initialize variable model/transformer/layer_18/ff/layer_1/bias:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_18/ff/layer_1/bias
DEBUG:tensorflow:Initialize variable model/transformer/layer_18/ff/layer_1/kernel:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_18/ff/layer_1/kernel
DEBUG:tensorflow:Initialize variable model/transformer/layer_18/ff/layer_2/bias:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_18/ff/layer_2/bias
DEBUG:tensorflow:Initialize variable model/transformer/layer_18/ff/layer_2/kernel:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_18/ff/layer_2/kernel
DEBUG:tensorflow:Initialize variable model/transformer/layer_18/rel_attn/LayerNorm/beta:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_18/rel_attn/LayerNorm/beta
DEBUG:tensorflow:Initialize variable model/transformer/layer_18/rel_attn/LayerNorm/gamma:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_18/rel_attn/LayerNorm/gamma
DEBUG:tensorflow:Initialize variable model/transformer/layer_18/rel_attn/k/kernel:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_18/rel_attn/k/kernel
DEBUG:tensorflow:Initialize variable model/transformer/layer_18/rel_attn/o/kernel:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_18/rel_attn/o/kernel
DEBUG:tensorflow:Initialize variable model/transformer/layer_18/rel_attn/q/kernel:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_18/rel_attn/q/kernel
DEBUG:tensorflow:Initialize variable model/transformer/layer_18/rel_attn/r/kernel:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_18/rel_attn/r/kernel
DEBUG:tensorflow:Initialize variable model/transformer/layer_18/rel_attn/v/kernel:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_18/rel_attn/v/kernel
DEBUG:tensorflow:Initialize variable model/transformer/layer_19/ff/LayerNorm/beta:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_19/ff/LayerNorm/beta
DEBUG:tensorflow:Initialize variable model/transformer/layer_19/ff/LayerNorm/gamma:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_19/ff/LayerNorm/gamma
DEBUG:tensorflow:Initialize variable model/transformer/layer_19/ff/layer_1/bias:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_19/ff/layer_1/bias
DEBUG:tensorflow:Initialize variable model/transformer/layer_19/ff/layer_1/kernel:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_19/ff/layer_1/kernel
DEBUG:tensorflow:Initialize variable model/transformer/layer_19/ff/layer_2/bias:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_19/ff/layer_2/bias
DEBUG:tensorflow:Initialize variable model/transformer/layer_19/ff/layer_2/kernel:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_19/ff/layer_2/kernel
DEBUG:tensorflow:Initialize variable model/transformer/layer_19/rel_attn/LayerNorm/beta:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_19/rel_attn/LayerNorm/beta
DEBUG:tensorflow:Initialize variable model/transformer/layer_19/rel_attn/LayerNorm/gamma:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_19/rel_attn/LayerNorm/gamma
DEBUG:tensorflow:Initialize variable model/transformer/layer_19/rel_attn/k/kernel:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_19/rel_attn/k/kernel
DEBUG:tensorflow:Initialize variable model/transformer/layer_19/rel_attn/o/kernel:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_19/rel_attn/o/kernel
DEBUG:tensorflow:Initialize variable model/transformer/layer_19/rel_attn/q/kernel:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_19/rel_attn/q/kernel
DEBUG:tensorflow:Initialize variable model/transformer/layer_19/rel_attn/r/kernel:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_19/rel_attn/r/kernel
DEBUG:tensorflow:Initialize variable model/transformer/layer_19/rel_attn/v/kernel:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_19/rel_attn/v/kernel
DEBUG:tensorflow:Initialize variable model/transformer/layer_2/ff/LayerNorm/beta:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_2/ff/LayerNorm/beta
DEBUG:tensorflow:Initialize variable model/transformer/layer_2/ff/LayerNorm/gamma:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_2/ff/LayerNorm/gamma
DEBUG:tensorflow:Initialize variable model/transformer/layer_2/ff/layer_1/bias:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_2/ff/layer_1/bias
DEBUG:tensorflow:Initialize variable model/transformer/layer_2/ff/layer_1/kernel:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_2/ff/layer_1/kernel
DEBUG:tensorflow:Initialize variable model/transformer/layer_2/ff/layer_2/bias:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_2/ff/layer_2/bias
DEBUG:tensorflow:Initialize variable model/transformer/layer_2/ff/layer_2/kernel:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_2/ff/layer_2/kernel
DEBUG:tensorflow:Initialize variable model/transformer/layer_2/rel_attn/LayerNorm/beta:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_2/rel_attn/LayerNorm/beta
DEBUG:tensorflow:Initialize variable model/transformer/layer_2/rel_attn/LayerNorm/gamma:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_2/rel_attn/LayerNorm/gamma
DEBUG:tensorflow:Initialize variable model/transformer/layer_2/rel_attn/k/kernel:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_2/rel_attn/k/kernel
DEBUG:tensorflow:Initialize variable model/transformer/layer_2/rel_attn/o/kernel:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_2/rel_attn/o/kernel
DEBUG:tensorflow:Initialize variable model/transformer/layer_2/rel_attn/q/kernel:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_2/rel_attn/q/kernel
DEBUG:tensorflow:Initialize variable model/transformer/layer_2/rel_attn/r/kernel:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_2/rel_attn/r/kernel
DEBUG:tensorflow:Initialize variable model/transformer/layer_2/rel_attn/v/kernel:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_2/rel_attn/v/kernel
DEBUG:tensorflow:Initialize variable model/transformer/layer_20/ff/LayerNorm/beta:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_20/ff/LayerNorm/beta
DEBUG:tensorflow:Initialize variable model/transformer/layer_20/ff/LayerNorm/gamma:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_20/ff/LayerNorm/gamma
DEBUG:tensorflow:Initialize variable model/transformer/layer_20/ff/layer_1/bias:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_20/ff/layer_1/bias
DEBUG:tensorflow:Initialize variable model/transformer/layer_20/ff/layer_1/kernel:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_20/ff/layer_1/kernel
DEBUG:tensorflow:Initialize variable model/transformer/layer_20/ff/layer_2/bias:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_20/ff/layer_2/bias
DEBUG:tensorflow:Initialize variable model/transformer/layer_20/ff/layer_2/kernel:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_20/ff/layer_2/kernel
DEBUG:tensorflow:Initialize variable model/transformer/layer_20/rel_attn/LayerNorm/beta:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_20/rel_attn/LayerNorm/beta
DEBUG:tensorflow:Initialize variable model/transformer/layer_20/rel_attn/LayerNorm/gamma:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_20/rel_attn/LayerNorm/gamma
DEBUG:tensorflow:Initialize variable model/transformer/layer_20/rel_attn/k/kernel:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_20/rel_attn/k/kernel
DEBUG:tensorflow:Initialize variable model/transformer/layer_20/rel_attn/o/kernel:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_20/rel_attn/o/kernel
DEBUG:tensorflow:Initialize variable model/transformer/layer_20/rel_attn/q/kernel:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_20/rel_attn/q/kernel
DEBUG:tensorflow:Initialize variable model/transformer/layer_20/rel_attn/r/kernel:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_20/rel_attn/r/kernel
DEBUG:tensorflow:Initialize variable model/transformer/layer_20/rel_attn/v/kernel:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_20/rel_attn/v/kernel
DEBUG:tensorflow:Initialize variable model/transformer/layer_21/ff/LayerNorm/beta:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_21/ff/LayerNorm/beta
DEBUG:tensorflow:Initialize variable model/transformer/layer_21/ff/LayerNorm/gamma:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_21/ff/LayerNorm/gamma
DEBUG:tensorflow:Initialize variable model/transformer/layer_21/ff/layer_1/bias:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_21/ff/layer_1/bias
DEBUG:tensorflow:Initialize variable model/transformer/layer_21/ff/layer_1/kernel:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_21/ff/layer_1/kernel
DEBUG:tensorflow:Initialize variable model/transformer/layer_21/ff/layer_2/bias:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_21/ff/layer_2/bias
DEBUG:tensorflow:Initialize variable model/transformer/layer_21/ff/layer_2/kernel:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_21/ff/layer_2/kernel
DEBUG:tensorflow:Initialize variable model/transformer/layer_21/rel_attn/LayerNorm/beta:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_21/rel_attn/LayerNorm/beta
DEBUG:tensorflow:Initialize variable model/transformer/layer_21/rel_attn/LayerNorm/gamma:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_21/rel_attn/LayerNorm/gamma
DEBUG:tensorflow:Initialize variable model/transformer/layer_21/rel_attn/k/kernel:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_21/rel_attn/k/kernel
DEBUG:tensorflow:Initialize variable model/transformer/layer_21/rel_attn/o/kernel:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_21/rel_attn/o/kernel
DEBUG:tensorflow:Initialize variable model/transformer/layer_21/rel_attn/q/kernel:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_21/rel_attn/q/kernel
DEBUG:tensorflow:Initialize variable model/transformer/layer_21/rel_attn/r/kernel:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_21/rel_attn/r/kernel
DEBUG:tensorflow:Initialize variable model/transformer/layer_21/rel_attn/v/kernel:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_21/rel_attn/v/kernel
DEBUG:tensorflow:Initialize variable model/transformer/layer_22/ff/LayerNorm/beta:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_22/ff/LayerNorm/beta
DEBUG:tensorflow:Initialize variable model/transformer/layer_22/ff/LayerNorm/gamma:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_22/ff/LayerNorm/gamma
DEBUG:tensorflow:Initialize variable model/transformer/layer_22/ff/layer_1/bias:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_22/ff/layer_1/bias
DEBUG:tensorflow:Initialize variable model/transformer/layer_22/ff/layer_1/kernel:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_22/ff/layer_1/kernel
DEBUG:tensorflow:Initialize variable model/transformer/layer_22/ff/layer_2/bias:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_22/ff/layer_2/bias
DEBUG:tensorflow:Initialize variable model/transformer/layer_22/ff/layer_2/kernel:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_22/ff/layer_2/kernel
DEBUG:tensorflow:Initialize variable model/transformer/layer_22/rel_attn/LayerNorm/beta:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_22/rel_attn/LayerNorm/beta
DEBUG:tensorflow:Initialize variable model/transformer/layer_22/rel_attn/LayerNorm/gamma:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_22/rel_attn/LayerNorm/gamma
DEBUG:tensorflow:Initialize variable model/transformer/layer_22/rel_attn/k/kernel:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_22/rel_attn/k/kernel
DEBUG:tensorflow:Initialize variable model/transformer/layer_22/rel_attn/o/kernel:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_22/rel_attn/o/kernel
DEBUG:tensorflow:Initialize variable model/transformer/layer_22/rel_attn/q/kernel:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_22/rel_attn/q/kernel
DEBUG:tensorflow:Initialize variable model/transformer/layer_22/rel_attn/r/kernel:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_22/rel_attn/r/kernel
DEBUG:tensorflow:Initialize variable model/transformer/layer_22/rel_attn/v/kernel:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_22/rel_attn/v/kernel
DEBUG:tensorflow:Initialize variable model/transformer/layer_23/ff/LayerNorm/beta:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_23/ff/LayerNorm/beta
DEBUG:tensorflow:Initialize variable model/transformer/layer_23/ff/LayerNorm/gamma:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_23/ff/LayerNorm/gamma
DEBUG:tensorflow:Initialize variable model/transformer/layer_23/ff/layer_1/bias:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_23/ff/layer_1/bias
DEBUG:tensorflow:Initialize variable model/transformer/layer_23/ff/layer_1/kernel:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_23/ff/layer_1/kernel
DEBUG:tensorflow:Initialize variable model/transformer/layer_23/ff/layer_2/bias:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_23/ff/layer_2/bias
DEBUG:tensorflow:Initialize variable model/transformer/layer_23/ff/layer_2/kernel:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_23/ff/layer_2/kernel
DEBUG:tensorflow:Initialize variable model/transformer/layer_23/rel_attn/LayerNorm/beta:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_23/rel_attn/LayerNorm/beta
DEBUG:tensorflow:Initialize variable model/transformer/layer_23/rel_attn/LayerNorm/gamma:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_23/rel_attn/LayerNorm/gamma
DEBUG:tensorflow:Initialize variable model/transformer/layer_23/rel_attn/k/kernel:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_23/rel_attn/k/kernel
DEBUG:tensorflow:Initialize variable model/transformer/layer_23/rel_attn/o/kernel:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_23/rel_attn/o/kernel
DEBUG:tensorflow:Initialize variable model/transformer/layer_23/rel_attn/q/kernel:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_23/rel_attn/q/kernel
DEBUG:tensorflow:Initialize variable model/transformer/layer_23/rel_attn/r/kernel:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_23/rel_attn/r/kernel
DEBUG:tensorflow:Initialize variable model/transformer/layer_23/rel_attn/v/kernel:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_23/rel_attn/v/kernel
DEBUG:tensorflow:Initialize variable model/transformer/layer_3/ff/LayerNorm/beta:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_3/ff/LayerNorm/beta
DEBUG:tensorflow:Initialize variable model/transformer/layer_3/ff/LayerNorm/gamma:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_3/ff/LayerNorm/gamma
DEBUG:tensorflow:Initialize variable model/transformer/layer_3/ff/layer_1/bias:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_3/ff/layer_1/bias
DEBUG:tensorflow:Initialize variable model/transformer/layer_3/ff/layer_1/kernel:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_3/ff/layer_1/kernel
DEBUG:tensorflow:Initialize variable model/transformer/layer_3/ff/layer_2/bias:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_3/ff/layer_2/bias
DEBUG:tensorflow:Initialize variable model/transformer/layer_3/ff/layer_2/kernel:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_3/ff/layer_2/kernel
DEBUG:tensorflow:Initialize variable model/transformer/layer_3/rel_attn/LayerNorm/beta:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_3/rel_attn/LayerNorm/beta
DEBUG:tensorflow:Initialize variable model/transformer/layer_3/rel_attn/LayerNorm/gamma:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_3/rel_attn/LayerNorm/gamma
DEBUG:tensorflow:Initialize variable model/transformer/layer_3/rel_attn/k/kernel:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_3/rel_attn/k/kernel
DEBUG:tensorflow:Initialize variable model/transformer/layer_3/rel_attn/o/kernel:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_3/rel_attn/o/kernel
DEBUG:tensorflow:Initialize variable model/transformer/layer_3/rel_attn/q/kernel:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_3/rel_attn/q/kernel
DEBUG:tensorflow:Initialize variable model/transformer/layer_3/rel_attn/r/kernel:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_3/rel_attn/r/kernel
DEBUG:tensorflow:Initialize variable model/transformer/layer_3/rel_attn/v/kernel:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_3/rel_attn/v/kernel
DEBUG:tensorflow:Initialize variable model/transformer/layer_4/ff/LayerNorm/beta:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_4/ff/LayerNorm/beta
DEBUG:tensorflow:Initialize variable model/transformer/layer_4/ff/LayerNorm/gamma:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_4/ff/LayerNorm/gamma
DEBUG:tensorflow:Initialize variable model/transformer/layer_4/ff/layer_1/bias:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_4/ff/layer_1/bias
DEBUG:tensorflow:Initialize variable model/transformer/layer_4/ff/layer_1/kernel:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_4/ff/layer_1/kernel
DEBUG:tensorflow:Initialize variable model/transformer/layer_4/ff/layer_2/bias:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_4/ff/layer_2/bias
DEBUG:tensorflow:Initialize variable model/transformer/layer_4/ff/layer_2/kernel:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_4/ff/layer_2/kernel
DEBUG:tensorflow:Initialize variable model/transformer/layer_4/rel_attn/LayerNorm/beta:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_4/rel_attn/LayerNorm/beta
DEBUG:tensorflow:Initialize variable model/transformer/layer_4/rel_attn/LayerNorm/gamma:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_4/rel_attn/LayerNorm/gamma
DEBUG:tensorflow:Initialize variable model/transformer/layer_4/rel_attn/k/kernel:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_4/rel_attn/k/kernel
DEBUG:tensorflow:Initialize variable model/transformer/layer_4/rel_attn/o/kernel:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_4/rel_attn/o/kernel
DEBUG:tensorflow:Initialize variable model/transformer/layer_4/rel_attn/q/kernel:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_4/rel_attn/q/kernel
DEBUG:tensorflow:Initialize variable model/transformer/layer_4/rel_attn/r/kernel:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_4/rel_attn/r/kernel
DEBUG:tensorflow:Initialize variable model/transformer/layer_4/rel_attn/v/kernel:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_4/rel_attn/v/kernel
DEBUG:tensorflow:Initialize variable model/transformer/layer_5/ff/LayerNorm/beta:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_5/ff/LayerNorm/beta
DEBUG:tensorflow:Initialize variable model/transformer/layer_5/ff/LayerNorm/gamma:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_5/ff/LayerNorm/gamma
DEBUG:tensorflow:Initialize variable model/transformer/layer_5/ff/layer_1/bias:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_5/ff/layer_1/bias
DEBUG:tensorflow:Initialize variable model/transformer/layer_5/ff/layer_1/kernel:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_5/ff/layer_1/kernel
DEBUG:tensorflow:Initialize variable model/transformer/layer_5/ff/layer_2/bias:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_5/ff/layer_2/bias
DEBUG:tensorflow:Initialize variable model/transformer/layer_5/ff/layer_2/kernel:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_5/ff/layer_2/kernel
DEBUG:tensorflow:Initialize variable model/transformer/layer_5/rel_attn/LayerNorm/beta:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_5/rel_attn/LayerNorm/beta
DEBUG:tensorflow:Initialize variable model/transformer/layer_5/rel_attn/LayerNorm/gamma:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_5/rel_attn/LayerNorm/gamma
DEBUG:tensorflow:Initialize variable model/transformer/layer_5/rel_attn/k/kernel:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_5/rel_attn/k/kernel
DEBUG:tensorflow:Initialize variable model/transformer/layer_5/rel_attn/o/kernel:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_5/rel_attn/o/kernel
DEBUG:tensorflow:Initialize variable model/transformer/layer_5/rel_attn/q/kernel:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_5/rel_attn/q/kernel
DEBUG:tensorflow:Initialize variable model/transformer/layer_5/rel_attn/r/kernel:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_5/rel_attn/r/kernel
DEBUG:tensorflow:Initialize variable model/transformer/layer_5/rel_attn/v/kernel:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_5/rel_attn/v/kernel
DEBUG:tensorflow:Initialize variable model/transformer/layer_6/ff/LayerNorm/beta:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_6/ff/LayerNorm/beta
DEBUG:tensorflow:Initialize variable model/transformer/layer_6/ff/LayerNorm/gamma:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_6/ff/LayerNorm/gamma
DEBUG:tensorflow:Initialize variable model/transformer/layer_6/ff/layer_1/bias:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_6/ff/layer_1/bias
DEBUG:tensorflow:Initialize variable model/transformer/layer_6/ff/layer_1/kernel:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_6/ff/layer_1/kernel
DEBUG:tensorflow:Initialize variable model/transformer/layer_6/ff/layer_2/bias:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_6/ff/layer_2/bias
DEBUG:tensorflow:Initialize variable model/transformer/layer_6/ff/layer_2/kernel:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_6/ff/layer_2/kernel
DEBUG:tensorflow:Initialize variable model/transformer/layer_6/rel_attn/LayerNorm/beta:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_6/rel_attn/LayerNorm/beta
DEBUG:tensorflow:Initialize variable model/transformer/layer_6/rel_attn/LayerNorm/gamma:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_6/rel_attn/LayerNorm/gamma
DEBUG:tensorflow:Initialize variable model/transformer/layer_6/rel_attn/k/kernel:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_6/rel_attn/k/kernel
DEBUG:tensorflow:Initialize variable model/transformer/layer_6/rel_attn/o/kernel:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_6/rel_attn/o/kernel
DEBUG:tensorflow:Initialize variable model/transformer/layer_6/rel_attn/q/kernel:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_6/rel_attn/q/kernel
DEBUG:tensorflow:Initialize variable model/transformer/layer_6/rel_attn/r/kernel:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_6/rel_attn/r/kernel
DEBUG:tensorflow:Initialize variable model/transformer/layer_6/rel_attn/v/kernel:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_6/rel_attn/v/kernel
DEBUG:tensorflow:Initialize variable model/transformer/layer_7/ff/LayerNorm/beta:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_7/ff/LayerNorm/beta
DEBUG:tensorflow:Initialize variable model/transformer/layer_7/ff/LayerNorm/gamma:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_7/ff/LayerNorm/gamma
DEBUG:tensorflow:Initialize variable model/transformer/layer_7/ff/layer_1/bias:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_7/ff/layer_1/bias
DEBUG:tensorflow:Initialize variable model/transformer/layer_7/ff/layer_1/kernel:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_7/ff/layer_1/kernel
DEBUG:tensorflow:Initialize variable model/transformer/layer_7/ff/layer_2/bias:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_7/ff/layer_2/bias
DEBUG:tensorflow:Initialize variable model/transformer/layer_7/ff/layer_2/kernel:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_7/ff/layer_2/kernel
DEBUG:tensorflow:Initialize variable model/transformer/layer_7/rel_attn/LayerNorm/beta:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_7/rel_attn/LayerNorm/beta
DEBUG:tensorflow:Initialize variable model/transformer/layer_7/rel_attn/LayerNorm/gamma:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_7/rel_attn/LayerNorm/gamma
DEBUG:tensorflow:Initialize variable model/transformer/layer_7/rel_attn/k/kernel:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_7/rel_attn/k/kernel
DEBUG:tensorflow:Initialize variable model/transformer/layer_7/rel_attn/o/kernel:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_7/rel_attn/o/kernel
DEBUG:tensorflow:Initialize variable model/transformer/layer_7/rel_attn/q/kernel:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_7/rel_attn/q/kernel
DEBUG:tensorflow:Initialize variable model/transformer/layer_7/rel_attn/r/kernel:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_7/rel_attn/r/kernel
DEBUG:tensorflow:Initialize variable model/transformer/layer_7/rel_attn/v/kernel:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_7/rel_attn/v/kernel
DEBUG:tensorflow:Initialize variable model/transformer/layer_8/ff/LayerNorm/beta:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_8/ff/LayerNorm/beta
DEBUG:tensorflow:Initialize variable model/transformer/layer_8/ff/LayerNorm/gamma:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_8/ff/LayerNorm/gamma
DEBUG:tensorflow:Initialize variable model/transformer/layer_8/ff/layer_1/bias:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_8/ff/layer_1/bias
DEBUG:tensorflow:Initialize variable model/transformer/layer_8/ff/layer_1/kernel:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_8/ff/layer_1/kernel
DEBUG:tensorflow:Initialize variable model/transformer/layer_8/ff/layer_2/bias:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_8/ff/layer_2/bias
DEBUG:tensorflow:Initialize variable model/transformer/layer_8/ff/layer_2/kernel:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_8/ff/layer_2/kernel
DEBUG:tensorflow:Initialize variable model/transformer/layer_8/rel_attn/LayerNorm/beta:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_8/rel_attn/LayerNorm/beta
DEBUG:tensorflow:Initialize variable model/transformer/layer_8/rel_attn/LayerNorm/gamma:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_8/rel_attn/LayerNorm/gamma
DEBUG:tensorflow:Initialize variable model/transformer/layer_8/rel_attn/k/kernel:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_8/rel_attn/k/kernel
DEBUG:tensorflow:Initialize variable model/transformer/layer_8/rel_attn/o/kernel:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_8/rel_attn/o/kernel
DEBUG:tensorflow:Initialize variable model/transformer/layer_8/rel_attn/q/kernel:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_8/rel_attn/q/kernel
DEBUG:tensorflow:Initialize variable model/transformer/layer_8/rel_attn/r/kernel:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_8/rel_attn/r/kernel
DEBUG:tensorflow:Initialize variable model/transformer/layer_8/rel_attn/v/kernel:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_8/rel_attn/v/kernel
DEBUG:tensorflow:Initialize variable model/transformer/layer_9/ff/LayerNorm/beta:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_9/ff/LayerNorm/beta
DEBUG:tensorflow:Initialize variable model/transformer/layer_9/ff/LayerNorm/gamma:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_9/ff/LayerNorm/gamma
DEBUG:tensorflow:Initialize variable model/transformer/layer_9/ff/layer_1/bias:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_9/ff/layer_1/bias
DEBUG:tensorflow:Initialize variable model/transformer/layer_9/ff/layer_1/kernel:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_9/ff/layer_1/kernel
DEBUG:tensorflow:Initialize variable model/transformer/layer_9/ff/layer_2/bias:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_9/ff/layer_2/bias
DEBUG:tensorflow:Initialize variable model/transformer/layer_9/ff/layer_2/kernel:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_9/ff/layer_2/kernel
DEBUG:tensorflow:Initialize variable model/transformer/layer_9/rel_attn/LayerNorm/beta:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_9/rel_attn/LayerNorm/beta
DEBUG:tensorflow:Initialize variable model/transformer/layer_9/rel_attn/LayerNorm/gamma:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_9/rel_attn/LayerNorm/gamma
DEBUG:tensorflow:Initialize variable model/transformer/layer_9/rel_attn/k/kernel:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_9/rel_attn/k/kernel
DEBUG:tensorflow:Initialize variable model/transformer/layer_9/rel_attn/o/kernel:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_9/rel_attn/o/kernel
DEBUG:tensorflow:Initialize variable model/transformer/layer_9/rel_attn/q/kernel:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_9/rel_attn/q/kernel
DEBUG:tensorflow:Initialize variable model/transformer/layer_9/rel_attn/r/kernel:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_9/rel_attn/r/kernel
DEBUG:tensorflow:Initialize variable model/transformer/layer_9/rel_attn/v/kernel:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/layer_9/rel_attn/v/kernel
DEBUG:tensorflow:Initialize variable model/transformer/r_r_bias:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/r_r_bias
DEBUG:tensorflow:Initialize variable model/transformer/r_s_bias:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/r_s_bias
DEBUG:tensorflow:Initialize variable model/transformer/r_w_bias:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/r_w_bias
DEBUG:tensorflow:Initialize variable model/transformer/seg_embed:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/seg_embed
DEBUG:tensorflow:Initialize variable model/transformer/word_embedding/lookup_table:0 from checkpoint xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt with model/transformer/word_embedding/lookup_table
INFO:tensorflow:**** Global Variables ****
INFO:tensorflow: name = model/transformer/r_w_bias:0, shape = (24, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/r_r_bias:0, shape = (24, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/word_embedding/lookup_table:0, shape = (32000, 1024), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/r_s_bias:0, shape = (24, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/seg_embed:0, shape = (24, 2, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_0/rel_attn/q/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_0/rel_attn/k/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_0/rel_attn/v/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_0/rel_attn/r/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_0/rel_attn/o/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_0/rel_attn/LayerNorm/beta:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_0/rel_attn/LayerNorm/gamma:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_0/ff/layer_1/kernel:0, shape = (1024, 4096), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_0/ff/layer_1/bias:0, shape = (4096,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_0/ff/layer_2/kernel:0, shape = (4096, 1024), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_0/ff/layer_2/bias:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_0/ff/LayerNorm/beta:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_0/ff/LayerNorm/gamma:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_1/rel_attn/q/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_1/rel_attn/k/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_1/rel_attn/v/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_1/rel_attn/r/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_1/rel_attn/o/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_1/rel_attn/LayerNorm/beta:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_1/rel_attn/LayerNorm/gamma:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_1/ff/layer_1/kernel:0, shape = (1024, 4096), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_1/ff/layer_1/bias:0, shape = (4096,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_1/ff/layer_2/kernel:0, shape = (4096, 1024), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_1/ff/layer_2/bias:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_1/ff/LayerNorm/beta:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_1/ff/LayerNorm/gamma:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_2/rel_attn/q/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_2/rel_attn/k/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_2/rel_attn/v/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_2/rel_attn/r/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_2/rel_attn/o/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_2/rel_attn/LayerNorm/beta:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_2/rel_attn/LayerNorm/gamma:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_2/ff/layer_1/kernel:0, shape = (1024, 4096), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_2/ff/layer_1/bias:0, shape = (4096,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_2/ff/layer_2/kernel:0, shape = (4096, 1024), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_2/ff/layer_2/bias:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_2/ff/LayerNorm/beta:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_2/ff/LayerNorm/gamma:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_3/rel_attn/q/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_3/rel_attn/k/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_3/rel_attn/v/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_3/rel_attn/r/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_3/rel_attn/o/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_3/rel_attn/LayerNorm/beta:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_3/rel_attn/LayerNorm/gamma:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_3/ff/layer_1/kernel:0, shape = (1024, 4096), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_3/ff/layer_1/bias:0, shape = (4096,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_3/ff/layer_2/kernel:0, shape = (4096, 1024), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_3/ff/layer_2/bias:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_3/ff/LayerNorm/beta:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_3/ff/LayerNorm/gamma:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_4/rel_attn/q/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_4/rel_attn/k/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_4/rel_attn/v/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_4/rel_attn/r/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_4/rel_attn/o/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_4/rel_attn/LayerNorm/beta:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_4/rel_attn/LayerNorm/gamma:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_4/ff/layer_1/kernel:0, shape = (1024, 4096), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_4/ff/layer_1/bias:0, shape = (4096,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_4/ff/layer_2/kernel:0, shape = (4096, 1024), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_4/ff/layer_2/bias:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_4/ff/LayerNorm/beta:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_4/ff/LayerNorm/gamma:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_5/rel_attn/q/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_5/rel_attn/k/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_5/rel_attn/v/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_5/rel_attn/r/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_5/rel_attn/o/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_5/rel_attn/LayerNorm/beta:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_5/rel_attn/LayerNorm/gamma:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_5/ff/layer_1/kernel:0, shape = (1024, 4096), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_5/ff/layer_1/bias:0, shape = (4096,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_5/ff/layer_2/kernel:0, shape = (4096, 1024), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_5/ff/layer_2/bias:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_5/ff/LayerNorm/beta:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_5/ff/LayerNorm/gamma:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_6/rel_attn/q/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_6/rel_attn/k/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_6/rel_attn/v/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_6/rel_attn/r/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_6/rel_attn/o/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_6/rel_attn/LayerNorm/beta:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_6/rel_attn/LayerNorm/gamma:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_6/ff/layer_1/kernel:0, shape = (1024, 4096), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_6/ff/layer_1/bias:0, shape = (4096,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_6/ff/layer_2/kernel:0, shape = (4096, 1024), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_6/ff/layer_2/bias:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_6/ff/LayerNorm/beta:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_6/ff/LayerNorm/gamma:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_7/rel_attn/q/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_7/rel_attn/k/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_7/rel_attn/v/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_7/rel_attn/r/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_7/rel_attn/o/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_7/rel_attn/LayerNorm/beta:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_7/rel_attn/LayerNorm/gamma:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_7/ff/layer_1/kernel:0, shape = (1024, 4096), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_7/ff/layer_1/bias:0, shape = (4096,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_7/ff/layer_2/kernel:0, shape = (4096, 1024), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_7/ff/layer_2/bias:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_7/ff/LayerNorm/beta:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_7/ff/LayerNorm/gamma:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_8/rel_attn/q/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_8/rel_attn/k/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_8/rel_attn/v/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_8/rel_attn/r/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_8/rel_attn/o/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_8/rel_attn/LayerNorm/beta:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_8/rel_attn/LayerNorm/gamma:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_8/ff/layer_1/kernel:0, shape = (1024, 4096), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_8/ff/layer_1/bias:0, shape = (4096,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_8/ff/layer_2/kernel:0, shape = (4096, 1024), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_8/ff/layer_2/bias:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_8/ff/LayerNorm/beta:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_8/ff/LayerNorm/gamma:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_9/rel_attn/q/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_9/rel_attn/k/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_9/rel_attn/v/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_9/rel_attn/r/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_9/rel_attn/o/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_9/rel_attn/LayerNorm/beta:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_9/rel_attn/LayerNorm/gamma:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_9/ff/layer_1/kernel:0, shape = (1024, 4096), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_9/ff/layer_1/bias:0, shape = (4096,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_9/ff/layer_2/kernel:0, shape = (4096, 1024), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_9/ff/layer_2/bias:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_9/ff/LayerNorm/beta:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_9/ff/LayerNorm/gamma:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_10/rel_attn/q/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_10/rel_attn/k/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_10/rel_attn/v/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_10/rel_attn/r/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_10/rel_attn/o/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_10/rel_attn/LayerNorm/beta:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_10/rel_attn/LayerNorm/gamma:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_10/ff/layer_1/kernel:0, shape = (1024, 4096), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_10/ff/layer_1/bias:0, shape = (4096,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_10/ff/layer_2/kernel:0, shape = (4096, 1024), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_10/ff/layer_2/bias:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_10/ff/LayerNorm/beta:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_10/ff/LayerNorm/gamma:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_11/rel_attn/q/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_11/rel_attn/k/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_11/rel_attn/v/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_11/rel_attn/r/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_11/rel_attn/o/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_11/rel_attn/LayerNorm/beta:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_11/rel_attn/LayerNorm/gamma:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_11/ff/layer_1/kernel:0, shape = (1024, 4096), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_11/ff/layer_1/bias:0, shape = (4096,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_11/ff/layer_2/kernel:0, shape = (4096, 1024), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_11/ff/layer_2/bias:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_11/ff/LayerNorm/beta:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_11/ff/LayerNorm/gamma:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_12/rel_attn/q/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_12/rel_attn/k/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_12/rel_attn/v/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_12/rel_attn/r/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_12/rel_attn/o/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_12/rel_attn/LayerNorm/beta:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_12/rel_attn/LayerNorm/gamma:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_12/ff/layer_1/kernel:0, shape = (1024, 4096), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_12/ff/layer_1/bias:0, shape = (4096,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_12/ff/layer_2/kernel:0, shape = (4096, 1024), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_12/ff/layer_2/bias:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_12/ff/LayerNorm/beta:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_12/ff/LayerNorm/gamma:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_13/rel_attn/q/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_13/rel_attn/k/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_13/rel_attn/v/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_13/rel_attn/r/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_13/rel_attn/o/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_13/rel_attn/LayerNorm/beta:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_13/rel_attn/LayerNorm/gamma:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_13/ff/layer_1/kernel:0, shape = (1024, 4096), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_13/ff/layer_1/bias:0, shape = (4096,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_13/ff/layer_2/kernel:0, shape = (4096, 1024), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_13/ff/layer_2/bias:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_13/ff/LayerNorm/beta:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_13/ff/LayerNorm/gamma:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_14/rel_attn/q/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_14/rel_attn/k/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_14/rel_attn/v/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_14/rel_attn/r/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_14/rel_attn/o/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_14/rel_attn/LayerNorm/beta:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_14/rel_attn/LayerNorm/gamma:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_14/ff/layer_1/kernel:0, shape = (1024, 4096), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_14/ff/layer_1/bias:0, shape = (4096,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_14/ff/layer_2/kernel:0, shape = (4096, 1024), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_14/ff/layer_2/bias:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_14/ff/LayerNorm/beta:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_14/ff/LayerNorm/gamma:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_15/rel_attn/q/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_15/rel_attn/k/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_15/rel_attn/v/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_15/rel_attn/r/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_15/rel_attn/o/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_15/rel_attn/LayerNorm/beta:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_15/rel_attn/LayerNorm/gamma:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_15/ff/layer_1/kernel:0, shape = (1024, 4096), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_15/ff/layer_1/bias:0, shape = (4096,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_15/ff/layer_2/kernel:0, shape = (4096, 1024), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_15/ff/layer_2/bias:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_15/ff/LayerNorm/beta:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_15/ff/LayerNorm/gamma:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_16/rel_attn/q/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_16/rel_attn/k/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_16/rel_attn/v/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_16/rel_attn/r/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_16/rel_attn/o/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_16/rel_attn/LayerNorm/beta:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_16/rel_attn/LayerNorm/gamma:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_16/ff/layer_1/kernel:0, shape = (1024, 4096), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_16/ff/layer_1/bias:0, shape = (4096,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_16/ff/layer_2/kernel:0, shape = (4096, 1024), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_16/ff/layer_2/bias:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_16/ff/LayerNorm/beta:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_16/ff/LayerNorm/gamma:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_17/rel_attn/q/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_17/rel_attn/k/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_17/rel_attn/v/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_17/rel_attn/r/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_17/rel_attn/o/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_17/rel_attn/LayerNorm/beta:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_17/rel_attn/LayerNorm/gamma:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_17/ff/layer_1/kernel:0, shape = (1024, 4096), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_17/ff/layer_1/bias:0, shape = (4096,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_17/ff/layer_2/kernel:0, shape = (4096, 1024), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_17/ff/layer_2/bias:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_17/ff/LayerNorm/beta:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_17/ff/LayerNorm/gamma:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_18/rel_attn/q/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_18/rel_attn/k/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_18/rel_attn/v/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_18/rel_attn/r/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_18/rel_attn/o/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_18/rel_attn/LayerNorm/beta:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_18/rel_attn/LayerNorm/gamma:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_18/ff/layer_1/kernel:0, shape = (1024, 4096), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_18/ff/layer_1/bias:0, shape = (4096,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_18/ff/layer_2/kernel:0, shape = (4096, 1024), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_18/ff/layer_2/bias:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_18/ff/LayerNorm/beta:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_18/ff/LayerNorm/gamma:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_19/rel_attn/q/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_19/rel_attn/k/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_19/rel_attn/v/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_19/rel_attn/r/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_19/rel_attn/o/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_19/rel_attn/LayerNorm/beta:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_19/rel_attn/LayerNorm/gamma:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_19/ff/layer_1/kernel:0, shape = (1024, 4096), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_19/ff/layer_1/bias:0, shape = (4096,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_19/ff/layer_2/kernel:0, shape = (4096, 1024), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_19/ff/layer_2/bias:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_19/ff/LayerNorm/beta:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_19/ff/LayerNorm/gamma:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_20/rel_attn/q/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_20/rel_attn/k/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_20/rel_attn/v/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_20/rel_attn/r/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_20/rel_attn/o/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_20/rel_attn/LayerNorm/beta:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_20/rel_attn/LayerNorm/gamma:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_20/ff/layer_1/kernel:0, shape = (1024, 4096), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_20/ff/layer_1/bias:0, shape = (4096,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_20/ff/layer_2/kernel:0, shape = (4096, 1024), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_20/ff/layer_2/bias:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_20/ff/LayerNorm/beta:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_20/ff/LayerNorm/gamma:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_21/rel_attn/q/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_21/rel_attn/k/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_21/rel_attn/v/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_21/rel_attn/r/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_21/rel_attn/o/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_21/rel_attn/LayerNorm/beta:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_21/rel_attn/LayerNorm/gamma:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_21/ff/layer_1/kernel:0, shape = (1024, 4096), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_21/ff/layer_1/bias:0, shape = (4096,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_21/ff/layer_2/kernel:0, shape = (4096, 1024), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_21/ff/layer_2/bias:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_21/ff/LayerNorm/beta:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_21/ff/LayerNorm/gamma:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_22/rel_attn/q/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_22/rel_attn/k/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_22/rel_attn/v/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_22/rel_attn/r/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_22/rel_attn/o/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_22/rel_attn/LayerNorm/beta:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_22/rel_attn/LayerNorm/gamma:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_22/ff/layer_1/kernel:0, shape = (1024, 4096), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_22/ff/layer_1/bias:0, shape = (4096,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_22/ff/layer_2/kernel:0, shape = (4096, 1024), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_22/ff/layer_2/bias:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_22/ff/LayerNorm/beta:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_22/ff/LayerNorm/gamma:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_23/rel_attn/q/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_23/rel_attn/k/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_23/rel_attn/v/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_23/rel_attn/r/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_23/rel_attn/o/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_23/rel_attn/LayerNorm/beta:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_23/rel_attn/LayerNorm/gamma:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_23/ff/layer_1/kernel:0, shape = (1024, 4096), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_23/ff/layer_1/bias:0, shape = (4096,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_23/ff/layer_2/kernel:0, shape = (4096, 1024), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_23/ff/layer_2/bias:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_23/ff/LayerNorm/beta:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_23/ff/LayerNorm/gamma:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/sequnece_summary/summary/kernel:0, shape = (1024, 1024)
INFO:tensorflow: name = model/sequnece_summary/summary/bias:0, shape = (1024,)
INFO:tensorflow: name = model/regression_sts-b/logit/kernel:0, shape = (1024, 1)
INFO:tensorflow: name = model/regression_sts-b/logit/bias:0, shape = (1,)
WARNING:tensorflow:From /mnt/lustre/sjtu/home/myl01/anaconda3/envs/xlnet/lib/python2.7/site-packages/tensorflow/python/training/learning_rate_decay_v2.py:321: div (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Deprecated in favor of operator or tf.math.divide.
WARNING:tensorflow:From /mnt/lustre/sjtu/home/myl01/anaconda3/envs/xlnet/lib/python2.7/site-packages/tensorflow/python/ops/math_ops.py:3066: to_int32 (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.cast instead.
INFO:tensorflow:**** Global Variables ****
INFO:tensorflow: name = model/transformer/r_w_bias:0, shape = (24, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/r_r_bias:0, shape = (24, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/word_embedding/lookup_table:0, shape = (32000, 1024), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/r_s_bias:0, shape = (24, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/seg_embed:0, shape = (24, 2, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_0/rel_attn/q/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_0/rel_attn/k/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_0/rel_attn/v/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_0/rel_attn/r/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_0/rel_attn/o/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_0/rel_attn/LayerNorm/beta:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_0/rel_attn/LayerNorm/gamma:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_0/ff/layer_1/kernel:0, shape = (1024, 4096), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_0/ff/layer_1/bias:0, shape = (4096,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_0/ff/layer_2/kernel:0, shape = (4096, 1024), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_0/ff/layer_2/bias:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_0/ff/LayerNorm/beta:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_0/ff/LayerNorm/gamma:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_1/rel_attn/q/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_1/rel_attn/k/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_1/rel_attn/v/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_1/rel_attn/r/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_1/rel_attn/o/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_1/rel_attn/LayerNorm/beta:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_1/rel_attn/LayerNorm/gamma:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_1/ff/layer_1/kernel:0, shape = (1024, 4096), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_1/ff/layer_1/bias:0, shape = (4096,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_1/ff/layer_2/kernel:0, shape = (4096, 1024), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_1/ff/layer_2/bias:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_1/ff/LayerNorm/beta:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_1/ff/LayerNorm/gamma:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_2/rel_attn/q/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_2/rel_attn/k/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_2/rel_attn/v/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_2/rel_attn/r/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_2/rel_attn/o/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_2/rel_attn/LayerNorm/beta:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_2/rel_attn/LayerNorm/gamma:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_2/ff/layer_1/kernel:0, shape = (1024, 4096), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_2/ff/layer_1/bias:0, shape = (4096,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_2/ff/layer_2/kernel:0, shape = (4096, 1024), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_2/ff/layer_2/bias:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_2/ff/LayerNorm/beta:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_2/ff/LayerNorm/gamma:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_3/rel_attn/q/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_3/rel_attn/k/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_3/rel_attn/v/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_3/rel_attn/r/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_3/rel_attn/o/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_3/rel_attn/LayerNorm/beta:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_3/rel_attn/LayerNorm/gamma:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_3/ff/layer_1/kernel:0, shape = (1024, 4096), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_3/ff/layer_1/bias:0, shape = (4096,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_3/ff/layer_2/kernel:0, shape = (4096, 1024), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_3/ff/layer_2/bias:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_3/ff/LayerNorm/beta:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_3/ff/LayerNorm/gamma:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_4/rel_attn/q/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_4/rel_attn/k/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_4/rel_attn/v/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_4/rel_attn/r/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_4/rel_attn/o/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_4/rel_attn/LayerNorm/beta:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_4/rel_attn/LayerNorm/gamma:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_4/ff/layer_1/kernel:0, shape = (1024, 4096), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_4/ff/layer_1/bias:0, shape = (4096,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_4/ff/layer_2/kernel:0, shape = (4096, 1024), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_4/ff/layer_2/bias:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_4/ff/LayerNorm/beta:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_4/ff/LayerNorm/gamma:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_5/rel_attn/q/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_5/rel_attn/k/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_5/rel_attn/v/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_5/rel_attn/r/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_5/rel_attn/o/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_5/rel_attn/LayerNorm/beta:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_5/rel_attn/LayerNorm/gamma:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_5/ff/layer_1/kernel:0, shape = (1024, 4096), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_5/ff/layer_1/bias:0, shape = (4096,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_5/ff/layer_2/kernel:0, shape = (4096, 1024), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_5/ff/layer_2/bias:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_5/ff/LayerNorm/beta:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_5/ff/LayerNorm/gamma:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_6/rel_attn/q/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_6/rel_attn/k/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_6/rel_attn/v/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_6/rel_attn/r/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_6/rel_attn/o/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_6/rel_attn/LayerNorm/beta:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_6/rel_attn/LayerNorm/gamma:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_6/ff/layer_1/kernel:0, shape = (1024, 4096), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_6/ff/layer_1/bias:0, shape = (4096,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_6/ff/layer_2/kernel:0, shape = (4096, 1024), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_6/ff/layer_2/bias:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_6/ff/LayerNorm/beta:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_6/ff/LayerNorm/gamma:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_7/rel_attn/q/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_7/rel_attn/k/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_7/rel_attn/v/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_7/rel_attn/r/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_7/rel_attn/o/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_7/rel_attn/LayerNorm/beta:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_7/rel_attn/LayerNorm/gamma:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_7/ff/layer_1/kernel:0, shape = (1024, 4096), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_7/ff/layer_1/bias:0, shape = (4096,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_7/ff/layer_2/kernel:0, shape = (4096, 1024), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_7/ff/layer_2/bias:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_7/ff/LayerNorm/beta:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_7/ff/LayerNorm/gamma:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_8/rel_attn/q/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_8/rel_attn/k/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_8/rel_attn/v/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_8/rel_attn/r/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_8/rel_attn/o/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_8/rel_attn/LayerNorm/beta:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_8/rel_attn/LayerNorm/gamma:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_8/ff/layer_1/kernel:0, shape = (1024, 4096), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_8/ff/layer_1/bias:0, shape = (4096,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_8/ff/layer_2/kernel:0, shape = (4096, 1024), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_8/ff/layer_2/bias:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_8/ff/LayerNorm/beta:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_8/ff/LayerNorm/gamma:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_9/rel_attn/q/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_9/rel_attn/k/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_9/rel_attn/v/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_9/rel_attn/r/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_9/rel_attn/o/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_9/rel_attn/LayerNorm/beta:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_9/rel_attn/LayerNorm/gamma:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_9/ff/layer_1/kernel:0, shape = (1024, 4096), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_9/ff/layer_1/bias:0, shape = (4096,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_9/ff/layer_2/kernel:0, shape = (4096, 1024), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_9/ff/layer_2/bias:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_9/ff/LayerNorm/beta:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_9/ff/LayerNorm/gamma:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_10/rel_attn/q/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_10/rel_attn/k/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_10/rel_attn/v/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_10/rel_attn/r/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_10/rel_attn/o/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_10/rel_attn/LayerNorm/beta:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_10/rel_attn/LayerNorm/gamma:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_10/ff/layer_1/kernel:0, shape = (1024, 4096), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_10/ff/layer_1/bias:0, shape = (4096,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_10/ff/layer_2/kernel:0, shape = (4096, 1024), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_10/ff/layer_2/bias:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_10/ff/LayerNorm/beta:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_10/ff/LayerNorm/gamma:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_11/rel_attn/q/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_11/rel_attn/k/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_11/rel_attn/v/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_11/rel_attn/r/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_11/rel_attn/o/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_11/rel_attn/LayerNorm/beta:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_11/rel_attn/LayerNorm/gamma:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_11/ff/layer_1/kernel:0, shape = (1024, 4096), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_11/ff/layer_1/bias:0, shape = (4096,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_11/ff/layer_2/kernel:0, shape = (4096, 1024), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_11/ff/layer_2/bias:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_11/ff/LayerNorm/beta:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_11/ff/LayerNorm/gamma:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_12/rel_attn/q/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_12/rel_attn/k/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_12/rel_attn/v/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_12/rel_attn/r/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_12/rel_attn/o/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_12/rel_attn/LayerNorm/beta:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_12/rel_attn/LayerNorm/gamma:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_12/ff/layer_1/kernel:0, shape = (1024, 4096), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_12/ff/layer_1/bias:0, shape = (4096,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_12/ff/layer_2/kernel:0, shape = (4096, 1024), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_12/ff/layer_2/bias:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_12/ff/LayerNorm/beta:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_12/ff/LayerNorm/gamma:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_13/rel_attn/q/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_13/rel_attn/k/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_13/rel_attn/v/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_13/rel_attn/r/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_13/rel_attn/o/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_13/rel_attn/LayerNorm/beta:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_13/rel_attn/LayerNorm/gamma:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_13/ff/layer_1/kernel:0, shape = (1024, 4096), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_13/ff/layer_1/bias:0, shape = (4096,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_13/ff/layer_2/kernel:0, shape = (4096, 1024), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_13/ff/layer_2/bias:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_13/ff/LayerNorm/beta:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_13/ff/LayerNorm/gamma:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_14/rel_attn/q/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_14/rel_attn/k/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_14/rel_attn/v/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_14/rel_attn/r/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_14/rel_attn/o/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_14/rel_attn/LayerNorm/beta:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_14/rel_attn/LayerNorm/gamma:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_14/ff/layer_1/kernel:0, shape = (1024, 4096), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_14/ff/layer_1/bias:0, shape = (4096,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_14/ff/layer_2/kernel:0, shape = (4096, 1024), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_14/ff/layer_2/bias:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_14/ff/LayerNorm/beta:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_14/ff/LayerNorm/gamma:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_15/rel_attn/q/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_15/rel_attn/k/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_15/rel_attn/v/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_15/rel_attn/r/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_15/rel_attn/o/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_15/rel_attn/LayerNorm/beta:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_15/rel_attn/LayerNorm/gamma:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_15/ff/layer_1/kernel:0, shape = (1024, 4096), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_15/ff/layer_1/bias:0, shape = (4096,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_15/ff/layer_2/kernel:0, shape = (4096, 1024), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_15/ff/layer_2/bias:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_15/ff/LayerNorm/beta:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_15/ff/LayerNorm/gamma:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_16/rel_attn/q/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_16/rel_attn/k/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_16/rel_attn/v/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_16/rel_attn/r/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_16/rel_attn/o/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_16/rel_attn/LayerNorm/beta:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_16/rel_attn/LayerNorm/gamma:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_16/ff/layer_1/kernel:0, shape = (1024, 4096), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_16/ff/layer_1/bias:0, shape = (4096,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_16/ff/layer_2/kernel:0, shape = (4096, 1024), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_16/ff/layer_2/bias:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_16/ff/LayerNorm/beta:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_16/ff/LayerNorm/gamma:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_17/rel_attn/q/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_17/rel_attn/k/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_17/rel_attn/v/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_17/rel_attn/r/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_17/rel_attn/o/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_17/rel_attn/LayerNorm/beta:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_17/rel_attn/LayerNorm/gamma:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_17/ff/layer_1/kernel:0, shape = (1024, 4096), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_17/ff/layer_1/bias:0, shape = (4096,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_17/ff/layer_2/kernel:0, shape = (4096, 1024), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_17/ff/layer_2/bias:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_17/ff/LayerNorm/beta:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_17/ff/LayerNorm/gamma:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_18/rel_attn/q/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_18/rel_attn/k/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_18/rel_attn/v/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_18/rel_attn/r/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_18/rel_attn/o/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_18/rel_attn/LayerNorm/beta:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_18/rel_attn/LayerNorm/gamma:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_18/ff/layer_1/kernel:0, shape = (1024, 4096), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_18/ff/layer_1/bias:0, shape = (4096,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_18/ff/layer_2/kernel:0, shape = (4096, 1024), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_18/ff/layer_2/bias:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_18/ff/LayerNorm/beta:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_18/ff/LayerNorm/gamma:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_19/rel_attn/q/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_19/rel_attn/k/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_19/rel_attn/v/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_19/rel_attn/r/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_19/rel_attn/o/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_19/rel_attn/LayerNorm/beta:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_19/rel_attn/LayerNorm/gamma:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_19/ff/layer_1/kernel:0, shape = (1024, 4096), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_19/ff/layer_1/bias:0, shape = (4096,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_19/ff/layer_2/kernel:0, shape = (4096, 1024), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_19/ff/layer_2/bias:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_19/ff/LayerNorm/beta:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_19/ff/LayerNorm/gamma:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_20/rel_attn/q/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_20/rel_attn/k/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_20/rel_attn/v/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_20/rel_attn/r/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_20/rel_attn/o/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_20/rel_attn/LayerNorm/beta:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_20/rel_attn/LayerNorm/gamma:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_20/ff/layer_1/kernel:0, shape = (1024, 4096), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_20/ff/layer_1/bias:0, shape = (4096,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_20/ff/layer_2/kernel:0, shape = (4096, 1024), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_20/ff/layer_2/bias:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_20/ff/LayerNorm/beta:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_20/ff/LayerNorm/gamma:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_21/rel_attn/q/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_21/rel_attn/k/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_21/rel_attn/v/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_21/rel_attn/r/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_21/rel_attn/o/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_21/rel_attn/LayerNorm/beta:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_21/rel_attn/LayerNorm/gamma:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_21/ff/layer_1/kernel:0, shape = (1024, 4096), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_21/ff/layer_1/bias:0, shape = (4096,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_21/ff/layer_2/kernel:0, shape = (4096, 1024), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_21/ff/layer_2/bias:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_21/ff/LayerNorm/beta:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_21/ff/LayerNorm/gamma:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_22/rel_attn/q/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_22/rel_attn/k/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_22/rel_attn/v/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_22/rel_attn/r/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_22/rel_attn/o/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_22/rel_attn/LayerNorm/beta:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_22/rel_attn/LayerNorm/gamma:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_22/ff/layer_1/kernel:0, shape = (1024, 4096), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_22/ff/layer_1/bias:0, shape = (4096,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_22/ff/layer_2/kernel:0, shape = (4096, 1024), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_22/ff/layer_2/bias:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_22/ff/LayerNorm/beta:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_22/ff/LayerNorm/gamma:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_23/rel_attn/q/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_23/rel_attn/k/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_23/rel_attn/v/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_23/rel_attn/r/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_23/rel_attn/o/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_23/rel_attn/LayerNorm/beta:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_23/rel_attn/LayerNorm/gamma:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_23/ff/layer_1/kernel:0, shape = (1024, 4096), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_23/ff/layer_1/bias:0, shape = (4096,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_23/ff/layer_2/kernel:0, shape = (4096, 1024), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_23/ff/layer_2/bias:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_23/ff/LayerNorm/beta:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_23/ff/LayerNorm/gamma:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/sequnece_summary/summary/kernel:0, shape = (1024, 1024)
INFO:tensorflow: name = model/sequnece_summary/summary/bias:0, shape = (1024,)
INFO:tensorflow: name = model/regression_sts-b/logit/kernel:0, shape = (1024, 1)
INFO:tensorflow: name = model/regression_sts-b/logit/bias:0, shape = (1,)
INFO:tensorflow:**** Global Variables ****
INFO:tensorflow: name = model/transformer/r_w_bias:0, shape = (24, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/r_r_bias:0, shape = (24, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/word_embedding/lookup_table:0, shape = (32000, 1024), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/r_s_bias:0, shape = (24, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/seg_embed:0, shape = (24, 2, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_0/rel_attn/q/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_0/rel_attn/k/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_0/rel_attn/v/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_0/rel_attn/r/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_0/rel_attn/o/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_0/rel_attn/LayerNorm/beta:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_0/rel_attn/LayerNorm/gamma:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_0/ff/layer_1/kernel:0, shape = (1024, 4096), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_0/ff/layer_1/bias:0, shape = (4096,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_0/ff/layer_2/kernel:0, shape = (4096, 1024), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_0/ff/layer_2/bias:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_0/ff/LayerNorm/beta:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_0/ff/LayerNorm/gamma:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_1/rel_attn/q/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_1/rel_attn/k/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_1/rel_attn/v/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_1/rel_attn/r/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_1/rel_attn/o/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_1/rel_attn/LayerNorm/beta:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_1/rel_attn/LayerNorm/gamma:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_1/ff/layer_1/kernel:0, shape = (1024, 4096), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_1/ff/layer_1/bias:0, shape = (4096,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_1/ff/layer_2/kernel:0, shape = (4096, 1024), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_1/ff/layer_2/bias:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_1/ff/LayerNorm/beta:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_1/ff/LayerNorm/gamma:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_2/rel_attn/q/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_2/rel_attn/k/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_2/rel_attn/v/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_2/rel_attn/r/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_2/rel_attn/o/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_2/rel_attn/LayerNorm/beta:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_2/rel_attn/LayerNorm/gamma:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_2/ff/layer_1/kernel:0, shape = (1024, 4096), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_2/ff/layer_1/bias:0, shape = (4096,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_2/ff/layer_2/kernel:0, shape = (4096, 1024), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_2/ff/layer_2/bias:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_2/ff/LayerNorm/beta:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_2/ff/LayerNorm/gamma:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_3/rel_attn/q/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_3/rel_attn/k/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_3/rel_attn/v/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_3/rel_attn/r/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_3/rel_attn/o/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_3/rel_attn/LayerNorm/beta:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_3/rel_attn/LayerNorm/gamma:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_3/ff/layer_1/kernel:0, shape = (1024, 4096), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_3/ff/layer_1/bias:0, shape = (4096,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_3/ff/layer_2/kernel:0, shape = (4096, 1024), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_3/ff/layer_2/bias:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_3/ff/LayerNorm/beta:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_3/ff/LayerNorm/gamma:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_4/rel_attn/q/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_4/rel_attn/k/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_4/rel_attn/v/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_4/rel_attn/r/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_4/rel_attn/o/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_4/rel_attn/LayerNorm/beta:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_4/rel_attn/LayerNorm/gamma:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_4/ff/layer_1/kernel:0, shape = (1024, 4096), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_4/ff/layer_1/bias:0, shape = (4096,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_4/ff/layer_2/kernel:0, shape = (4096, 1024), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_4/ff/layer_2/bias:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_4/ff/LayerNorm/beta:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_4/ff/LayerNorm/gamma:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_5/rel_attn/q/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_5/rel_attn/k/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_5/rel_attn/v/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_5/rel_attn/r/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_5/rel_attn/o/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_5/rel_attn/LayerNorm/beta:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_5/rel_attn/LayerNorm/gamma:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_5/ff/layer_1/kernel:0, shape = (1024, 4096), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_5/ff/layer_1/bias:0, shape = (4096,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_5/ff/layer_2/kernel:0, shape = (4096, 1024), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_5/ff/layer_2/bias:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_5/ff/LayerNorm/beta:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_5/ff/LayerNorm/gamma:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_6/rel_attn/q/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_6/rel_attn/k/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_6/rel_attn/v/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_6/rel_attn/r/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_6/rel_attn/o/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_6/rel_attn/LayerNorm/beta:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_6/rel_attn/LayerNorm/gamma:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_6/ff/layer_1/kernel:0, shape = (1024, 4096), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_6/ff/layer_1/bias:0, shape = (4096,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_6/ff/layer_2/kernel:0, shape = (4096, 1024), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_6/ff/layer_2/bias:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_6/ff/LayerNorm/beta:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_6/ff/LayerNorm/gamma:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_7/rel_attn/q/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_7/rel_attn/k/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_7/rel_attn/v/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_7/rel_attn/r/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_7/rel_attn/o/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_7/rel_attn/LayerNorm/beta:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_7/rel_attn/LayerNorm/gamma:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_7/ff/layer_1/kernel:0, shape = (1024, 4096), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_7/ff/layer_1/bias:0, shape = (4096,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_7/ff/layer_2/kernel:0, shape = (4096, 1024), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_7/ff/layer_2/bias:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_7/ff/LayerNorm/beta:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_7/ff/LayerNorm/gamma:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_8/rel_attn/q/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_8/rel_attn/k/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_8/rel_attn/v/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_8/rel_attn/r/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_8/rel_attn/o/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_8/rel_attn/LayerNorm/beta:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_8/rel_attn/LayerNorm/gamma:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_8/ff/layer_1/kernel:0, shape = (1024, 4096), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_8/ff/layer_1/bias:0, shape = (4096,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_8/ff/layer_2/kernel:0, shape = (4096, 1024), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_8/ff/layer_2/bias:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_8/ff/LayerNorm/beta:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_8/ff/LayerNorm/gamma:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_9/rel_attn/q/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_9/rel_attn/k/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_9/rel_attn/v/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_9/rel_attn/r/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_9/rel_attn/o/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_9/rel_attn/LayerNorm/beta:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_9/rel_attn/LayerNorm/gamma:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_9/ff/layer_1/kernel:0, shape = (1024, 4096), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_9/ff/layer_1/bias:0, shape = (4096,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_9/ff/layer_2/kernel:0, shape = (4096, 1024), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_9/ff/layer_2/bias:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_9/ff/LayerNorm/beta:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_9/ff/LayerNorm/gamma:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_10/rel_attn/q/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_10/rel_attn/k/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_10/rel_attn/v/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_10/rel_attn/r/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_10/rel_attn/o/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_10/rel_attn/LayerNorm/beta:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_10/rel_attn/LayerNorm/gamma:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_10/ff/layer_1/kernel:0, shape = (1024, 4096), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_10/ff/layer_1/bias:0, shape = (4096,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_10/ff/layer_2/kernel:0, shape = (4096, 1024), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_10/ff/layer_2/bias:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_10/ff/LayerNorm/beta:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_10/ff/LayerNorm/gamma:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_11/rel_attn/q/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_11/rel_attn/k/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_11/rel_attn/v/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_11/rel_attn/r/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_11/rel_attn/o/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_11/rel_attn/LayerNorm/beta:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_11/rel_attn/LayerNorm/gamma:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_11/ff/layer_1/kernel:0, shape = (1024, 4096), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_11/ff/layer_1/bias:0, shape = (4096,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_11/ff/layer_2/kernel:0, shape = (4096, 1024), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_11/ff/layer_2/bias:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_11/ff/LayerNorm/beta:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_11/ff/LayerNorm/gamma:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_12/rel_attn/q/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_12/rel_attn/k/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_12/rel_attn/v/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_12/rel_attn/r/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_12/rel_attn/o/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_12/rel_attn/LayerNorm/beta:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_12/rel_attn/LayerNorm/gamma:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_12/ff/layer_1/kernel:0, shape = (1024, 4096), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_12/ff/layer_1/bias:0, shape = (4096,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_12/ff/layer_2/kernel:0, shape = (4096, 1024), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_12/ff/layer_2/bias:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_12/ff/LayerNorm/beta:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_12/ff/LayerNorm/gamma:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_13/rel_attn/q/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_13/rel_attn/k/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_13/rel_attn/v/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_13/rel_attn/r/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_13/rel_attn/o/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_13/rel_attn/LayerNorm/beta:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_13/rel_attn/LayerNorm/gamma:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_13/ff/layer_1/kernel:0, shape = (1024, 4096), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_13/ff/layer_1/bias:0, shape = (4096,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_13/ff/layer_2/kernel:0, shape = (4096, 1024), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_13/ff/layer_2/bias:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_13/ff/LayerNorm/beta:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_13/ff/LayerNorm/gamma:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_14/rel_attn/q/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_14/rel_attn/k/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_14/rel_attn/v/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_14/rel_attn/r/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_14/rel_attn/o/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_14/rel_attn/LayerNorm/beta:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_14/rel_attn/LayerNorm/gamma:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_14/ff/layer_1/kernel:0, shape = (1024, 4096), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_14/ff/layer_1/bias:0, shape = (4096,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_14/ff/layer_2/kernel:0, shape = (4096, 1024), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_14/ff/layer_2/bias:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_14/ff/LayerNorm/beta:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_14/ff/LayerNorm/gamma:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_15/rel_attn/q/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_15/rel_attn/k/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_15/rel_attn/v/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_15/rel_attn/r/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_15/rel_attn/o/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_15/rel_attn/LayerNorm/beta:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_15/rel_attn/LayerNorm/gamma:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_15/ff/layer_1/kernel:0, shape = (1024, 4096), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_15/ff/layer_1/bias:0, shape = (4096,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_15/ff/layer_2/kernel:0, shape = (4096, 1024), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_15/ff/layer_2/bias:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_15/ff/LayerNorm/beta:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_15/ff/LayerNorm/gamma:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_16/rel_attn/q/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_16/rel_attn/k/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_16/rel_attn/v/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_16/rel_attn/r/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_16/rel_attn/o/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_16/rel_attn/LayerNorm/beta:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_16/rel_attn/LayerNorm/gamma:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_16/ff/layer_1/kernel:0, shape = (1024, 4096), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_16/ff/layer_1/bias:0, shape = (4096,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_16/ff/layer_2/kernel:0, shape = (4096, 1024), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_16/ff/layer_2/bias:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_16/ff/LayerNorm/beta:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_16/ff/LayerNorm/gamma:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_17/rel_attn/q/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_17/rel_attn/k/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_17/rel_attn/v/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_17/rel_attn/r/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_17/rel_attn/o/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_17/rel_attn/LayerNorm/beta:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_17/rel_attn/LayerNorm/gamma:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_17/ff/layer_1/kernel:0, shape = (1024, 4096), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_17/ff/layer_1/bias:0, shape = (4096,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_17/ff/layer_2/kernel:0, shape = (4096, 1024), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_17/ff/layer_2/bias:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_17/ff/LayerNorm/beta:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_17/ff/LayerNorm/gamma:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_18/rel_attn/q/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_18/rel_attn/k/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_18/rel_attn/v/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_18/rel_attn/r/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_18/rel_attn/o/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_18/rel_attn/LayerNorm/beta:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_18/rel_attn/LayerNorm/gamma:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_18/ff/layer_1/kernel:0, shape = (1024, 4096), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_18/ff/layer_1/bias:0, shape = (4096,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_18/ff/layer_2/kernel:0, shape = (4096, 1024), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_18/ff/layer_2/bias:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_18/ff/LayerNorm/beta:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_18/ff/LayerNorm/gamma:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_19/rel_attn/q/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_19/rel_attn/k/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_19/rel_attn/v/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_19/rel_attn/r/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_19/rel_attn/o/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_19/rel_attn/LayerNorm/beta:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_19/rel_attn/LayerNorm/gamma:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_19/ff/layer_1/kernel:0, shape = (1024, 4096), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_19/ff/layer_1/bias:0, shape = (4096,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_19/ff/layer_2/kernel:0, shape = (4096, 1024), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_19/ff/layer_2/bias:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_19/ff/LayerNorm/beta:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_19/ff/LayerNorm/gamma:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_20/rel_attn/q/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_20/rel_attn/k/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_20/rel_attn/v/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_20/rel_attn/r/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_20/rel_attn/o/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_20/rel_attn/LayerNorm/beta:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_20/rel_attn/LayerNorm/gamma:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_20/ff/layer_1/kernel:0, shape = (1024, 4096), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_20/ff/layer_1/bias:0, shape = (4096,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_20/ff/layer_2/kernel:0, shape = (4096, 1024), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_20/ff/layer_2/bias:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_20/ff/LayerNorm/beta:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_20/ff/LayerNorm/gamma:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_21/rel_attn/q/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_21/rel_attn/k/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_21/rel_attn/v/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_21/rel_attn/r/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_21/rel_attn/o/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_21/rel_attn/LayerNorm/beta:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_21/rel_attn/LayerNorm/gamma:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_21/ff/layer_1/kernel:0, shape = (1024, 4096), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_21/ff/layer_1/bias:0, shape = (4096,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_21/ff/layer_2/kernel:0, shape = (4096, 1024), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_21/ff/layer_2/bias:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_21/ff/LayerNorm/beta:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_21/ff/LayerNorm/gamma:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_22/rel_attn/q/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_22/rel_attn/k/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_22/rel_attn/v/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_22/rel_attn/r/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_22/rel_attn/o/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_22/rel_attn/LayerNorm/beta:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_22/rel_attn/LayerNorm/gamma:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_22/ff/layer_1/kernel:0, shape = (1024, 4096), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_22/ff/layer_1/bias:0, shape = (4096,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_22/ff/layer_2/kernel:0, shape = (4096, 1024), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_22/ff/layer_2/bias:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_22/ff/LayerNorm/beta:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_22/ff/LayerNorm/gamma:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_23/rel_attn/q/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_23/rel_attn/k/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_23/rel_attn/v/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_23/rel_attn/r/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_23/rel_attn/o/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_23/rel_attn/LayerNorm/beta:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_23/rel_attn/LayerNorm/gamma:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_23/ff/layer_1/kernel:0, shape = (1024, 4096), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_23/ff/layer_1/bias:0, shape = (4096,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_23/ff/layer_2/kernel:0, shape = (4096, 1024), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_23/ff/layer_2/bias:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_23/ff/LayerNorm/beta:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_23/ff/LayerNorm/gamma:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/sequnece_summary/summary/kernel:0, shape = (1024, 1024)
INFO:tensorflow: name = model/sequnece_summary/summary/bias:0, shape = (1024,)
INFO:tensorflow: name = model/regression_sts-b/logit/kernel:0, shape = (1024, 1)
INFO:tensorflow: name = model/regression_sts-b/logit/bias:0, shape = (1,)
INFO:tensorflow:**** Global Variables ****
INFO:tensorflow: name = model/transformer/r_w_bias:0, shape = (24, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/r_r_bias:0, shape = (24, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/word_embedding/lookup_table:0, shape = (32000, 1024), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/r_s_bias:0, shape = (24, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/seg_embed:0, shape = (24, 2, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_0/rel_attn/q/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_0/rel_attn/k/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_0/rel_attn/v/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_0/rel_attn/r/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_0/rel_attn/o/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_0/rel_attn/LayerNorm/beta:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_0/rel_attn/LayerNorm/gamma:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_0/ff/layer_1/kernel:0, shape = (1024, 4096), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_0/ff/layer_1/bias:0, shape = (4096,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_0/ff/layer_2/kernel:0, shape = (4096, 1024), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_0/ff/layer_2/bias:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_0/ff/LayerNorm/beta:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_0/ff/LayerNorm/gamma:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_1/rel_attn/q/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_1/rel_attn/k/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_1/rel_attn/v/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_1/rel_attn/r/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_1/rel_attn/o/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_1/rel_attn/LayerNorm/beta:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_1/rel_attn/LayerNorm/gamma:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_1/ff/layer_1/kernel:0, shape = (1024, 4096), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_1/ff/layer_1/bias:0, shape = (4096,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_1/ff/layer_2/kernel:0, shape = (4096, 1024), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_1/ff/layer_2/bias:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_1/ff/LayerNorm/beta:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_1/ff/LayerNorm/gamma:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_2/rel_attn/q/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_2/rel_attn/k/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_2/rel_attn/v/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_2/rel_attn/r/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_2/rel_attn/o/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_2/rel_attn/LayerNorm/beta:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_2/rel_attn/LayerNorm/gamma:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_2/ff/layer_1/kernel:0, shape = (1024, 4096), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_2/ff/layer_1/bias:0, shape = (4096,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_2/ff/layer_2/kernel:0, shape = (4096, 1024), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_2/ff/layer_2/bias:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_2/ff/LayerNorm/beta:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_2/ff/LayerNorm/gamma:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_3/rel_attn/q/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_3/rel_attn/k/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_3/rel_attn/v/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_3/rel_attn/r/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_3/rel_attn/o/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_3/rel_attn/LayerNorm/beta:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_3/rel_attn/LayerNorm/gamma:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_3/ff/layer_1/kernel:0, shape = (1024, 4096), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_3/ff/layer_1/bias:0, shape = (4096,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_3/ff/layer_2/kernel:0, shape = (4096, 1024), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_3/ff/layer_2/bias:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_3/ff/LayerNorm/beta:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_3/ff/LayerNorm/gamma:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_4/rel_attn/q/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_4/rel_attn/k/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_4/rel_attn/v/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_4/rel_attn/r/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_4/rel_attn/o/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_4/rel_attn/LayerNorm/beta:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_4/rel_attn/LayerNorm/gamma:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_4/ff/layer_1/kernel:0, shape = (1024, 4096), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_4/ff/layer_1/bias:0, shape = (4096,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_4/ff/layer_2/kernel:0, shape = (4096, 1024), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_4/ff/layer_2/bias:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_4/ff/LayerNorm/beta:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_4/ff/LayerNorm/gamma:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_5/rel_attn/q/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_5/rel_attn/k/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_5/rel_attn/v/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_5/rel_attn/r/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_5/rel_attn/o/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_5/rel_attn/LayerNorm/beta:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_5/rel_attn/LayerNorm/gamma:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_5/ff/layer_1/kernel:0, shape = (1024, 4096), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_5/ff/layer_1/bias:0, shape = (4096,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_5/ff/layer_2/kernel:0, shape = (4096, 1024), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_5/ff/layer_2/bias:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_5/ff/LayerNorm/beta:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_5/ff/LayerNorm/gamma:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_6/rel_attn/q/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_6/rel_attn/k/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_6/rel_attn/v/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_6/rel_attn/r/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_6/rel_attn/o/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_6/rel_attn/LayerNorm/beta:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_6/rel_attn/LayerNorm/gamma:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_6/ff/layer_1/kernel:0, shape = (1024, 4096), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_6/ff/layer_1/bias:0, shape = (4096,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_6/ff/layer_2/kernel:0, shape = (4096, 1024), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_6/ff/layer_2/bias:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_6/ff/LayerNorm/beta:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_6/ff/LayerNorm/gamma:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_7/rel_attn/q/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_7/rel_attn/k/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_7/rel_attn/v/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_7/rel_attn/r/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_7/rel_attn/o/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_7/rel_attn/LayerNorm/beta:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_7/rel_attn/LayerNorm/gamma:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_7/ff/layer_1/kernel:0, shape = (1024, 4096), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_7/ff/layer_1/bias:0, shape = (4096,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_7/ff/layer_2/kernel:0, shape = (4096, 1024), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_7/ff/layer_2/bias:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_7/ff/LayerNorm/beta:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_7/ff/LayerNorm/gamma:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_8/rel_attn/q/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_8/rel_attn/k/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_8/rel_attn/v/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_8/rel_attn/r/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_8/rel_attn/o/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_8/rel_attn/LayerNorm/beta:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_8/rel_attn/LayerNorm/gamma:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_8/ff/layer_1/kernel:0, shape = (1024, 4096), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_8/ff/layer_1/bias:0, shape = (4096,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_8/ff/layer_2/kernel:0, shape = (4096, 1024), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_8/ff/layer_2/bias:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_8/ff/LayerNorm/beta:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_8/ff/LayerNorm/gamma:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_9/rel_attn/q/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_9/rel_attn/k/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_9/rel_attn/v/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_9/rel_attn/r/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_9/rel_attn/o/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_9/rel_attn/LayerNorm/beta:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_9/rel_attn/LayerNorm/gamma:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_9/ff/layer_1/kernel:0, shape = (1024, 4096), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_9/ff/layer_1/bias:0, shape = (4096,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_9/ff/layer_2/kernel:0, shape = (4096, 1024), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_9/ff/layer_2/bias:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_9/ff/LayerNorm/beta:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_9/ff/LayerNorm/gamma:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_10/rel_attn/q/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_10/rel_attn/k/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_10/rel_attn/v/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_10/rel_attn/r/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_10/rel_attn/o/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_10/rel_attn/LayerNorm/beta:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_10/rel_attn/LayerNorm/gamma:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_10/ff/layer_1/kernel:0, shape = (1024, 4096), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_10/ff/layer_1/bias:0, shape = (4096,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_10/ff/layer_2/kernel:0, shape = (4096, 1024), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_10/ff/layer_2/bias:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_10/ff/LayerNorm/beta:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_10/ff/LayerNorm/gamma:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_11/rel_attn/q/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_11/rel_attn/k/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_11/rel_attn/v/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_11/rel_attn/r/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_11/rel_attn/o/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_11/rel_attn/LayerNorm/beta:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_11/rel_attn/LayerNorm/gamma:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_11/ff/layer_1/kernel:0, shape = (1024, 4096), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_11/ff/layer_1/bias:0, shape = (4096,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_11/ff/layer_2/kernel:0, shape = (4096, 1024), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_11/ff/layer_2/bias:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_11/ff/LayerNorm/beta:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_11/ff/LayerNorm/gamma:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_12/rel_attn/q/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_12/rel_attn/k/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_12/rel_attn/v/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_12/rel_attn/r/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_12/rel_attn/o/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_12/rel_attn/LayerNorm/beta:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_12/rel_attn/LayerNorm/gamma:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_12/ff/layer_1/kernel:0, shape = (1024, 4096), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_12/ff/layer_1/bias:0, shape = (4096,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_12/ff/layer_2/kernel:0, shape = (4096, 1024), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_12/ff/layer_2/bias:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_12/ff/LayerNorm/beta:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_12/ff/LayerNorm/gamma:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_13/rel_attn/q/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_13/rel_attn/k/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_13/rel_attn/v/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_13/rel_attn/r/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_13/rel_attn/o/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_13/rel_attn/LayerNorm/beta:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_13/rel_attn/LayerNorm/gamma:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_13/ff/layer_1/kernel:0, shape = (1024, 4096), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_13/ff/layer_1/bias:0, shape = (4096,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_13/ff/layer_2/kernel:0, shape = (4096, 1024), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_13/ff/layer_2/bias:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_13/ff/LayerNorm/beta:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_13/ff/LayerNorm/gamma:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_14/rel_attn/q/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_14/rel_attn/k/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_14/rel_attn/v/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_14/rel_attn/r/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_14/rel_attn/o/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_14/rel_attn/LayerNorm/beta:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_14/rel_attn/LayerNorm/gamma:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_14/ff/layer_1/kernel:0, shape = (1024, 4096), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_14/ff/layer_1/bias:0, shape = (4096,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_14/ff/layer_2/kernel:0, shape = (4096, 1024), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_14/ff/layer_2/bias:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_14/ff/LayerNorm/beta:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_14/ff/LayerNorm/gamma:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_15/rel_attn/q/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_15/rel_attn/k/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_15/rel_attn/v/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_15/rel_attn/r/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_15/rel_attn/o/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_15/rel_attn/LayerNorm/beta:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_15/rel_attn/LayerNorm/gamma:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_15/ff/layer_1/kernel:0, shape = (1024, 4096), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_15/ff/layer_1/bias:0, shape = (4096,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_15/ff/layer_2/kernel:0, shape = (4096, 1024), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_15/ff/layer_2/bias:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_15/ff/LayerNorm/beta:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_15/ff/LayerNorm/gamma:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_16/rel_attn/q/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_16/rel_attn/k/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_16/rel_attn/v/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_16/rel_attn/r/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_16/rel_attn/o/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_16/rel_attn/LayerNorm/beta:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_16/rel_attn/LayerNorm/gamma:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_16/ff/layer_1/kernel:0, shape = (1024, 4096), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_16/ff/layer_1/bias:0, shape = (4096,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_16/ff/layer_2/kernel:0, shape = (4096, 1024), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_16/ff/layer_2/bias:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_16/ff/LayerNorm/beta:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_16/ff/LayerNorm/gamma:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_17/rel_attn/q/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_17/rel_attn/k/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_17/rel_attn/v/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_17/rel_attn/r/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_17/rel_attn/o/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_17/rel_attn/LayerNorm/beta:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_17/rel_attn/LayerNorm/gamma:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_17/ff/layer_1/kernel:0, shape = (1024, 4096), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_17/ff/layer_1/bias:0, shape = (4096,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_17/ff/layer_2/kernel:0, shape = (4096, 1024), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_17/ff/layer_2/bias:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_17/ff/LayerNorm/beta:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_17/ff/LayerNorm/gamma:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_18/rel_attn/q/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_18/rel_attn/k/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_18/rel_attn/v/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_18/rel_attn/r/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_18/rel_attn/o/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_18/rel_attn/LayerNorm/beta:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_18/rel_attn/LayerNorm/gamma:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_18/ff/layer_1/kernel:0, shape = (1024, 4096), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_18/ff/layer_1/bias:0, shape = (4096,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_18/ff/layer_2/kernel:0, shape = (4096, 1024), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_18/ff/layer_2/bias:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_18/ff/LayerNorm/beta:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_18/ff/LayerNorm/gamma:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_19/rel_attn/q/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_19/rel_attn/k/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_19/rel_attn/v/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_19/rel_attn/r/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_19/rel_attn/o/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_19/rel_attn/LayerNorm/beta:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_19/rel_attn/LayerNorm/gamma:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_19/ff/layer_1/kernel:0, shape = (1024, 4096), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_19/ff/layer_1/bias:0, shape = (4096,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_19/ff/layer_2/kernel:0, shape = (4096, 1024), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_19/ff/layer_2/bias:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_19/ff/LayerNorm/beta:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_19/ff/LayerNorm/gamma:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_20/rel_attn/q/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_20/rel_attn/k/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_20/rel_attn/v/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_20/rel_attn/r/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_20/rel_attn/o/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_20/rel_attn/LayerNorm/beta:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_20/rel_attn/LayerNorm/gamma:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_20/ff/layer_1/kernel:0, shape = (1024, 4096), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_20/ff/layer_1/bias:0, shape = (4096,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_20/ff/layer_2/kernel:0, shape = (4096, 1024), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_20/ff/layer_2/bias:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_20/ff/LayerNorm/beta:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_20/ff/LayerNorm/gamma:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_21/rel_attn/q/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_21/rel_attn/k/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_21/rel_attn/v/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_21/rel_attn/r/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_21/rel_attn/o/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_21/rel_attn/LayerNorm/beta:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_21/rel_attn/LayerNorm/gamma:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_21/ff/layer_1/kernel:0, shape = (1024, 4096), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_21/ff/layer_1/bias:0, shape = (4096,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_21/ff/layer_2/kernel:0, shape = (4096, 1024), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_21/ff/layer_2/bias:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_21/ff/LayerNorm/beta:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_21/ff/LayerNorm/gamma:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_22/rel_attn/q/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_22/rel_attn/k/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_22/rel_attn/v/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_22/rel_attn/r/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_22/rel_attn/o/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_22/rel_attn/LayerNorm/beta:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_22/rel_attn/LayerNorm/gamma:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_22/ff/layer_1/kernel:0, shape = (1024, 4096), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_22/ff/layer_1/bias:0, shape = (4096,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_22/ff/layer_2/kernel:0, shape = (4096, 1024), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_22/ff/layer_2/bias:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_22/ff/LayerNorm/beta:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_22/ff/LayerNorm/gamma:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_23/rel_attn/q/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_23/rel_attn/k/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_23/rel_attn/v/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_23/rel_attn/r/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_23/rel_attn/o/kernel:0, shape = (1024, 16, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_23/rel_attn/LayerNorm/beta:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_23/rel_attn/LayerNorm/gamma:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_23/ff/layer_1/kernel:0, shape = (1024, 4096), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_23/ff/layer_1/bias:0, shape = (4096,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_23/ff/layer_2/kernel:0, shape = (4096, 1024), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_23/ff/layer_2/bias:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_23/ff/LayerNorm/beta:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/layer_23/ff/LayerNorm/gamma:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/sequnece_summary/summary/kernel:0, shape = (1024, 1024)
INFO:tensorflow: name = model/sequnece_summary/summary/bias:0, shape = (1024,)
INFO:tensorflow: name = model/regression_sts-b/logit/kernel:0, shape = (1024, 1)
INFO:tensorflow: name = model/regression_sts-b/logit/bias:0, shape = (1,)
INFO:tensorflow:Done calling model_fn.
INFO:tensorflow:Done calling model_fn.
INFO:tensorflow:Done calling model_fn.
INFO:tensorflow:Done calling model_fn.
INFO:tensorflow:Create CheckpointSaverHook.
INFO:tensorflow:Graph was finalized.
2019-06-22 18:21:58.667821: I tensorflow/core/common_runtime/process_util.cc:71] Creating new thread pool with default inter op setting: 2. Tune using inter_op_parallelism_threads for best performance.
WARNING:tensorflow:From /mnt/lustre/sjtu/home/myl01/anaconda3/envs/xlnet/lib/python2.7/site-packages/tensorflow/python/training/saver.py:1266: checkpoint_exists (from tensorflow.python.training.checkpoint_management) is deprecated and will be removed in a future version.
Instructions for updating:
Use standard file APIs to check for files with this prefix.
INFO:tensorflow:Restoring parameters from exp/sts-b/model.ckpt-0
WARNING:tensorflow:From /mnt/lustre/sjtu/home/myl01/anaconda3/envs/xlnet/lib/python2.7/site-packages/tensorflow/python/training/saver.py:1070: get_checkpoint_mtimes (from tensorflow.python.training.checkpoint_management) is deprecated and will be removed in a future version.
Instructions for updating:
Use standard file utilities to get mtimes.
INFO:tensorflow:Running local_init_op.
INFO:tensorflow:Done running local_init_op.
INFO:tensorflow:Saving checkpoints for 0 into exp/sts-b/model.ckpt.
OMP: Info #242: KMP_AFFINITY: pid 11876 thread 1 bound to OS proc set {6}
INFO:tensorflow:Initialize strategy
OMP: Info #242: KMP_AFFINITY: pid 11876 thread 2 bound to OS proc set {7}
OMP: Info #242: KMP_AFFINITY: pid 11876 thread 3 bound to OS proc set {33}
OMP: Info #242: KMP_AFFINITY: pid 11876 thread 4 bound to OS proc set {34}
OMP: Info #242: KMP_AFFINITY: pid 11876 thread 6 bound to OS proc set {6}
OMP: Info #242: KMP_AFFINITY: pid 11876 thread 5 bound to OS proc set {5}
OMP: Info #242: KMP_AFFINITY: pid 11876 thread 7 bound to OS proc set {7}
OMP: Info #242: KMP_AFFINITY: pid 11876 thread 8 bound to OS proc set {33}
OMP: Info #242: KMP_AFFINITY: pid 11876 thread 9 bound to OS proc set {34}
OMP: Info #242: KMP_AFFINITY: pid 11876 thread 10 bound to OS proc set {5}
OMP: Info #242: KMP_AFFINITY: pid 11876 thread 11 bound to OS proc set {6}
OMP: Info #242: KMP_AFFINITY: pid 11876 thread 12 bound to OS proc set {7}
OMP: Info #242: KMP_AFFINITY: pid 11876 thread 13 bound to OS proc set {33}
OMP: Info #242: KMP_AFFINITY: pid 11876 thread 14 bound to OS proc set {34}
OMP: Info #242: KMP_AFFINITY: pid 11876 thread 15 bound to OS proc set {5}
OMP: Info #242: KMP_AFFINITY: pid 11876 thread 16 bound to OS proc set {6}
INFO:tensorflow:loss = 12.0331, step = 0