minimaxir / ctrl-gce Goto Github PK

View Code? Open in Web Editor NEW

153.0 153.0 13.0 134 KB

Set up the CTRL text-generating model on Google Compute Engine with just a few console commands.

License: MIT License

Shell 100.00%

ctrl google-compute-engine text-generation

ctrl-gce's Issues

'ascii' codec can't encode character

I'm regularly getting this error. Any way to circumvent it ?

Traceback (most recent call last):
File "generation.py", line 275, in
print(tokens_generated_so_far)
UnicodeEncodeError: 'ascii' codec can't encode character u'\u2019' in position 289: ordinal not in range(128)

Python 3 Support

Now that Python 3 support has been merged into the original repo, this repo should move to Python 3. (it should just require changing the Python versions accordingly.)

Support non-interactive mode

This worked like a charm, thanks so much.

I'm wondering about supporting non-interactive mode e.g. maybe an argument that is the name of a file with inputs? Seems like it would involve re-writing some of generation.py. My python is novice-level, but happy to take a stab at this if you'd be interested in collabing on it / it seems within the scope of the project.

error: fastBPE/fastBPE.cpp: no such file or directory

Hi,
Not sure the reason I'm having problems installing fastBPE following your exact steps with google compute engine.
I already tried it on paperspace notebook (v100) and fastBPE installed fine as far as I recall (however the interactive console didn't load, unsure why).

Anyway, I will google about "error: fastBPE/fastBPE.cpp: no such file or directory"

Keras_patch GPU instead of TPU

Hi,
I've been using the generation GCE setup you recommended (Works wonderfully with Python 2.7), but I haven't been able to get finetuning to work. I appreciate thats a totally different setup.

But, on Google Compute with 2x16gb P100's 8x CPU 30GB ram with Nick Walton's MultiGPU pull request to salesforce/ctrl appears like it should work.

I'm still getting OOM issues,

They've fixed the python3 not being supported in recent changes to ctrl, so I've tried reinstalling all the setup, with python3, 3.5 and 2.7.

019-11-06 11:33:42.236866: I tensorflow/core/common_runtime/bfc_allocator.cc:818] total_region_allocated_bytes_: 15928269056 memory_limit_: 15928269210 available bytes: 154 curr_region_allocation_bytes_: 31856538624
2019-11-06 11:33:42.236876: I tensorflow/core/common_runtime/bfc_allocator.cc:824] Stats:
Limit: 15928269210
InUse: 15928269056
MaxInUse: 15928269056
NumAllocs: 4013
MaxAllocSize: 1262254080

+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| No running processes found |
+-----------------------------------------------------------------------------+

Anyway, more asking for you help than anything, how come I can get generation to work on the same setup that wont allow finetuning?

Many thanks
Vince.

How to use CTRL as a conditional generation model?

Thanks for this repo. I would like to ask if you know about how i could fine-tune CTRL on a dataset which has prompts (such as a list of headlines of an event) and the article (think of it as a reverse summarization task where you give a summary it generates the entire text for you). I could create a new control code and fine tune on these new articles. but how do i introduce the headlines to the training data? The training data under a control code is just one big text file of articles. Should i place the headlines of the articles before each corresponding article and fine tune on this data?

minimaxir / ctrl-gce Goto Github PK

ctrl-gce's Issues

'ascii' codec can't encode character

Python 3 Support

Support non-interactive mode

error: fastBPE/fastBPE.cpp: no such file or directory

Keras_patch GPU instead of TPU

How to use CTRL as a conditional generation model?

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent