Giter Club home page Giter Club logo

ctrl-gce's Issues

'ascii' codec can't encode character

I'm regularly getting this error. Any way to circumvent it ?

Traceback (most recent call last):
File "generation.py", line 275, in
print(tokens_generated_so_far)
UnicodeEncodeError: 'ascii' codec can't encode character u'\u2019' in position 289: ordinal not in range(128)

Python 3 Support

Now that Python 3 support has been merged into the original repo, this repo should move to Python 3. (it should just require changing the Python versions accordingly.)

Support non-interactive mode

This worked like a charm, thanks so much.

I'm wondering about supporting non-interactive mode e.g. maybe an argument that is the name of a file with inputs? Seems like it would involve re-writing some of generation.py. My python is novice-level, but happy to take a stab at this if you'd be interested in collabing on it / it seems within the scope of the project.

error: fastBPE/fastBPE.cpp: no such file or directory

Hi,
Not sure the reason I'm having problems installing fastBPE following your exact steps with google compute engine.
I already tried it on paperspace notebook (v100) and fastBPE installed fine as far as I recall (however the interactive console didn't load, unsure why).

Anyway, I will google about "error: fastBPE/fastBPE.cpp: no such file or directory"

Keras_patch GPU instead of TPU

Hi,
I've been using the generation GCE setup you recommended (Works wonderfully with Python 2.7), but I haven't been able to get finetuning to work. I appreciate thats a totally different setup.

But, on Google Compute with 2x16gb P100's 8x CPU 30GB ram with Nick Walton's MultiGPU pull request to salesforce/ctrl appears like it should work.

I'm still getting OOM issues,

They've fixed the python3 not being supported in recent changes to ctrl, so I've tried reinstalling all the setup, with python3, 3.5 and 2.7.

019-11-06 11:33:42.236866: I tensorflow/core/common_runtime/bfc_allocator.cc:818] total_region_allocated_bytes_: 15928269056 memory_limit_: 15928269210 available bytes: 154 curr_region_allocation_bytes_: 31856538624
2019-11-06 11:33:42.236876: I tensorflow/core/common_runtime/bfc_allocator.cc:824] Stats:
Limit: 15928269210
InUse: 15928269056
MaxInUse: 15928269056
NumAllocs: 4013
MaxAllocSize: 1262254080

+-----------------------------------------------------------------------------+
| NVIDIA-SMI 410.104 Driver Version: 410.104 CUDA Version: 10.0 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 Tesla P100-PCIE... Off | 00000000:00:04.0 Off | 0 |
| N/A 45C P0 28W / 250W | 0MiB / 16280MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
| 1 Tesla P100-PCIE... Off | 00000000:00:05.0 Off | 0 |
| N/A 48C P0 30W / 250W | 0MiB / 16280MiB | 7% Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| No running processes found |
+-----------------------------------------------------------------------------+

Anyway, more asking for you help than anything, how come I can get generation to work on the same setup that wont allow finetuning?

Many thanks
Vince.

How to use CTRL as a conditional generation model?

Thanks for this repo. I would like to ask if you know about how i could fine-tune CTRL on a dataset which has prompts (such as a list of headlines of an event) and the article (think of it as a reverse summarization task where you give a summary it generates the entire text for you). I could create a new control code and fine tune on these new articles. but how do i introduce the headlines to the training data? The training data under a control code is just one big text file of articles. Should i place the headlines of the articles before each corresponding article and fine tune on this data?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.