Related issue: <a class="issue-link js-issue-link" data-error-text="Failed to load tit

Hi <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Hi <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="

I was using version 0.9.1 <a class="user-mention notranslate" data-hovercard-type="use

Worked perfectly <a class="user-mention notranslate" data-hovercard-type="user" data-h

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Ludwig New Version Issues of Repeating output about ludwig HOT 7 CLOSED

savi8sant8s commented on June 2, 2024

Ludwig New Version Issues of Repeating output

from ludwig.

Comments (7)

arnavgarg1 commented on June 2, 2024 1

Hi @savi8sant8s! I was able to verify that Ludwig 0.9.3 fixes things. I also made a few changes to your notebook that I believe are important in ensuring good learning/output. Here's the notebook: https://colab.research.google.com/drive/1QwojspiXKVULZ1xsuoUSWDonVS1Ig8JM?usp=sharing

The main thing you'll notice is that I added a code block to profile your data and figure out the distribution of the number of tokens in each of your columns. From this, I learned that the maximum sequence length of your instruction, input and output was 202 tokens. If we also add in the number of tokens for the prompt, it's probably closer to 256 tokens. However, you had set global_max_sequence_length to 128 instead of 256, meaning that the model would only learn from examples in your dataset where the number of tokens in your prompt + instruction + input was < 128 tokens, which wasn't always the case.

The other thing I added was a new trained parameter called enable_gradient_checkpointing: true which helps reduce memory usage for longer sequences.

Let me know if the output prediction results in this notebook match your expectation - it seems like it correctly fixed the capitalization and didn't perform the repetition that you were seeing before.

from ludwig.

arnavgarg1 commented on June 2, 2024 1

@savi8sant8s Ah I see, will let them know I responded here!

If this issue is resolved, is it okay if I mark it as closed?

from ludwig.

arnavgarg1 commented on June 2, 2024

Hi @savi8sant8s, thanks for the reporting the issue and sorry you're running into it.

Are you able to share which version of Ludwig you were using before downgrading to Ludwig 0.8.6? We actually introduced some regressions in Ludwig 0.9.1 and 0.9.2 that were fixed in Ludwig 0.9.3 released in the last week, specifically related to finetuning outputs not looking as good as expected for a variety of models including Llama, Mistral, Mixtral and Phi.

If you can share your dataset, I'm happy to test it for you with the latest Ludwig version and see if I can reproduce the error and then look into a fix.

from ludwig.

savi8sant8s commented on June 2, 2024

I was using version 0.9.1 @arnavgarg1.
Below is my Notebook and prompts. I was working on Fine-tuning LLama2-7b to create a text corrector in Portuguese.
project.zip
Thank you for the contact.

from ludwig.

savi8sant8s commented on June 2, 2024

Worked perfectly @arnavgarg1 . thank you so much. The issue is in ludwig-docs too. If you can solve it there, the author of it will also know that the new update solved the problem. Thanks again for the help.

from ludwig.

arnavgarg1 commented on June 2, 2024

@savi8sant8s I'm glad to hear that it worked perfectly!

Could you explain the issue in Ludwig-docs that you are referring to? Based on what you said, my understanding is that there was no notice on Ludwig docs explaining that this issue exists in Ludwig 0.9/0.9.1/0.9.2 and that we were working on a fix and now it is fixed. Is this understanding right?

from ludwig.

savi8sant8s commented on June 2, 2024

@arnavgarg1 In fact, an issue was created wrongly in ludwig-docs regarding this: ludwig-ai/ludwig-docs#337.

from ludwig.

Ludwig New Version Issues of Repeating output about ludwig HOT 7 CLOSED

Comments (7)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent