Giter Club home page Giter Club logo

Comments (19)

Caballero-Arepa avatar Caballero-Arepa commented on June 7, 2024 1

❌️Nope. I say no to it.

Translating Unciv is not translating a text, even if you were to formst it as so.

You can't just use a translator to get a good result because the computer does not have the level of comprehention a human does.

Why? Because there are game terms, unit formating, bracket variables, and all of that.
You need to keep consistent to your self, and in some cases you need to change entirely the frase to transmit the message, not just literally translate.

A Machine Learning program won't be able to understand the usage of conditionals on uniques and their context.

from unciv.

yairm210 avatar yairm210 commented on June 7, 2024 1

IF someone can get ChatGPT to do the work of reading the file and translating line-by-line while retaining placeholders, this sounds like a possible time-saver
If this requires manually copy-pasting lines to and from ChatGPT, it's a waste of time.
@awhillas if you can make it work for one language and guide us through the steps, we can consider this.
But this sounds to me more like "I bet GPT can solve this" than an actual solution.

Regarding licensing for GPT output in the game - that is a GOOD POINT that I had not even considered!

I would definitely evaluate this under "high risk low reward" rather than the opposite.
I"m keeping this open for, say, another week so others can gather data and comment, if it's not actionable by then then I'll close it. Because currently, this is NOT actionable.

(What are the limits of input you can send to GPT? Will it be able to eat 5000 line files? What are the limits of output? More questions than answers here)

from unciv.

SomeTroglodyte avatar SomeTroglodyte commented on June 7, 2024 1

Actually, implementing Langauge-adding mods might be a nice step, even streamlining the new-lang process. Since not part of Ruleset, would need some special-casing, but doable. A ModOptions unique triggering language table extension or somesuch.

from unciv.

SomeTroglodyte avatar SomeTroglodyte commented on June 7, 2024

What's a GPT? A Ghastly Precocious Teen? Or is it something to eat?

from unciv.

Cwpute avatar Cwpute commented on June 7, 2024

[…] A Machine Learning program won't be able to understand the usage of conditionals on uniques and their context.

…yet.
But until then i very much agree with all that you said.

Did you have a particular language in mind @awhillas ? Because if you happen to know that language, you might want to join in and translate the game yourself (here for more details).

from unciv.

SomeTroglodyte avatar SomeTroglodyte commented on June 7, 2024

Look - it's true an AI approach may add value to machine translation, specifically with patterns across "translatable units". But

  • machine translations are still abysmally undigestible in so many cases (like microsh- ahem -soft machine translating their technical documentation or exception messages or... - always leads to: the programmer must look up the original english to have any chance at understanding, even if their native tongue is the language shown).
  • Those offers seem very much "Anti" to the open source idea, or at least that's the impression I'm getting. And we are Origo aperta. (Hey that would have fit a line in Latin.properties better)

The idea to take machine translations is not new. I did a version of Unciv once that even machine-translated on-the-fly, using the one open source MT engine! Massive delays of course, but a cute experiment. Yup, still accessible: Apertium. (I never converted that into a run-once tool to "seed" a new language, though, as most languages I deemed interesting to get this way have missing to superficial support from that project.)

from unciv.

awhillas avatar awhillas commented on June 7, 2024

@SomeTroglodyte
By GPT I mean ChatGPT , which does a very good job of translating (depending on how rare the language is).

@Caballero-Arepa

the computer does not have the level of comprehention a human does.

Oh yes it does. I guess you haven't played with ChatGPT? But I don't have to convince you, just give it a try.

Why? Because there are game terms, unit formating, bracket variables, and all of that.

Yeah, the model is smart enough to handle that. I've been using it to do markdown and it translates the formatting there flawlessly. If you have some special format you can include the rules in the prompt and it will get it right nearly all the time. The small number of times it doesn't it will be obvious and easily fixed.

It would certainly give a good first draft and then the speakers of the given languages, who think the translations need work, will chime in no problem, but it won't require them to do all the translation, just correct small details, which is less intimidating to start with than everything.

Anyway, it might be an interesting experiment and if it's a failure you can just remove the language files. Low risk, high reward.

from unciv.

SomeTroglodyte avatar SomeTroglodyte commented on June 7, 2024

"ChatGPT" - Crappy hack abusing trust Gruesomely (while) Perusing Tetrahydrocannabinol"?

Low risk high threshold - I'm not going to open an account with a commercial entity I don't trust and where I see no good reason to evaluate that trust. So - you go ahead. I'm out.

from unciv.

SomeTroglodyte avatar SomeTroglodyte commented on June 7, 2024

Actually, the argument may be - how will Goblin Partisan Terror output be redistributable under an undisputable open license? I guess not at all.

That makes this a feature request to support adding languages as mod.

from unciv.

awhillas avatar awhillas commented on June 7, 2024

The licence on the output of ChatGPT, or any LLM or, any AI model language, image, audio whatever, is ambiguous at the moment. Considering that it is trained on a lot of open content I would say they don't have a leg to stand on if trying to claim copywrite on any of its generated output. Also considering many enterprises have build products based on it can not be. Anyway, if you can find any information about copywrite of LLMs output please let me know! (my job depends on it) But just thinking about it, if OpenAI did try to claim copywrite on anything it generated it's whole business model would collapse for who would continue to use them? So there is that.

But if that is still a concern then there are plenty of open source models such as Bloom (haven't tried it for translation as hosting 70billion param model is tricky but there is a way), Llama 2 again, haven't tried for translation but Google Vertex hosts this, just need a GCP account). But I haven't used any of these (yet) as I pay for OpenAI and its easy.

So i don't really see the "high risk low reward" side of it? Translating to 59 languages on the fly for each update also doesn't seem like a "low reward" to me. I'm sure once the industry cottons on to this every game will do it, why wouldn't they?

To get started I'd generally just send one, related, block of text at a time with related context and perhaps some examples of how to handle tricky formatting stuff. But its pretty smart and can figure out most things from patterns/examples.

Give me something like a CSV with source text in one column and I'll write a script to pass it through OpenAI and output in another column of the same CSV (or another one with the same text ID or whatever your using). Should take me 30 minutes.

from unciv.

yairm210 avatar yairm210 commented on June 7, 2024

https://github.com/yairm210/Unciv/blob/master/android/assets/jsons/translations/template.properties
Not exactly a CSV, delimited by =, but close enough
Example translated file: https://github.com/yairm210/Unciv/blob/master/android/assets/jsons/translations/Russian.properties

from unciv.

awhillas avatar awhillas commented on June 7, 2024

cool, I'll do a Russian one so we can compare. And a language you don't have?

from unciv.

SomeTroglodyte avatar SomeTroglodyte commented on June 7, 2024

language you don't have

Bork! Or Klingon in a pinch. Problem is, the Klingon albhapet ™️ is defined in unicode, but outside the UCS-16 range, so not trivial to include with a libGdx leg to stand on.

from unciv.

SomeTroglodyte avatar SomeTroglodyte commented on June 7, 2024

Or just look in https://github.com/yairm210/Unciv/blob/master/android/assets/jsons/translations/ and try to think of one missing. Or Greek, it's at 14% coverage, and not been updated in a while, and in a sense Civ-relevant..

from unciv.

yairm210 avatar yairm210 commented on June 7, 2024

language you don't have

Bork! Or Klingon in a pinch. Problem is, the Klingon albhapet ™️ is defined in unicode, but outside the UCS-16 range, so not trivial to include with a libGdx leg to stand on.

...no
Greek is a good pick, though :)

from unciv.

SomeTroglodyte avatar SomeTroglodyte commented on June 7, 2024

...no

😭 - the Monty Python dialect of Hungarian, then? Volapük? Loglan? Dravidian? Blissymbols? Anything that would have stumped Sapir-Whorf?

from unciv.

yairm210 avatar yairm210 commented on June 7, 2024

Ah yes the famous checks notes Whorf effect
Actually never heard of Blissymbols before
"My [unitName] is full of [unitName]s"

from unciv.

awhillas avatar awhillas commented on June 7, 2024

ok, so Greek? I guess you don't have anyone who can check it? I could also do English in a pirate voice or make every longish bit of text a rap :D

from unciv.

yairm210 avatar yairm210 commented on June 7, 2024

Greek is good, pirate would be acceptable as a POC but Greek would be better, we can push it to 'prod' and see what people say

from unciv.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.