Giter Club home page Giter Club logo

texttotalk's Introduction

Download count

TextToTalk

Chat TTS plugin for Dalamud. Has support for triggers/exclusions, several TTS providers, and more!

Commands

  • /tttconfig: Opens the configuration window.
  • /canceltts: Cancel all queued TTS messages.
  • /toggletts: Turns TTS on or off.
  • /disabletts: Turns TTS off.
  • /enabletts: Turns TTS on.

Lexicons

TextToTalk supports custom lexicons to modify how words are pronounced. For more information, please join our community lexicons discussion.

Direct links to information will be added here eventually.

Supported TTS providers

  • System (Windows)
  • AWS Polly
  • Azure (Microsoft Cognitive Services)
  • Uberduck
  • Websocket

WebSocket interfacing

TextToTalk can optionally open a WebSocket server to serve messages over. There are currently two JSON-format messages that can be sent (see IpcMessage):

TTS prompt:

{
  "Type": "Say",
  "Payload": "Firstname Lastname says something",
  // Will replace the logged-in player's name with {{FULL_NAME}}, {{FIRST_NAME}}, or {{LAST_NAME}} as appropriate.
  // Does not currently apply to players other than the logged-in player.
  "PayloadTemplate": "{{FULL_NAME}} says something",
  "Voice": {
    "Name": "Gender"
  },
  "Speaker": "Firstname Lastname",
  // or "AddonTalk", or "AddonBattleTalk"
  "Source": "Chat",
  "StuttersRemoved": false,
  // or null, for non-NPCs
  "NpcId": 1000115,
  // Refer to https://dalamud.dev/api/Dalamud.Game.Text/Enums/XivChatType
  "ChatType": 10,
  // Refer to https://dalamud.dev/api/Dalamud/Enums/ClientLanguage
  "Language": "English"
}

TTS cancel:

{
  "Type": "Cancel",
  "Payload": "",
  "PayloadTemplate": "",
  "Voice": null,
  "Speaker": null,
  // or "Chat", "AddonTalk", or "AddonBattleTalk"
  "Source": "None",
  "StuttersRemoved": false,
  "NpcId": null,
  "ChatType": null,
  "Language": null
}

Screenshots

image image image image image image image

Development

Refer to the wiki for dev documentation.

texttotalk's People

Contributors

ayuei avatar cidan avatar johnysandels avatar karashiiro avatar kaxlabs avatar lazerl0rd avatar passivemodding avatar ryankhart avatar spinda avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

texttotalk's Issues

Websocket Server - Connection problems and static port

Tried using python and keep getting a 501 response so I tried using WebsocketSharp and I'm getting a "|Fatal|WebSocket.connect:0|WebSocketSharp.WebSocketException: Not a WebSocket handshake response" on connection.

Plan is to use Conqui TTS as a back-end.

Do you have a sample client that you used for testing?

Amazon Polly crash

Hello, first of all I would like to thank you for this plugin its absolutely amazing. I seem to be having issues with Amazon Polly text to speech crashes after I try to enable Polly. I tried reinstalling the addon and as soon as I try to switch to Polly it just crashes.

Ungendered voice selected isn't used

I'm having an issue where the ungendered voice I selected isn't properly registered and (I assuming) the default US-english speaker voice is used instead.

See this vid for example. For some reasons, the game recognizes the report on the table is male speaker but that's probably how they assigned gender in the game? Not that I have any issue with that. Anyway, you can see around 0:44 where an unknown speaker enters the scene and, eventhough I selected Takumi, a JP voice, for ungendered in config, an english voice was played.

TTT sometimes stops working and game freezes

Hello,

I've uninstalled the TTT plugin and removed the settings, but I still have the problem.
Sometimes, after on sentence read successfully, the TTT plugin stops working.
Then , if I try to disable it or to exit the game, the game freezes.

I never had the problem before, but I've switched to Windows 11 recently, on a new computer.
I'm using the standard Windows voices.
It seems to happen when I alt-tab on Chrome.

EDIT:
When I type /xldebug and I open log window in verbose mod, then I launch a dialog, I have the following message:
"Unhandled SetStringChunkType: 16" a lot of times, then the same message with 29, 19, 41, 19, etc.

EDIT 2:
after uninstalling all plugins and installing TTT fresh, it seems to be more stable now. So I suppose that there was a conflict with another plugin or setting.

Support for Additional Voices

Version: 1909 Windows 10

I'm wondering if this plugin pulls from the available voices in control panel on Windows 10 or is limited to Zira / David when using Canadian English (EN) as the default language. In my control panel I have a couple extra voices available that were installed from a third party but they don't show up in the addon

Amazon Polly settings gone

Copied over the wrong user key & after it failed to authenticate I was unable to fix it. All Polly settings seems to have gone away. Have manually deleted the config files & completely reinstalled dalamud, but haven't been able to get the polly settings to show up again.

image

Often, but not always repeating

This seems to be happening more and more often, maybe one in three it will repeat the text it recently spoke. It will usually start from the beginning of the dialog tree as opposed to where I currently am.

I also don't have that translating plugin installed, which I think I saw mentioned before.

Disable "Character Name says:" completely

Perhaps this is a rare use case, but I would really appreciate if there was an option to remove "Character Name says:" completely, even for the first time a character speaks.

Community lexicons

Migrated to #60 - please continue there!

I don't use lexicons myself, so I don't have one I'm maintaining, but if anyone else has lexicons they're willing to share I'd appreciate it if they could drop a link so I can provide them to anyone who wants them and doesn't know how to make them themselves. Alternatively, feel free to post them in the #preset-sharing channel in the goat place Discord, and I'll relink them somewhere here.

Long "—" used in FFXIV dialogue leads to individual pronunciation of characters.

As the titles suggests the "—" character also known as emdash (at least i think that's what is used) makes TTS read each character individually rather than read the word infront and after the "—". So something like

behold Great King moogle — first of his name etc. sounds really silly.

would there be any way to make — be interpreted "-" instead? because "-" is treated as a normal dash and doesn't cause words to be read as individual characters.

Configuration not saving

Every time I log out and back in, I have to go into TextToTalk and reselect the channels I want. Uninstalling the plugin and reinstalling gives the same behavior and everything gets set to defaults.

(PLS) Longer graphemes do not take precedence

This may be related to #46, but it seemed different enough to open a new issue.

When matching entries in a lexicon file (for the System backend, at least), TTT appears to give priority to shorter grapheme matches. This prevents longer matches from working entirely.

E.g., given two entries, one with <grapheme>Ixal</grapheme> and the other with <grapheme>Ixali</grapheme>, when given the string "Ixali", TTT matches the first and ignores the second. The backend then sees the result as Ixal and i, and pronounces it as ['ɪk.sɑːl 'aɪ].

This is true regardless of the order of lexemes in the lexicon file. It seems like LexiconManager sorts the entries and then applies them in the order of shortest to longest.

FWIW, this seems to run counter to a couple guidelines in the PLS specification at https://www.w3.org/TR/pronunciation-lexicon/#AppC:

Precedence should be given to the retrieval of lexemes having a <grapheme> element whose content exactly matches the longest possible sequence of consecutive tokens. Thus, a lexeme for "they'll" should have precedence over a lexeme for "they" given the input "they'll'.

Lexical retrieval should be performed by the bias of tokens rather than characters. Thus, a lexeme for "do" should not match the beginning of "done".

The current implementation in LexiconManager doesn't appear to bother with any tokenization at the moment, so that might be worth pursuing. How exactly tokenization is typically implemented for speech synthesis is a bit beyond my depth, though.

Either way, I've found a workaround with the current version. Because aliases are applied before phonemes, you can use an alias to replace the longer grapheme with a string that doesn't match the shorter one, and then create a separate lexeme that matches that alias to the correct phoneme, like so:

  <lexeme>
    <grapheme>Ixali</grapheme>
    <grapheme>ixali</grapheme>
    <alias>I_x_a_l_i</alias>
  </lexeme>
  <lexeme>
    <grapheme>I_x_a_l_i</grapheme>
    <phoneme>ɪkˈsɑːli</phoneme>
  </lexeme>
  <lexeme>
    <grapheme>Ixal</grapheme>
    <grapheme>ixal</grapheme>
    <phoneme>ɪksɑːl</phoneme>
  </lexeme>

This is super hacky, though.

Option to speak current dialogue bubble immediately upon TTS enable

This is a bit of a nitpick, but I love this plugin so much not to give more feedback.

TextToTalk used to speak the current NPC dialogue that is currently on the screen when I use the keyboard shortcut to enable TTS. This functionality, unfortunately, got removed in this commit.

Was it sometimes annoying hearing TTS NPC dialogue from an old cutscene I had 15 minutes ago? Sure, I guess, but to me, it's more disappointing that, if I forget to enable TTS before an NPC conversation, then I have to manually read the first line of dialogue before TTS kicks in for the next line of dialogue.

Furthermore, even if I never forget to enable TTS before NPC dialogue, I don't know which NPC dialogue will be voice acted before it starts, so, before this commit, I had previously disabled TTS just in case there was voice acting, and if not, then I'd enable it with the shortcut.

However, if there is imminent functionality to automatically disable TTS during voided cutscenes as described in this issue, my issue described here is more or less a non-issue, as I would have no reason to ever manually disable TTS.

More German Voices for Amazon Polly

At the moment the only German Voice in the option list for Amazon Polly is "Vicki".

According to this site there are two more German voices "Hans" and "Marlene". Could you please enable those in the option list as well?

I took a looked into the AWSSDK.Polly.dll binary that you use and the strings "Hans" and "Marlene" can be found inside and so I hope it is just a change in how the TTS plugin uses the Amazon Polly API.

Nested Replacements (w priority system) for lexicon

Would allow for priority for replacements, allowing the name Y'stola to have priority over 's

Tbh this isn't that important since the main usage of the lexicons being used is names, not recreating a whole language, so the only reason this would be helpful is for 's's after a name

Voice Unlocker Error

Error when trying to use the built in Voice unlocker, when clicking both Manual tutorial and also Enable all system voices on version 1.9.4 and 1.9.7. Other versions not tested.

There is nothing showing an issue in output or dalamud logs

image

Addon Crashes Launcher

If addon is loaded the game crashes, atleast 15 more are reporting that issue on the XIVLauncher Discord

1.8.0.1 Bugs

-Deleting plugin in plugin installer runs into an error
installer error

  • /echo chat doesn't trigger TTS when it's supposed to.

.NET SpeechSynthesizer bugs

-with custom lexicon, sometimes voice gets stuck on one voice, and changing the voice crashes the game.

  • one time this happened the game kept crashing until I deleted my config. (voice didn't get stuck after that) - Haven't Reproduced the game crashing, but sometimes the voice still does get stuck even when changing it. I think it might have something to do with lexicons and gendered voice presets not playing nice. (can force the voice I want by enabling gendered presets and putting them all on the preset I'm currently using)


Listed are issues that are fixed by deleting and reinstalling the plugin, or re-starting the game - Haven't had time to see if these have been fixed in 1.8.4.0 :

  • with a working custom lexicon selected after some time, TTS can't read anything out loud at all anymore.

  • custom lexicon sometimes stops being used and uses default pronunciation.


FIXED -when using <proneme> in the lexicon.xml TTS can't read anything out loud at all. Pronemes are working with some bugs(not tested for all the bugs yet)

FIXED -also seems like European voices don't use the lexicon pronouncation when xml:lang="en' only when it's set to xml:lang="en-GB"

FIXED - first time selecting a lexicon file, pronunciation isn't used.

FIXED - deleting and re-selecting lexicons to use updated version of same lexicon.xml file won't use new pronunciation.

All Speech is interrupted by new dialogue box text and chat box messages when "Cancel the current speech when new text is available or text is advanced" is enabled.

The following bug occurs when "Cancel the current speech when new text is available or text is advanced" is enabled.

  • All text from dialogue boxes and triggered chat messages interrupt the queue and cancels current speech. Newest triggered chat box text or dialogue text is played.

     Ideally this feature should only cancel dialogue text when new dialogue text is available or advanced.

A possible solution to this is to label dialogue text in the queue as dialogue, and when new dialogue text is available or text is advanced, clear the text labeled, "dialogue". Then, move triggered chat messages up the queue so they aren't lost.

I think people probably only want dialogue messages to interrupt dialogue messages, and everything should not interrupt each other.

Here is an recording using /echo and a dialogue box as an example of chat messages interrupting dialogue, dialogue interrupting chat messages, and chat messages interrupting chat messages.

https://streamable.com/8rpyoh

Feel free to reach out if you need any clarification! this one was a bit wordy because it was hard to explain 😅

Stop reading current NPC dialgoue box when dialogue box is advanced or closed

It would be great if TextToTalk would cancel reading the previous dialogue box when advancing to the next dialogue, and also when a text box is closed when you're done talking to an NPC. video example, clicking text box closed once I've finished reading it, it continues, and when a new textbox opens it has to wait for the previous voice to finish: https://streamable.com/saeger

Similarly, when toggling TextToTalk to disabled, it would be a nice addition to clear the TTS queue and cancel the current TTS! example : when disabling TTS, the voice continues https://streamable.com/8d72l1

It can be annoying when skipping through a bunch of unimportant text from a NPC, that the text drones on in the background which needs to be waited through, or manually cancelled.

(Also side note: Thank you so much for adding the Dialogue box reading! It's been making the the non-voiced cutscenes and it's accompanied missions so much easier to follow, and they feel more important! Thanks for making such a great plugin!)

Amazon Polly's Neural Engine voice options not listed...at first

Amazon Polly's Neural Engine voice options are not listed, and when I tried "standard" engine and picked voices, and then later switched back to the "neural" engine, the default neural engine voice that had worked before no longer works, and I'm still stuck with whichever "standard" engine voices that I had picked.

Gendered speakers being read at same time

If you advance through multiple dialog boxes that included different gendered speakers (even with gendered speakers option off), the dialog for the different speakers will be read at the same time.

Read current text when Enabling TTS while dialogue box is open, don't read current text when enabling TTS while dialogue box is closed.

In the XLDev Addon Inspector it is possible to see if the dialogue box is visible or not, so it should be possible to detect if the dialogue box is open.
I think that if the dialogue window is open, enabling TTS through the keybind shortcut or through chat commands, it should start reading the currently open dialogue box.
If the dialogue window is not open when enabling the TTS through the keybind or chat command, it should not read the current dialogue.

I hope that all makes sense! feel free to reach out for clarification if I said it weird!
Still really enjoying this plugin! it's been great for the Hildebrand missions!

User defined profiles with each their own hotkey?

Normally, I'd prefer to hear TTS for unvoiced NPC dialogue and maybe status effect changes on myself, but sometimes, when I'm just hanging out in a city chatting occasionally, I'd like to hear TTS from a number of additional sources. I'm not always paying close attention to the chatbox.

I think it would be great as a user, to be able to create separate profiles with different TTS settings, each having its own hotkey to switch to it.

As it is now, the hotkey is just a simple toggle, so if each profile's hotkey also worked as a toggle, it would probably need to disable any other previously active profile.

I do already enjoy FFXIV a lot more with this plugin as is. Thank you for creating it!

TTS repeated twice in most dialogues

I am using testing version 1.8.0.4 and many of the dialogues are read twice. I am using Amazon Polly Neural Engine and its only free for 1 million characters, not sure if repeated speech is counted or not.

Amazon Polly Voices Rate Not Working

Amazon Polly Voices work, but changing the rate in the slider doesn't affect the voice speed for them. Is it possible to have rate supported for third party 64-bit voices?

Read quest text.

I have an idea for an accessibility feature where this addon could read from the quest text window directly for the non voice acted quests.
Right now, there is an option to enable npc dialogue, however the text is printed to chat after skipping to next line making it hard to follow.
A hotkey could be added and it would read on press so it doesn't automatically read everything.
I found where the text is stored if that is of any help.
ffxiv

Awkward pronunciation of some abbreviations with Amazon Polly

I'm using the Amazon Polly voices, which are great, but sometimes it extrapolates abbreviations like "nin" to mean "Nine Inch Nails"
or "res" to mean "residential" and it always throughs me for a loop every time I hear it. It would be pretty sweet to be able to disable that and/or be able to set our own custom pronunciations for words.

I don't know how the backend of this is set up, but assuming it's an Amazon Polly issue, then this issue isn't really a bug with this plugin but more of a feature request to be able to better utilize Amazon Polly.

I'll still love and use this plugin regardless.

TextToTalk Error

After installing the plugin and attempting to open the configuration window by clicking on the config button, i get this error popup:
https://i.imgur.com/F4T0atz.png

I tried restarting the game but didn't work. Please advise.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.