Numen omen

myśli niesparsowane

Subscribe to RSS feed

Posts tagged with "tts"

Using TTS technology to facilitate language learning

, ,


So you have decided to learn a foreign language, let's suppose a less popular one. Ok, it is hard to find classes, so you decide to learn it on your own, no big deal. You find a self-study course that suits you, obviously with accompanying audio to serve as a model and you are all set to pick up the language, right?

Err, not really. What if it soon turns out that the CDs or tapes are lacking some of the wordlists and text passages, some words and sentences are spoken too fast or not clearly, and there are a few dialects recorded, so you end up being confused about proper pronunciation, or worse, acquire pronunciation of different dialects and regions depending on the word...

Obviously the solution is to get more language input, so you listen to radio or tv programmes (it is good the Internet offers a lot of streaming media) hoping to hear the words you are not sure of. Tricky, especially if you are a beginner. You can also try to find a native speaker of the language with good, neutral pronunciation. Even harder, eh? So what else can you do?

Enter the land of Text to Speech technology (TTS)! No, I do not suggest using one of those robotic-sounding speech of formant synthesis, but a more natural unit selection (or at least diphone) synthesis, where a database (often much longer that an hour) of natural recorded speech is used as model. The recorded speech is then 'chopped' into smaller units: phrases, words, morphemes, syllables or phones. Later, a special sophisticated algorithm use those units and dsp processing to recreate a continuous chain of speech from the input, namely our text.

Sounds fantastic, but how much does it cost. Very good and sophisticated systems used by telecoms or car industry can be extremely expensive, desktop solutions are usually much cheaper. The good news is you can often try a free 30-day demo version, for example Infovox Desktop or use an online demo without downloading and installing a thing! Sure, those online demos are usually limited to 100-250 characters or play background music, but they are more than adequate for the purpose of language learning.

If you are eager to use one of those TTS systems, here are some links of interactive demos to get you started:
  1. Nuance RealSpeak
  2. AT&T TTS
  3. Acapela HQ TTS
  4. Loquendo TTS

The last one can even produce lively expressive intonation patterns and phrases for a number of languages. Check it out!

If you got interested in TTS you can read a Wikipedia article Speech synthesizer and follow the links there.

One more remark. Many words are context sensitive, i.e. their pronunciation can change depending on the place in the sentences and often TTS engines "know" that. So it is a good idea to provide the context to the words in doubt too.

I wonder if there are more people using the technology as an aid in language learning. Or maybe there are better ways of acquiring a correct pronunciation? Let me know what you think.