I this paper we evaluate our conversational TTS model with prosody control in a contectual interactive setting. 
In order to do this we developed a 20 question game using GPT-4 as a back-end, Google ASR and our TTS server.
We also develoved an interaction manager that kept track of the progress in the game
and controlled the prosodic realization accordingly.
In the first table we illustrate this prosody control in two example sentances. 
| Bad progress | Moderate progress | Good progress | ||||
| Speaking style | Positive text | Negative text | |
| Slow speech rate and low pitch | |||
| Slower speech rate and lower pitch | |||
| moderate speech rate and pitch | |||
| higher speech rate and higher pitch | |||
| high speech rate and high pitch | |||
Using the an animal guessing game we performed an evalaution of a Tacotron 2 TTS voice with prosody control (TTS-Pros), 
 
    controlling speaking style (from read to conversational) and prosody (mean pitch and speaking rate)
    We compared it with a Tacotron voice without prosody control (TTS-T2) and a Matcha TTS voice (TTS-Matcha).
    In the study 21 subjects were invited to our smart home lab to play an animal guessing game.
    In the game the users and system took turn thinking about an animal or asking questions and guessing.
    Each user played 4 games with system that used one of the three TTS systems. 
    Below are some examples of these interactions. The tables only contain the ASR transcript of the user input. 
    
| Time (start-end) | Speaker | Audio | Text | 
| 00:00:04,096 - 00:00:05,798 | TTS-Pros: | Hi there Lets play 10 questions. | |
| 00:00:06,096 - 00:00:08,798 | TTS-Pros: | Is the animal you're thinking of typically kept as a pet? | |
| 00:00:09,638 -- 00:00:09,858 | USER: | Yes | |
| 00:00:12,359 -- 00:00:13,119 | TTS-Pros: | Ah, fantastic. | |
| 00:00:13,640 -- 00:00:14,820 | TTS-Pros: | That certainly narrows it down. | |
| 00:00:22,584 -- 00:00:25,825 | TTS-Pros: | Is the animal you're thinking of small enough to be held in your arms? | |
| 00:00:27,406 -- 00:00:27,906 | USER: | Of course, yes. | |
| 00:00:30,069 -- 00:00:31,991 | TTS-Pros: | Oh wonderful that narrows it down quite a bit. | |
| 00:00:36,615 -- 00:00:38,336 | TTS-Pros: | Is this animal a type of rodent? | |
| 00:00:40,878 -- 00:00:41,739 | USER: | I'm not quite sure. | |
| 00:00:41,759 -- 00:00:46,063 | TTS-Pros: | Okay, so then back to the drawing board it seems. | |
| 00:00:49,486 -- 00:00:50,627 | TTS-Pros: | Does this animal have fur? | |
| 00:00:51,548 -- 00:00:51,748 | USER: | Yes. | |
| 00:00:54,210 -- 00:00:54,830 | TTS-Pros: | Ah, splendid. | |
| 00:00:56,712 -- 00:00:57,573 | TTS-Pros: | That narrows it down. | |
| 00:01:01,428 -- 00:01:02,549 | TTS-Pros: | Does this animal have fur? | |
| 00:01:03,630 -- 00:01:03,850 | USER: | Yes! | |
| 00:01:06,192 -- 00:01:06,792 | TTS-Pros: | Ah, that's great. | |
| 00:01:07,413 -- 00:01:08,614 | TTS-Pros: | Seems like I'm getting closer. | |
| 00:01:14,759 -- 00:01:16,520 | TTS-Pros: | Is it known for its ability to purr? | |
| 00:01:20,444 -- 00:01:21,444 | USER: | Can you repeat that again? | |
| 00:01:23,626 -- 00:01:25,888 | TTS-Pros: | Is it known for its ability to purr? | |
| 00:01:26,769 -- 00:01:35,342 | USER: | To purr? What does that mean? I'm gonna skip this. | |
| 00:01:31,921 -- 00:01:35,342 | TTS-Pros: | Is the animal primarily a carnivore known for eating meat? | |
| 00:01:35,982 -- 00:01:37,502 | USER: | Noo, I don't think so. | |
| 00:01:39,843 -- 00:01:40,283 | TTS-Pros: | Uh, no? | |
| 00:01:40,923 -- 00:01:43,084 | TTS-Pros: | Alright, I need to consider other options. | |
| 00:01:43,724 -- 00:01:45,645 | USER: | I mean, yes, but not raw meat. | |
| 00:01:48,186 -- 00:01:49,286 | TTS-Pros: | Is it a type of bird? | |
| 00:01:50,907 -- 00:01:51,307 | USER: | No. | |
| 00:01:51,327 -- 00:01:57,869 | TTS-Pros: | Oof I thought I was out to do something interesting let's see what else could it be? | |
| 00:02:02,540 -- 00:02:04,862 | TTS-Pros: | Does this animal commonly reside indoors? | |
| 00:02:05,402 -- 00:02:11,286 | USER: | yes yes uh both but more or less indoor yeah | |
| 00:02:11,546 -- 00:02:12,667 | TTS-Pros: | Phew okay I'm getting warmer with this. | |
| 00:02:17,610 -- 00:02:22,113 | TTS-Pros: | Is this animal prone to climbing or jumping onto elevated surfaces like furniture? | |
| 00:02:24,815 -- 00:02:33,115 | USER: | it depends on what species species | |
| 00:02:39,785 -- 00:02:43,010 | TTS-Pros: | I have asked my questions and have failed to guess correctly. | |
| Time (start-end) | Speaker | Audio | Text | 
| 00:05,754 00:07,134 | TTS-Pros: | I am thinking about an animal. | |
| 00:11,537 - 00:13,117 | TTS-Pros: | Please start asking me questions. | |
| 00:15,879 - 00:17,180 | USER: | Does it have four legs? | |
| 00:21,822 - 00:23,123 | TTS-Pros: | No, it does not have four legs. | |
| 00:24,103 - 00:26,825 | TTS-Pros: | Keep on your toes. You might just stumble on the answer. | |
| 00:27,219 - 00:28,100 | USER: | sir | |
| 00:28,926 - 00:29,026 | TTS-Pros: | Um... | |
| 00:35,698 - 00:39,219 | TTS-Pros: | Looks like you took a tiny misstep there, don't trip over your own questions. | |
| 00:39,219 - 00:45,100 | USER: | /SILENCE/ | |
| 00:45,100 - 00:46,001 | TTS-Pros: | You have to say something? | |
| 00:46,741 - 00:48,621 | USER: | Does it have more than four legs? | |
| 00:53,102 - 00:54,923 | TTS-Pros: | No, it does not have more than four legs. | |
| 00:56,163 - 00:58,944 | TTS-Pros: | Definitely not crawling with possibilities, are we? | |
| 01:03,740 - 01:05,001 | USER: | Is it a reptile? | |
| 01:05,101 - 01:06,001 | TTS-Pros: | You have to say something. | |
| 01:10,623 - 01:11,603 | USER: | Is it a reptile? | |
| 01:13,224 - 01:14,144 | TTS-Pros: | Just a second. | |
| 01:16,605 - 01:17,865 | TTS-Pros: | No, it is not a reptile. | |
| 01:23,608 - 01:27,009 | TTS-Pros: | guess we're not slithering towards the answer this time. | |
| 01:34,349 - 01:35,790 | USER: | Does it live in the water? | |
| 01:37,370 - 01:39,071 | TTS-Pros: | Engaging deeper thought. | |
| 01:40,071 - 01:41,411 | TTS-Pros: | Yes, it resides in the water. | |
| 01:42,312 - 01:43,812 | TTS-Pros: | You're not floundering anymore. | |
| 01:44,572 - 01:45,833 | TTS-Pros: | Keep swimming for the answer. | |
| 01:46,973 - 01:48,513 | USER: | Is it a type of fish? | |
| 01:49,370 - 01:50,071 | TTS-Pros: | Let's see | |
| 01:52,654 - 01:54,215 | TTS-Pros: | Yes, you're asking if it's a fish. | |
| 01:55,195 - 01:58,096 | TTS-Pros: | Cast a wider net, but you're swimming in the right school now. | |
| 02:08,692 - 02:10,714 | USER: | Is it larger than a cat? | |
| 02:12,535 - 02:13,075 | TTS-Pros: | Hold on. | |
| 02:15,517 - 02:17,158 | TTS-Pros: | Yes, it is larger than a cat. | |
| 02:18,179 - 02:21,021 | TTS-Pros: | Seems like you're not fishing for compliments, just bigger fish. | |
| 02:26,805 - 02:29,047 | USER: | Would you consider this animal dangerous? | |
| 02:30,788 - 02:31,368 | TTS-Pros: | In a jiffy. | |
| 02:33,390 - 02:34,811 | TTS-Pros: | Yes, it is considered dangerous. | |
| 02:35,836 - 02:38,418 | TTS-Pros: | Careful now, don't bite off more than you can chew with your guesses. | |
| 02:41,940 - 02:44,701 | USER: | Is it extra dangerous for surfers? | |
| 02:46,823 - 02:47,463 | TTS-Pros: | uh, let me see. | |
| 02:51,545 - 02:53,707 | TTS-Pros: | Yes, it is particularly risky for those who surf. | |
| 02:54,547 - 02:56,589 | TTS-Pros: | You're definitely riding the wave to success now. | |
| 03:02,092 - 03:03,733 | USER: | Does it have very many teeth? | |
| 03:05,666 - 03:06,227 | TTS-Pros: | in a jiffy. | |
| 03:07,871 - 03:09,676 | TTS-Pros: | You have asked ten questions and have to guess. | |
| 03:10,678 - 03:11,500 | USER: | Is it a shark? | |
| 03:13,906 - 03:15,710 | TTS-Pros: | Yes, it is a shark, you guessed correctly. | |
| 03:16,612 - 00:03:18,316 | TTS-Pros: | This concludes this round of the game. | |
| Time (start-end) | Speaker | Audio | Text | 
| 00:00:07,469 -- 00:00:09,070 | TTS-T2: | I am thinking about an animal. | |
| 00:00:11,752 -- 00:00:13,974 | USER: | Is the animal larger than a human? | |
| 00:00:19,697 -- 00:00:21,859 | TTS-T2: | Yes, it is larger than a human. | |
| 00:00:22,699 -- 00:00:25,201 | TTS-T2: | You're not sizing me up to quickly, are you? | |
| 00:00:28,143 -- 00:00:29,524 | USER: | Does the animal live on land? | |
| 00:00:31,167 -- 00:00:31,707 | TTS-T2: | Let's see. | |
| 00:00:36,948 -- 00:00:39,069 | TTS-T2: | Yes, it does reside on land. | |
| 00:00:40,269 -- 00:00:43,550 | TTS-T2: | Absolutely towering over the competition with that guess. | |
| 00:01:04,378 -- 00:01:07,259 | USER: | yeah is it uh domesticated a domestic animal | |
| 00:01:08,760 -- 00:01:09,300 | TTS-T2: | Hold on. | |
| 00:01:11,481 -- 00:01:13,703 | TTS-T2: | No, it is not a domestic animal. | |
| 00:01:14,803 -- 00:01:18,125 | TTS-T2: | Looks like someone's barking up the wrong family tree. | |
| 00:01:22,227 -- 00:01:23,848 | USER: | uh is it uh carnivore | |
| 00:01:25,369 -- 00:01:26,009 | TTS-T2: | Quick scan. | |
| 00:01:28,835 -- 00:01:30,116 | TTS-T2: | No, not a carnivore. | |
| 00:01:31,477 -- 00:01:34,319 | TTS-T2: | uh you might want to sharpen those claws of thought. | |
| 00:01:34,319 -- 00:01:40,123 | USER: | /SILENCE/ | |
| 00:01:40,123 -- 00:01:41,284 | TTS-T2: | You have to say something. | |
| 00:01:41,304 -- 00:01:43,705 | USER: | Is it a lot heavier than a human? | |
| 00:01:45,286 -- 00:01:46,207 | TTS-T2: | Just a second. | |
| 00:01:50,730 -- 00:01:53,512 | TTS-T2: | Yes, it is significantly heavier than a human. | |
| 00:01:54,560 -- 00:01:57,581 | TTS-T2: | You're definitely not lightweight in your questioning. | |
| 00:01:58,781 -- 00:02:00,121 | USER: | Is it an elephant? | |
| 00:02:01,702 -- 00:02:02,202 | TTS-T2: | Hold on. | |
| 00:02:04,982 -- 00:02:07,183 | TTS-T2: | No, your guess is not correct. | |
| 00:02:08,063 -- 00:02:10,623 | TTS-T2: | keep trunk ting along with those questions. | |
| 00:02:18,414 -- 00:02:19,414 | USER: | does uh is the animal dangerous to humans | |
| 00:02:15,164 -- 00:02:15,885 | TTS-T2: | Quick scan. | |
| 00:02:27,472 -- 00:02:30,533 | TTS-T2: | No, it's not considered dangerous to humans. | |
| 00:02:31,553 -- 00:02:34,394 | TTS-T2: | Don't worry, you won't need any armor for this one. | |
| 00:02:39,916 -- 00:02:44,797 | USER: | Does the animal have any patterns on itself? | |
| 00:02:46,458 -- 00:02:47,618 | TTS-T2: | Reflecting on that. | |
| 00:02:59,710 -- 00:03:02,355 | TTS-T2: | You have asked 10 questions and have to guess. | |
| 00:03:03,076 -- 00:03:04,539 | USER: | It's the animal, a giraffe. | |
| 00:03:06,923 -- 00:03:08,867 | TTS-T2: | I am sorry you guessed wrong. | |
| Time (start-end) | Speaker | Audio | Text | 
| 00:00:06,915 -- 00:00:08,436 | TTS-Matcha: | I am thinking about an animal. | |
| 00:00:11,218 -- 00:00:13,459 | TTS-Matcha: | Please start asking me questions. | |
| 00:00:14,240 -- 00:00:15,320 | USER: | Is it able to swim? | |
| 00:00:21,024 -- 00:00:23,005 | TTS-Matcha: | No, it's not known for swimming. | |
| 00:00:23,926 -- 00:00:27,928 | TTS-Matcha: | I'd throw you a life jacket, but we're on dry land with this one. | |
| 00:00:29,645 -- 00:00:30,525 | USER: | Is it a carnivore? | |
| 00:00:32,266 -- 00:00:32,926 | TTS-Matcha: | Hold on. | |
| 00:00:35,647 -- 00:00:37,147 | TTS-Matcha: | Yes, it is a carnivore. | |
| 00:00:37,828 -- 00:00:38,808 | TTS-Matcha: | Chomp chomp. | |
| 00:00:39,568 -- 00:00:41,909 | TTS-Matcha: | You've got some sharp teeth on that one. | |
| 00:00:43,529 -- 00:00:44,450 | USER: | Does it have scales? | |
| 00:00:46,250 -- 00:00:47,331 | TTS-Matcha: | Just a second. | |
| 00:00:51,872 -- 00:00:53,573 | TTS-Matcha: | No, it does not have scales. | |
| 00:00:54,453 -- 00:00:57,974 | TTS-Matcha: | Looks like you're fishing for clues in the wrong pond. | |
| 00:01:00,330 -- 00:01:01,191 | USER: | Is it a K9? | |
| 00:01:03,072 -- 00:01:03,712 | TTS-Matcha: | In a jiffy | |
| 00:01:05,374 -- 00:01:07,055 | TTS-Matcha: | No, it is not a canine. | |
| 00:01:07,875 -- 00:01:10,277 | TTS-Matcha: | Maybe try howling up a different tree. | |
| 00:01:11,658 -- 00:01:16,321 | USER: | Is it part of the cat family? | |
| 00:01:18,658 -- 00:01:20,321 | TTS-Matcha: | Let's see.. | |
| 00:01:22,506 -- 00:01:24,887 | TTS-Matcha: | Yes, it is part of the cat family. | |
| 00:01:25,708 -- 00:01:28,350 | TTS-Matcha: | purr fect you're clawing your way to the top. | |
| 00:01:31,195 -- 00:01:33,196 | USER: | Is it known to reside in a jungle? | |
| 00:01:35,116 -- 00:01:36,377 | TTS-Matcha: | Reflecting on that. | |
| 00:01:38,677 -- 00:01:41,938 | TTS-Matcha: | No, it's not typically associated with the jungle. | |
| 00:01:54,282 -- 00:01:58,944 | TTS-Matcha: | Looks like you're not quite the king of this particular guessing jungle yet. | |
| 00:02:00,151 -- 00:02:01,331 | USER: | Does it live in the savannah? | |
| 00:02:03,052 -- 00:02:05,312 | TTS-Matcha: | Excuse me, I need to think a bit. | |
| 00:02:09,053 -- 00:02:11,353 | TTS-Matcha: | Yes, it does reside in the savannah. | |
| 00:02:12,073 -- 00:02:13,053 | TTS-Matcha: | Hot, hot, hot. | |
| 00:02:13,853 -- 00:02:16,774 | TTS-Matcha: | Your instincts are on a safari to success. | |
| 00:02:18,414 -- 00:02:19,414 | USER: | Does it have a mane? | |
| 00:02:21,115 -- 00:02:21,715 | TTS-Matcha: | Let's see. | |
| 00:02:24,975 -- 00:02:26,496 | TTS-Matcha: | Yes, it does have a mane. | |
| 00:02:27,276 -- 00:02:28,176 | TTS-Matcha: | Looking sharp. | |
| 00:02:29,013 -- 00:02:31,616 | TTS-Matcha: | You're nearly the king of this guessing game. | |
| 00:02:37,942 -- 00:02:40,725 | TTS-Matcha: | You have asked 10 questions and have to guess. | |
| 00:02:42,166 -- 00:02:42,847 | USER: | Is it a lion? | |
| 00:02:45,149 -- 00:02:47,792 | TTS-Matcha: | Yes, it is a lion, You guessed correctly. | |
| Time (start-end) | Speaker | Original (T2) | Alternative (prosody) | Text | 
| 00:00:07,470 -- 00:00:09,031 | SYSTEM: | I am thinking about an animal. | ||
| 00:00:15,316 -- 00:00:17,678 | SYSTEM: | Please start asking me questions. | ||
| 00:00:18,899 -- 00:00:20,980 | USER: | Does the animal live in water? | ||
| 00:00:25,243 -- 00:00:27,385 | SYSTEM: | No, it does not reside in water. | ||
| 00:00:28,388 -- 00:00:31,369 | SYSTEM: | Hope you're not fishing for clues in the wrong pond. | ||
| 00:00:32,569 -- 00:00:34,350 | USER: | Does your animal fly? | ||
| 00:00:36,050 -- 00:00:37,210 | SYSTEM: | Swiftly checking. | ||
| 00:00:38,911 -- 00:00:40,351 | SYSTEM: | No, it does not fly. | ||
| 00:00:41,692 -- 00:00:45,073 | SYSTEM: | Seems like your guesses are having a bit of turbulence. | ||
| 00:01:04,378 -- 00:01:07,259 | USER: | Does your animal usually reside in Europe? | ||
| 00:00:51,194 -- 00:00:53,255 | SYSTEM: | Excuse me, I need to think a bit. | ||
| 00:00:53,975 -- 00:00:55,676 | SYSTEM: | Yes, it can reside in Europe. | ||
| 00:00:56,436 -- 00:00:56,796 | SYSTEM: | Eureka. | ||
| 00:00:57,802 -- 00:00:59,943 | SYSTEM: | somewhat closer to the truth you are. | ||
| 00:01:02,245 -- 00:01:05,146 | USER: | Is your animal usually kept as a pet? | ||
| 00:01:07,148 -- 00:01:07,908 | SYSTEM: | Let me see. | ||
| 00:01:11,530 -- 00:01:13,732 | SYSTEM: | Yes, it is usually kept as a pet. | ||
| 00:01:14,932 -- 00:01:16,593 | SYSTEM: | you're barking up the right wait i'm not supposed to say that. | ||
| 00:01:19,115 -- 00:01:20,896 | SYSTEM: | Keep fetching those questions. | ||
| 00:01:20,896 -- 00:01:32,557 | USER: | /SILENCE/ | ||
| 00:01:32,557 -- 00:01:33,818 | SYSTEM: | You have to say something. | ||
| 00:01:34,499 -- 00:01:35,940 | USER: | Is your animal carnivorous? | ||
| 00:01:39,884 -- 00:01:41,345 | SYSTEM: | Yes, it is carnivorous. | ||
| 00:01:42,386 -- 00:01:45,069 | SYSTEM: | salivating at the right scent i see. | ||
| 00:01:48,152 -- 00:01:50,536 | USER: | uh does your animal | ||
| 00:01:50,936 -- 00:01:52,536 | SYSTEM: | Swiftly checking. | ||
| 00:01:56,265 -- 00:01:59,286 | SYSTEM: | uh I think you might have missed a part of your question there. | ||
| 00:01:59,666 -- 00:02:00,166 | USER: | Yes, I did. | ||
| 00:02:00,186 -- 00:02:02,687 | SYSTEM: | That was certainly a mysterious attempt. | ||
| 00:02:06,147 -- 00:02:07,928 | USER: | Is your animal a type of cat? | ||
| 00:02:09,928 -- 00:02:10,128 | SYSTEM: | uh.. | ||
| 00:02:13,589 -- 00:02:15,609 | SYSTEM: | No, it is not a type of cat. | ||
| 00:02:16,530 -- 00:02:19,070 | SYSTEM: | You're barking up the wrong tree with that guess. | ||
| 00:02:19,990 -- 00:02:22,351 | SYSTEM: | Try a different category, perhaps. | ||
| 00:02:23,751 -- 00:02:25,416 | USER: | Is your animal a type of dog? | ||
| 00:02:27,864 -- 00:02:30,050 | SYSTEM: | Yes, it is a dog you guessed correctly. | ||
| 00:02:30,933 -- 00:02:33,200 | SYSTEM: | This concludes this round of the game. | ||