BLOG

Randy Smith's picture

By Randy Smith

February 1, 2010

Talking With Computers: Part Two

You may wish to read Randy's previous columns on character interaction, Is The Holodeck The Holy Grail? and Talking With Computers before reading this one.

On one hand you’ve got static, preauthored dialogue trees, which are seriously limited in the amount of freedom they can provide. On the other, natural language processing (NLP), which would work great if it wasn’t all but impossible to implement. Those extremes aren’t promising, but there’s been plenty of activity in the middle. Façade. Galatea. The Sims. Masq. Siboot. Ultima. Civilization. Quantic Dream. BioWare. Bethesda.

Might as well analyse Fallout 3, since it’s having trouble not winning every award imaginable lately. Fallout’s conversation trees are dynamic in that they vary depending on your character’s stats and behaviour. Like if you have the Child At Heart perk, your Karma is Very Evil, you’ve discovered the ID tags of a dead woman and you’re holding a toaster, then you’re able to tell the orphan that his mom Janet is trapped inside the toaster and will come back from the dead if he can free her (notice to politicians: don’t go after Bethesda – I just made that up). So the player has input via game actions and choices; if you want to say something special then you must ‘unlock’ that dialogue option. Since the designers can craft sophisticated dynamics between the regular game and the conversation, such as encouraging players to sign up for quests that are over their head, you get a feeling that overall such an interaction can have a lot of range.

Though dynamic, those dialogue trees are still pre-authored. I’ve experienced the potential for drama-drenched moments in Fallout 3, like when I failed to save the sheriff’s life but looted his keys thinking I’d finally have a safe place to sleep, then entered his house to discover, to my heart-wrenching shock, a young boy. At which point, I wanted to ask: “Oh no, are you the sheriff’s son?” And then say: “I’m so sorry. I have horrible news for you.” However, my dialogue options were written as though I’d met the son previously and he’d already heard about his dad’s fate from some jerk who sprinted from the scene of the crime to the house even faster than I did. This isn’t because Bethesda is unaware of these cases, it’s because pre-authoring dialogue to cover every possible combination of variables is too much of a brute-force approach and bumps up against technical limitations.

But how can I craft my own dialogue if we’ve already skewered the dream of NLP and real AI? A starting point is adding more types of direct input, such as selecting moods or tones of voice, so at least I can choose between being gentle versus harsh when talking to the sheriff’s son. You can also time the player’s responsiveness, a class of input I term ‘microexpressions’, not quite a full player action but mineable for meaning regardless. Combining these techniques might allow the player to author the difference between a firm and immediate “NO!” versus a hesitant “I guess not” and anything in between as analogue flavour on what would otherwise be a single discrete option.

What if you could say “I really hate ____” and could fill in the blank with any game noun – person, place, or thing? Do you know anything about blank? When is the last time you saw blank? Are you related to blank? Or fill in a game action, such as completing a quest or changing the condition of something: I intend to change the state of blank to blank. Bank to robbed. Widow to happy. Triforce to restored. Chris Crawford calls it an “inverse parser”. Combine all of these types of input – tone, timing and constructed statement of fact – and I could craft the dialogue option I was looking for, if perhaps crudely.

What this solution gains you over straight NLP is a chance in hell of interpreting the dialogue, because you’ve limited it to an established range of statements and questions that can only be about things in the game. To generate meaningful output you still have to encode a crapload of per-character knowledge, what they know and what they don’t, what they think about certain conditions being true, how likely they are to believe you, and how they feel about the way you speak to them. This in turn implies systems for calculating the moods of NPCs, tracking their trust levels, and generally simulating their mental space. It sounds like a lot of work, but so is populating a dungeon with monsters and combat systems, and it’s actually the exciting part, because it empowers player-authored play. Gain someone’s trust to deceive them. Misinform them you’re going to rob the bank to distract them. Break the news about their dad in the most compassionate way possible.

Civilization can fill in the blanks back atcha. Abraham Lincoln demands the secret of pottery, where both the verb and object are plugged in from a finite list of possibilities. What if an NPC could fill in your reputation, the most valuable item you’re carrying, or the most insulting thing you’d ever said to anyone? It might start to feel like not only did they understand you, but they also are noticing you.

Randy Smith is the co-owner of developer Tiger Style. He’d like to thank the artgame list for their help with this column.

SuperApe's picture

I'm sure you've acknowledged that these three interaction methodologies could be mixed and matched to varying degrees. A branching narrative with some procedural elements that trigger sub-branches, for example, allowing for some flexibility at the cost of some extra design work? Love this line of discussion, Randy.

EDIT: Whoops. Meant to comment on previous post, because of course, in this post that's precisely what you're doing. :)

Verbal_Oz's picture

First of all I have to say I am really enjoying these columns, it has certainly helped to highlight how far games still have to go, particularly in the areas of story and NPC interaction.

In regards to this particular column, I agree that it would be great if games allowed greater freedom to craft a response, but I believe that even if this were possible the results would inevitably disappoint. The reason I believe this is that it is not the response that players care about but the reaction to said response. If the sheriffs son reacts the exact same way, regardless of if I say 'Hey sonny, your Dads dead but don't worry I looted his corpse!' or something more compassionate then it only highlights the limited scope of his character. Yes some number representing 'Trust' may change in the background, but I want to see that subtly reflected in the animations - I don't want a 'Trust +1' popup. The amount of work involved in creating multiple VO's, animations & consequences of the simple example given are as vast as the freedom you allow the player - and I'm not sure the expense could be justified when applied to a world with thousands of NPCs.

I would prefer that my available responses are more limited if this means that I get a greater impact from the choices I make. I wouldn't even care about the longer term impacts if I saw a genuine difference in the boys reaction if I chose the 'comfort' response over 'taunt'. If I knew I had gained his trust by his mannerisms and expression, that's when the impact of my response becomes real.