The following analysis is performed from a position of practicality and (surprise!) authoring for novice-intermediate users. Consider this an author-centric design, akin to player-centric or user-centric design philosophies of game and HCI design.
By “practicality,” I mean attempting to maximize the interest of potential authors both inside and outside academia. To increase the number of potential authors (one of my primary goals), I am choosing to include non-academics wanting to create dramatic interactive characters in my target audience. This means that I cannot assume my users will swallow a system if it sacrifices too much usability or common-sense logic in favor of a particular academic theory. For example, Haskell is a beautifully constructed language that is worth learning and studying, but it is rarely used in the job market instead of Java, C++, or another functional language. Phrased another way, the more of the system that can be explained in terms of common-sense or folk psychology (rather than scientific psychology), the better.
So, while “practicality” has expanded my target audience tremendously, I must attend to those new (novice) potential authors. I can assume they have some programming knowledge — a BS in computer science, self-taught a language or two, or fundamentals for an entry-level position in CS — and have familiarity with logic and the flow of code. The price of admission into using whatever architecture I choose must be as low as possible to ease these authors into its use, and I fully expect to have to reduce its price of admission beyond what it is currently.
However, I also want the architecture to be able to satisfy the promise on the box: to be capable of making realistic, dramatically interactive characters. I do not want to make a training program with training wheels that cannot be removed, a program that is discarded once users reach intermediate level because it is too simple. The author should be able to create whatever behavior they can reasonably imagine, given enough experience with the system. The system should also not break down as an author reaches a satisfying complexity for their agent, in logical horsepower or efficiency. There are, of course, reasonable upper limits: we cannot expect the user’s agent to solve AI-hard or AI-complete problems. However, if the user did their best and spent hours coding the door-opening behavior… and then the performance lagged so badly when deciding how to open the door that the experience was unplayable… That would be a catastrophic failure.
From an authoring-for-authors perspective (mine, not the user’s), the behaviors should be hierarchical and capable of modularization. For example, once the user creates the fully-fledged behavior to open the door, they can forget about it and trust that it will work. An even lower-level example would be head-tracking: telling the agent to “LOOK HERE” and trust that their head won’t spin around Exorcist-style and seem to break their spine, but they will instead move their body appropriately. This enables authors to create libraries of behaviors that can be shared, reducing future authoring burden. When authoring a higher-level view, such as a scene, the less the author doesn’t need to worry about the low-level details, the better.
- Favors usability and common-sense logic
- Low price of admission
- Capable of higher complexity
- Hierarchical and Modular
I have been, up until now, using a few choice buzzwords to describe the type of agent I want to support: virtual (a given), dramatic, interactive, embodied. “Embodied virtual characters” necessitates that the agent not just be a mind in a theoretical body making decisions, but instead that there is a physical form (in virtual space) being driven by the mind. While I have punted dealing with implementation details of the agent’s inputs/outputs, the agent should still have functional signals and embodied movements. If the architecture is not capable of handling inputs/outputs of an embodied character, the author (or I) would have to make or support them. Interactive implies that not only can the agents created interact with one another, but with a human player/user. The interface to that human can be anything (mouse, keyboard, kinect, brain waves) so long as there are no crazy hoops to jump through in order to include a human in the play loop. Finally, I describe what I mean by dramatic agents, the gooiest word of the bunch. To quote merriam-webster.com
sudden and extreme; greatly affecting people’s emotions; attracting attention : causing people to carefully listen, look, etc.
On first glance, this word sounds more like an authoring challenge rather than an architecture specification. Indeed, I will be tackling how to support authoring dramatic characters shortly. But authors, especially non-academic ones, do not want to make boring/lifeless characters in most cases. It should be as easy as possible to make punchy, sensational, and exciting characters. The less work that I have to do to make the architecture support dramatic realizations of the embodied characters, the more attractive that architecture is for me to use.
What exactly does “supporting dramatic realizations of embodied characters” mean? I need to do more research in this area, for starters. But for now, I attempt to describe it as… providing architecture that treats different personalities, extreme emotions, and the general expressiveness of the agent as important concepts. It goes back to folk psychology and the hierarchical concept I hinted at above: authors know that an angry person might enact “opening a door” differently from a sad one regardless of other logic involved. It has to be possible (and hopefully intuitive) for the author to change the expressiveness of the performance (which is related output specifically, but is still part of the agent’s AI), while not tangling it up with the functional decision-making process.
The architecture should support creating agents with the following properties as much as possible:
Now that we have our requirements (with a touch of gooey nice-to-have’s), we can examine a variety of architectures that concern themselves with decision-making AI and choose the best candidate.