PhD in Crafts

Since changing my research focus, I’ve been doing a lot of reading, speaking with people, and musing on ideas. Tomorrow I owe Michael a draft of my research questions. Join me on my journey of stitching everything together!

The process going forward will (should) look something like this:

  1. Pick a sub-community to assist
  2. Determine unique contribution to that sub-community (a plan/goal)
  3. Determine the questions to be answered (that address/form the unique contribution)
  4. Find answers to those questions (making some unique contribution a reality(?))
  5. Evaluate the answers to show that they sufficiently answered the questions (\o/)

A paper reports on parts of those steps, but also aims to tell a mini version of that story.
For example, I just got finished reading “Reflections on Craft: Probing the Creative Process of Everyday Knitters” by Rosner & Ryokai. Their (1) was knitters, their (2) was adding media recording/tagging throughout the knitting process. Their goals were more exploratory in this stage — “…elicit their reflections on their craft practices and learn from their interactions with material, people, and technology.” Because of that, it’s hard to find any clear (3) that isn’t trivially restating their exploration goals. Their (4) involved building Spyn, a set of hardware that enables recording of image, sound, and video and tagging it in a specific location on the garment via yardage-use detectors and infrared ink marking on the yarn. (5) They then taught 7 knitters how to use the hardware and left them to their own devices to knit various garments for various recipients. The knitters engaged with the technology very well, using it as “a little time capsule,” “emotional blackmail,” a means to “write a story” and “embed memories.” One knitter sung a lullaby to the future baby recipient, another documented the recipient’s favorite foods/places as a kind of scrapbook. Overall, it was a very effective ethnographic paper/technology.

So yeah, I thought that paper was pretty cool >.> <.<

More to the point, it’s a great working example of the kind of stuff I’ll be aiming to do. Probably less physical-technology based, and probably not in the domain of knitting (since it’s not my particular favorite), but close. That begs the question…

(1) What is my domain?

My inability to pick things led me to consider a broad approach… but I have been leaning away from that since talking to more folk. It’s much easier to do deep dives and push on the PhD blip of progress and ‘new knowledge’ with a specific domain.
I used to do crochet — see Pattern, my crochet pattern generator, and my ideas to make a 3D crochet pattern modeler. Doing crochet OR knitting would help me leverage the Ravelry community. But that hasn’t been what’s captured my interest as of late.

My blackwork embroidery generator has gotten quite a bit of interesting push and was the closest to being polished. However, blackwork is TOO limited of a domain… in style and color. Both hand and machine embroiderers work in many colors and stitch styles. It’s part of what helps make the possibilities so broad and vibrant. But the two (machine vs. hand) domains are also possibly very different. Hand embroidery deals much more with fancy stitches/shapes/inclusions (like beads), while machine embroidery has more restrictions on stitch type/size and its application. Websites like UrbanThreads provide primarily machine embroidery patterns with hand-embroidery patterns as large-scale black-and-white line images for users to do with as they wish (like the old days of iron-on black-and-white embroidery patterns).

Do I have to pick? And if I do, which one? The only embroidery machine I have to use and test is a little cheapo… that may restrict the claims I can confidently make in the future. On the other hand, websites like UrbanThreads may be a really great resource. Automatically digitizing images is a well-known problem to make progress toward… although seeing/testing those other super expensive softwares is, again, frightening.
*phew* okay, I just sent off an email after reading one of their articles on digitizing. That sounds like an interesting enough domain with the right balance of need, community, and potential computer science magic.

(2) Unique Contribution

I admittedly don’t know much about digitizing because of the price requirements for many of the software packages out there. The links below are some I’ve gathered to help me patch up that knowledge. As far as I can tell, most tools allow you to edit individual stitches. Some offer higher-level tools like “fills” which are dense satin stitches along a particular direction. These can work to fill large spaces or be used as thick lines. I’m not sure if going over an area multiple times is something done for coverage or puff (like in hand embroidery). I also know there is a particular knack for making a ‘free-standing lace’ design — something that won’t fall apart when you use water-soluble stabilizer. There are other steps involved for doing applique, reverse applique, and alignment marks (for too-big designs).

SO, there are lots of different things to keep in mind depending on the end use of the design, the size of the design (things can’t easily be ‘resized’, they need to be re-filled with proper stitch densities/distribution), the fabric it get stitched on, the potential thread being used… These are all things a tool can help with, even if it’s just reminding the user about common pitfalls.

There are also other things. Digitizing generally means taking an existing image/idea and making it into a design. But what about designing digitally from the beginning? With the under-the-hood stitch representation keeping up with the design? Blackwork is one style of embroidery this could work for. Arbitrary SVGs… cross stitch designs (pixel images)… these would certainly be newer options escaping the standard definition of “digitizing.”

Other resources describing digitizing (and its challenges/needs) is convinced software cannot replace humans

Other resources providing tips on digitizing setting up expectations recommended software by UrbanThreads Link says it all

Other resources on ((embroidery) community magazine magazine craftster needlework

(3) Questions — my current goal

Alright, so, we have some ideas of the solution space and the problems. We need to form research questions that are evaluatable, well-defined/unambiguous, and that are answerable by the kind of work I want to do. Totally easy. >.> <.<
After some offline scribbling and brainstorming in the dentist’s office, I have my first draft for Michael:

  1. How can we design intelligent digital authoring tools that enable artists’ traditional craft practice?

  2. How can the application of AI techniques in the form of mixed-initiative craft software support the ideation and creation of traditional crafts?

  3. How can we evaluate the contributions of AI techniques to traditional craft practice in the context of mixed-initiative craft software?

Expanding on them a bit…

#1 is the main, short, sweet goal. Intelligent implies artificial intelligence. Authoring tools is a blanket term for crafting software, although I realize that the concept of a tool is misleading, since there are many non-digital tools involved in craft practice (*goes back and adds term*). The concept of “experience” (ie novice v. expert) in crafting is so muddy, I dodged the term altogether and just labeled my users artists, which implies a probable novice-to-tool approach but a probable non-novice background in the craft itself. I use the term traditional craft as a well-known label for physical handicrafts. While I’ll likely be focusing on one craft heavily, this makes room for mentioning other crafts.

#2 is a much more specific and expanded version of #1. Instead of intelligence, I use the term AI techniques because I will likely be using multiple… grammars, machine learning, maybe even more general AI stuff like behavior trees. I am not pigeonholing into a specific technique here because different techniques may be useful in different contexts or for different crafts. I specifically drop in the term mixed-initiative here to imply that my crafting software, which includes intelligence, will also be interactive. Finally, the aforementioned “craft practice” is expanded into both ideation and creation, which touches upon the creativity support tool approach (of ideas and exploration) as well as my more unique approach: that there is a component of physical making that occurs after design with the tool is complete.

#3 is primarily to address the ill-defined area of both evaluating machine “creativity” in regards to design, as well as the effectiveness that the software has on the actual user. This is a two-pronged problem without clearly defined solutions. Previous work in the area of evaluating crafts take, at best, an exploratory ethnographic approach, and, at worse, attempt to measure creativity in an utterly too-narrow definition. I’m not quite sure how yet, but I expect there to be some research contribution on trying to measure the outcomes of the tool and the crafters (if they make the items designed in the tool).

Posted in Uncategorized | 2 Comments

Creativity Literature

More literature surveying.

I’ve started to pull together a spreadsheet of what I’ve read/want to read. I’m working my way through it, thinking about how it all fits together, what of it applies to my work, and what I want to move forward with.

I’ll continue to edit this post (and the spreadsheet) as I continue to read.

Interesting Reads

Ben Shniderman, 2009, Creativity Support Tools: A Grand Challenge for HCI Researchers: Probably a better summary than what I pulled together last post about the theory of creativity, design principles, and next-steps. A summary of recent gatherings on creativity support tools

Brenda Massetti, 1996, An Empirical Examination of Creativity Support Systems on Idea Generation: A painstakingly detailed examination of non-significant results of a study using creativity support tools. LOTS of great citation resources. Most interesting to me to theorize about why it failed…
Was the software (and thus domain) too general (broad ‘idea formation’)?Was the domain not open-ended enough, or possibly too political (solving the homeless problem)?
Were the experts/domain a weird fit (an human rights attorney, a community council president with a masters in public administration)?
Was the judging metric too harsh or polarized (novelty & value)?
Could a proposal made by novices to seasoned veterans be both highly novel and valuable in such a limited domain of feasibility?

Mark A. Runco & Garrett J. Jaeger, 2012, The Standard Definition of Creativity: A tiny rant about the history of our “standard definition” including validity(/effectiveness/usefulness/fit/appropriateness/utility/worthwhile/practical) and novelty(/originality/uniqueness/compelling). “Stein (1953) was the first to offer the standard definition in an entirely unambiguous fashion, and unlike his predecessors, he was without a doubt talking about creativity per se.”

Interesting Concepts

A “creativity inventory” (from Hellriegel and Solcum, 1991) assesses an individual’s perceived self-confidence, need for individuality, abstract critical thinking ability, analysis capability, desire for task achievement, and degree of environmental control. These have been consistently cited as characteristics of a creative person (Amabile and Tighe, 1993; Barron and Harrington, 1981; Tardif and Sternberg, 1988; Torrance, 1988)
Later in the article, it’s said: “the inventories did not directly account for a subject’s ability to generate ideas” [from Massetti, 1996]

Idea Fluency (high/low): “characterized as a creative ability that remains relatively constant over time (Guilford, 1950; Torrance, 1988; and Wallach, 1983)… Subjects were placed into the high-fluency category if they generated more than the mean number of responses (i.e., four or more) and into the low-fluency category if they generated three or fewer responses.” [from Massetti, 1996]

Genex: “A four-phase framework for generating excellence.” I would characterize it as a kind of creative agile idea development approach (meant to be highly fluid and iterative). “The name genex … was chosen to echo Vannevar Bush’s memex (memory extender)”. The four phases include:
Collect: learn from previous works stored in digital libraries, the web, etc. Visualize data and processes
Relate: consult with peers and mentors at early, middle, and late stages.
Create: explore, compose, evaluate possible solutions. Think with free associations, explore what if solutions and tools, compose artifacts and performances, and review/replay session histories (or should that be counted as collecting?)
Donate: disseminate the results and contribute to the digital libraries
[from Shneiderman 1999]

Interesting Quotes

“Researchers will study to understand its remarkable success in bringing together hundreds of thousands of editors and writers to create a compelling resource. Similarly open source software communities, such as Linux… give an indication of much more ambitious collaborations… Understanding the determinants of success will be central. How important is lowering/raising barriers to entry, providing/limiting a hierarchy of administrators, or stability/change in content and interface? Other questions include the impact of rewarding active contributors, recognizing quality, and preventing malicious attacks?” (emphasis added) Ben Shniderman, 2009, Creativity Support Tools: A Grand Challenge for HCI Researchers


Posted in Uncategorized | Leave a comment

Creativity Support Tools

Michael gave me the OK to change my research focus after my internship at Disney Research Pittsburgh. Before, it was a focus in embodied autonomous agents and the possible authoring tools to support them, with a minor in craft authoring support tools. Now, I think the focus will be craft tools with a minor in the work I’ve done with ABL and assessing agent AI authoring tools. The title he came up with was Authoring and Debugging Tools for Domain-Specific Expressive Languages. Now, I’d probably rephrase it to be something like “Creativity Support Tools for Domain-Specific Expressive Languages” or “Creativity Support Tools: Case Studies in Two Domains.” Maybe if I find some grand lesson from the two experiences, it’d be like “Creativity Support Tools: <insert lesson here>.”

In order to address my new focus, I’ve started a whole new literature review. It’s helped me come at this problem from a whole new (broader) perspective. I’ve got about 15 new books on the way, as well as a big stack of papers I’ve been tackling. I have been really surprised by A) how much overlap in good tool design and good game design (especially of open-world or simulation-based games) there is and B) the general needs of these tools matched the ABL-specific needs we came up with previously. Also as a bonus C) how much good tool design overlaps with experimental teaching/learning approaches. That shouldn’t be surprising, given how games often use tutorials, and how creativity (and their supportive tools) involve both domain-specific information, encoding their best practices, as well as extra support mechanisms for enabling the domain-specific artifact creation/dissemination.

The first major reading I’ve finished going through is the Creativity Support Tools workshop report from the NSF conducted in June 2005. There is a LOT of rich summary information of past research and proposals of open problems, which still sound relevant even 10 years later.

What is Creativity? How do you ‘Support’ it?

“Basically, creativity can be considered to be the development of a novel product that has some value to the individual and to a social group” [NSF page 10, Creativity Support Tool Evaluation Methods and Metrics]. In this and many other definitions, the concepts of novelty and value consistently reappear: that the created thing must be ‘new’ in some way (not formally known/explored) and useful/important/worthwhile by some metric. Boden [1990] draws a distinction between P-creative (personally novel, relatively common, since we do not know everything about our world) and H-creative (novel to the human race or culture as a whole, relatively rare). Measuring value, however, is extremely muddy, as the concept of value (/useful/important/worthwhile) is different between those synonyms, as well as between individuals, groups, and cultures.

Csikszentmihalyi’s work on Creativity defines the key components as:
Domain: “consists of a set of symbols, rules, and procedures” (like math or biology)
Field: “the individuals who act as gatekeepers to the domain… decide whether a new idea, performance, or product should be included” (peer-review and validation)Individual: “when a person… has a new idea or sees a new pattern, and when this novelty is selected by the appropriate field for inclusion in the relevant domain” (like paper publication! or publishing a pattern book! or just having other people use your tool)

“The psychological literature provides no clear, unequivocal answer to whether or not creativity can be enhanced. There are many different variables that have been proposed as having a role, including individual abilities, interests, attitudes, motivation, intelligence, knowledge, skills, beliefs, values, and cognitive styles. Thus it seems that individual, social, societal, and cultural differences and factors may all matter, at some time or another and under some circumstance or another” [NSF page 13, Creativity Support Tool Evaluation Methods and Metrics]. Basically, there are too many interconnected variables involved in the squishy process of creativity for us to isolate/examine it as a phenomenon. Thus, we can’t really test to see if creativity is “happening” or to what degree, nor can we really measure how creative someone is (other than by their potentially creative output, which can be anything from ideas to artifacts).

However, people sometimes have a feel for the ‘flow of creative juices’ or when their creative process is interrupted. Some people have a particular mindset or environment in which they best ‘work their magic’. Still, at other times inspiration ‘strikes without warning’ or ‘can come from anywhere’. These are all phrases I’ve colloquially heard when people discuss their creative processes. Creativity is undeniably a force that can be helped or hindered, whether the creator is aware of those forces and their state of mind or not.

Without being able to measure or predict creativity, we can still attempt to improve it. Nickerson [1999] has included factors involve in teaching creativity, such as: “Build basic skills; Encourage acquisition of domain-specific knowledge; Stimulate and reward curiosity and exploration; Build motivation; Encourage confidence and risk-taking; Focus on mastery and self-competition; Provide opportunities for choice and discovery; and Develop self-management (meta-cognitive) skills” (my emphasis) [NSF page 14, Creativity Support Tool Evaluation Methods and Metrics]. We need tools to help with these tasks, which help improve the personal experience of the one creating, improve the outcomes and artifacts, and help domain-specific process challenges.

A quick definition given of Creativity Support Tools (CSTs): “…tools that enable people to express themselves creatively and to develop as creative thinkers… software and user interfaces that empower users to be not only more productive, but more innovative… These advanced interfaces should also provide potent support in hypothesis formation, speedier evaluation of alternatives, improved understanding through visualization, and better dissemination of results” [NSF page 25, Design Principles for Tools to Support Creative Thinking].

One of the most fundamental (and repeated) examples of a creativity support tool is the pencil and paper (or whiteboard). It’s easy to use, fast to sketch ideas, and easily visualized (especially if you draw pictures rather than words). Other common examples are the telescope and sewing machine (likely due to Ben Shneiderman, as he’s used the examples previously).

Design Considerations/Principles/Criteria for Creativity Support

Consensus from the workshop gathered around the concepts of “low thresholds (easy entry to usage for novices), high ceilings (powerful facilities for sophisticated users), and wide walls (a small, well-chosen set of features that support a wide range of possibilities)” (my emphasis) [NSF page2]. If this isn’t a summary for good open-world/simulation game design, I don’t know what is! Just think of Minecraft. The fundamental features (harvesting/creating/placing voxels), easy entry (WASD/space/mouse controls, easy-to-understand metaphors and operations), and limitless possibilities. Honestly, the barrier to entry (figuring out all the block/item combinations) is the one I find most lacking in Minecraft, but that hasn’t stopped little kids from figuring it out. In tool-creation, though, this is a troublesome set of requirements. Tools restricted enough for the novice are often too restricted (or toy-ish) for experts. Everyone also learns at their own pace… just look at the utter failure that was Mario Maker‘s tool unlock plan (where it used to be mandatory to play the game over 9 days to unlock all the level editing tools, until Nintendo released a day-1 patch offering an alternative unlock plan).

Trying not to get side-tracked… Candy, Edmonds, and Hewett have tried to understand the functional requirements and design criteria. In the report, they state: “…any Creativity Support Tool should allow the user: to take an holistic view of the source data or raw material with which their work; to suspend judgement on any matter at any time and be able to return to that suspended state easily; to be able to make unplanned deviations; return to old ideas and goals; formulate, as well as solve, problems; and to re-formulate the problem space as their understanding of the domain or state of the problem changes” [NSF page 14, Creativity Support Tool Evaluation Methods and Metrics].

All of the NSF Workshop article Design Principles for Tools to Support Creative Thinking (p. 25-38) is relevant to this section. A quick summary of their major points (other than the low thresholds, high ceilings, and wide walls, which was quoted from this article too):
1. Support Exploration: make it easy to change all aspects of the design. A tool must be trustworthy, so that users are comfortable trying new things, and progress will not be lost (UNDO!!!) A tool should also be “self-revealing” so that it is clear to users what can be done. Elements of the tool that are hard to use will not be used. The tool should be non-obstructive to exploration, and allow the user to exert partial effort to get a partial result quickly. FAIL FAST!
2. Low Threshold, High Ceiling, and Wide walls
3. Support Many Paths and Many Styles: some people have ADD and jump between all kinds of projects between circling back to a solution. Others focus and dive deep on a single task. The latter has typically been primarily supported, but the first is a perfectly viable method of creative process as well.
4. Support Collaboration: for both a single author getting feedback, or multiple authors working in tandem, communication and distribution of creations, tricks, techniques, and examples is extremely important for usability.
5. Support Open Interchange: seamlessly interpolate with other tools, both your own and others. Working with common file formats, or enabling users to create “plug-ins” or “mods”, support this point.
6. Make it as Simple as Possible – and Maybe Even Simpler: even though simpler and better-designed tools may be regarded as ‘toys’, the real trick is offering the simplest ways to do the most complex things. Think of the Minecraft example above. Don’t succumb to feature creep if you audience doesn’t want/need those tasks!
7. Choose Black Boxes Carefully: your black box is the lowest level of abstraction (primitives) your user can work with. If you’re not trying to teach physics, let a physics engine be a potential black box.
8. Invent Things that You Would Want to Use Yourself: your tool should be enjoyable to use. If you don’t enjoy using it, why would anyone else? External validation and community recognition are important.
9. Balance User Suggestions with Observation and Participatory Processes: users don’t know what they want, what tools or feasibly, or what tools will result in a certain behavior. Designs with well-chosen parameters are often more successful than designs with fully-adjustable parameters, if fully-adjustable parameters are not needed by the user. Infer what users want/don’t want from their actions.
10. Iterate, Iterate – then Iterate Again: Just as we want users to work with rapid prototyping of ideas, so should your tool be a rapid prototype that responds directly and rapidly to user behaviors.
11. Design for Designers: design tools that enable others to design, create, and invent things. Writing software is a creative activity, and we can creatively write software for making creative software.
12. Evaluation of Tools: ….

Evaluation Metrics

“…[N]o single [evaluation] method or measure will be appropriate for all situations or all aspects of the complex phenomenon of creativity” [NSF page 15, Creativity Support Tool Evaluation Methods and Metrics]. The article lists a bunch of sample qualitative evaluation questions to be asked about the robustness, generalizability, effectiveness, and comparative strengths/weaknesses of the evaluated tool, as well as possible quantitative metrics to gather during studies. I am too lazy to write them all out here, since they’re like a page long. The article also compares pro’s and con’s of long-term and short-term  studies, surveys, and ethnographies. The short rule of thumb is: ethnographies give you the best coverage of usability and feedback, but take so much time/resources that you should use short studies in the beginning to iterate and produce a prototype worthy of the longer ethnographic study.

The main take-aways of that article’s examples show how to extract why a user does their actions, what the user’s goal is, and help them reach it faster/easier/simpler. This approach applies at multiple levels of granularity. For example, at a low-level, users may want to click-and-drag or click twice to move an object, and if you don’t support the one they use, you may understand why a user’s interaction fails. At a high level, the user may be moving the object in order to see a different configuration in comparison to one they already have. Maybe a comparative view, examining snapshots side-by-side, may help them compare configurations more easily. These examples also illustrate why qualitative AND quantitative data is crucial to understanding, for example, why the user tried doing the same operation twenty times in a row.

Current evaluation metrics (at the article’s writing, anyhow) on performance and efficiency may be important, but they are not the only measures of a creative support tool’s effectiveness. How a tool influences a user’s problem-solving process or creative exploration needs to be better explored.

Other Cool Quotes

“work smarter, not harder” Beyond Productivity: Information, Technology, Innovation, and Creativity (2003)

“Creative ideas emerge from novel juxtaposition of concepts in working memory in the context of a creative task” NSF Workshop page 21, Creativity Support Tool Evaluation Methods and Metrics

“Almost by definition, creative work means that the final design is not necessarily known at the outset, so users must be encouraged to explore the space” From Fiscker 1994, NSF Workshop page 26, Design Principles for Tools to Support Creative Thinking

“By creating you become more creative” NSF Workshop page 34, Design Principles for Tools to Support Creative Thinking

Posted in Uncategorized | Leave a comment

Paper Writing Tips

These were tips given in Jim Whitehead’s Generative Methods class at UCSC (specifically 2-27-12). It was an informal paper, but the notes are still useful.

Also a useful source by Joanna Bryson:

ACM Sig formatting style, 8-10 pages
1. Intro
– Broader relevance argument (Why did you just waste n weeks working on this?)
Better – Faster – Cheaper: Pick two!
– Super broad to the generic masses (Layman)
– Save $ on content equations (cheaper)
– More expressive, opens avenues of creativity (better)
– Adaptability to player actions (better)
– on-the-fly generation for a replayability (faster, better)
– and/or inside a particular research community (Specific language and phrasing)
– An appeal to how the work contributes to a well-known research agenda (Holodeck and interactive narrative)
– Framing of your work against other work (if you don’t know the community, it can be hard)
– Statement of research questions
– Just bullet point it!  “The research questions we address are…”
– Can number them and reference them
– Also good to have rational text before/after (initially got this, which suggests that, etc)
– Road map paragraph
– The remainder of the paper is this… (related work, system, evaluation…)

2. Related Work
– Most related systems/algorithms/etc.
A. Enumeration Approach/Death March (if you do it wrong)
– 1-2 sentences up to 1-2 paragraphs on each relevant system/alg/etc
** – At the end, give differentiation from your work (can take the opprotunity to take a pot shot!)
– Reference other papers that solved problems, or forward reference where you fix them
Ideal: problems with or gaps in prior work directly led to the current work
System A
System B
System C…
B. Framework Approach ** Better, but harder **, can often see a Table
– Characterize/Classify the work that’s come before (categorize)
– Slot in related work into those categories you defined
– Must describe classification scheme and categorize, and relate each category to your work *****
– Benefit: can fold many systems into one category, or use a category to dismiss a bunch of things at once
CategoryA (system, system, system)
CategoryB (System, system, system)…

3. Description of System/Approach (used to answer research questions)
– Architecture approach, major data structures, major representation details (grammar etc)
– Major components, broad algorithms,
– Typically: no code, few implementation details (ie language, design pattern)
– Architecture diagrams are good!  Data flow diagrams are also common

4. Evaluation
– Does the system answer/shed light on the research questions?
– A bunch of exemplars (cherry picking (success/failure cases)), or not
– Evaluation of generative space (usually better)
– Results for each research question (get down in the weeds, broader trends?)

5. Discussion
– Talk about broader trends if there are more to talk about!
– Talk about results in #4
(Not always present)

6. Future Work (less and less prevelant now)
– Criticize some things there, not criticized for NOT having one 😛
– Only if you have non-obvious conclusions
– Convincing other people to do it for you, not you doing it yourself
– Not a future task list for yourself
** PHD Dissertation is different: Branching out, thinking through, very good to have

7. Conclusion
– Restate what you learned about the research questions (Summary form, narrate it)
– Bring back to the broader relevance.
– How does knowledge learned relate to goals as stated in broader relevance discussion?

8. Abstract
– Write last, put first!

9. References ****
– Minimum: Convey the kind of thing that is being referenced
(book, webpage, converence page, journal, game)
– Do not trust Bibtex, do not trust end note, etc.
– Manually inspect references to make sure you know what KIND of thing it is
– Please include the year (ACM or IEEE style)

Posted in Uncategorized | Leave a comment

Hard Questions

I realized that my previous posts were elaborating on stuff I already knew and dodging the questions I had yet to answer, so I’ve been focusing on those harder questions in the past week.

First, my advisers said that I needed “an underlying metaphor” to tie my story together. I raged against this a while. Was this a metaphor for what I was going to build, or what the author was going to be building? Why did I even need one anyway? And what kind of metaphor could possibly work without over-simplifying the process? What did having a metaphor provide me that I didn’t already have?

I realized that the metaphor was for what I was going to be building, since that’s what the advancement document is all about, so that question got answered. And after spinning my wheels and coming up with absolutely nothing, I decided to just pick a metaphor and go with it — the storyboard metaphor. James Skorupski used it for wide ruled/story canvas, and he had cited others (Pizzi, Jhala, McDermott, Goldman, and Ronfard to name a few) who had successfully used it. Of course, it works best for visual and spacial concepts: cinematography, game movies, scenes, machinima, or even comic stills. The more I read James’ advancement proposal and some of his papers, however, the more I realized the benefit that storyboards had to offer me.

James had a structure very similar to mine: branching stories. Instead of a player choosing to take the high road or the low road, the character chooses whether to speak or take a drink. Character behaviors are like microscopic mini-stories with little plot structure. I can use stills of character action instead of story action, and the user can still read the interpolation in the gutters of the panels. Movies even have punchy emotional or internal conflict/choices that they must show through storyboards, so it’s not like it’s a character-unique problem that some choices would not be easily visible.  Michael’s first concern was the real-time aspect of the interaction, but once you  cut up real-time action into infinitesimally small time steps, it’s still time steps. There are still decision points. We just have to account for more of them than most branching story structures.

So I guess it wasn’t the storyboard metaphor that really got me excited, but the research of all the people using it really made me believe that I could use it too. It helped me come upon the comparison of interactive story structure and character behavior-making mechanisms. James even had a zoomable, collapsable, hierarchical tree of story points that showed various information by its structure AND content — EXACTLY what I had planned to do. James also used ABL to handle the tree. I don’t know how I should feel about him skipping off to Ebay and not finishing his PhD, though >.> <.<

~~~ ***** ~~~

Then I talked in pod with Noah and Michael to iron out some more details on what the actual implementation would look like. I let myself go and had some pie in the sky dreams, which they had to ground me out in… Turns out that even though I ignore the method of input/output, it is still very important to be able to see the behaviors I’m trying to visualize. Procedural stick figures/procedural South Park would be a PhD in themselves, and I don’t want to get bogged down in thinking about the super low-level fundamentals. But at the same time, they feel necessary in order to work on a higher level that makes use of them! The further I get away from how people might actually want to use my authoring tool, the less likely that gap will ever get filled in by anyone, and the chance that my work might actually be relevant shrinks to near nothing. 2D or 3D is a huge question in and of itself, let alone all the troubles that each would provide. ARGH! I KNOW that I do not want just text, that much is certain.

I’ve previously focused on specific authoring bottlenecks to solve, both for IMMERSE (fighting with unity) and ABL (flat language: no classes, no built-in hierarchy) in general, and recently I’ve been doing some iteration on imaging cool authoring visualizations and structures. Now I need to tie them together. Something like… adding some code hierarchy as built-in idioms (with patching capabilities for when people want to make their own) and allowing the author to see the generally invisible strings that tie behaviors together would help code reuse (libraries!) and organization immensely. Even people who have written behaviors forget how they work later! I need to make the tool such that authors are given a “box” where their behavior must function within, and that can act as the library’s boundaries, rather than just hoping authors conform to those standards. Giving authors idioms and patterns to work with. “Reifying idioms” is Michael’s phrase for it.

Noah thinks that would be an interesting cut-off point for me to define. You have to go back to Eclipse to define new idioms. But if you want to work with the ones the tool understands, you work in the tool. And have some kind of mark-up for new idioms that allows the tool to hook into them.

Another angle of possibilities is the blank page problem: what motivates my character and why? You can author a character to perform some specific tasks in a specific domain, but you feel as if you’re reaching into the dark. There is nothing about ABL that encourages long-term motivation or emotional responses (for example), you just kind of have to feel it out and add it adhoc as you’re writing behaviors.  Michael suggested Borrow Egri as a metaphor to have the tool guide you to fill in those missing pieces similar to Dramatica. Operationalizing the blank fields that authors would plug in within the tool.

Can we ignore unanticipated interactions/take it as a given? Aim instead for describing higher-level stuff? The “picking up coffee cup” problem is not on the ABL side, but more on the engine side. It’s Unity and those kinds of engines that have trouble doing anything other than running and shooting. They simply aren’t made to, for example, hold a coffee cup such that the liquid inside wouldn’t spill everywhere while the character moves. There is no gesturing, there is little object attachment, there is no sitting in chairs. It’s all fudged. Docking is hard (Assassin’s Creed team is focused on making the tech to do it. Years and tons of people to make free running look good. And they aren’t in engines by default. What would happen if that amount of tech was put into dramatic performance? There isn’t gameplay around dramatic performance as an industry standard.). Another team spent 6 months trying to get characters to sit in a chair/stand up without it looking horrible.

The weird animations is not an authoring problem in terms of ABL, it’s an authoring problem in terms of the engine, which was never built to afford anything other than FPSes. It just becomes an ABL problem when we try to baby and bandaid the shortcomings and edge cases of the engine.  AAA games have horrible leer-y rubber masks or put helmets on everyone.

But storyboards means that you don’t necessarily need to animate it. Maybe just show it in Unity and not worry about it not looking too good (IT HURTS!)? Maybe simplify it like Storyteller?  Noah’s suggestion: Start with the authoring stuff. By the time I’m getting along with the authoring stuff, there may be some project that I want people to author for. And then worry about the visualization of the actual output by then. But I NEED to have behaviors to test them… Argh!

Lots of stream of consciousness stuff there as I listened to my recording of Michael/Noah’s feedback. The best part of having Michael recordings is listening to him do raspberries as he thinks :).

Posted in Uncategorized | Leave a comment

Architecture Ramblings

It is so hard to get back on a horse that you’ve fallen off of, when it should be galloping away into the sunset. Good news is that the FDG paper is finally sent off, so at least I did something!  Sorry, it’s been kinda rough personally lately. I will do my best to get back into the swing of things for this post.

~~~ ***** ~~~

The caveats that I wrote and didn’t really need in the last few posts were meant for this one. Probably my weakest link so far in terms of research, but I can’t avoid it any more.

I have been assuming an agent flow something along the lines of:
Sensing -> Processing -> Deciding -> Actuating
And I am searching for some architecture that helps with Processing and Deciding.  There has to be better AI words and definitions for what I mean, but in my paper I called it the “Decision-Making Mechanism.”  Architectures that help with it can be of any model, discipline, or motivation: be it affective computing, cognitive science, planning, or something I’ve failed to mention so far. So long as it fits what I want.  The following are listed in no particular order, just architectures I have heard of and my impressions of them. (Wikipedia lists so many more… Do I even want to try and look at them all? Apparently other people are a lot better at it than I am )

I’m using it as a token cognitive architecture here.
From its FAQ: “Also, if you only have limited time that you can spend developing the system, then Soar is probably not the best choice. It currently appears to require a lot of effort to learn Soar, and more practice before you become proficient than is needed if you use other, simpler, systems, such as Jess.”
They list their pro’s as being able to handle learning, interruptibility, large-scale rule systems, parallel reasoning, and a design approach based on problem spaces (which I’m not entirely sure what that means, but okay!).
Their primary concerns seem to be functionality and adherence to their model, not necessarily usability. I’ve worked with large rule systems before, and it difficult to make those rules make sense, to imagine them as a whole/working in conjunction, and in general organization. I would need to do a lot of work imposing structure on the undefined nature of everything and organizing the rules into something coherent.

My token representation of affective computing (and general planning). Again, more people have done better comparisons than me.
On the surface, it seems that using emotions as a model makes the authoring a bit more understandable and transparent — I am afraid of something, so I won’t want to do it! Simple.  However, the goal structures that determine scenarios feel weird, the personality files that control changes in opinion feel even weirder, and something feels off about distilling everything into a handful of positive/negative feelings. Dramatic stories are conflicted, not always so clear-cut, and it is difficult to be expressive when all agents are cut from the same decision-making cloth.

Used to make Facade, this seems to be the clearest answer out of the lot of them. That’s sorta what this whole thing has been leading to on purpose, but let’s see how well it all works out anyway.

  • Favors usability and common-sense logic – ABL’s structure is a Behavior Tree (ABL Behavior Tree or ABT), made up of choices. It’s one of the simplest, fundamental representations of branching logic.
  • Low price of admission – One of ABL’s biggest flaws (like any architecture on this list) is its complexity and capability. My paper talked about how novice authors need pre-defined idioms to even get started. However, with a fundamental set of idioms and structures written for an author, the difficulty of making behaviors is drastically lowered. I know, I’ve been there!
  • Capable of higher complexity – An agent can have its own ABT, or one tree can govern multiple agents. Reactive planning allows an ABT to grow to be as complex or simple as necessary in any given moment.
  • Scalable – Facade proves its capabilities of scaling to a satisfactory degree.
  • Hierarchical and Modular – Reactive planning ensures that sub-goals of a player are self-contained (as they have to be when they get added onto the ABT). A behavior tree made of behavior lego’s.
  • Embodied – Hooks up into Unreal or whatever the user wants.
  • Interactive – Reactive planning is called such because it reacts to user’s actions… interactive by definition.
  • Dramatic – Facade shows that it’s at least capable of dramatic authoring
  • Expressive – Facade shows that it’s at least capable of expressiveness

~~~ ***** ~~~

I probably went about this the wrong way… But I felt I had to answer the questions: Why ABL? Why reactive planning? Why not another architecture or method?  I don’t even think I really answered the questions with the above. I don’t think I could even enumerating every architecture I could find, although that might be a useful exercise in the future, or how well that would answer the questions either.  I shouldn’t take the power of ABL for granted.

Behavior Trees do not scale well on their own and are not capable of accounting for every possibility at every moment for an embodied agent. The ABT built of reactive planning makes the best of both worlds: it takes the simplicity of design of behavior trees and the potential complexity of situated AI and makes it work. ABL has no model or structure of the human mind, emotions, or any of that built-in, but it is capable of supporting whatever an author may choose is their most natural form of decision-making. ABL enables whatever folk psychology an author has, and it is up to the author whether it will work, make sense, or be designed well enough to function.  I get to show authors what is possible, what ABL and its agents are capable of, not chain them to a particular philosophy and force them to work within its boundaries.

That is why I chose ABL out of any others. That and I have experience with it, and access to Michael Mateas whenever I need it >=D.

Posted in Uncategorized | Leave a comment

Fields and Terminology of AI

The following is comprised of a collection of on-going research into the options of AI approaches and specific architectures that fit my previous criteria as closely as possible.  Theoretically, any AI architecture or method could be considered here, but that it would be a huge waste of time to elaborate on them all. Instead, I’ve selected to discuss candidates that I know fit at least most of my criteria. And I know that I will need more (detailed) research here not only in the architectures themselves, but how people author for them.

First, a review of terminology that has caused many lines to be drawn in the sand of the AI landscape.  Here are some AI keywords (from a surfing of wikipedia) that will narrow down the field of all “artificial intelligence” to those close to my specifications:

Situated Artificial Intelligence
While situated often implies robots in the case of this field of AI, it has been accepted that virtual agents embedded in a dynamic virtual world that they can sense and manipulate also count as situated. I suppose that situated may more closely represent what I mean by embodied.  Situated AI tends to be represented bottom-up: composing tiny bits of behavior into less-tiny-bits of behavior in hierarchical steps until we reach a composition that will appear to look like “opening a door” to a viewer.

Traditional or Symbolic AI
I include this definition in contrast to the last. It is top-down: decomposing the idea of “opening a door” into many individual sub-steps until a list of actions are found.  Static Behavior Trees (BTs), Finite State Machines (FSMs) are included in this class of behaviors. It is impossible to foresee all situations an embodied agent will encounter, and traditional AI falls short when the agent encounters an unknown/missing/inappropriate symbol or behavior.

Affective Computing
Psychology’s affect deals with emotions and feelings. AI in this field process audio and visual signals relating to human reading and demonstration of a variety of emotions (which, in themselves, are difficult to classify).  My concern with emotions is enabling the author to create agents that dramatically express (and possibly track) emotional state in a human-readable fashion. It would be a first-class research problem to try and make an AI try to read the player’s emotions (which is why I am dodging the input/output problems).

Artificial General Intelligence
While an author could make an animal or non-human agent enact their behaviors, we are aiming to author human-level intelligence in AI.  However, we want human-level intelligence in the service of human-like performance.  AGI is hypothetical and controversial; are we creating intelligent agents (AI-hard) or agents that act intelligent?  I am perfectly satisfied with simply acting intelligent.

Automated Planning and Scheduling
Often shortened to justplanning,” this branch of AI is concerned with composing a sequence of decisions, often in service of accomplishing a goal. Planning can go forward and backwards and be driven by any manner of decision-making theories such as reinforcement learning and statistical models. Our agent will need to have some method of planning in its architecture for sequencing and deconstructing behaviors (so that they are simpler to author).

Cognitive Science
Brains and computers have shared a metaphor since computers were invented.  We can use our own mind’s inputs, outputs, and computations as inspiration for computational models, and once we make them accurate enough, we’ll have solved that hard AI problem.  Theoretically. Until then, we can only work with the closest approximations we have. If a cognitive science system can produce decent behavioral performances through its approximations, it deserves to be considered as another model of decision-making.

Next post will be about actual architectures. Promise!

Posted in Uncategorized | 1 Comment