Creative Sense-Making: 

An Enactive Cognitive Theory, Method, and Analytical Framework for
Co-Creative AI


Nicholas Davis, PhD


Abstract

Creative sense-making (CSM) is a theoretical framework that applies the enactive frameworks of sense-making and participatory sense-making (PSM) to creativity and co-creation. CSM proposes that creativity, and co-creation in particular, are sense-making activities where meaning is built dynamically through autonomous interaction with the environment and other agents. The framework is distinguished from typical accounts of sense-making and PSM because it involves the production of a creative artifact, a piece of artificial media, whether it be ephemeral or concrete. CSM seeks to model the interaction dynamics of co-creation to learn more about the sense-making and PSM patterns present during co-creation. The framework utilizes the Enactive Model of Creativity to develop a cognitively rooted interaction coding scheme. The coding procedure is applied continuously via manual video coding of a recorded co-creative session or automatically by encoding the coding scheme into a co-creative system. The codes, when summed, produce a CSM curve that visually depicts trends and patterns in the interaction dynamics of a co-creative session. These curves can then be analyzed using statistical modeling techniques, such as stock market technical analysis. The turn-taking rhythm can also be modeled and visualized with the approach. The CSM can provide a shared methodology, theory, and analytical framework with which to conduct research in co-creative AI. 


Keywords: Co-Creative AI, Participatory Cognition, Enactive Cognition, Sense-making, Creativity, Human-AI Interaction,



Introduction

Co-creative artificial intelligence (AI) is a field of study that focuses on designing and studying co-creative AI systems that contribute to a shared creative product with the users (Davis, 2013). The field of computational creativity can be segmented into three categories of systems (Davis, Karimi, Guzdial): autonomous computational creativity systems that generate creative products, creativity support tools that accelerate the user’s existing creativity, and co-creative systems that collaborate with the user on a shared creative product. In co-creative contexts, the interaction can be ephemeral, such as dance, or concrete, like design. Turn-taking can emerge between the user and AI where ideas successively build upon each other in a collaborative improvisation. The user and AI can take on different roles during the co-creation, such as definer (e.g. coming up with new ideas), refiner (e.g. elaborating on present ideas), and evaluator (e.g. reflecting on the creative product and offering critiques). 


In co-creation, the creative process progresses dynamically with shared meaning gradually and interactively built through a social interplay. There is a social interaction dynamic that has a degree of autonomy that influences the co-production of the shared artifact. The sense-making of the individual can be applied to the interaction itself or to the content of the interaction (e.g. the creative output). Making sense of the interaction regulates the interaction dynamics and entails attributes such as modifying the rhythm of turn-taking, adjusting turn length, and providing feedback. Making sense of the creative output is about building meaning based on what the other collaborator contributed. In this context, goals are not predetermined, but rather emerge through a mutual and interactive process. The enactive concept of directive is more appropriate, which provides some structure and constraints for the creative activity without imposing a scripted plan. The co-creative process would then be characterized by the emergence of directives that guide the creative process along certain trajectories that can be modified in the moment by either collaborator. 


The argument put forth by the creative sense-making framework (CSM) is that co-creation is a form of sense-making in which meaning is interactively constructed through the autonomous interaction of an agent with the environment and other agents within that environment, e.g. a co-creative AI. During co-creation, participants engage in participatory sense-making, where their interactions form a co-regulated interaction coupling (e.g. a structural correspondence between successive turns). The coupling is co-regulated because the autonomy of both parties is respected and each contributes to the flow of the interaction. An example in the domain of drawing would be one collaborator drawing a face and another collaborator adding ears and a hat. The turns are coupled because they have some semantic relationship with each other. Coupling can also come from similarity between turns, such as one person mimicking another. PSM investigates interaction couplings, feedback, coordination, and the rhythm of turn-taking, including periods of meaning expression and pauses. 


Creative sense-making applies the idea of sense-making and participatory sense-making to creativity and co-creation. Creative sense-making is unique from typical sense-making contexts because there is a shared creative product that can be ephemeral, such as a dance, or concrete, like an artwork. The actions taken by participants are reflected in the artifact and serve as further feedback for subsequent actions. These actions can be quantified to study the sense-making processes of co-creation. CSM demonstrates how it is possible to deduce cognitively rooted modes of interaction and code interactions through time to arrive at a curve describing the interaction dynamics of co-creativity. 


The CSM coding scheme and theory is based on the Enactive Model of Creativity (EMC). The EMC describes two main cognitive modes: clamped (e.g. fluidly executing actions) and unclamped (e.g. exploring the environment or interactively building a better mental model). It proposes that creativity is a fluctuation of clamped and unclamped cognition. The CSM’s analytical technique quantifies those fluctuations to define interaction patterns and trends that describe the sense-making and PSM processes present in co-creation. With this data, a model of co-creation can be derived that describes the temporally extended interaction dynamics of sense-making present in co-creation. This is a domain independent model, and it can be used to compare the interactions of co-creative systems in different creative domains. 


This paper begins by providing some background on the cognitive theory of enaction and its frameworks of sense-making and participatory sense-making. Then, the Enactive Model of Creativity is described to arrive at the cognitively based interaction coding scheme. Next, the coding scheme is described, and the technique for how to apply it is explained, including a manual and automatic application procedure. The Codix video coding platform is described, which enables the manual CSM code application process. The AI Drawing Partner is described, which is a co-creative drawing environment that automatically codes the CSM’s interaction mode codes. Next, the analysis process for the CSM curves is described, including stock market technical analysis and turn-taking rhythm analysis. A discussion follows that examines how the CSM is a domain independent theoretical, methodological, and analytical framework to advance co-creative AI. 


The primary contribution of this article is expanding the existing CSM framework into a full theoretical, methodological, and analytical framework to study human-AI co-creation. The secondary contribution is the novel analytical framework presented that can aid in the analysis and interpretation of results in co-creative AI research. The final contribution is the introduction and explanation of two freely available research platforms with which to conduct analysis with the CSM to the research community: 1) Codix, a video analytics and coding platform to manually apply the CSM coding procedure, and 2) AI Drawing Partner, which is a quantified co-creative drawing system that automatically applies the CSM coding procedure.

Related Work

Theories of Co-Creation

Within computational creativity research, Guzdial and Riedl introduced a framework to represent turn-based co-creative systems, emphasizing the importance of understanding the start and end conditions, the actions each partner can take, the nature of the AI agent, the role of the user, and the turn-taking dynamics (Guzdial et al., 2019). This framework provides a structured way to analyze and design co-creative interactions, highlighting the nuances of different collaborative styles within the interaction. In a turn-taking system, the framework helps quantitatively define aspects of the system, which can then be used to evaluate similar co-creative systems based on these features systematically.


Kantosalo et al. (2020) extend this idea by introducing a layered model for describing interactions with co-creative AI systems: interaction modalities, styles, and strategies \cite{kantosalo2020modalities}. Modalities refer to the channels of communication, styles describe the structure and behavior of interactions, and strategies govern the system's decision-making process. This layered approach allows for a more nuanced understanding of the complex interplay between humans and AI in co-creative processes. For example, a system might use visual and auditory modalities to communicate with the user, adopt a collaborative style that encourages back-and-forth interaction (turn-taking), and employ a strategy that prioritizes novelty and surprise in its creative output.


Wu et al. (2021) further extend this interaction modality understanding by introducing the "Human-AI Co-Creation Model," a circular process encompassing six phases: perceiving, thinking, expressing, collaborating, building, and testing \cite{wu2021ai}. This model emphasizes that AI can augment human capabilities in each phase, leading to a more enhanced and efficient creative process. In the "perceiving" phase, the AI can analyze vast datasets to provide insights that might not be readily apparent to humans, while in the "expressing" phase, AI tools can help users generate and iterate on creative outputs more rapidly. The emphasis on collaboration in Wu et al.'s model aligns with the core tenet of co-creative AI, where humans and AI actively contribute to the creative process, fostering a sense of shared agency and ownership over the final artifact, whether concrete or ephemeral.


Rezwana (2022) further enriched this understanding by focusing on the technical systems underpinning human-AI co-creation in their Co-Creative Framework for Interaction Design (COFI) \cite{rezwana2022designing}. This framework emphasizes the role of the interactive system's design, the algorithms in use, and the level of automation in shaping how collaboration manifests. A tool offering basic suggestions might encourage a turn-taking style, while a complex, manipulative environment could allow for task-divided or even simultaneous collaboration. COFI also provides a structured approach to analyzing and designing co-creative interactions by examining various aspects of the interaction between humans and AI, such as the types of contributions made by each party, the communication channels used, and the overall creative process. By considering these features, COFI offers a comprehensive framework for understanding and designing effective co-creative AI systems.


While these frameworks offer valuable insights into co-creative AI, they primarily focus on the technical and interaction design aspects. On the other hand, the Creative Sense-Making (CSM) framework delves deeper into the cognitive and social dynamics of co-creation, emphasizing the importance of shared meaning-making and the emergence of directives through interaction. CSM's unique contribution lies in its ability to quantify the fluctuations between clamped and unclamped cognition, providing a nuanced understanding of the interaction patterns and trends in co-creative processes. This focus on the underlying cognitive processes and the dynamic interplay between humans and AI sets CSM apart from other frameworks, positioning it as a valuable tool for understanding and enhancing the co-creative experience.

Enactivism 

Enactivism stands in contrast to traditional cognitivist views of cognition as information processing systems that sense, plan, and act according to environmental input. In an enactive account, the sensing and acting occur in tandem through interaction with the environment. Perception is for action and guided through action. Perceptual sensing is an activity itself guided by affordances in the environment that represent the action potentials of objects. Cognition is said to be embodied and embedded in a dynamic environment that is leveraged to support cognition, making cognition extended to the environment as well. The enactive account cuts across the brain-body-world divide by viewing cognition as a sense-making activity in which interaction with the environment and the feedback from those interactions inform an intelligent and embodied perceptual process. 


There are five pillars of important concepts in enactivism: autonomy of an agent to maintain its existence under precarious conditions through interaction and exchange with the environment, sense-making or imbuing the environment with meaning through interaction to continue the autonomous identity of the cognizer, emergence of meaning through interaction, embodiment of the agent such that actions are afforded and constrained based on body configuration, and experience in which an interaction flows and has a trajectory and history of interactions. 


There are three strands of enactivism present in the literature: autopoetic enactivism, sensorimotor enactivism, and radical enactivism. Autopoetic enactivism argues for a continuity of life to mind and investigates the biodynamics of an organism as it interacts with its environment to continue its autonomous existence. These researchers seek to define the cognitive structures that emerge as a result of the unique biological imperatives of the cognizer. Central to this strand of enactivism is the concept of autonomy, which is the notion of an organism’s ability to survive under precarious conditions through building meaning by interacting and exchanging resources with the environment. The organism can also modify the environment to change the available actions it can take on the environment, making it adaptive. Cognitive agents, according to this account, are autonomous and adaptive. 


Sensorimotor enactivism investigates the interrelationship between perception and action. These theorists view perception as an activity in which the cognizer reaches out into the world to perceive elements of that world through sensorimotor feedback loops. They see perception as guided by sensorimotor feedback loops that form through interaction with the environment, and these sensorimotor loops constrain and inform cognition. These sensorimotor feedback loops, when activated through time, form sensorimotor contingencies, which are defined as “patterns of dependence obtaining between perception and exploratory activity” (Ward et al., 2017) that structure the manner in which perceptual activity is carried out. 


Radical enactivism, or radically enactive cognition (REC), argues against internal representations, and instead proposes that cognition is an embodied activity that is carried out without a mental model. “REC takes up the general enactivist project of rejecting cognitivism in favour of analysing minds in terms of dynamic patterns of adaptive environmental interactions” (Ward et al., 2017).


Applying enaction as a lens for co-creation involves investigating the dynamics of interaction between a user and an AI, such as the social interplay and shared meaning construction present during co-creation. In their work, The Five Pillars of Enaction as a Theoretical Framework for Co-Creative AI, Davis et al., (2024) define enactive co-creative AI as follows: 

“At least one human and one agent collaborating on a shared creative product where the autonomy of the user and agent is maintained and meaning is built through interaction, coordination, communication, and feedback. The agent and user engage in sense-making (regulating interaction with the environment) and participatory sense-making (regulating a social sense-making process) to understand each other’s creative intentions and enact or bring forth meaning in the environment. Both the user and agent are embodied, with interactions constrained and afforded by their bodies. The agent engages in improvised interaction to yield emergent interaction dynamics. The agent remembers its experience, storing the interaction history and utilizing that to inform the creative trajectory of the interaction.” (Davis et al., 2024)

This definition of co-creative AI emphasizes the dynamic nature of meaning construction through feedback and coordination in a social sense-making process, embodied in a dynamic environment, with rich affordances that guide interaction. Importantly, it recognizes the lived experience of a co-creative session and how a nuanced interaction history is developed through time. Enactivism, along with its constituent frameworks of sense-making and participatory sense-making, can be used to study, describe, and quantify co-creative AI. 

Sense-Making 

Sense-making relates to the ability to interact with the world to build meaning relative to continuing one’s autonomous identity. The interaction with the world ‘casts a web of significance’ () and imbues the environment with meaning through interaction. For example, De Jaegher (2013) writes: “Actions and their consequences constantly shape the underlying processes and modulate autonomy such that intentions, goals, norms, and significance in general change as a result. The significant world of the cognizer is therefore not pre-given but largely enacted, shaped as part of its autonomous activity.” (De Jaegher, 2013). De Jaegher goes on to summarize and define sense-making as: “a cognizer's adaptive regulation of its states and interactions with the world, with respect to the implications for the continuation of its own autonomous identity. In other words, sense-making is concerned acting and interacting, and the concern comes directly from the sense-maker's self-organization under precarious circumstances.” (De Jaegher, 2013). Similarly, Thompson and Stapleton (2009) define sense-making in relation to autonomy:  “Sense-making is the interactional and relational side of autonomy…Sense-making is behaviour or conduct in relation to environmental significance and valence, which the organism itself enacts or brings forth on the basis of its autonomy” (Thompson & Stapleton, 2009). Sense-making can be applied to co-creation by investigating meaning construction based on the autonomous activities of the user and AI. 

Participatory Sense-Making (PSM)

PSM is a cognitive framework developed by De Jaegher and Di Paolo (2007) within the autopoetic enactive tradition that situates meaning construction as a social activity based on the dynamics of interaction and exchange between multiple agents. These interactions form a ‘co-regulated coupling’ where the autonomy of each interactor is respected and each contributes to the flow of the interaction. The co-regulated coupling develops an independent autonomy that influences each of the contributors, i.e. the dynamics of interaction. The cognitive agent can regulate both the interaction dynamics as well as the content of what is contributed to the interaction. When the agent regulates both interaction dynamics and the content of contributions, participatory sense-making emerges. In this approach, meaning is co-constructed through interaction with other agents and the environment in a way that respects the autonomy of each interactor. 


PSM can be applied to co-creation to describe the dynamics of interaction, in particular interaction couplings and shared meaning construction. In co-creation, turn-taking can emerge where each turn can be related to what came before. This relationship can be semantic, visual, or structural correspondences between the content of the turns. PSM emerges in co-creation when multiple successive turns are related to each other and the collaborators are building upon each other’s contributions in a collaborative improvisation, as well as maintaining the dynamics of interaction through feedback and coordination. 

Enactive Model of Creativity (EMC)

The EMC is a model for understanding how cognition and perception change dynamically during the creative process (). It proposes the creative process unfolds by discovering and defining directives that guide and constrain interactions. These directives influence what affordances are perceptually available in the environment, which in turn constrains the possible actions that can be executed in the environment. The EMC proposes perceptual logic as a cognitive mechanism that selects relevant affordances from all available affordances. Cognition clamps to a perceptual logic during execution and unclamps during exploration. 


Clamped cognition is knowing what to do, when to do it, and executing actions with minimal thought or effort (e.g. walking down a sidewalk). In the model, this is an equilibrium with the awareness rectangle centered in the middle of the cognitive continuum. This is referred to as everyday cognition and relates to the time spent performing activities that do not require extensive thought, such as routines or habits. It would feature a well defined perceptual logic, where the individual can intelligently ‘read’ the environment to detect key pieces of information that inform the task at hand. Here, the individual is in a sensorimotor interaction coupling with the system in the environment, such that it is performing predictably and minimal higher order thinking is involved. The individual is not relying on a mental model of the situation, but rather executing embodied and situated actions in a predictable yet dynamic environment. 


Clamped cognition corresponds to the state of flow identified in the psychological literature, which is an optimal state of cognition where skill and challenge are balanced, the individual can focus purely on creative expression, and time ceases to flow linearly (Csit.). An artist, for example, would be in a state of clamped cognition while they were painting in a state of flow. Cognition would be clamped to a perceptual logic that guided perception and action based on perceived affordances in the artwork. To a skilled artist, i.e. one who has developed expert perceptual logic, the artwork itself guides the execution of actions because of the artistic affordances present when a skilled artist looks at their own work in progress. Perceptual logic exists at the local, regional, and global level in an artwork. Changing some local details could perturb the regional or global balance and require correction, thus affording another change. When perceptual logic is violated, cognition can unclamp to explore ways to find balance in the artwork. 


Unclamped cognition is interacting with the world to update one’s mental model or explore the environment (e.g. recovering from tripping on the sidewalk). During an unclamp event, the individual may intentionally interact with the environment to update their mental model of the situation at hand. For example, after tripping on the sidewalk, the individual may inspect the upcoming regions of the sidewalk to ensure there are no other cracks or bumps. This inspection could be used to update the mental model of the sidewalk as either smooth or bumpy. Once the mental model has been updated, a new perceptual logic is put into place and the affordances of the environment change dynamically according to the new perceptual logic, i.e. the bumps of sidewalks become more salient in the perceptual process. The mental model is no longer actively used at this point once the perceptual logic is in place and cognition becomes clamped. 


Within the unclamped category, there are two ways cognition can unclamp: 1) Functional unclamp, which is regulating interaction with the environment (e.g. exploring the environment from new angles, inspecting details of the environment), and 2) Interactional unclamp, which is disengaging from the interaction, possibly to think, reflect, and update the mental model. Functional unclamping changes the sensory data available to the individual through embodied explorations to gain new perspectives, e.g. moving the head or body to get a different viewpoint. It can also involve perturbing the system in the environment experimentally to determine the relationship between actions and their causal effects in the system in the environment, e.g. stepping slowly on ice and studying the feedback from each step, so as to not fall in a body of water. Interactional unclamp is a metacognitive activity where cognition becomes about cognition. It can occur during periods of pausing and hesitation where the focus of cognition becomes inner processes rather than the sensory data coming into the human system.


In the artist example, an unclamped state would be stepping back from an artwork to get a better view of the whole piece. This would be an example of an interactional unclamp event. The artist could also engage in a functional unclamp event, which could be simulating brush strokes above the canvas to help visualize what they would look like if they were painted. This is an embodied activity where the artist attempts to feel what the brush stroke would feel like if it were painted on the canvas, through both proprioceptive feedback and visual feedback. The artist would then progress through states of clamped cognition, where they were fluidly painting on the canvas, and unclamped cognition, where they were engaged in metacognitive activities, such as evaluating, reflecting, and simulating different outcomes. This cycle of clamped and unclamped cognition is a sense-making process exhibited by the autonomous activity of the individual and characterizes the core of creative sense-making. 

Creative Sense-Making Coding Technique

The CSM coding technique maps the cognitive modes of the EMC to the domain of human-AI co-creation, as shown in Table 1