Artificial Sentience: Tonic Dopamine as the Engine of Motivation: A Computational Exploration

For a list of all posts go here.

Introduction

Motivation is often conflated with reward, pleasure, or learning. Yet everyday experience — and clinical observation — shows that these concepts can dissociate. One can desperately want without enjoying, explore without committing, or remain apathetic despite available rewards.

In neuroscience, these distinctions are often discussed in terms of dopamine, but dopamine itself is not a unitary signal. In particular, tonic dopamine (DATonic) has been proposed to regulate the energization and vigor of behavior, rather than learning or reward prediction per se.

In this post, I present a minimal computational model designed to isolate the role of tonic dopamine in motivation, curiosity, and goal-directed behavior. Importantly, this model does not include cues or phasic dopamine, and therefore does not model addiction. Instead, it focuses on how different motivational regimes emerge from the interaction between tonic dopamine and hedonic adaptation.

The next post will build on this foundation by introducing phasic dopamine, cue sensitization, and addiction-like dynamics.

Conceptual framing

The guiding assumptions of the model are:

• DATonic modulates motivation globally, scaling the probability that actions are selected and executed.

• DATonic also modulates curiosity, determining whether the agent explores or remains inert.

• Hedonic value is interoceptive, tied to a target that produces pleasure when reached.

• Hedonic adaptation causes the subjective value of the target to decay with repeated consumption.

• No cue is present at the target location, ensuring that motivation remains hedonic rather than cue-driven.

Within an active inference perspective, DATonic can be interpreted as a form of policy precision or gain: higher values increase behavioral confidence and vigor, while lower values produce indecision and apathy.

The simulated environment

The agent operates in a simple two-dimensional environment, introduced in a previous post:

• A target produces interoceptive hedonic feedback when reached.

• A barrier blocks direct access to the target, with a narrow opening.

• The agent must explore to discover the opening before it can repeatedly reach the target.

Crucially:

• The target does not move.

• There is no explicit reward prediction error.

• There is no cue colocated with the target.

This ensures that all observed behaviors emerge from motivation and adaptation, not learning or habit formation.

Two key parameters

After running extensive simulations, it became clear that motivational regimes are best characterized not by DATonic alone, but by the interaction of two parameters:

• DATonic level after target detection

This determines how strongly motivation and curiosity are energized once the agent knows the target exists.

• Hedonic adaptation rate

Implemented as an exponential decay of target gain with repeated visits, governed by a multiplier on the number of target encounters.

Two simple equations characterize the motivational regimes:

a) Subjective Motivation is proportional to (distance to target) * (DATonic) * (Exteroceptive Gain of Target) * (Interoceptive Subjective Hedonic Value of target),

b) where (Interoceptive Subjective Hedonic Value of target) = exp(-(hedonic adaptation decay rate) * (number of visits to target)).

Together, these parameters define a two-dimensional motivational phase space.

Emergent motivational regimes

Four distinct behavioral regimes emerge naturally from the simulations.

(a) Low DATonic, any adaptation rate

Observed behavior:

The agent shows minimal exploration, fails to overcome the barrier, and never reaches the target. Hedonic value remains constant because it is never experienced.

Phenomenological interpretation:

This regime closely resembles apathy, as seen in Parkinson’s disease, severe depression, or negative symptoms of schizophrenia.

Importantly, the failure here is not due to lack of reward value, but to insufficient motivational energy to act upon it.

Figure 1. Apathy regime.

Caption: Low DATonic produces minimal exploration and failure to reach the target, regardless of hedonic adaptation rate. Motivation collapses before goal-directed behavior can emerge.

(b) Medium DATonic, medium adaptation rate

Observed behavior:

The agent explores, finds the opening, reaches the target a few times, and then disengages as hedonic value decays.

Phenomenological interpretation:

This regime corresponds to healthy goal-directed behavior: curiosity-driven exploration, successful pursuit, and flexible disengagement once interest wanes.

This regime reflects a balance between motivational drive and hedonic adaptation.

Figure 2. Healthy motivation regime.

Caption: With moderate DATonic and moderate hedonic adaptation, the agent reaches the target a few times before disengaging as subjective value diminishes.

(c) Medium DATonic, low adaptation rate

Observed behavior:

The agent repeatedly reaches the target. Hedonic value diminishes slowly, sustaining prolonged engagement.

Phenomenological interpretation:

This regime resembles compulsive behavior, such as that seen in mania or stimulant intoxication. Motivation remains high despite diminishing novelty.

Crucially, this is not addiction. There are no cues, no sensitization, and no persistence in the absence of hedonic value.

Figure 3. Compulsion without addiction.

Caption: Slow hedonic adaptation combined with sufficient DATonic sustains repeated target engagement, producing compulsive-like behavior without cue dependence.

A unifying view: motivation lies in a Goldilocks zone

Taken together, these simulations highlight a key principle:

Motivation is optimal within a narrow range of tonic dopamine.

• Too little → apathy

• Too much → instability or compulsion

• Balanced → flexible, goal-directed behavior

Importantly, hedonic adaptation modulates whether motivation extinguishes or persists, but it cannot generate addiction on its own.

What this model does not explain

This model deliberately excludes:

• cue-driven behavior,

• sensitization,

• craving,

• persistence in the absence of reward.

As a result, it cannot explain addiction.

This limitation is intentional.

Looking ahead: phasic dopamine and cue sensitization

None of the regimes described above explain why neutral cues can acquire overwhelming motivational power, or why seeking persists even when pleasure fades.

To address those phenomena, we must introduce phasic dopamine, reward prediction errors, and cue sensitization.

That is the focus of the next post.

Conclusion

By isolating tonic dopamine and hedonic adaptation, this model demonstrates how diverse motivational phenotypes can emerge without learning, cues, or addiction. It provides a computational bridge between dopamine theory, active inference, and clinical phenomenology — and sets the stage for understanding how phasic dopamine reshapes motivation in far more pathological ways.

Artificial Sentience

Monday, December 29, 2025

Tonic Dopamine as the Engine of Motivation: A Computational Exploration

No comments:

Post a Comment

Building Proto-Affective Agents with Active Inference

Report Abuse