BaseAgent: Computation and Algorithms

BaseAgent is the main component for algorithms and computation in ros_base. Its responsibility is centered on algorithm logic, internal state, and the organization of computation flow.

1. What the base class provides today

The current BaseAgent implementation is intentionally lightweight:

class BaseAgent(ABC):
    def __init__(self, manager=None, *args, **kwargs):
        self._manager = manager
        self._logger = CustomLogger(self.__class__.__name__)

The base class exposes these context properties by default:

self.logger
self.nodes
self.agents
self.timestamp
self.state
self.node_freq_hz
self.get_clock()

Among them, reset() is abstract and must be implemented by subclasses. handle() is left open for the subclass to define when needed.

2. What a good agent usually looks like

A good agent typically satisfies three conditions:

Clear inputs
Clear outputs
Computation logic that can be tested independently

For example:

class TrackerAgent(BaseAgent):
    def handle_initial_bbox(self, initial_bbox, img):
        ...

    def handle_sigma(self, img, depth, timestamp):
        ...

    def reset(self):
        ...

The important point is that these core methods consume regular Python or NumPy data, not the ROS pub/sub process itself.

3. Two typical kinds of agents

Task-oriented agents

Examples from SigLoMa-VLM:

QwenVLMAgent: cloud VLM calls, low frequency, high latency
TrackerAgent: image tracking and sigma-point generation
UIAgent: user interface and on-screen interaction during the task flow

These agents often:

Are not suitable to run on every frame
Need to be scheduled by the handler in specific states
Must avoid blocking the whole main loop

Control-oriented agents

Examples from quad_deploy:

StandAgent
SigLoMaLocoAgent
SigLoMaNavAgent
SigLoMaTurnAgent

Most of them inherit from BaseRLAgent, with these characteristics:

High runtime frequency
Stable behavior required on every cycle
decimation used to control the real model-inference cadence

4. `BaseRLAgent` shows a useful pattern for high-frequency agents

BaseRLAgent in quad_deploy is a strong reference:

def handle(self):
    if self.timestamp % self.decimation == 0:
        action, p_gains, d_gains, done = self.step()
    else:
        action, p_gains, d_gains, done = None, None, None, None
    self.robot.send_action(action, p_gains, d_gains)

This shows that inside the ros_base architecture, an agent does not need to perform a full recomputation every time handle() runs. Common strategies include:

Running inference every N frames
Doing only lightweight post-processing at high frequency
Decoupling heavy computation from control output

5. About context access

Although an agent can access self.nodes and self.agents directly, it is better to treat that as a wiring convenience rather than the core design style.

A more stable pattern is:

Read data in the handler
Pass only the required data into the agent as arguments
Let the agent focus on the current task

For example:

img = self.camera.img
bbox = self.QVLM.handle_bbox(img_np=img, curr_target="toy", multi=False)

instead of making deep agent methods fetch self.nodes["camera"] everywhere.

6. Suggestions for offline testing

If you want an agent to be easier to test without ROS:

Keep constructors for model loading and lightweight configuration only
Let core methods accept regular data types whenever possible
Do not hard-code topic names, QoS policies, or message types into the agent

With that structure, the agent can still be instantiated directly even without a manager, while continuing to use self._logger for logging.