Member Login

JCal Pro Mini-calendar

October 2020 November 2020 December 2020
Mo Tu We Th Fr Sa Su
Week 44 1
Week 45 2 3 4 5 6 7 8
Week 46 9 10 11 12 13 14 15
Week 47 16 17 18 19 20 21 22
Week 48 23 24 25 26 27 28 29
Week 49 30

Current Time

The time is 10:22:18.
Presentation of the dissertation project and the research results to date (Kleesiek) Print E-mail

Presentation of the dissertation project and the research results to date:

Introduction

Is it possible for an autonomous agent to infer sensorimotor laws based on interaction with its environment and develop cognitive behaviour? We investigate this question by combining several approaches from the field of developmental robotics, computational neuroscience, neurophysiology and psychology.

In contrast to many traditional beliefs that rest on the idea that the brain stores an internal representation of the world, O'Regan and Noe propose a theory where the outside world itself serves as an external memory [4]. The perceptual experience is a result of the learned mastery of what they call sensorimotor contingencies (SMCs), combining physical properties of the environment as well as those stemming from a particular sensor system. This approach naturally accounts for the differences in the perceived quality of sensory experience across modalities and stresses the necessity of the interplay between sensor perception and motor actions which already has been suggested by von Helmholtz [1].

Creatures are not born endowed with a matured set of SMCs. They need to develop and are shaped through the given sensor and actuator properties in a lifelong learning process. Similarly, the field of developmental robotics deals with the progressive and incremental development of proficiencies in machines. Instead of carrying out a particular predefined task, the robot can discover its perceptual, cognitive, and behavioural capabilities through actions based on its own physical morphology and the dynamic structure of its environment. To achieve this goal, exploratory activity as well as some kind of motivation is crucial.

Materials & Methods

Our system architecture is (so far partially) implemented on the Robotino® platform (Festo Didactic, Germany, http://www.festo-didactic.com/). This robot is equipped with nine infrared distance sensors, a colour webcam as well as two microphones. Movements are realized via an omnidirectional drive consisting of three modules. Monitoring their motor voltages results thus in a system with four modalities.

Approaches we have considered

We considered several approaches to technically realize a system capable to develop SMCs. Amongst others we took a closer look at Predictive State Representations (PSRs) [2], which rely on core tests (a sequence of actions followed by an observation), to model a dynamical system. The probability of success for each core test is maintained and becomes a feature for the representation of the world that can be used to predict future observations. This intuitive approach seems to meet the description of SMCs very nicely. However, there are several unsolved problems. PSRs are basically descriptive and they do not generate actions (planning problem). Furthermore, the questions of how to identify core tests (discovery problem) and how to update their probability distribution (learning problem) are not solved satisfactorily.

Another method we examined uses symbolic regression. Schmidt and Lipson used an evolutionary algorithm to derive physical laws purely based on the observation of a system [7]. Unfortunately, this approach seems to be not feasible for an online system.

System architecture

The effect of the robot actuators on the external world produces a sensor reading which in turn can be processed internally by the robot. The system architecture is composed of strongly interacting modules. The prediction machineє-greedy or more sophisticated reinforcement techniques [8]. Once an action has been chosen and a corresponding action-perception loop has been carried out, the error between the prediction and actual sensor reading can be computed and subsequently used to improve the prediction capabilities of the prediction machine. Besides that the error contributes to the overall “well being” of the agent. generates a prediction of the sensor values of the next time step associated with an action. For now an action is defined as a higher-level motor primitive, e.g. “move forward”, but it will be replaced by a low-level one, i.e. motor voltage, in the long run. The action selection module can use the prediction to choose a potential action, e.g. using

This is realized by relating the inaccuracy of the prediction to novelty and curiosity (cf. [5, 6]). A high error indicates that the recently occurred situation is not (well) known. Furthermore, looking at the rate of change of the error in a sequence of similar action-perceptions leads to an estimate of either learning progress or frustration which in turn can be used to influence the action selection. The “well being” can be augmented with a variety of in- and external rewards or punishments, e.g. for low power consumption, boredom or a preferred zone like a “petting zoo”. From a neuroscience point of view the internal state of our robot is strongly related to the dopamine cell system of the midbrain which is involved in the control of movements, the signalling of (reward) prediction error, motivation and cognition [3].

Currently the prediction machine is realized employing a recurrent neuronal network (RNN). However, many other methods, e.g. a SVM could be used to learn a relation between actions and sensor readings. Another approach we consider for implementing as a basis for the prediction machine relies on generic sensory coding principles, e.g. sparse coding, temporal coherence and predictability. Identically, with respect to a given objective function, computational units are combined in a hierarchical manner. Solely based on the statistical properties of the sensory modality and their position in the hierarchy (and thus differing inputs), the information processing leads to detection of different invariances depending on the respective hierarchy level. In fact, Wyss et al. showed that a (simulated) robot exposed to a continuous visual stream develops receptive fields with properties matching the ventral visual system and at the highest level of the hierarchy even resembles properties similar to place fields observed in the entorhinal cortex [9] (despite those convincing findings this approach might not be suitable for a fast online procedure). It remains to be solved which method and which parameterization is most suitable for our model.

Results

Preliminary results from a simplified system are promising. Currently, only infrared sensors are considered and the actions of the robot are restricted to forward, backward, left, right and stop. Experimenting with the robot in a simple environment (box of ≈ 1m2, no obstacles) and narrowing down the possible actions even further (only forward and backward) leads to a decrease of the prediction error to satisfactorily low values after a few iterations. However, when permitting all possible actions, this effect cannot be observed anymore. Instead, after an initial decrease the error rises again and the prediction machine is not able to capture the dynamics of the system.

Discussion & Outlook

In contrast to many other approaches, no goal for our robot is predefined. Instead, judging the error of the prediction of the sensor readings and the evolution of the resulting behaviour can be used to monitor the success of the system. Initial results show that our system struggles when the environmental complexity increases. A potential reason for this failure could be that the increasing number of states exceeds the capacity of the RNN. However, a major problem seems to be the slip of the robot’s wheels leading to a substantial rotation. By construction, the agent is not able to compensate for this. Furthermore, it is not able to distinguish between a rotation of the world and a self-rotation. Thinking about the causes and consequences of this undesired rotation actually leads us to the heart of the problem we want to solve. Mastery of SMCs should be a suitable solution to compensate for the slip and even represent a generic approach for various kinds of surfaces (of course, for that the robot must be capable to perform a rotation). Hence, we are currently focusing on this sub-problem.

In the long run, we plan to extend our approach with different modalities. This will help to discriminate between ambiguous situations and the question of early and late integration can be addressed. Furthermore, the construction of sensoritopic maps with e.g. an information theoretic approach could help to investigate the relation between stimuli of different modalities. An artificial deprivation or substitution of senses during the course of an experiment is also conceivable as well as an exchange of sensorimotor laws between different simulations. Next to the active (the robot itself) and passive (experimenter) manipulation of the environment, we can look for object affordances, permanence, recognition as well as identification.

Connections to other CINACS projects

This project is connected to several research projects in China and Germany within the CINACS framework. In particular, an interconnection to the project of Mario Maiworm (I.1.5.2) has to be mentioned. Mario Maiworm utilizes probabilistic models to a) predict (human) behaviour and b) to integrate sensory parameters by weighing them by their reliability. The obtained results from his project can be compared to our robot system in two ways. First, his models can be used to assess the “human-like” behaviour of our model (which can be seen as a sort of benchmark). Second, they can be integrated in our specific framework and (after the parameters have been adjusted) used to weigh sensory information across different modalities.

Furthermore, the project of Ning Chen (I 1.3.2), dealing with multimodal data mining, shares techniques from machine learning which have a common methodological basis and have led to mutual incitations and discussions during the 2009 CINACS summer school in Bejing. Although we do not want to pursue the canaonical approach commonly used in technical systems (feature extraction followed by early or late sensory fusion) as it can be found in the project of Ning Chen, results of the two methods can be compared and some algorithms from her work, e.g. using Bayesian models for time series analysis, can be valuable for the interpretation of our approach.

Since this project is a continuation of Tobias Kringe’s project (I.1.1.2.), it shares the same relations with that project (see description there).

References

[1] H. v. Helmholtz. Handbuch der physiologischen Optik, Voss: Leipzig 1867.

[2] L. Michael, S. Richard and S. Satinder. Predictive representations of state. In: Advances In Neural Information Processing Systems 14, pp. 1555-1561. MIT Press: Cambridge, MA 2002.

[3] M. Masayuki and H. Okihide. Two types of dopamine neuron distinctly convey positive and negative motivational signals. Nature, 459: 837-841, 2009.

[4] J K O'Regan and A Noe. A sensorimotor account of vision and visual consciousness. Behav Brain Sci, 24: 939-973, 2001.

[5] P. Y. Oudeyer, F. Kaplan, et al. Intrinsic motivation systems for autonomous mental development. IEEE Transactions on Evolutionary Computation, 11: 265—286, 2007.

[6] J. Schmidhuber. Curious Model-Building Control-Systems. Proc. IJCNN 1991 - IEEE International Joint Conference on Neural Networks, 2: 1458-1463, Singapor, Nov 18-21, 1991.

[7] S. Michael and L. Hod. Distilling free-form natural laws from experimental data. Science, 324: 81-85, 2009.

[8] R. Sutton and A. Barto. Reinforcement Learning: An Introduction. MIT Press: Cambridge 1998.

[9] R. Wyss, P. König, et al. A model of the ventral visual system based on temporal stability and local memory. PLoS Biol, 4: e120, 2006.