359
Views
7
CrossRef citations to date
0
Altmetric
Original Articles

On the design of neuro-controllers for individual and social learning behaviour in autonomous robots: an evolutionary approach

&
Pages 211-230 | Published online: 20 May 2008

Figures & data

Figure 1. Depiction of the (a) individual and (b) social learning tasks. In (a), R indicates the robot's starting position at the beginning of each trial T i . LS indicates the position of the light source. S indicates the emission of a tone. Before the emission of the tone, a successful robot should perform phototaxis (see continuous arrows). After the emission of the tone, the robot should perform antiphototaxis (see dashed arrows). In (b), L and D refer to the learner and demonstrator starting position, respectively. During the first trial T 1, learner and demonstrator are placed close to each other. In the second trial T 2, the demonstrator is removed and a successful learner should imitate the behaviour shown by the demonstrator in the previous trial by performing either phototaxis or antiphototaxis.

Figure 1. Depiction of the (a) individual and (b) social learning tasks. In (a), R indicates the robot's starting position at the beginning of each trial T i . LS indicates the position of the light source. S indicates the emission of a tone. Before the emission of the tone, a successful robot should perform phototaxis (see continuous arrows). After the emission of the tone, the robot should perform antiphototaxis (see dashed arrows). In (b), L and D refer to the learner and demonstrator starting position, respectively. During the first trial T 1, learner and demonstrator are placed close to each other. In the second trial T 2, the demonstrator is removed and a successful learner should imitate the behaviour shown by the demonstrator in the previous trial by performing either phototaxis or antiphototaxis.

Figure 2. (a) The simulated robot. IR i with i∈[0, 7] are the infrared proximity sensors; AL i with i∈[0, 1] the ambient light sensors; SS the sound sensor; Ml the left motor and Mr the right motor. (b) Network architecture. Only the efferent connections for the first node of each layer are drawn.

Figure 2. (a) The simulated robot. IR i with i∈[0, 7] are the infrared proximity sensors; AL i with i∈[0, 1] the ambient light sensors; SS the sound sensor; Ml the left motor and Mr the right motor. (b) Network architecture. Only the efferent connections for the first node of each layer are drawn.

Figure 3. First evolutionary phase. Each genotype is evaluated for 120 trials. That is, 40 trials in the single robot case (i.e., first set), 40 trials in the two robots case (i.e. second set), and 40 trials in the three robots case (i.e. third set). S indicates the trials in which a tone is emitted. In those trials which precede the emission of the tone, the robots are rewarded for performing phototaxis—these trials are indicated with the+sign. In those trials which follow the emission of the tone the robots are rewarded for performing antiphototaxis—these trials are indicated with the−sign.

Figure 3. First evolutionary phase. Each genotype is evaluated for 120 trials. That is, 40 trials in the single robot case (i.e., first set), 40 trials in the two robots case (i.e. second set), and 40 trials in the three robots case (i.e. third set). S indicates the trials in which a tone is emitted. In those trials which precede the emission of the tone, the robots are rewarded for performing phototaxis—these trials are indicated with the+sign. In those trials which follow the emission of the tone the robots are rewarded for performing antiphototaxis—these trials are indicated with the−sign.

Figure 4. Second evolutionary phase. Each genotype undergoes a first set of 32 trials in which it is evaluated at the individual learning task (see left side of the picture, and also caption of for details). If the genotype manages to successfully complete 25 out of the 32 individual learning trials then it is allowed to undergo a subsequent sets of trials (16×2 trials for phototaxis and 16×2 trials for antiphototaxis) in which it is evaluated at the social learning task (see right side of the picture). The demo-trial is the one in which the demonstrator and the learner are placed in the arena close to each other. The copy-trial is the one in which the learner (alone) is required to imitate the behaviour shown by the demonstrator in the demo-trial. Before the demo-trail, the demonstrator is taught what action to display to the learner.

Figure 4. Second evolutionary phase. Each genotype undergoes a first set of 32 trials in which it is evaluated at the individual learning task (see left side of the picture, and also caption of Figure 3 for details). If the genotype manages to successfully complete 25 out of the 32 individual learning trials then it is allowed to undergo a subsequent sets of trials (16×2 trials for phototaxis and 16×2 trials for antiphototaxis) in which it is evaluated at the social learning task (see right side of the picture). The demo-trial is the one in which the demonstrator and the learner are placed in the arena close to each other. The copy-trial is the one in which the learner (alone) is required to imitate the behaviour shown by the demonstrator in the demo-trial. Before the demo-trail, the demonstrator is taught what action to display to the learner.

Figure 5. (a) First evolutionary phase: fitness of the best genotypes of each generation of the best three out of 12 evolutionary runs. (b) Second evolutionary phase: fitness of the best genotypes in each generation of the best four out of 12 evolutionary runs.

Figure 5. (a) First evolutionary phase: fitness of the best genotypes of each generation of the best three out of 12 evolutionary runs. (b) Second evolutionary phase: fitness of the best genotypes in each generation of the best four out of 12 evolutionary runs.

Table 1. Results of post-evaluation tests aimed at evaluating the individual learning capabilities of the best four evolved agents (i.e. I1, I2, I3, and I4) taken from four different evolutionary runs (i.e. E1, E2, E3, and E4; see ). The agents are evaluated in condition F (the light is placed in front of the agent) and condition B (the light is placed behind the agent). Each condition refers to 8 different groups of 8000 trials in which the emission of sound is systematically varied from trial 1 to trial 8. For each condition, the table shows: Equation(1) the success rate (S); Equation(2) the rate of error type E 1 (the agent does phototaxis instead of antiphototaxis); Equation(3) the rate of error type E 2 (the agent does antiphototaxis instead of phototaxis).

Table 2. Results of post-evaluation tests, limited to the anti-phototaxis response, aimed at evaluating the social learning capabilities of the best four evolved agents (i.e. I1, I2, I3, and I4) taken from four different evolutionary runs (i.e. E1, E2, E3, and E4; see ). The agents are evaluated in four different starting conditions, FL, FR, BL, and BR. F (front) and B (behind) refer to the position of the light with respect to the heading of the agents. L (i.e. the demonstrator on the left side of the learner) and R (i.e. the demonstrator on the right side of the learner) refer to the relative positions of the agents. Each condition refers to 2000 post-evaluation trials. For each condition, the table shows: Equation(1) the success rate (S); Equation(2) the rate of error type E 3 (the learner does not follow the demonstrator in the demo-trial); Equation(3) the rate of error type E 4 (the demonstrator does not show the correct response to the light in the demo-trial); Equation(4) the rate of error type E 5 (the learner does not replicate in the copy-trials what previously shown by the demonstrator).

Figure 6. The graphs refer to post-evaluation tests in which the learner controlled by genotype I2, is evaluated in copy-trials following demo-trials of different length (from 1 time-step to 30 time-steps) in which the demonstrator performed phototaxis (graph (a)) and anti-phototaxis (graph (b)). The graphs show the learner performances (i.e. success rate) in the copy-trials. For phototaxis and anti-phototaxis, the learner is evaluated in four different starting conditions FL, FR, BL, and BR (see the caption of for details).

Figure 6. The graphs refer to post-evaluation tests in which the learner controlled by genotype I2, is evaluated in copy-trials following demo-trials of different length (from 1 time-step to 30 time-steps) in which the demonstrator performed phototaxis (graph (a)) and anti-phototaxis (graph (b)). The graphs show the learner performances (i.e. success rate) in the copy-trials. For phototaxis and anti-phototaxis, the learner is evaluated in four different starting conditions FL, FR, BL, and BR (see the caption of Table 2 for details).

Figure 7. The graphs refer to post-evaluation tests in which the learner controlled by genotype I2, is evaluated in copy-trials following demo-trials of different length (from 1 time step to 30 time steps) in which the demonstrator performed phototaxis (graph (a)) and antiphototaxis (graph (b)). The graphs refer to the average frequency of forward and backward direction of movements of the learner in the corresponding copy-trials. For phototaxis and anti-phototaxis, the learner is evaluated in four different starting conditions: FL, FR, BL, and BR (see the caption of for details).

Figure 7. The graphs refer to post-evaluation tests in which the learner controlled by genotype I2, is evaluated in copy-trials following demo-trials of different length (from 1 time step to 30 time steps) in which the demonstrator performed phototaxis (graph (a)) and antiphototaxis (graph (b)). The graphs refer to the average frequency of forward and backward direction of movements of the learner in the corresponding copy-trials. For phototaxis and anti-phototaxis, the learner is evaluated in four different starting conditions: FL, FR, BL, and BR (see the caption of Table 2 for details).

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.