680
Views
12
CrossRef citations to date
0
Altmetric
Original Articles

Specialization and generalization of robot behaviour in swarm energy foraging

, , &
Pages 131-152 | Received 02 May 2011, Accepted 06 May 2011, Published online: 01 Aug 2011

Abstract

Energy supply is one of the most serious problems for micro-mechatronic devices. For collective systems, such as sensor networks or swarms of autonomous micro-robots, collective energy management is especially hard. This work describes a kinetic model of energy foraging and an application of bio-inspired harvesting behaviour to a real robot swarm. The heuristic strategy derived allows proper collective management of energy resources without using global knowledge and guarantees a good swarm efficiency. Despite the whole swarm having the same behavioural rules, some robots specialize in only a few foraging activities, whereas others are more universal in their behaviour. Such emergence of ‘specialists’ and ‘generalists’ is observed in animal groups and can indicate common behavioural principles underlying natural and artificial systems.

1. Introduction

Modern micro-technological research faces several essential challenges related to the size, available energy and functional capabilities of micro-mechatronic systems [Citation1]. Because of these limitations, micro-devices are relatively simple, with minuscule sensing, computation, communication and actuation capabilities [Citation2]. There is a limited number of ways to satisfy the continuously increasing demands made on these devices in human and industrial environments, for example, creating a large number of simple autonomous systems, such as micro-robots, so that their collective work and apparent emergent phenomena provide extended functionality and high reliability for the whole system [Citation3,Citation4].

Designers of collective micro-systems face many problems related to coordination, information transfer and decision-making [Citation5,Citation6]; of these, autonomous energy management is the most critical [Citation7,Citation8]. Autonomous energy management comprises several issues: the recharging of equipment, managing the docking approach, individual energy homeostasis and the behavioural strategy of the whole group (e.g. [Citation9,Citation10]). This article focuses on collective energy foraging, using an example implementation in a swarm of autonomous ‘Jasmine’ micro-robots [Citation11]. These robots have energy-level sensors; when they become ‘hungry’, they seek a docking station. Since their charging and discharging time is almost equal, a docking station with N/2 slots can provide enough energy for N robots. If we calculate efficiency as a relationship between the main working state and the auxiliary non-working states, the best achievable swarm efficiency in ‘Jasmine’ robots is 50%. However, the more robots are involved in the collective behaviour, the more energy they require for auxiliary non-working activities. As shown in this work, the expected collective energy consumption at a constant swarm density is proportional to N 2 of robots, while the swarm efficiency (with some hand-coded greed foraging strategies) varies between 14.5% and 21%.

The low efficiency of greed strategies can be explained by the unoptimized dynamics of working and recharging robots in a swarm [Citation12]. The derived kinetic models indicate that variable swarm density and a minimal number of waiting robots are necessary conditions for improving the efficiency of energy foraging. Moreover, individual energy thresholds should be adaptable to the energy level of the swarm. To implement these conditions, we apply biological foraging strategies. Optimal foraging theory (OFT) has been used by some researchers in the field of robotics: Ulam and Balch [Citation13], for example, used OFT to evaluate the efficiency of robots. Simple reactive ‘shortcut’ rules, based on biological data, were also investigated in artificial agents, to achieve an optimally efficient search for resources by solitary agents [Citation14,Citation15], as well as almost optimal distribution of non-communicating agents between patches of different profitability [Citation16].

The idea behind bio-inspired foraging strategies is derived from the observation that some animals are known to spontaneously change their foraging sites, despite the sites still having food resources available. Usually, these animals' hunger levels are not critical, and so they are able to spend some of their remaining energy in exploring for new territory. This is known as a ‘spontaneous’ foraging strategy. Other animals remain in the same place for a while, even after all the food there has been consumed. This is known as a ‘persistent’ foraging strategy. The persistence of individual behavioural acts has been instantiated in robots in various ways and has been proven to enhance agents' performance [Citation17–19]. Interaction between persistence and spontaneity has been shown to be useful in control of odour gradient following in simple simulated agents [Citation20]. To our knowledge, however, no attempt has been made to explore interaction between persistent and spontaneous behavioural sequences in robots which perform collective tasks. As shown experimentally, bio-inspired energy foraging, based on kinetic models, provides a swarm efficiency of 33–38%.

During robot experiments using a bio-inspired foraging strategy, we encountered another interesting result. Generally, all the robots execute four main roles: working, looking for the docking station, waiting and recharging. Our observation is that some robots change their roles less frequently than others, becoming ‘specialists’, that is, specialized in a particular behaviour. Other robots that demonstrate more frequent changes of role are ‘generalists’, which can adopt many roles. The emergence of specialization has been noted by many researchers and it has been shown that the specialization of agents can come about through artificial evolution. The specialist or generalist outcome of evolution depends on the energy value of the available food and the agents' ability to discriminate different food objects and extract energy from them [Citation21]. In [Citation22], when accomplishing a multi-foraging task, it was shown how learning results in the division of a robotic team into specialists and generalists. As demonstrated later in this work, the appearance of specialists and generalists in swarm-foraging behaviour seems to be an emergent phenomenon originating in the spatial interactions between robots.

This article is structured as follows: Section 2 introduces the individual energy homeostasis of the ‘Jasmine’ micro-robot and Section 3 develops a theoretical model of foraging behaviour for the simulated micro-robot. Sections 4 and 5 are devoted to the bio-inspired foraging strategy and its implementation. Section 6 describes swarm experiments performed with real robots. Section 7 concludes this work.

2. Individual energy homeostasis for the ‘Jasmine’ micro-robot

The ‘Jasmine’ micro-robot, (a), is an open source hardware robot.Footnote 1 The robot measures 30 × 30 × 20 mm, uses two Atmel AVR Mega micro-controllers (Atmel, San Jose, CA, USA) and has six infrared (IR)-based communication/proximity channels, covering 360° with maximum/minimum ranges of 200/100 mm. It uses two geared DC motors with maximum velocity of about 0.5 m/s. Extension boards provide wireless communication (ZigBee), gradient light perception, an ego-positioning system and so on. The ‘Jasmine’ uses a single-cell 250 mAh Li–Po accumulator with internal energy sensor and consumes about 200 mA when moving and sensing, about 20 mA when sensing only (communicating) and about 10 mA when listening only. The autonomous work period is about 1.25 h. The Li–Po accumulator works optimally when discharging until it reaches 75–80% of full capacity; critical level occurs when the voltage drops under 3V. The recharging current is 1C (250 mA); full recharging takes about 90 min, and partial recharging time is almost equal to discharging time. The autonomous energetic homeostasis of the robot has five states, as shown in .

Table 1. Main energetic states of the ‘Jasmine’ robot

Figure 1. (a) The ‘Jasmine III’ micro-robot; (b) Collective energy foraging in a swarm of 50 ‘Jasmine’ micro-robots, showing docking stations, recharging robots and robots waiting near the docking station.

Figure 1. (a) The ‘Jasmine III’ micro-robot; (b) Collective energy foraging in a swarm of 50 ‘Jasmine’ micro-robots, showing docking stations, recharging robots and robots waiting near the docking station.

A swarm of ‘Jasmine’ robots can be thought of as an autonomous sensor network capable (due to its self-recharging capacity) of long-term autonomy. The ability to behave as such a network is the main motivation for using swarm robots in this work. We consider the main working states to be movement in the robot arena, collection of sensor data (from IR sensors) and transmission of these data to the main ZigBee server. This scenario is general enough to be transferred to other robot swarms, for example, underwater or aerial collective systems.

The robot can manage its own behaviour, as shown in First, in the critical state, robots halt the activity currently being executed and seek a docking station. Second, the robots prioritize the activity currently being executed Pr(Task) and the seeking of food Pr(Sh) (feeling hungry). When, for example, the priority of the current activity is 60%, but hunger is 70%, robot will seek the docking station. Finally, a robot can have the so-called collective instinct, when it will recharge only until it reaches the satisfied state Ss. This takes less time than recharging until full (Sf) and frees the slot for another robot. While foraging, the robot can exhibit any of four roles, which can change dynamically (see ).

Figure 2. Structural scheme of energetic homeostasis.

Figure 2. Structural scheme of energetic homeostasis.

Table 2. Four main roles in energetic homeostasis

All the robots first execute the working role R0. When robots become ‘hungry’, they change to role R1 and start seeking the docking station. The docking station [Citation9,Citation23] has two copper strips, with 5V across them (see (b)). Each slot of the docking station is equipped with a communication system, like the robots themselves. Thus, the docking station and the robots can communicate (the robots can ‘smell’ free slots from 200 mm away). The robots approach the docking station; when it is busy, they remain close by in the ‘waiting’ role, R2. After recharging (role R3), the robots resume the working role R0.

We denote the number of robots that execute role R i as and the duration of role R i in the robot j as , or in general, . The available individual energy Ei is estimated in ADC values of the corresponding voltage of the Li–Po accumulator. The efficiency Φ j of the robot j can be calculated as

(1)

The charging and discharging current (i.e. the time) are almost the same, that is, . When , the efficiency achieves Φ = 1/2 = 50%. When , that is, the robot does not recharge and only works, its efficiency is Φ = 100%. Thus, the value of Φ is useful for the cases and expresses a general relationship in a robot's energetic balance. Swarm efficiency Φ s and the collective energy level of the swarm E s are

(2)
where N is the number of robots and Ei is the individual energy level of a robot.

3. Kinetic model of swarm foraging

We demonstrated in Section 2 that the best swarm efficiency occurs when . Obviously, efficiency in swarm and . In this section, we will estimate swarm efficiency for different cases of R1 and R2 and formulate the requirements for a good collective foraging strategy.

3.1. Global energy homeostasis for a constant swarm density

Swarm density Dsw is defined as a relationship between the number of robots N and the area S they occupy. The critical swarm density can be derived from the assumption that robots cover the whole area S, that is, from , where Rs is the sensing radius of a robot:

(3)

For the ‘Jasmine’ robots, in an arena measuring S = 140 × 115 cm2, the critical maximum number of robots is 52. We can also estimate an optimal swarm density using the assumption that for best swarm reactivity (see [Citation24]), the robots should be within a communication radius Rc of each other

(4)

For the same conditions, N opt = 23. Maintaining NN opt is advantangeous, as it allows for the reaching of several super-scalable swarm parameters [Citation24]. Therefore, in this section, we calculate global energy homeostasis from the condition NN opt, that is, from the constant swarm density.

Let Ψ be the amount of energy coming into the swarm from outside. The inequality of the energy balance

(5)

says that the energy consumption should be less than, or at least equal to, the energy input. Energy consumption Ec comprises the finding of the energy source by N robots , the waiting/docking of N robots and finally the moving of N/2 robots , while other N/2 robots are recharging . Using the values for the ‘Jasmine’ robots , we derive

(6)
where the numeric coefficient m = 0.1 (see for differences between ω in R2 and R3). The time required to find energy can be estimated in a linear approximation from the covering problem (see, for example, [Citation25]): when Rs is the sensing radius of the robot and υ is the velocity of motion, during a robot can cover the area . We assume that for a subcritical swarm density , a robot has to cover an area of SS r N to find energy, where S r is the area of the robot itself, that is,
(7)

To satisfy the condition NN opt, when increasing N, we must also increase S, that is, to S(N). We finally derive

(8)

Thus, the finding time can be approximated as a linear function of the number of robots, when keeping the swarm density constant. This is an important difference to EquationEquation (15) in Section 3.2, which calculates for variable swarm densities and for overlapping trajectories of multiple robots.

The docking approach is relatively quick; however, the docking station can become a bottleneck when too many robots are moving to and from the energy source. To estimate , we can assume that the docking time is sublinear to the number of robots, that is, , where the small coefficient λ is of dimension time, estimated experimentally. Returning to EquationEquation (6), we derive

(9)

EquationEquation (9) requires explanation. The linear term in EquationEquation (9) represents the doing of a useful job, required, for example, for sensor network activities. However, the quadratic terms represent the energy needs for exploration of territory and mutual hindering of robots, required for supporting system-internal activities. This means swarm (and more generally collective) systems have an optimal size at a constant swarm density. When a swarm grows, that is, covers more territory, the system-internal activities consume much more energy for internal activities than is required for useful outputs from the system. EquationEquation (9) can be reformulated for swarm efficiency (EquationEquation (1)). For a constant swarm density, energetic efficiency Φ E

(10)

limits the spatial growth of swarm systems, thus representing a natural size limit. Efficiency for the ‘Jasmine’ robot parameters is shown in (a). EquationEquations (1) and (10) differ from each other in EquationEquation (6) that N/2 robots are used during and for the calculation of Φ E . For individual Φ j , N of robots is irrelevant.

Figure 3. (a) Energetic efficiency of the ‘Jasmine’ swarm for EquationEquation (6) at a constant swarm density; (b) plot of covering rate from EquationEquation (14).

Figure 3. (a) Energetic efficiency of the ‘Jasmine’ swarm for EquationEquation (6)(6) at a constant swarm density; (b) plot of covering rate from EquationEquation (14)(14).

3.2. Collective strategies for a variable swarm density

As shown in Section 3.1, maintaining a constant swarm density costs a high energetic price. Therefore, the main consideration for a good collective foraging strategy is to make swarm density variable and to make use of the effects that appear. In particular, a higher swarm density allows the reduction of , using the effect of overlapping trajectories. As shown in Section 3.1, when the search area is S, the covering area S cov of randomly moving robots can be estimated as the sum of non-overlapping local areas S l = υt2Rs (shown in (a)) minus overlapping between S l. There are two reasons for overlapping: first, the swarm density, when one robot overlaps the trajectory of another robot (S ov1); and second, the collision-avoiding behaviour of a robot, when it overlaps its own trajectory (S ov2), that is,

(11)

Figure 4. Coverage by (a) 2 and (b) 13 robots. Ten seconds of motion (10 images) are shown, as difference images extracted from the video sequences.

Figure 4. Coverage by (a) 2 and (b) 13 robots. Ten seconds of motion (10 images) are shown, as difference images extracted from the video sequences.

Equation defines the quality of the covering strategy. In we demonstrate two cases, in which 2 and 13 robots move randomly in an area measuring 140 × 115 cm2. When t is large enough, it is assumed ; whereas for short t, . During random motion, the local areas S l overlap, so the efficiency of coverage is decreased. Obviously, a good strategy must minimize overlapping between S l. Overlapping is difficult to calculate exactly, however, it can be estimated: the N moving robots can be represented as Nn static robots when calculating a ‘differential image’, as shown in , where n is the number of snapshots. The value of n increases continuously, so that n = kt, where k is a coefficient of ‘how often snapshots are taken during t’. We assume that a snapshot only shows when robots move more than 2Rs , that is, k = υ/2Rs . Randomly moving robots behave very similar to gas molecules, that is, they are uniformly distributed over the area covered at t → ∞. Statistically, the areas S l are not overlapped, when . Therefore, the value S ov1 can be estimated as

(12)

The value S ov1 makes sense when S ov1 ≥ 0. The value of S ov2 depends on the collision-avoiding behaviour of a robot. At each collision, the robot rotates, so that it moves somewhat over its former trajectory. The area lost is proportional to the number of robot–robot contacts. At each contact, a robot loses a triangle of area, as shown in (a), which can be calculated as or (α is the collision-avoiding angle, ≈30°). The number of contacts C is equal to the average number of robots N/S within the area S l (with Maxwell coefficient ), that is,

(13)

Finally, we derive this expression for the area covered

(14)

In , we plot the relationship in EquationEquation (14) dependent on time, when S = 16,100 cm2, v = 30 cm/s and Rc  = 15 cm for two cases N = 2 and N = 13, as shown in As seen, 2 robots can cover the area in about 1 min, and 13 robots in about 10 s. Both results correlate very well with the experimental data.

EquationEquation (14) allows us to estimate the time needed to cover a given area with a variable swarm density. Setting S l = S, solving about t and simplifying, we derive the equation

(15)
where m = 0.268. From two solutions, we must choose the positive one.

3.3. Requirements for a good swarm foraging strategy

A good foraging strategy should minimize times and as well as the number of robots and . This can be done in several ways, bearing in mind that no foraging strategy may involve centralized elements, global knowledge or unrealistic sensor data.

i.

There are several mechanisms which force swarms to . In this case the energy balance should not be considered as but as

(16)

EquationEquation (16) means that energy input should be proportional to S and uniformly distributed in S. This is an important consequence, allowing improvement of the foraging strategy. Uniform distribution enables the swarm to minimize system-internal activities.

ii.

Constant swarm density does not provide efficient energetic performance. This in turn means that for a swarm member will die because of bottlenecks and insufficient energy input. Individual energetic death is a self-regulating mechanism, allowing maximal collective energetic performance. Therefore, a robot swarm should allow for the killing of some robots (e.g. switching them into stand-by mode) to achieve better energetic performance.

iii.

EquationEquation (15) for collective searching provides a shorter time . However, the more robots involved in a collective search, the worse their collective efficiency. A good strategy should maximize Φ s by varying in a collective search.

iv.

A good foraging strategy should minimize and by managing the number of robots that go to recharging. In the ideal case, should equal the number of free slots in the docking station. The ‘buffered’ robots should not occlude the docking station.

v.

A good foraging strategy should adapt the ‘critical energy’ and ‘hungry’ thresholds Sc and Sh to the current energetic state of the swarm and thus reduce .

Of these mechanisms, the last two have the most intense impact on collective foraging. In Section 4, we demonstrate a bio-inspired heuristics that can provide a good strategy for optimizing Φ s by managing , and , .

4. Bio-inspired strategy for optimal foraging

Optimal behaviour models. Animal behaviour is adaptive, and the notion of adaptation may be formalized if one introduces a ‘currency’ in which animals invest and gain when accomplishing a task. If an animal forages for food, the currency could represent the energy they gain from their prey and the energy they invest in finding and processing it. If the animal achieves the maximum ratio of gain to investment possible in a given environment and with given foraging capabilities, it could be viewed as an ‘optimal forager’. Models which predict how optimal foragers should search for and select prey are developed within the frame of OFT [Citation13,Citation16]. Consider an environment in which prey is concentrated in patches, with prey density differing among the patches. Foraging models predict that an optimal forager will leave a patch once the rate of encounters with prey in that patch falls below a certain threshold. This threshold will be higher, and the forager will stay longer on a patch, if the average prey-encounter rate on a patch is low and/or the average time to find the next patch is long. Similarly, optimal time allocation among several tasks (e.g. feeding, courtship and territory-guarding) could, in principle, be calculated based on ideas of gain and investment.

Specialists and generalists. The basic tenets of OFT (currency, gain/investment ratio) are applicable not only to individual animals, but also at the level of swarms and colonies. Specialization in insect colonies can be temporary and depends on the state of the colony and the resources available. For example, foraging bees usually visit those nectariferous flowers which provide a maximum gain/investment ratio at the level of whole colony [Citation26]. The emergence of generalists and specialists can be observed not only in food foraging, but also in the allocation of colony members to various tasks such as building, attending to progeny and so on [Citation27]. An approach based on OFT makes it possible to develop a model for optimal division of labour among colony members in terms of maximal gain/investment ratio at the colony level.

Heuristics instead of optimal decisions. Models of optimal behaviour usually deal with ‘all-knowing’ animals that possess global information about the environment within which they live. When choosing a patch, they would know the rate at which they will encounter patches of different prey density. When choosing a task to accomplish, the members of a colony would know the relative priorities of different tasks for the colony, as well as the number of members already engaged in these tasks. Thus, models of optimal foraging do not take into account the restrictions suffered by real living beings. Animals have only limited access to global information about their environment and only a limited time in which to make decisions. Furthermore, nervous systems, at least in the lower animals, have limited processing capacity. Finally, natural environments change over time, so a long and thorough learning and analysis of global information might make no sense in an unstable environment.

With these restrictions in mind, one can hardly expect animals to be ‘all-knowing’. For this reason, OFT is not used as a realistic description of animal behaviour; rather, it describes an idealized efficient behaviour, used as a standard against which we may compare real animals and evaluate their efficiency. It is known that, instead of the rational decisions predicted by optimization models, animals use simple shortcut rules – heuristics – which result in suboptimal but still efficient behaviours. For example, an individual bee forager does not analyse global information about colony needs, available food sources and the current specializations of other bees in the colony. Instead, in deciding which food source she should switch to, she responds to a few local cues available to her. This simple decision-making process results in a nearly optimal choice of food source by the whole colony [Citation26].

We investigated two simple heuristics in robots: the persistence of current behaviour and spontaneous switching among behaviours. We also investigated whether these heuristics lead to the efficient behaviour of robotic teams, including the emergence of generalists and specialists.

4.1. Persistence and spontaneity in animal behaviour

Persistence. When an animal starts a particular behaviour, for example, feeding or searching, it performs that behaviour persistently for a while, even if the causal factors that initially evoked it fall to a low level and the causal factors for another competing behaviour rise. The mechanisms that make persistence possible differ in different animals. In mammals, persistence is caused by a positive feedback loop via the basal ganglia – thalamocortical circuit. The result is hysteresis, in which an act currently being performed remains active with lower levels of causal factors than were initially required to start it [Citation19]. The persistence of a current behaviour obviously has an adaptive value. First, it helps the animal accomplish an already-started act, despite a temporary break in incentive stimulation, and therefore satisfies the organism's needs. Second, it prevents ‘shuttling’ between competing behaviours. Third, it results in a sort of anticipation, because the positive feedback increases responsiveness to those future events which are relevant to current behaviour [Citation18].

The heuristics inspired by foraging in ants was used to simulate division into foragers and ‘loafers’ within a colony of agents. If a forager happens to find more prey than the others, its foraging motivation increases and this agent keeps foraging. The motivation of the less-successful agents falls and they stop foraging. As a result, the collective efficiency increases, because, when there are many foragers and not much prey, unsuccessful agents stay in the nest and do not interfere [Citation28].

Spontaneity. As well as persistence, animals exhibit another behavioural feature: spontaneous switching among behaviours. We define a switch as spontaneous if it is not a reaction to external clues, but is based solely on the internal state of the animal. Thus, spontaneity could serve as a counterbalance to persistency. Bees, for example, show ‘flower constancy’: once they have begun to take nectar from particular flowers (e.g. blue), they mostly ignore others (e.g. yellow) even if the nectar content of the blue flowers falls (persistency). Yet, from time to time and for no obvious reason, the bees sample the yellow flowers, and some individuals are more apt to sample than others [Citation29]. Such spontaneous sampling may help a bee colony to track environmental changes and eventually switch to more profitable flowers.

4.2. Bio-inspired modelling of the interaction between persistence and spontaneity

To model persistence in the ‘Jasmine’ robot, we adopted an approach previously used to simulate searching behaviour heuristics in caddisfly larvae [Citation30]. The priority of any particular task Pr(Task) t at a time t is a function F of the robot's current energy level Et and a function G of the current signals It (from both the environment and other robots). Persistence is introduced as a dependence of Pr(Task) t on the priority Pr(Task) t – Δt that existed at previous time interval t – Δt:

(17)
where k is the ‘inertness coefficient’. The larger the coefficient, the more the priority of the current task can be expected to increase as the robot keeps performing that task. As a result, the robot may continue to perform the task even if its energy level and the external signals from the environment and the other robots fall to low levels, or even temporarily vanish, and the priority of other tasks increases. This will cause the robot to focus on a particular task in spite of unpredictable accidental fluctuations of stimulants. We call such a persistent robot ‘inertial’. As a counterbalance to inertial robots, we introduce ‘spontaneous’ robots. In these robots, the priority of a task depends only on the robot's internal state: its current energy level. We expect that inertial robots will switch among tasks R0–R3 less frequently than spontaneous robots. In this way, a separation of the robotic team into specialists and generalists could arise.

5. Implementation of bio-inspired foraging

In implementing a bio-inspired strategy [Citation31], we intend to prove the following assumptions:

i.

Varying swarm density will lead to increased efficiency, but also to dead robots.

ii.

The combination of individual inertial and spontaneous strategies can minimize the number of robots recharging. This will in turn minimize and and should increase swarm efficiency.

iii.

Individual thresholds for critical and hungry states Sc and Sh can be adapted by considering the task Pr(Task) t related to local sensor data and local communication with neighbours. This can lead, at a collective level, to a reduction in the number of waiting robots and so to higher efficiency.

Spontaneous robots follow a simple threshold model to decide whether they are ‘hungry’. As soon as their internal energy value falls below a predefined level Th hungry, they start to ‘feel hungry’ and look for ‘food’. The robots perform a random search and dock when they reach an available station. If a robot encounters any waiting robots during this search, it assumes that the waiting robots need energy more urgently that it does itself, and it returns to its working task. The spontaneous robots consider themselves to be recharged and ready to leave the station when a predefined constant amount of energy is reached. If the robot is not able to dock, it stops at a nearby buffer zone and waits until a free station is available. The behaviour model of the spontaneous robot is shown in (a).

Figure 5. Behavioural model for (a) spontaneous and (b) inertial robots.

Figure 5. Behavioural model for (a) spontaneous and (b) inertial robots.

Inertial robots require a more complex model (see (b)). We introduce a priority Pr(Task) t for each activity R0–R3, which can change following interactions with other team members. While working, the robot starts to ‘feel hungry’ as soon as its energy level falls below the ‘hungry’ threshold. From that point, the ‘hungry feeling’ incrementally steps up the priority of the search task. The search task will be executed when its priority exceeds the priority of the work task. The feedback from the swarm allows us to make assumptions regarding the collective energy level of the swarm and influence its decision-making process. In our model, inertial robots allow the number of working robots to decrease the priority of the work task and the number of waiting robots increase it: current energy level of the robotincreases the search priority, number of working teammatesdecreases the work priority, number of waiting teammatesincreases the work priority. When the search task priority exceeds the work task priority, the robot switches to the recharge role and seeks the docking station. As soon as a slot becomes available the robot docks, otherwise it conserves its energy in the buffer zone. While recharging, the robot starts to ‘feel full’ after its energy level exceeds a predefined recharged energy threshold. As above, the ‘full feeling’ increases the robot's need to change its role. The number of waiting teammates affects the need to change, whereas the number of working robots positively influences the priority of recharging.

6. Experiments with real robots

Implementing ‘not-knowing’ and ‘limited’ strategies assumes realistic capabilities in the robots. Their implementation in the ‘Jasmine’ robots is similar to the simulation, but with some additional restrictions due to the embedded platform. The robots perform in two arenas: a smaller 110 cm × 85 cm = 0.935 m2 and a larger 140 cm × 115 cm = 1.61 m2, as shown in (a) and (b). A recharging station with five docking slots was installed on one wall of each arena, as shown in (c). Two robots, positioned near each end of the recharging station, served as waiting stations and continuously sent a waitMessage. To find the docking station, the robots used a random search. The number of robots in the small arena varies from 3 to 10 (Dsw  = 3.2–10.69). To confirm the optimal and maximal swarm densities, as mentioned in Section 3, several preliminary experiments with 40, 50 and 60 robots were performed in the large arena. With 50 and 60 robots, almost all the robots remained immobile, due to continuous ‘collisions’. This experimentally confirms for this size of arena. In general, sets of 30 and 40 robots demonstrated the predicted decrease of . Due to the energy-limited environment and self-regulation through robots' ‘death’, it was decided to perform the final experiments with 30 robots in the large arena (Dsw  = 18.63, with 15 dead robots Dsw  = 9.25).

Figure 6. (a) Experimental set-up for the small arena; (b) experimental set-up for the large arena; and (c) docking station with IR slots, allowing an ‘energy-smelling’ approach.

Figure 6. (a) Experimental set-up for the small arena; (b) experimental set-up for the large arena; and (c) docking station with IR slots, allowing an ‘energy-smelling’ approach.

Alternative docking station in the large arena. Because of uncertainties arising from their rotations and from reflection of IR light, robots can need several attempts at docking. In experiments in the large arena, with increased swarm density, this can affect the results. To remove this problem, we assume docking is successful when a robot finds a free slot and sends the signal dockingSuccessful to the docking station (i.e. the slot stops sending the signal attractMessage). Thus, the recharging robots gather together in the ‘recharge zone’; see (c). This approach was also selected following experiments with an inductive recharging procedure, in which a strong electromagnetic field is concentrated in the ‘recharge zone’.

Additional restrictions imposed on scenarios. To take a decision, a robot needs feedback from the swarm. This feedback is collected by counting the messages received from working or waiting teammates and is based on the synergetic approach of collective decision-making for randomly changing neighbours [Citation32]. To estimate a robot's need to stay in or leave its current state, three priorities were introduced, similar to those in the simulation: prioWorkTask, prioSearchTask and prioRechargeTask. During work, the robot increases prioWorkTask in steps, until the ‘hungry’ threshold is reached, when the robot starts to increase prioSearchTask. When prioSearchTask exceeds prioWorkTask, the inertial robot switches to the recharging role. The same procedure is followed while the robot is recharging and its energy level exceeds Th recharged. Then, prioWorkTask competes with prioRechargeTask. When prioWorkTask exceeds prioRechargeTask, the robot switches back to the working role.

6.1. Experiments

The first experiments were performed in the small arena, using physical recharging of the robots. The available docking slots were reduced to three and the experiments were executed with 3, 6 and 10 robots. For the small arena, one waiting station was used (otherwise, the search time for finding the waiting station would not be comparable). To deliver comparable results, many of the common parameters were set to be equal for both types of experiments: maximum energy 185 (the energy value of an accumulator recharged up to 90%); Th dead = 120 (the ‘dead’ energy threshold); and Th crit = 150.

After comparing the performance of the two foraging strategies, the experiments were extended to the large arena, with 30 robots. Two strategies, ‘not-knowing’ and ‘limited’ robots, were explored. Two waiting stations were installed, at each end of the docking station. All the experiments were designed to take 10 min to complete. The speed of energy reduction had to be controlled, to allow for several recharge phases during any one experiment. Therefore, in addition to measuring the physical energy value, a variable energyValueSim was introduced, which can be used to simulate the recharge and discharge cycle. In all experiments with real robots, the energyValueSim was updated every 4 s. Thus, a full discharge cycle of a moving robot from maximal energy to the ‘dead’ threshold takes around 4.4 min. The reverse full recharge cycle takes just as long. Each experiment in the small arena was repeated 10 times and in the large arena 5 times. Evaluation of the experiments was performed by reading logfiles from the robots. Since all the experiments demonstrated good repeatability, shows only mean values.

Table 3. Parameters and results for experiments with 3, 6, 10 and 30 robots

‘Not-knowing’ strategy in a small arena. The parameters and results of these experiments are collected in . A sample run in the ‘not-knowing 10’ experiment is shown in The influence of energy reduction on the performance of ‘not-knowing’ robots was studied by comparing the results of the ‘not-knowing 3’, ‘not-knowing 6’ and ‘not-knowing 10’ experiments. In ‘not-knowing 3’, the docking slots provide energy for 3 robots, so enough energy is readily available, and the swarm maintains a high collective energy. Doubling the number of agents (‘not-knowing 6’) reduces both collective energy and efficiency but the swarm manages to stay alive despite inhabiting an environment with limited resources. In the ‘not-knowing 10’ experiment, where energy is very constrained, efficiency remains almost equal to ‘not-knowing 6’, but the collective energy is further reduced, falling below the critical threshold. Also, 3–4 of the 10 robots die, which reduces the swarm size to 6 agents and leads to an efficiency comparable to the ‘not-knowing 6’ run.

Figure 7. Experimental run for the ‘not-knowing’ strategy, with 10 robots. First, the robots execute regular work (1), until Th hungry is exceeded and all the robots ‘become hungry’ (2). Eventually, all recharging slots are occupied and the rest of the swarm clusters around the waiting station (3). Some robots have recharged and the waiting ones again seek a slot (4). As slots are re-occupied, the ‘hungry’ robots again collect at the buffer zone (5). This sequence cycles until some of the robots ‘die’ while waiting or searching (6).

Figure 7. Experimental run for the ‘not-knowing’ strategy, with 10 robots. First, the robots execute regular work (1), until Th hungry is exceeded and all the robots ‘become hungry’ (2). Eventually, all recharging slots are occupied and the rest of the swarm clusters around the waiting station (3). Some robots have recharged and the waiting ones again seek a slot (4). As slots are re-occupied, the ‘hungry’ robots again collect at the buffer zone (5). This sequence cycles until some of the robots ‘die’ while waiting or searching (6).

To summarize, ‘not-knowing’ robots offer the best performance when enough energy is present, marked by a very high collective energy level, but an efficiency of only 38.56%. Reducing the energy source leads to extreme reduction in efficiency, whereas collective energy falls slowly.

‘Limited robots’ strategy. Three types of experiments were conducted to examine the performance of the ‘limited robots’ strategy: inertial robots only, spontaneous robots only and a mixed society of spontaneous and inertial robots. The energetic thresholds used were for inertial robots Th hungry = 170, Th recharged = 178; and for spontaneous robots Th hungry = 180, Δrecharged = 5. The inertness coefficient was set in accordance with the simulated value of medium inertness. As in the ‘not-knowing’ experiments, three docking slots and one waiting station were made available to the swarm. gives an overview of the ‘limited’ strategy experiments.

Limited strategy I: experimental run of an entirely inertial swarm. Inertial robots achieve almost the best possible efficiency when no restrictions exist on energy input. Since more work is done, the collective energy level is not especially high but that does not particularly affect swarm behaviour in the ‘inertial 3’ experiments. Increasing the number of swarm members leads to the reduction of efficiency and collective energy. However, feedback from the swarm and inertness allows the robots to work for longer, which leads to ‘exhaustion’ and the robots ‘die’ even when the swarm has only 6 members. Further reduction of available energy leads to the collective energy falling below the critical threshold; many robots ‘die’ and the efficiency achieved is about 20%. Therefore, inertial robots offer very good efficiency when sufficient energy resources exist but constraining the energy source leads to the reduction of both efficiency and collective energy.

Limited strategy II: experimental run of an entirely spontaneous swarm. Where sufficient energy is available to the swarm, spontaneous robots can recharge whenever they wish, so they recharge frequently and maintain their energy at a high level. However, frequent recharging means that less work is done, so the efficiency is only about 39.33%. Increasing the number of agents changes the results considerably: the agents cannot recharge as much as they want to and so work until real ‘exhaustion’ (Th crit) sets in. Thus, an efficiency of 47.80% is achieved, but collective energy falls severely. Further, constricting available energy further reduces the collective energy, but the efficiency stays comparatively high.

Limited strategy III: experimental run of a mixed swarm. Since the swarm consists of two different subclasses of robots, it can benefit from both strategies. A swarm of 6 robots achieves an efficiency of 37.80% at a collective energy level above the critical threshold. Within a more constrained environment, both parameters decrease slowly and some robots ‘die’.

Experimental run of a ‘not-knowing’ swarm in a large arena. To test the behaviour in a heavily energy-restricted environment, two experiments were performed with 30 robots in the large arena. A sample run of the ‘not-knowing’ scenario is shown in The experimental run is characterized on the one hand by long periods of little or no movement and on the other by periods of high levels of movement, in which the agents essentially hinder each other.

Figure 8. Experimental run of the ‘not-knowing’ strategy with a swarm of 30 robots. All the robots start the experiment with equal energy autonomy and execute sensor network activities (1). When an individual's energy level falls below Th hungry, the robot switches to the recharging role (2). Since all the robots switch almost simultaneously, the docking station is soon occupied by recharging robots and the rest cluster around the two waiting stations (3). Eventually, the recharging robots are ‘full’ and the cluster around the docking station dissolves. At the same time, the waiting robots re-start their search and the clusters at the waiting stations likewise melt away (4). New recharging robots stop at the docking station and block it. The remaining ‘hungry’ robots again collect at the waiting station (5).

Figure 8. Experimental run of the ‘not-knowing’ strategy with a swarm of 30 robots. All the robots start the experiment with equal energy autonomy and execute sensor network activities (1). When an individual's energy level falls below Th hungry, the robot switches to the recharging role (2). Since all the robots switch almost simultaneously, the docking station is soon occupied by recharging robots and the rest cluster around the two waiting stations (3). Eventually, the recharging robots are ‘full’ and the cluster around the docking station dissolves. At the same time, the waiting robots re-start their search and the clusters at the waiting stations likewise melt away (4). New recharging robots stop at the docking station and block it. The remaining ‘hungry’ robots again collect at the waiting station (5).

Experimental run of a mixed swarm in a large arena. For the experiments with large mixed swarms, an equal proportion of inertial and spontaneous robots was assumed. The collective energy and swarm efficiency obtained exactly reflected the behaviour described above. As shown in , half the swarm was ‘dead’ at the end of both experiments and the resulting collective energy was also equal. A difference occurred in the efficiency obtained, which represents the time in which the swarm was free for useful activities. ‘Limited’ robots worked 23% of their time, whereas ‘not-knowing’ robots managed to work for only 13% of theirs. Since the recharging role comprises waiting, recharging and searching, the ‘not-knowing’ robots spent most of their time halted in clusters, either at the waiting station or at the recharging station.

6.2. Discussion of results

Inertial and spontaneous robots differ in their reaction to energy reduction. summarizes the results of the ‘limited robots’ experiments.

Figure 9. Comparison between the energetic performance of the ‘limited robots’ experiments, showing (a) the collective energy level and (b) the swarm efficiency.

Figure 9. Comparison between the energetic performance of the ‘limited robots’ experiments, showing (a) the collective energy level and (b) the swarm efficiency.

Experiments with three docking stations and six spontaneous robots demonstrate the best efficiency of 47.8%. However, the collective energy of the swarm is very low, even falling below the critical threshold; this energetic homeostasis is unstable. Inertial robots keep their collective energy level above the critical threshold, but their efficiency is much lower at 24.7%. Mixing inertial and spontaneous robots provides a compromise – in the ‘mixed 6’ experiments, collective energy, at 154.87, stays above the critical threshold and the robots are free for work for 37.8% of their time.

is an overview of the ‘not-knowing’ and ‘limited robots’ strategies with available energy restricted. The collective energy level is comparable in both scenarios but differentiation occurs in the resulting swarm efficiency. ‘Not-knowing’ robots manage to work in both environments for around 16% of their time, whereas a mixed swarm of three inertial and three spontaneous robots achieves an efficiency of 37.8%, and in the more restricted environment around 30%. So, using a bio-inspired strategy gives the swarm twice as much time to execute useful activities as a strategy in which the agents follow a threshold model.

Figure 10. Comparison between the results of experiments with ‘not-knowing’ and ‘limited’ robots, showing (a) the collective energy level and (b) the swarm efficiency.

Figure 10. Comparison between the results of experiments with ‘not-knowing’ and ‘limited’ robots, showing (a) the collective energy level and (b) the swarm efficiency.

includes a sample for the number of role switches per robot in the ‘inertial 10’, ‘spontaneous 10’ and ‘mixed 10’ experiments where energy is constrained. In a swarm of inertial robots, each individual follows its behaviour model, which necessitates long working times followed by long recharge times. Such long periods in each role result in a low number of changes, with low dynamics at the docking station. All agents behave similarly, so the number of role changes is almost equal for all members. The only exceptions are robots that cannot find the station at all and die while waiting or searching (e.g. robots 2 and 8 in (a)). The role dynamic is completely different for a swarm of spontaneous robots. Their behaviour model requires short recharge times, so role switching occurs more often.

Figure 11. Role dynamics in experiments with ‘limited robots’: (a) ‘inertial 10’, (b) ‘spontaneous 10’ and (c) ‘mixed 10’.

Figure 11. Role dynamics in experiments with ‘limited robots’: (a) ‘inertial 10’, (b) ‘spontaneous 10’ and (c) ‘mixed 10’.

Two robots did not find the docking station (3 and 5 in (b)); but also a difference in the resulting behaviour developed in the rest of the agents. Some (robots 4, 9, 10) changed between the two roles more often than did the others (e.g. 7). That is, robot 7 specialized in one of the two tasks and avoided switching between them, whereas 4, 9 and 10 usually switched. Such emergence of generalists and specialists in the swarm is clearly visible in a mixed society. As (c) shows, the distribution of the number of role changes is highly non-uniform. Since robots 1–5 are all inertial, it may be expected that all will exhibit a similar number of changes. The same holds for the spontaneous robots, 6–10, where small differences may also be expected. But the resulting behaviour is completely different. Role dynamics emerge even with the inertial robots – robot 2 changes as often as would a spontaneous one. Moreover, spontaneous robots increase their discrimination: robot 6 switched very frequently, whereas robot 7 managed to recharge just once. Specialists and generalists emerge from the ‘crowd’, even though the implemented behaviour model stays the same within the subclasses.

(c) and 12(a) show the ‘not-knowing 10’ and the ‘mixed 10’ experiments. ‘Not-knowing’ robots distribute the available energy equally among the membership – everyone receives as much as it needs. In a mixed society, everyone gets as much as the society gains. So, ‘not-knowing’ robots change their roles less frequently and no individualism ever develops. Spontaneous and inertial robots interact and switch more often between roles. Some are specialists in one of the two roles, others are generalists and can do both. Regardless, the choice of which role to play is highly dynamic and situation-dependent, which seems to be the premise for an adaptable and efficient society.

Experiments with large swarms in an extremely energy-constrained environment demonstrated similar role dynamics. (b) and (c) shows a sample of role dynamics for two experiments using ‘not-knowing 30’ and ‘mixed 30’ strategies with 30 robots. The ‘not-knowing’ swarm is characterized by a low number of changes, while the ‘limited’ robots demonstrate completely different role dynamics. Since the spontaneous robots recharge only for short periods, the exchange rate at the docking station is high. Because of the irregular distribution of recharging and working roles within the swarm, the robots do not all try to reach the station at the same time.

Figure 12. Role dynamics in experiments with ‘limited’ robots: (a) ‘not-knowing’, (b) ‘not-knowing 30’ and (c) ‘mixed 30’.

Figure 12. Role dynamics in experiments with ‘limited’ robots: (a) ‘not-knowing’, (b) ‘not-knowing 30’ and (c) ‘mixed 30’.

7. Conclusions

The comparison of energetic performance between a swarm executing a simple threshold model and a swarm following a bio-inspired model showed that, within the same level of energy homeostasis, the swarm efficiency achieved Φ s was doubled by the use of inertial and spontaneous agents. Real robot experiments confirmed that a bio-inspired approach minimized robots recharging. This optimized and , diminishing the energetic bottleneck and delivering better swarm efficiency. It was also demonstrated that adaptation of individual thresholds for the critical and hungry states Sc and Sh, by changing the priorities of tasks Pr(Task), leads to better collective performance. Finally, the experiments confirmed that a variable swarm density, Dsw  = 3.2–18.63, leads to unscalable behaviour (e.g. a bottleneck around ), but achieves energetic self-regulation through robot death. This mechanism increases collective energetic efficiency. In addition, the following points are considered:

Collective knowledge greatly influences the energy‐foraging performance of robots, whereas exact localization abilities do not have much impact.

Social robots, as in the ‘best’ and ‘average’ strategy, cannot satisfy the requirements of an energy-foraging strategy. They achieve very good swarm efficiency but a poor collective energy level. The high tolerance of the needs of other robots leads to self-destruction. The agents work until exhaustion and do not try to preserve their energy. Thus, many agents ‘die’ energetically and collective homeostasis is achieved only at a very low level.

Egoistic robots, as in the ‘not-knowing’ strategy, are also not able to satisfy the requirements. They maintain their collective energy homeostasis at a high level but the swarm efficiency obtained is minimal. In this society, the agents try to work as little as possible. Although it delivers an excellent energy homeostasis, this society is unusable.

In environments where sufficient energy is available, inertial robots outperform spontaneous robots, since in a society of inertial robots, energy is divided equally.

In environments where energy is constrained, spontaneous robots outperform inertial robots. In a society of spontaneous robots, individuals acquire as much energy as they need.

The study of role dynamics showed that, although following the same behavioural model, some agents preferred to remain in one of two roles. Others swapped roles more frequently. Such behaviour was clearly observed when the swarm had to survive in an energy-constrained environment. Since such collective behaviour is not preprogrammed, we can say it emerges from specific spatial interactions in the swarm. The emergence of specialists and generalists is a very important characteristic of a swarm society and although the question of how it is provoked is hard to answer, passive interactions within the swarm and spatial distribution of agents within the arena are two possibilities.

To summarize, the adoption of bio-inspired spontaneous and inertial behaviours as a strategy for energy foraging based on a kinetic model coordinates the energetic needs of individuals to obtain a better collective performance. This coordination is achieved without central control and complex communication, using a very simple algorithm. The bio-inspired approach addresses only the distribution of working/recharging roles and proves that adaptive role distribution can achieve a major improvement in a swarm's energetic performance.

Acknowledgement

Serge Kernbach and Olga Kernbach were supported by the following grants: ‘SYMBRION’, the GA No. 216342 (Future and Emergent Technologies Proactive); and ‘REPLICATOR’, GA No. 216240 (Cognitive Systems, Interaction, Robotics).

Notes

1. www.swarmrobot.org.

References

  • Kernbach , S. 2011 . Handbook of Collective Robotics: Fundamentals and Challenges , Edited by: Kernbach , S. Singapore : Pan Stanford Publishing .
  • I-Swarm . 2003–2007 . I-Swarm: Intelligent Small World Autonomous Robots for Micro-manipulation , I-Swarm , , Germany : European Union 6th Framework Programme Project No. FP6-2002-IST-1 .
  • Kernbach , S. 2008 . Structural Self-organization in Multi-Agents and Multi-Robotic Systems , Berlin : Logos Verlag .
  • Sahin , E. 2004 . Swarm Robotics: From Sources of Inspiration to Domains of Application , Heidelberg : Springer-Verlag .
  • Lüth , T. 1998 . Technische Multi-Agenten-Systeme , München , Wien : Carl Hanser Verlag .
  • Weiss , G. , ed. 1999 . Multiagent Systems. A Modern Approach to Distributed Artificial Intelligence , Cambridge , MA : MIT Press .
  • Silverman , M. , Nies , D. , Jung , B. and Sukhatme , G. Staying Alive: A Docking Station for Autonomous Robot Recharging . Proceedings of the IEEE International Conference on Robotics and Automation . 11–15 May , Washington , DC . pp. 1050 – 1055 .
  • Liu , Y. , Rafailovich , M. , Malal , R. , Cohn , D. and Chidambaram , D. 2009 . Engineering of Bio-Hybrid Materials by Electrospinning Polymer-Microbe Fibers . Proc. Nat. Acad. Sci. U.S.A. , 106 : 14201 – 14206 .
  • Jebens , K. 2006 . Development of a Docking Approach for Autonomous Recharging System for Micro-Robot ‘Jasmine’ , Stuttgart : Studienarbeit, University of Stuttgart .
  • Sempé , F. , Muńoz , A. and Drogoul , A. 2002 . “ Autonomous robots sharing a charging station with no communication: A case study ” . In Distributed Autonomous Robotic Systems , Edited by: Asama , H. 91 – 100 . Cambridge , MA : MIT Press .
  • Kornienko , S. , Kornienko , O. and Levi , P. Minimalistic Approach towards Communication and Perception in Microrobotic Swarms . Proceedings of the International Conference on Intelligent Robots and Systems (IROS-2005) . Edmonton , Canada. 2–6 August . pp. 2228 – 2234 .
  • Lerman , K. and Galstyan , A. 2002 . Mathematical model of foraging in a group of robots: Effect of interference . Auton. Robots , 13 : 127 – 141 .
  • Ulam , P. and Balch , T. 2004 . Using optimal foraging models to evaluate learned robotic foraging behavior . Adapt. Behav. , 12 : 213 – 222 .
  • McFarland , D. and Spier , E. 1997 . Basic cycles, utility and opportunism in self-sufficient robots . Rob. Auton. Syst. , 20 : 179 – 190 .
  • Spier , E. and McFarland , D. 1998 . Possibly optimal decision making under self-sufficiency and autonomy . J. Theor. Biol. , 189 : 317 – 331 .
  • Seth , A. 2002 . Modelling group foraging: Individual suboptimality, interference, and a kind of matching . Adapt. Behav. , 9 : 67 – 91 .
  • Avila-Garcia , O. and Canamero , L. From Animals to Animats 8 . Proceedings of the 8th International Conference on Simulation of Adaptive Behavior . Los Angeles , CA . Using hormonal feedback to modulate action selection in a competitive scenario , 13–17 July . pp. 243 – 252 . MIT Press .
  • Girard , B. , Cuzin , V. , Guillot , A. , Gurney , K. and Prescott , T. From Animals to Animats 7 . Proceedings of the 7th International Conference on Simulation of Adaptive Behavior . Edinburgh , UK . Comparing a bio-inspired robot action selection mechanism with winner-takes-all , 4–11 August . pp. 75 – 94 . MIT Press .
  • Gonzalez , F.M. , Prescott , T. , Gurney , K. , Humphries , M. and Redgrave , P. From Animals to Animats 6 . Proceedings of the 6th International Conference on Simulation of Adaptive Behavior . Paris , France . An embodied model of action selection mechanisms in the vertebrate brain , 11–15 September . pp. 157 – 166 . MIT Press .
  • Nepomnyashchikh , V. and Podgornyj , K. 2003 . Emergence of adaptive searching rules from the dynamics of a simple nonlinear system . Adapt. Behav. , 11 : 245 – 265 .
  • Lund , H. and Parisi , D. Generalist and specialist behavior due to individual energy extracting abilities . Artificial Life V, Proceedings of the 5th International Workshop on the Synthesis and Simulation of Living Systems . Nara , Japan. Edited by: Langton , C. and Shimohara , K. 16–18 May 1996 . pp. 335 – 345 . Bradford Books/MIT Press .
  • Ulam , P. and Balch , T. Niche Selection in Foraging Tasks in Multi-Robot Teams Using Reinforcement Learning . Proceedings of the 2nd International Workshop on the Mathematics and Algorithms of Social Insects . 15–17 December , Atlanta , GA . pp. 161 – 167 .
  • Attarzadeh , A. 2006 . Development of advanced power management for autonomous micro-robots , Stuttgart : Master thesis, University of Stuttgart .
  • Kernbach , S. , Thenius , R. , Kernbach , O. and Schmickl , T. 2009 . Re-embodiment of honeybee aggregation behavior in artificial micro-robotic system . Adapt. Behav. , 17 : 237 – 259 .
  • Wagner , I. , Lindenbaum , M. and Bruckstein , A. 1999 . Distributed covering by ant-robots using evaporating traces . IEEE Trans. Rob. Autom. , 15 : 918 – 933 .
  • Schmickl , T. and Crailsheim , K. 2004 . Costs of environmental fluctuations and benefits of dynamic decentralized foraging decisions in honey bees . Adapt. Behav. , 12 : 263 – 277 .
  • Bonabeau , E. , Dorigo , M. and Theraulaz , G. 1999 . Swarm Intelligence: From Natural to Artificial Systems , New York : Oxford University Press .
  • Labella , T.H. , Dorigo , M. and Deneubourg , J.L. 2006 . Division of labor in a group of robots inspired by ants’ foraging behavior . ACM Trans. Auton. Adapt. Syst. , 1 : 4 – 25 .
  • Hill , P. , Wells , P. and Wells , H. 1997 . Spontaneous flower constancy and learning in honey bees as a function of colour . Anim. Behav. , 54 : 615 – 627 .
  • Nepomnyashchikh , V. , Popov , E. and Redko , V. 2008 . Biologically inspired model of adaptive searching behavior . Opt. Mem. Neural Net. (Inf. Opt.) , 17 : 69 – 74 .
  • Kancheva , T. 2007 . Adaptive role dynamics in energy foraging behavior of a real micro-robotic swarm , Stuttgart : Master thesis, University of Stuttgart .
  • Kernbach , O. 2011 . The synergetic approach towards analysing and controlling the collective phenomena in multi-agents systems , Stuttgart : Ph.D. thesis, University of Stuttgart .

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.