376
Views
22
CrossRef citations to date
0
Altmetric
Original Articles

Flexible word meaning in embodied agents

, &
Pages 173-191 | Published online: 20 May 2008

Figures & data

Figure 1. Sony QRIO humanoid robots play a language game about physical objects in a shared scene.

Figure 1. Sony QRIO humanoid robots play a language game about physical objects in a shared scene.

Figure 2. Increasing complexity in the nature of the coupling between form and meaning. Hypothetical example lexicons of one agent are shown for four different models of lexicon formation. Line widths denote different connection weights (scores). (a) One-to-one mappings between names and individuals in the naming game. There can be competing mappings involving the same individual (synonyms). (b) One-to-one mappings between words and single features in guessing games. In addition to synonymy, there can be competing mappings involving the same words (homonymy). (c) Many-to-one mappings between sets of features and words. In addition to synonymy and homonymy, words can be mapped to different competing sets of features that partially overlap each other. (d) Associations as proposed in this article. Competition is not explicitly represented but words have flexible associations to different features that are shaped through language use.

Figure 2. Increasing complexity in the nature of the coupling between form and meaning. Hypothetical example lexicons of one agent are shown for four different models of lexicon formation. Line widths denote different connection weights (scores). (a) One-to-one mappings between names and individuals in the naming game. There can be competing mappings involving the same individual (synonyms). (b) One-to-one mappings between words and single features in guessing games. In addition to synonymy, there can be competing mappings involving the same words (homonymy). (c) Many-to-one mappings between sets of features and words. In addition to synonymy and homonymy, words can be mapped to different competing sets of features that partially overlap each other. (d) Associations as proposed in this article. Competition is not explicitly represented but words have flexible associations to different features that are shaped through language use.

Figure 3. Visual perception of an example scene for robots A and B. On the top, the scene as seen through the cameras of the two robots and the object models constructed by the vision system are shown. The coloured circles denote objects, the width of the circles represents the width of the objects and the position in the graph shows the position of the objects relative to the robot. Black arrows denote the position and orientation of the two robots. At the bottom, the features that were extracted for each object are shown. Since both robots view the scene from different positions and lighting conditions, their perceptions of the scenes, and consequently the features extracted from their object models, differ. Those features that are different between the two robots are denoted in italics.

Figure 3. Visual perception of an example scene for robots A and B. On the top, the scene as seen through the cameras of the two robots and the object models constructed by the vision system are shown. The coloured circles denote objects, the width of the circles represents the width of the objects and the position in the graph shows the position of the objects relative to the robot. Black arrows denote the position and orientation of the two robots. At the bottom, the features that were extracted for each object are shown. Since both robots view the scene from different positions and lighting conditions, their perceptions of the scenes, and consequently the features extracted from their object models, differ. Those features that are different between the two robots are denoted in italics.

Figure 4. A possible representation for the word ‘dog’ in English. Every feature associated with the form ‘dog’ is scored separately.

Figure 4. A possible representation for the word ‘dog’ in English. Every feature associated with the form ‘dog’ is scored separately.

Figure 5. Flow of one language game. A speaker and a hearer follow a routinised script. The speaker tries to draw the attention of the hearer to a physical object in their shared environment. Both agents are able to monitor whether they reached communicative success and thus learn from the interaction by pointing to the topic of the conversation and giving non-linguistic feedback. Populations of agents gradually reach consensus about the meanings of words by taking turn being speaker and hearer in thousands of such games.

Figure 5. Flow of one language game. A speaker and a hearer follow a routinised script. The speaker tries to draw the attention of the hearer to a physical object in their shared environment. Both agents are able to monitor whether they reached communicative success and thus learn from the interaction by pointing to the topic of the conversation and giving non-linguistic feedback. Populations of agents gradually reach consensus about the meanings of words by taking turn being speaker and hearer in thousands of such games.

Figure 6. Dynamics of the language games in a population of 25 agents averaged over 10 runs of 50000 interactions. Values are plotted for each interaction along the x-axis. Communicative success: for each successful interaction (the hearer understands the utterance and is able to point to the object that was chosen as topic by the speaker), the value 1 is recorded, for each failure, 0. Values are averaged over the last 100 interactions. Average lexicon size: the number of words each agent knows is averaged over the 25 agents of the population. Lexicon coherence: this is a measure of how similar the lexicons of the agents are. For each word form known in the population, the similarity function described in Section 3.1 is applied to all pairs of words known by different agents and the results are averaged. The value 1 means that all 25 agents have identical lexicons,−1 means that they are completely different (each agent associates completely different feature sets to each word form), and the value 0 means that the number of shared and non-shared features in the words of different agents is equal. Error bars are standard deviations across the 10 different experimental runs.

Figure 6. Dynamics of the language games in a population of 25 agents averaged over 10 runs of 50000 interactions. Values are plotted for each interaction along the x-axis. Communicative success: for each successful interaction (the hearer understands the utterance and is able to point to the object that was chosen as topic by the speaker), the value 1 is recorded, for each failure, 0. Values are averaged over the last 100 interactions. Average lexicon size: the number of words each agent knows is averaged over the 25 agents of the population. Lexicon coherence: this is a measure of how similar the lexicons of the agents are. For each word form known in the population, the similarity function described in Section 3.1 is applied to all pairs of words known by different agents and the results are averaged. The value 1 means that all 25 agents have identical lexicons,−1 means that they are completely different (each agent associates completely different feature sets to each word form), and the value 0 means that the number of shared and non-shared features in the words of different agents is equal. Error bars are standard deviations across the 10 different experimental runs.

Figure 7. The meanings of the first three words of agent 1 (out of a population of 25 agents) and the corresponding meanings in the lexicons of agents 2, 3, and 4 after 10,000 interactions. The numbers on the right side are scores of the association to the feature.

Figure 7. The meanings of the first three words of agent 1 (out of a population of 25 agents) and the corresponding meanings in the lexicons of agents 2, 3, and 4 after 10,000 interactions. The numbers on the right side are scores of the association to the feature.

Figure 8. Examples of flexible word meanings. A population of 25 agents played 50,000 language games. Each graph shows, for one particular word in the lexicon of agent 1, the strength of the association to different features. In order to keep the graphs readable, the agents have access only to a subset of the 10 sensory channels (width, height, luminance, green-red, yellow-blue).

Figure 8. Examples of flexible word meanings. A population of 25 agents played 50,000 language games. Each graph shows, for one particular word in the lexicon of agent 1, the strength of the association to different features. In order to keep the graphs readable, the agents have access only to a subset of the 10 sensory channels (width, height, luminance, green-red, yellow-blue).

Figure 9. (a) The interpretation performance of one new agent that is added to a stabilised population. For each word this agent adopts, the communicative success at the first, second, third, etc. exposure is measured and averaged over all the words in the lexicon of that agent. (b) The impact of the different perceptions on the lexicon: for each sensory channel, the average association score for channel features is shown, given all words in the population. In the legend, for each channel the average difference between the perception of robots A and B for all scenes in the data set is shown.

Figure 9. (a) The interpretation performance of one new agent that is added to a stabilised population. For each word this agent adopts, the communicative success at the first, second, third, etc. exposure is measured and averaged over all the words in the lexicon of that agent. (b) The impact of the different perceptions on the lexicon: for each sensory channel, the average association score for channel features is shown, given all words in the population. In the legend, for each channel the average difference between the perception of robots A and B for all scenes in the data set is shown.

Figure 10. The effect of the amount of structure in a simulated world on the structure of the emerging language. Features are represented as nodes in a directed graph and feature nodes that are connected by edges will occur together in simulated perceptions of the world. (a)–(c) The co-occurrence graph used in conditions 1 (highly unstructured world), 3 and condition 5 (highly structured world). (d) The average number of features associated to each word for conditions 1–5. Values are averaged over all words in the population. Error bars are standard deviations over 10 repeated series of 50,000 language games each.

Figure 10. The effect of the amount of structure in a simulated world on the structure of the emerging language. Features are represented as nodes in a directed graph and feature nodes that are connected by edges will occur together in simulated perceptions of the world. (a)–(c) The co-occurrence graph used in conditions 1 (highly unstructured world), 3 and condition 5 (highly structured world). (d) The average number of features associated to each word for conditions 1–5. Values are averaged over all words in the population. Error bars are standard deviations over 10 repeated series of 50,000 language games each.

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.