Swarm Intelligence with Evolutionary Learning for Unmanned Vehicle Control

Dr. David P. Brown, Senior Principal Analyst

Unmanned vehicles continue to expand their roles in many applications.  This trend will likely accelerate in the future, as Congress has directed the FAA to develop a plan for the safe integration of unmanned aircraft within the U.S. by September 30, 2015.  To operate autonomously, unmanned vehicles must be capable of making complex decisions.  One method receiving increased attention for solving these types of complex problems is swarm intelligence.  Swarm intelligence is derived from biological organisms such as ants, bees, birds, and other species.   This approach may provide superior results at a much lower costs when compared to rule-based programming.  Since unmanned vehicles are not birds or ants, their decision making must be optimized to their sensors and other onboard systems.  Again, nature provides a model to optimize these systems using evolutionary learning.  This approach is extensible to applications far beyond that found in nature by allowing for the creation of system of systems logic and control.

Swarm Intelligence

Swarm intelligence is used by many types of social creatures within nature such as colonies of ants and bees.  While an individual ant or bee is a very simple creature with very little actual intelligence or problem solving skill, the colony, working as a self-organizing group with no leader or command and control system can perform very complex tasks through collective local interaction with their environment.  Additionally, by working as a team these insects create synergies well beyond their individual capabilities.  As an example, ants have been found to be able to carry over 10 times more weight working together as a group than the same number of ants could individually carry.[i]

Within nature, species that use swarm intelligence can self-organize to efficiently search for and locate a food source and then recruit other members to exploit that source.  If they sense a threat, they can “mark” the threat (using pheromones) signaling other members of danger and enlisting their help in defending their colony.  Many species have different subspecies that specialize in specific tasks.  While a subspecies will be the first to perform their specialty task if required, they will assist with other tasks if there is no need for their specialty at a given time.

Applications

There are a number of possible applications for swarm intelligence in both civil and military unmanned vehicle operations.  Similar to social insects searching for food, swarm intelligence may be useful for moving vehicles in a group, missions requiring search over a large area by multiple units, tracking of items of interest, and offensive or defensive military operations.   Unmanned vehicles come in a variety of different types; each with its own mission specialties and strengths.  Operations may benefit from the self-organizational aspects of swarm intelligence using different unmanned vehicles.   Swarm intelligence would likely provide the most value if implemented in unmanned vehicles which are currently in development.  It may not be feasible or tactically desirable to remain in constant contact with these unmanned systems, requiring these vehicles to have at least some autonomous capabilities.  It is also possible that operating multiple vehicles together using swarm intelligence might gain synergies of effectiveness similar to the ants cited above that would far exceed the capabilities of any individual vehicle.  An added advantage is that since individual elements of a swarm do not need complex decision processes (complexity is derived through group action), these systems should take less time to develop and be acquired at lower cost.  Additionally, use of artificial intelligence allows these systems to evolve in response to an adaptable enemy in ways that that systems using rule-based programming cannot.

Artificial Intelligence Techniques for Swarm Agents

One of the key questions for implementation of swarm intelligence in machines is determining the best mechanism.  There has been considerable research on social insects and researchers have been able to develop algorithms to model individual behaviors that mimic these biological units.  These algorithms can generally be classified as rule-based techniques, as each unit is programmed via a set of mathematical formulae and a set of rules for making decisions.  While it has been shown that complex behaviors can be accomplished by combining multiple simple behaviors, such an approach can result in large programming requirements.  Additionally, these elements would lack the evolutionary adaptability of their biological counterparts that are essential to the survival of these species within nature. Case-based reasoning is a generalization of rule-based reasoning.

A more promising approach is the use of artificial intelligence to create the individual elements for swarm intelligence.  One AI technique that is adaptable to implementing a swarm is the Bayesian network.  A Bayesian network consists of a directed acyclic graph that represents dependencies among variables, together with local probability distributions defined for small clusters of directly related variables. Directed acyclic graphs consisting of nodes which represent the variables and arcs (or directed edges) that describe cause and effect relationships or statistical associations between the variables.[ii]  When utility and decision nodes are used in a directed graph representing options, probabilities over consequences or utilities over consequences in a decision problem, this is referred to as an influence diagram.

Optimization through Evolutionary Learning

When constructing Bayesian networks, values are typically applied through learning cases or elicitation from experts.   As an alternative, Innovative Decisions, Inc. (IDI) has pioneered the capability to train artificial intelligence applications through interaction with modeling and simulation.  Using this approach, networks make random decisions during multiple simulation scenarios.  Based on the outcome of the scenario, they receive positive and/or negative feedback.  Over many simulation runs, the artificial intelligence applications adjust their numeric values through an evolutionary optimization algorithm as they learn how to make good decisions.  One research project demonstrated that a Bayesian network using this technique was better able to identify and classify aircraft during simulated surface to air engagements than a Bayesian network using probabilities elicited from a group of U.S. Navy surface warfare officers.[iii]

SIM ANT Demonstration

In order to demonstrate the utility of this approach, a simulation was created that mimics a group of ants collectively searching for food.  The simulation can be run with one to four ants in the search area.  Ants search for food by using their antennae which provide both touch and smell.  Ants search as a group by leaving a pheromone trail as they move.  If an ant smells pheromones, it knows this area has already been searched and will move away from the smell to search in an uncovered area.  If an ant smells food, it moves in a direction toward stronger scents until it finds the food.  Ants use their antennae to locate and avoid obstacles. 

For the simulation, the ants are confined to a 100 by 100 enclosed area that contains five obstacles.  The start location, food location, and obstacle location are randomly positioned at the start of each simulation as shown in figure 1.

Figure 1: Simulation Setup

Figure 1: Simulation Setup

As the ant searches for food, it must make a decision as to which direction to go based on the following hierarchy:

  • Turn away from obstacles or the search area boundary
  • Turn toward stronger smells of food
  • Turn away from the smell of pheromones

The decision hierarchy is controlled by the assignment of values in the utility nodes of the network.  The search decision network is shown in figure 2.

Figure 2: Search Decision Network

Figure 2: Search Decision Network

A simulation was then constructed using ExtendSim, a modeling and simulation tool by Imagine That Inc.  The decision network is then interfaced to the simulation using a custom library of simulation blocks.  At each step in the simulation, each ant uses its antennae to feel for obstacles and smell for food or pheromones.  The antennae input are then fed into the decision network as the node inputs shown in the top row of figure 2.  The decision network calculates whether to continue forward or to turn left or right on the next move.  The process is repeated until one ant moves into the same cell as the food.

Baseline performance for searches using one to four ants is shown in figure 3.  Each data point represents the average of 10,000 simulations.

Figure 3: Baseline Search Performance

Figure 3: Baseline Search Performance

The total number of moves is relatively the same for all numbers of ants.  The time of search improves by adding additional ants, but each additional ant added to the search provides less improvement indicating that a point will eventually be reached were adding an additional ant provides no improvement.

When the simulation was run, the simulated ants found the food but did not search like ants in nature.  The simulated ants tended to move in straight lines until they sensed something that caused them to turn.  Anyone who has observed real ants searching for food has noticed that they use a random, zig-zag search pattern.  In order to improve the performance of the ants, they must use a similar search pattern.  The problem is that the patterns appear completely random defying a mathematical approach, and we cannot ask the ants how they search.

Evolutionary Learning

The unique combination of the ExtendSim simulation software with a Netica Bayesian network allows for the discovery of the optimal search pattern using the same technique as the ants: evolutionary learning.  The ants are now allowed to turn left or right at random intervals.  The primary variable to be solved is how long the ant should move in a set direction before it changes direction.  The ants select a random probability that it will turn with no smell or obstacle inputs and test that probability over multiple runs.  They will then select other probabilities and test over multiple runs keeping track of which ones allowed them to find the food in the shortest time with the lowest total moves.  As mentioned earlier, the food, start position and obstacles are randomly placed in the search area during each run.  This is necessary so the ants learn the optimal search pattern instead of a single optimal search for one set of conditions.  The simulation continues until the ants cannot significantly improve upon the best solution.  In nature, this is how biological species have learned to accomplish tasks.  Fortunately with modern computers, we can accomplish hundreds or even thousands of years of evolutionary development in hours.  Test and feedback remain critical elements throughout the learning process.  For example, it was discovered that if a Bayesian network calculates an equal value for turning left or right, it will always make the first solution on the list of decision options.  If right appears as a decision option before left, the ants will move in a clockwise circle.  To eliminate this problem which is an artificiality of the software, a turn bias node was added as shown in figure 2 so that if the ant turns right, all things being equal it will be biased to turn left on the next turn.

Simulation Results

After insertion of the bias node and completion of evolutionary learning, the simulated ants exhibited search patterns similar to real ants as shown in figure 4.

Figure 4: Four Ant Search Optimization

Figure 4: Four Ant Search Optimization

The ants were again run through 10,000 simulation runs with the results shown in figure 5.

Figure 5: Comparison of Search Patterns

Figure 5: Comparison of Search Patterns

As can be seen in figure 5, evolutionary learning of an optimized search pattern allows the ants to find the food using 30% fewer moves with much more consistent results.  The mean time to find the food also improved significantly, with one ant using the optimized search pattern nearly as fast as two ants using the baseline, non-optimized search.

Optimizing Systems of Systems

While use of swarm intelligence for individual and collective unmanned vehicle control holds great promise, the methods described above have exciting possibilities to provide capabilities far beyond that found in nature.  While both ants and bees have developed swarm intelligence patterns allowing them to perform different functions that collectively exhibit complex behaviors, they still operate as separate biological systems.  An interesting question is how much better these two species (or any number of biologic creatures) could perform if they worked together leveraging their different capabilities in a system of systems.  For example, how much faster could ants find food if bees provided aerial reconnaissance giving them situational awareness of the search area.  What if bees and ants built their nests in the same location and used collective defense against threats enabling both ground and air detection of intruders and a combined air/ground assault against a threat.  While such experiments cannot be performed in nature, they can easily be accomplished using the simulation technique described above.  System of systems optimization can be performed just as easily as that for single systems by simulating the capabilities of the different vehicles and optimizing them to work together toward collective goals.

[i] Bonabeau, Dorigo and Theraulaz. (1999) Swarm Intelligence: From Natural to Artificial Systems.  Oxford University Press.

[ii] Jensen, Finn V., (1996) An Introduction to Bayesian Networks, UCL Press Limited, London, UK.

[iii] Brown, David. (2008) Rapid, Low Cost Modeling and Simulation: Integrating Bayesian and Neural Networks as an Alternative to an Equation-Based Approach. VDM Verlag Dr. Muller, Saarbrucken, Germany.