There are many algorithms that fall into the category of CAS; Genetic Algorithms (definition; my C# sample; another good C# sample) are one, Neural Networks are a commonly explored one as well (sample).   However, there are many thousands of algorithms (and variants thereof) that qualify as complex adaptive systems. Most of them have at least one key requirement in common; they require high performance platforms to produce meaningful output in a reasonable time span.

Here’s an example of how high the sheer number of calculations can be for a real-world application, but don’t be too concerned about following what each step here means – just know each is a factor of the final number of calculations.  I am using a fully connected neural network with 40 input neurons, 2 hidden layers and 8 neurons per layer, plus one output neuron, which equals 409 weighted connections between the nodes.  Let’s assume 2 calculations per connection (weight multiplication and addition to the total).  Okay, now I create 20 instances of this (each being an ‘agent’ that will compete with the others while searching for the goal I’ve assigned them).  The estimated number of calculations each agent will do each ‘turn’ for those original 409 values is, conservatively, 12.  So far, that means I’ll have to do approximately 196,320 operations just to run ONE iteration on ONE test data set.  Consider that, in this case, I have over 4000 test data sets (individual stocks) across 1800 days of data and that I will typically want to do 1000 iterations.  This factors up to having to do about 1500 Trillion floating-point operations to exhaustively train the above system.  This would take many weeks on the fastest of current single-processor systems.  Of course, there are methods of eliminating some of these calculations to make the tests work, but this gives you one small example the scope of the problem.

In this system, as in many others, the issue is that the algorithms operate on an analogy with living processes and they contain an implication of parallel execution.  The training algorithm I use is based on the swarming activity of insects – of course, each bug has its own ‘processor’ and a real insect swarm operates in a parallel fashion.  In PCs, obviously the number of processors is going to be limited and at some point each ‘bug’ or ‘agent’ will be sharing a processor with its peers.  Parallel computing in C#/.NET is possible by threading operations using the System.Threading namespace to run independent agents on their own Thread and processor.  And of course you can always achieve parallelism by reaching out beyond your machine to other networked machines with .NET Remoting.

(On a side note, in the mid 1980’s there was a development of an extremely parallel-processing machine called The Connection Machine which was designed for AI algorithms and ran with 65,536 (216) processors.)

So the bottom line is that we want C# to be as efficient as possible.  Fortunately, it appears performance was a big concern for the .NET and C# teams, as most performance tests seem to bear out.  I’ll point to the  C# performance comparison because the tests are at the level of intensity of many evolutionary algorithms and because it breaks performance down by type of calculation.  Double math and trig are the common types of calculation you’ll see in algorithms.  The double calculations should be obvious, but I’ll point out that trig/transcendental functions are often used to introduce nonlinearity into algorithms and are an important factor in evaluating performance.

It’s my opinion that raw C# performance is good enough for most everything you’d ever want to do, even in the processor-hogging world of CAS.  But there is another critical performance factor that is not as obvious but that gives C# a distinct ease-of-use advantage --- IMO.  In C#, we primarily deal with objects that represent the concept or entity we are logically modelling.  Certainly at the method level we write procedural/imperative code to create methods, but a lot of C# work is object-oriented.  The more OO you do, the more concepts you turn into objects, the more method calls you are going to do.  Just look at the stack trace from an ASP.NET page request; you can go perhaps dozens of method calls deep at times and create hundreds of objects for a single request.  While C# and .NET handle this extremely well, the fact is that tons of nested virtual method calls can have an impact on performance when you are working at the extremes CAS algorithms require.   My prime example is a neural network.   If a neural network is implemented in an OO way, it usually has a core node-like object that represents a neuron and which can be linked to other neurons.  Each neuron will generally have some sort of method that gets the output value for that neuron.  Getting the output for a single neuron usually means getting the value for all the underlying neurons it is linked to.  What this turns into, then, is a vast number of recursive method calls to evaluate a single neuron and all its children, grandchildren etc., down to the inputs.

C# along with the .NET framework has at least two major methods of creating dynamic code at runtime (Reflection.Emit and CodeDOM) that will help get around the performance issues of many nested virtual function calls.  What they allow you to do is take loose, object-oriented structures that have been arbitrarily composed at run-time and compile them into a highly optimizable/cacheable single method.  This is exactly what I set out to do in my ArithEmitic project, which simplifies that process of using Reflection.Emit to create a single method from an object graph of interconnected objects.  I use it in my compilable neural network project in which you can assemble Neurons into a feed-forward network, then compile them into a single method (no recursive calls) which the .NET JIT will optimize and run much faster than the nested virtual method calls the object graph of neurons requires.   How much faster?  I took a simple neural net of 8 input neurons and 2 hidden layers of 8 neurons each, fully connected.  It ran in just over 5 seconds ‘uncompiled’ – or 37,643 activations per second.  When I emit a single method that calculates the same thing, it runs in 0.29 seconds or 667,103 – almost 18 times faster.  This kind of gain is possible because by compiling at run time, you are removing all flexibility from the structure and nailing down exactly what it is you want calculated. 

Beyond my performance example, C#’s dynamic code generation opens the doors to all kinds of potentially self-modifying and growing code applications, which are key to some areas of CAS.  There are few limits on what can be accomplished using these dynamic aspects of the language.

Given C#’s native performance characteristics, multithreading and remoting capabilities, and it’s relatively easy options for generating and compiling dynamic code, I feel safe in concluding that C# is a very good choice for implementing most any type of complex adaptive system.  Certainly I have found it to be.