Neural networks and genetic algorithms III
With this article I conclude the series dedicated to the application of genetic algorithms to the design of neural networks. I will explain the most relevant code of the sample application given with these articles, mainly the classes dedicated to the genes management and the selection process. You can find more information in the previous articles of the series.
In this link, you can read the first article in this series, where I explain the ideas that are the basis of this application. In this other link, you can download the source code of the GANNTuning application, written in csharp using Visual Studio 2017.
The source code corresponding to the neural network and its training is based on this other article about the backpropagation algorithm, so you can use it as a reference.
The NNGene class
The NNGene class represents the genes. It is in the GANNTuning.Genes namespace. This class implements the IEquatable and IComparable interfaces, so that the genes can be sorted in a list. These genes represent properties of the structure and configuration of the neural network. The number of neurons in the first layer is stored in the _input variable. The number of hidden layers and the number of neurons in each of them in the _hidden array; if this variable is null, there are no hidden layers, otherwise there will be as many hidden layers as elements in the array; the value of each element is the number of neurons in the layer. The variable _alpha contains the value of the parameter that configures the bipolar sigmoid function of the network's neurons. The rest of the variables of the class store different network errors statistics, which represent their efficiency; the most important is _fcost, which is accessed through the Cost property, and is the value of the cost function.
The calculation of the cost function is done using the ComputeCost method, which receives as a parameter the network to be assessed. This method returns true in case that the cost of the network is lesser than the current cost of the gene.
public bool ComputeCost(Network net)
{
string[] pnet = net.Params.Split(';');
if ((Input != pnet[0]) ||
(Hidden != pnet[1]) ||
(_alpha != net.Alpha))
{
throw new Exception("Network and gene params do not match");
}
double cost = net.FailRate +
(0.00005 * (net.Layers - 1) * (net.Neurons - 1));
if (_fcost > cost)
{
_fcost = cost;
_minerrNN = net.TrainErrorMin;
_maxerrNN = net.TrainErrorMax;
_meanerrNN = net.TrainError;
_miderrNN = net.TrainErrorMedian;
_dverrNN = net.TrainErrorDev;
return true;
}
return false;
}
The cost calculation is based on the error rate of the network. This rate is the percentage of samples that the network has incorrectly classified. This value is penalized according to the complexity of the network, which is calculated by multiplying the number of neurons by the number of layers, excluding the output layer; in this way, comparing two networks with the same failure rate, we prefer the simplest one.
This class has two constructors, in the first one, the gene is built from an existing network. This allows calculating the cost function of any neural network:
public NNGene(Network net, bool compute)
{
_input = net.Inputs;
_hidden = net.Hidden;
_alpha = net.Alpha;
if (compute)
{
ComputeCost(net);
}
}
The second constructor allows creating new genes randomly, or crossing two existing genes:
public NNGene(NNGene g1, NNGene g2, int maxn, int maxh)
{
Random r = new Random();
if ((g1 == null) || (g2 == null))
{
CreateNewGene(maxn, maxh);
}
else
{
double p = r.NextDouble();
if (p < 0.5)
{
Cross(g1, g2);
}
else
{
Cross(g2, g1);
}
p = r.NextDouble();
if ((p < 0.05) || SameAs(g1) || SameAs(g2))
{
Mutate(maxn, maxh);
}
}
}
The parameters g1 and g2 are the two genes we want to cross, or null, if we want to create a new gene from scratch; maxn is the maximum number of neurons per layer, and maxh the maximum number of hidden layers.
The CreateNewGene method simply creates a new gene using random values within the range indicated:
private void CreateNewGene(int maxn, int maxh)
{
Random r = new Random();
_input = r.Next(3, maxn + 1);
int nh = r.Next(maxh + 1);
if (nh > 0)
{
_hidden = new int[nh];
for (int l = 0; l < nh; l++)
{
_hidden[l] = r.Next(1, maxn + 1);
}
}
double da = r.NextDouble();
double sgn = r.NextDouble() > 0.5 ? 1 : -1;
_alpha = 2 + (sgn * Math.Round(da * da * da, 2));
}
The value of _alpha is centered in 2, with a small random deviation up or down.
Crossing genes
Gene crossing in this implementation is not symmetric, so, the crossing order is inverted with a probability of 0.5. The Cross method performs this crossing:
private void Cross(NNGene g1, NNGene g2)
{
_input = g1._input;
if (g1._hidden == null)
{
_hidden = g2._hidden;
}
else
{
if (g2._hidden == null)
{
_hidden = g1._hidden;
}
else
{
_hidden = new int[Math.Min(g1._hidden.Length,
g2._hidden.Length)];
for (int h = 0; h < _hidden.Length; h++)
{
if ((h & 1) != 0)
{
_hidden[h] = g2._hidden[h];
}
else
{
_hidden[h] = g1._hidden[h];
}
}
}
}
_alpha = g2._alpha;
}
The number of input neurons is taken from the first gene. If one of the genes has no hidden layers, the resulting gene takes the hidden layers from the other one. In case the two genes have hidden layers, the new gene will have as many hidden layers as the parent with the minimum number of hidden layers, and the number of neurons in each layer is alternately taken from the first and second gene. The value of the _alpha parameter is taken from the second gene.
Of course, these rules are arbitrary; I have set them as an example only. In a real problem they can be much more elaborated and designed trying to maximize the efficiency of the algorithm.
Mutations
Either as a result of the crossing, the resulting gene is equal to one of its parents, or with a probability of less than 5%, the new gene is mutated using the Mutate method:
private void Mutate(int maxn, int maxh)
{
Random r = new Random();
double da = r.NextDouble();
double p = r.NextDouble();
_alpha = Math.Max(0.5, _alpha +
(p < 0.5 ? Math.Round(da * da * da, 2) :
-Math.Round(da * da * da, 2)));
p = r.NextDouble();
if (p < 0.3)
{
_input = r.Next(3, maxn + 1);
}
p = r.NextDouble();
if (p < 0.3)
{
int nh = r.Next(maxh + 1);
if (nh > 0)
{
_hidden = new int[nh];
for (int l = 0; l < nh; l++)
{
_hidden[l] = r.Next(1, maxn + 1);
}
}
}
}
The value of _alpha is increased or decreased by a small amount. With a probability of 30% the number of input neurons is changed and, again with probability of 30%, the number of hidden layers and the number of neurons in them is randomly modified.
These rules are also arbitrary and are just an example. The process of designing good crossover and mutation algorithms can be laborious. Genes can also manage more parameters and properties of the network than those used in this example. It would also be possible, for example, to have different types of genes with different rules. The system can be made as complex as necessary, in order to optimize the exploration of the entire space of possibilities.
To generate the neural network represented by the gene, the BuildNN method is used:
public Network BuildNN(int inputs)
{
return new Network(inputs, _input, _hidden, _alpha);
}
Where the inputs parameter represents the number of values in each sample.
Implementation of the genetic algorithm
The selection of the genes is implemented in the asynchronous handler of the Click event of the corresponding command button, in the main form, implemented by the MainForm class. Previously you have to load a set of data using the CSVReader class, which will be stored in the _data variable. To obtain a sample with a given percentage of the total data, you have to proceed as follows:
int pc = GetTrainParams();
_data.GetSample((pc * _data.MinSampleCount) / 100);
Where pc is the percentage of samples you want to use for training the network, and it is obtained from the configuration controls in the form, along with the rest of the parameters.
The TrainGene method is used to train and evaluate the network generated by a given gene:
private async Task<Network> TrainGene(NNGene gen, int trials)
{
Network nn = null;
_network = gen.BuildNN(_data.ValueCount);
for (int fc = 0; fc < trials; fc++)
{
_network.InitializeWeights();
InitializeFeedback();
await _network.TrainNetwork(_data.Inputs,
_data.Outputs,
_learningrate,
_momentum,
_iterations);
_network.ComputeErrors(_data.Inputs,
_data.Outputs,
_data.MaxError);
if (gen.ComputeCost(_network))
{
nn = _network;
}
}
return nn;
}
The parameters of this method are the gene to be evaluated and the number of times the training will be performed. The weights of the neurons of the network are initialized randomly each time using the InitializeWeights method of the Network class. This class represents the neural network generated by the gene. The efficiency of the network can vary considerably depending on the initial weights used, so that, performing several assessments could be useful.
The TrainNetwork method executes the training itself, performing as many iterations as indicated in the variable _iterations. In each iteration, the input samples are randomly shuffled, which optimizes the convergence of the learning process.
Once the network has been trained, the statistics related to the errors are calculated with the ComputeErrors method, and the ComputeCost method of the gene is used to determine if the cost of the network is less than the last calculated cost. If that is the case, the improved network is the return value of the method, if not, null is returned.
Going back to the main algorithm, these are the most important variables:
_genes = new List<NNGene>();
double mincost = double.MaxValue;
double maxcost = double.MinValue;
NNGene gene = null;
List<NNGene> nwgen = new List<NNGene>();
int generation = 0;
_genes is a global variable of type List<NNGene>, which contains the entire gene population. The local variables mincost and maxcost contain the minimum and maximum costs found in the last generation, to inform of the evolution of the algorithm in the user interface. The local variable gene is used to create new genes, and nwgen is the list of genes for the new population of the new generation that will replace the worse of the previous generation. The variable local generation is the generation counter.
The gene population is initialized as follows:
for (int i = 0; i < pop; i++)
{
gene = new NNGene(null, null, maxnn, maxh);
_genes.Add(gene);
Network nn = await TrainGene(gene, _trials);
if ((nn != null) && (bestgene.Cost > gene.Cost))
{
bestnn = nn;
bestgene = new NNGene(nn, true);
}
mincost = Math.Min(mincost, gene.Cost);
maxcost = Math.Max(maxcost, gene.Cost);
tbResults.Text = "Population: " + _genes.Count.ToString();
}
From here, in each iteration, a new generation is created using the following rules:
- The first 3 best genes are crossed, each with the next 5 best. In total, 15 new genes are generated.
- The first 3 best genes are crossed, each with 10 randomly selected genes from the entire population. In total 30 new genes are generated.
- 25% of the genes, randomly selected, are crossed with another gene, also randomly selected. The number of new genes will be 25% of the total population.
- A number of genes equal to 25% of the total population are generated at random.
This fragment of code, for example, corresponds to the first of the previous operations:
for (int g = 0; g < 3; g++)
{
for (int ix = 1; ix <= 5; ix++)
{
gene = new NNGene(_genes[g], _genes[g + ix], maxnn, maxh);
nwgen.Add(gene);
Network nn = await TrainGene(gene, _trials);
if ((nn != null) && (bestgene.Cost > gene.Cost))
{
bestnn = nn;
bestgene = new NNGene(nn, true);
}
mincost = Math.Min(mincost, gene.Cost);
maxcost = Math.Max(maxcost, gene.Cost);
tbResults.Text = "Min Cost: " +
mincost.ToString() +
" / Max Cost: " + maxcost.ToString() +
" / Gen.: " + generation.ToString() +
" / New Pop.: " + nwgen.Count.ToString();
}
}
Next, a number of genes of the current population equal to that of the new population are eliminated, selecting for that the genes with the worst scores:
int sup = Math.Max(0, (_genes.Count - nwgen.Count));
_genes.RemoveRange(sup, _genes.Count - sup);
The remaining genes are trained again, to give them a new opportunity to score better, and the new population is added to the list, to restart the process.
Again, these rules are arbitrary. They are selected only as an example. In a real implementation, they must be defined in a thoughtful way, and based on the characteristics of the problem in which we are working.
That's all with respect to this series about the genetic algorithms applied to neural networks. There is a lot to improve in the procedures that I have explained, but they can serve as a base to build more serious applications, or just to mess around a bit with this technology.