Genetic Algorithms and Neural Networks Based Optimization Applied to the Wastewater Decolorization by Photocatalytic Reaction

This paper proposes a genetic algorithm and a neural network based procedure to estimate the optimal conditions for a dyestuff wastewater treatment process consisting of a heterogeneous photocatalytic oxidation. A simulated dyestuff effluent containing the azo dye Reactive Black 5 is decolorized by a photocatalytic reaction using TiO 2 P-25 as catalyst in the presence of Fe +3 and H 2 O 2 . A simple feed forward neural network with one hidden layer was projected and used to predict the evolution in time of the decolorization of this type of wastewater. The neural model was included in the optimization procedure solved with a simple genetic algorithm. The goal of the optimization is to calculate the optimal reaction conditions (illumination time and amounts of reagents) which assure an imposed value for the transmittance.

Heterogeneous and homogeneous solar photocatalytic detoxification methods (TiO 2 /H 2 O 2 , Fe +3 /H 2 O 2 ) have shown recently great promise for the treatment of industrial wastewater, groundwater and contaminated air.Additionally, the semiconductor mediated photocatalytic process has shown also great potential for disinfection of air and water, thus making possible a number of applications [1][2][3].General description of heterogeneous and homogeneous photocatalysis under artificial or solar irradiation is presented in several excellent review articles [4][5][6][7].
The photocatalytic degradation of azo dyes containing different functionalities using TiO 2 or Photo-Fenton reagent as photocatalysts in aqueous solutions under solar and UV-A irradiation has been described in a series of researches [8][9][10].The combination of both processes, as has been reported by other authors [11,12], leads to an enhancement of the removal rate of the pollutants due to the fact that Fe +3 ions and H 2 O 2 act as scanvengers of the electrons, which are photogenerated in the conduction band of TiO 2 , while the produced Fe +2 ions can participate again in the Fenton reaction.
The phenomenological treatment of such photochemical systems are very complex.In general, the rate of reaction in heterogeneous photocatalytic systems is a complex nonlinear function of catalyst loading, light intensity, initial solution pH, reactant and oxidants concentration etc. Due to these reasons, the ability of systems such as neural networks (NN) to recognize and reproduce cause-effect relationships through training, for multiple input-output systems, has gained popularity recently, in various areas of chemical engineering, also in the field of photocatalytic treatment of wastewater [13][14][15][16].
In recent years, neural networks have been much studied because of their capability to approximate any continuous nonlinear functions.Neural networks possess the ability to learn what happens in a process without actually modeling the physical and chemical laws that govern the system [17].The use of neural networks has become increasingly recommended for applications 816 * email: silvia_curteanu@yahoo.comwhere the mechanistic description of the interdependence between variables is either unknown or very complex.They are now the most popular artificial tool with applications in areas such as pattern recognition, classification, process control, optimization [18][19][20].Different types of neural network applications are reviewed in our precedent work [21].
In correlation with the aspects considered in the present paper, several applications of neural networks can be mentioned.
D. Salari et al. [22] used a three-layer neural network to predict the methyl tert-buthyl ether concentration in a photooxidative degradation process in the presence of H 2 O 2 under UV light illumination.S. Göb et al. [23] has been developed an empirical model based on artificial neural networks for fitting the experimental data obtained in a laboratory batch reactor for the degradation of 2,4-dimethyl aniline (2,4-xylidine), chosen as a model pollutant.A. Duran et al. [24] applied neural network modeling to the degradation of Reactive Blue 4 dye solutions in order to evaluate the use of the Fenton reagent under UV irradiation conditions.In a recent paper, A. Duran and J.M. Monteagudo [25] determine the influence of four parameters (pH and initial concentrations of TiO 2 , Fe(II) and H 2 O 2 ) on the value of the decoloration kinetic rate constant using neural networks.
The solution of an optimisation problem can be found through, among others, deterministic or stochastic approaches.The former composes the traditional optimisation methods (direct and gradient-based methods) and have the disadvantages of requiring the first and/or second-order derivatives of the objective function and/or constraints or of being not efficient in non-differentiable or discontinuous problems.Furthermore, the deterministic methods are dependent on the chosen initial solution and can tend to converge towards local extrema of the fitness function, which is clearly unsatisfactory for problems where the fitness varies non-monotonously with the parameters.The stochastic methods, such as genetic algorithms (GA), do not possess these drawbacks.GAs are part of the so-called evolutionary algorithms and compose a search and optimisation tool with increasing application in scientific problems.They do not need to have any information about the search space, just needing an objective/fitness function that assigns a value to any solution [26].
Because of their flexibility, ease of operation, minimal requirements and global perspective, GAs have been successfully used in a wide variety of problems [27].GA has found various applications in chemical engineering including process control, gas pipeline design, pattern recognition of multivariate chemical data, optimization of reaction rate parameters, multipurpose chemical batch plant design, and scheduling [28].Details about different types of genetic algorithms -simple genetic algorithm or different adaptations for problems with multiple constraints -and a series of their applications in optimization of the chemical processes has been descried in the reviews of Bhaskar [29] and Nandasana [30].
The main goal of this paper is to develop a general procedure based on neural networks and genetic algorithms which could be applied to complex optimization problems.
A series of experiments have been carried out to study the decolorization of a synthetic dye stuff effluent containing the Reactive Black 5 (RB5) as a model azo dye, by photocatalytic degradation using TiO 2 P-25 as catalyst in the presence of Fe +3 and H 2 O 2 .Neural networks are used as efficient modeling tool and genetic algorithms as solving method of optimization; a neural model of the process, trained with experimental data, is included in the optimization procedure.The case study is the estimation of the evolution in time of a dyestuff wastewater treatment process consisting of a photocatalytic oxidative reaction.It is followed a certain decolorization degree related to the optimal working conditions: illumination time, amounts of catalyst TiO 2 P-25, H 2 O 2 and Fe +3 .The GA optimization procedure has proved easily to apply with useful and accurate results.

Reagents
The synthetic wastewater (CSW) was made according to the recipe used for dying of cotton fabrics and was of the following composition: 0.07 g L -1 Reactive Black 5, 0.1 g L -1 HCOOH, 0.250 g L -1 Perydrol FHB 15 %, 0.375 g L -1 Sequion, 0.5 g L -1 Na 2 CO 3 , 0.5 g L -1 NaOH, 3 g L -1 NaCl.The CSW has an initial pH value of 12 and a DOC of 0.15 g L -1 (0.5 g L -1 COD and 0.12 g L -1 BOD 5 ).
The chemical structure of Reactive Black 5 (C 26 H 21 N 5 O 19 S 6 Na 4 ) is given in figure 1.

Procedures and analysis
Experiments were carried-out in a closed Pyrex cell of 500 ml capacity, provided with ports, at the top, for bubbling air necessary for the reaction to take place.The reaction mixture was maintained as suspension by magnetic stirring.Previous to irradiation, the reaction mixture was left 15 miutes in the dark with the aim at achieving the maxium adsorption of the dye onto the semiconductor catalyst surface.The irradiation was performed with a 9 tral lamp.The spectral response of the irradiation source (Osram Dulux S 9W/78 UV-A) according to the producer is ranged between 340 and 400 nm, with a maximum at 366 nm and two additional weak lines in the visible region.The photon flow per unit volume of the incident light was determined by chemical actinometry using potassium ferrioxalate.The initial light flux, under exactly the same conditions as in the photocatalytic experiments, was evaluated to be 7.16 .10 -4 Einstein min -1 .
In all cases, during the experiments, 500 ml acidified (pH 0 =3.2) dye solution containing the appropriate amounts of semiconducting powder, H 2 O 2 and Fe +3 was magnetically stirred before and during irradiation.Specific quantities of samples were withdrawn at periodic intervals and filtered through a 0.45µm filter (Schleicher and Schuell) in order to remove the catalyst particles.With the aim at assessing the extent of color removal, changes in the concentration of the dye were observed from its characteristic absorption band using a UV-VIS spectrophotometer Shimadzu UV-160 A. Since a linear dependence between the initial concentration of the dye solution and his optical density at 585 nm was observed, the photodecomposition was monitored spectrophotometrically at this wavelength.The pH values of the solution were monitored with a Metrohm pH-meter, while the reaction temperature was kept constant at 25 ± 0.1 °C.

Results and discussions Processing of experimental data
Table 1 presents the experimental results obtained in different conditions of initial RB5, TiO 2 P-25, H 2 O 2 and Fe +3 concentrations.
In the first column of the table 1 is given the time intervals for collecting experimental data.For the first ten minutes, the samples were collected from two to two minutes, then the time interval increases to five minutes.Table 1 contains, in the last column, the RB5 concentration from the sample, calculated as function of transmittance and using an initial reference curve.
For the 16 series of experimental data in table 1, the influence of reaction conditions on the rate of dye elimination was studied using different graphical representations.
Figures 2 and 3 render the influence of the ratio between H 2 O 2 and Fe +3 in the presence of 0.25 g L -1 TiO 2 P-25 (fig.2) and 1 g L -1 TiO 2 P-25 (fig.3), on the decolorization process .The best results in both cases -shortest reaction timecorrespond to maximum amounts of the components, respectively 800 mg L -1 H 2 O 2 and 56 mg L -1 Fe +3 .
The influence of TiO 2 P-25 amount on the dye elimination rate is illustrated in figure 4 and the best result corresponds to 1.08 g L -1 .TiO 2 P-25 + 7 mg L -1 Fe +3 Figure 5 presents the influence of Fe +3 and catalyst amounts on the wastewater process working with maximum amount of H 2 O 2 (800 mg L -1 ).The minimum time corresponds to the 1g L -1 TiO 2 P-25 and maximum amount of Fe +3 (56 mg L -1 ).The same influence is shown in figure 6, but for minimum amount of H 2 O 2 (200 mg L -1 ).The best result is obtained in the same conditions as in the previous case, respectively 1g L -1 TiO 2 P-25 and 56 mg L -1 Fe +3 .Further, comparing the figures 5 and 6, for 1 g L -1 TiO 2 P-25 and 56 mg L -1 Fe +3 , 94 % of the dye is eliminated after 2 minutes (fig.5) where the amount of H 2 O 2 was 800 mg L -1 compared to 92 % for 200 mg L -1 H 2 O 2 in figure 6.

Neural network modeling
Neural network based modeling was applied to predict the rate of the process as function of the reaction conditions.Experimental data from table 1 were used to train different neural networks which model the transmittance as function of reaction conditions.10 % of these data represent the validation data set and the remaining data is the training data set.
One major problem in the development of neural network model is determining the network architecture, i.e. the number of hidden layers and the number of neurons in each hidden layer.Firstly, potentially good topologies must be identified.Nevertheless, no good theory or rule accompanies the neural network topology that should be used and trial-and-error is still required.This is done by testing several topologies and comparing the prediction errors.Smaller errors indicate potentially good architectures, i.e. neural network topologies with chances to train well and to output good results [31].
In this study, the four inputs of neural networks are: time (min), amount of catalyst (TiO 2 P-25, g L -1 ), amounts of H 2 O 2 and Fe +3 (mg L -1 ) and the network output is the transmittance (%).
Table 2 contains different feed forward topologies tested with selected training data and the main performances for these networks: MSE (Mean Squared Error), r (correlation between experimental and neural network outputs) and E p (percent error).The structure of a network of MLP type (multilayer perceptron) is given by the number of neurons in the input layer, corresponding to the four input variables, then the number of hidden neurons (in one or two layers) and, finally, the number of neurons in output layer for the output variable.
Hidden neurons, as well as output layer neuron, use hyperbolic tangent as nonlinear activation function.This type of function, compared to linear or logistic activation function, produces better learning.All the network weights were initialized as random numbers in the interval (-0.5, 0.5).The network was trained using back-propagation algorithm.For better performance, momentum is used to allows the network to respond not only to the local gradient, but also to recent trends in the error surface.Without momentum, the neural network may get stuck in a shallow local minimum; with momentum, it can slide through such a minimum.The training is considered finished at the point where the network error (MSE) becomes sufficiently small (less than 0.001).Consequently, a configuration of 4 input neurons, one hidden layer with 10 hidden neurons and an output layer of 1 neuron is selected, having MSE = 0.000678, r = 0.9992 and percentage error, E p = 5.2863 %.

Table 2 DIFFERENT TOPOLOGIES TESTED FOR THE FEED FORWARD NEURAL NETWORKS
We select a neural network from table 2 that best balanced generalization performance against network size and complexity, that means MLP(4:10:1) (marked in grey in table 2).This neural model does not have the best performances, but against other networks with two hidden layers, the differences are not significant.So we prefer a simple architecture with performances good enough.
Good predictions are obtained with the neural model MLP (4:10:1), on training data: average relative errors of 1.2660 % for transmittance and correlation between experimental data and network prediction were 0.9997.Relative errors were calculated using the following formula: E r =p exp -p net /p exp .100(1) where p represents the parameter under study (transmittance), indexes exp and net denote experimental and network values.
Several examples are presented in figures 7 and 8 which show a comparison between the two sets of data, experimental and network outputs.In these figures, series noted 7 and 8 in table 1 are chosen.
White bars represent the experimental values, and grey bars -neural network results.Good agreement between the two data sets proves that the neural model has learned well the behavior of the process -the increase of transmittance with time.
Zero value in the figures 7 and 8 represents a relative value, that means the moment of turning on the UV lamp.In fact, until zero moment on the graphics, the catalyst was mixed with wastewater for 15 min.During this time, a certain amount of dye is absorbed on the catalyst TiO 2 P-25 (5 % approximately).A key issue in neural network based process modeling is the robustness or generalization capability of the developed models, i.e. how well the model performs on unseen data.Thus, a serious examination of the accuracy of the neural network results requires the comparison with experimental data, which were not used in the training phase (previously unseen data).That is why the validation data set is considered as part of experimental data from table 1.The predictions of the network on validation data are given in table 3.
It can be noticed a satisfactory agreement between the two categories of data, experimental and neural network predictions, with an average relative error of 1.3083, maximum relative error less than 4% and a correlation of 0.9982.For this reason, the projected neural model MLP (4:10:1) can be used to make predictions under different reaction conditions, substituting the experiments that are time and material consuming.

Genetic algorithm optimization
The general solution of an optimization problem can be obtained in terms of the following four elements: an accurate model of the process, a selected number of control variables, an objective function and a suitable numerical method for solving the specified optimization problem.
In this study, the optimization problem includes the neural model, which is represented as: The vector of control variables , u, has the components: An admissible control input u* should be formed in such a way that the performance index, J, defined by the following equation, are minimized: (4) subject to: (5) with t representing illumination time, D f -transmittance at the end of the process and D d -an imposed value for this parameter.The transmittance, D, is a measure of the dye elimination from wastewater, with values between 0 and 100 %.High values for this parameter mean high degree of dye elimination.
The constraints are very important to define the range of variation of parameters and to disregard possible solutions that could be interesting in a theoretical approach to the problem.
In other words, the optimization problem to be solved can be formulated as follows: which are the optimal working conditions (time, amount of catalyst TiO 2 P-25, amounts of H 2 O 2 and Fe +3 ) necessary to obtain an imposed transmittance value under the given experimental conditions?
The optimization procedure includes a neural network (NN) model and it is solved with a simple genetic algorithm.The fitness function of the GA is the scalar objective function (4). Figure 9 illustrates this optimization procedure.Genetic algorithm provides, after an iterative calculus, the optimal values for decision variables (time, TiO 2 , H 2 O 2 , Fe +3 ), which are the inputs for the neural network model.With these inputs, the neural network computes the final value of transmittance, D f which will be compared with the desired value, D d .If the two values are identical or there is a very tight difference between them, we can conclude that the task of the optimization, represented by the minimum of the objective function, J, is achieved.
Genetic algorithms are intelligent stochastic optimization techniques based on the mechanism of natural selection and genetics.GAs start with an initial set of solutions, called population.Each solution in the Fig. 9.The optimization method based on NN and GA.

Table 4 OPTIMAL DECISION VARIABLES OBTAINED FOR DIFFERENT VALUES OF THE IMPOSED TRANSMITTANCE
population is called a chromosome (or individual), which represents a point in the search space.The chromosomes are evolved through successive iterations, called generations, by genetic operators (selection, crossover and mutation) that mimic the principle of natural evolution.A set of solutions are analyzed and modified by genetic operations simultaneously, where selection operator can select some "good" solutions as seeds, crossover operator can generate new solutions hopefully retaining good features from parents, and mutation operator can enhance diversity and provide a chance to escape from local optima [32].In a GA, a fitness value is assigned to each individual according to a problem-specific objective function.Generation by generation, the new individuals, called offspring, are created and survive with chromosome in the current population, called parents, to form a new population.
In our GA model, we used real values encoding for the chromosomes.There are other approaches for genetic algorithm based optimization which use binary solution representation, as it is the simplest type of encoding, in which chromosomes are composed only of 1's and 0's.Even the number of alleles is thus rather small (two), this encoding is very common, because it is very easy to use.However, value encoding is more general, because genes are real numbers.Some experiments [33] have shown that real value encoding is more efficient, with better precision of the solutions.
The initial population is generated randomly.Offspring are created by genetic operators and it is stored in a population pool that is a collection of offspring and their parents.
Selection compares the chromosomes in the population aiming to choose those which will take part in the reproduction process.There are different methods for the selection phase; our paper uses rank selection which first ranks the population and then every chromosome receives fitness from this ranking.The recombination (crossover) has as main purpose the recombination the features of two randomly selected parents from the mating pool with the aim of producing better offspring.The variant of crossover used in this study supposes different points for all genes, that means the new individual will no longer be on the line segment that links its parents.After recombination, offspring undergoes to mutation.Generally, the mutation refers to the creation of a new chromosome from one and only one individual with predefined probability.Mutation is used to produce small perturbations on chromosomes to promote diversity of the population.Our GA includes a variant of mutation named resetting.A gene value is reset to a random value in its search interval.The purpose is to refresh the search process, in case when the genetic diversity of the population decreases (so no longer converges to the solution) or the algorithm has converged into a local optimum.
After the three operators are carried, the offspring is inserted into the population, replacing the parent chromosomes in which they were derived from, producing a new generation.The best individual is copied directly into the new population (the elitism technique) and the rest of the individuals are replaced by the new generations.
The termination criterion determines when GA will stop.In other words, the genetic operations are repeated until a termination condition is met.In our implementation, we stopped GA if the maximum number of generations has been executed or the pre-set number of generations without improvement in the last best solution has been reached.
Population size, number of generations, crossover probability and mutation probability are known as the control parameters of genetic algorithms.The values of these parameters must be specified before the execution general.Then, GA should be run many times, choose and keep the best solution obtained along the runs.
For a binary GA, the "scheme theorem" proves that there is a nonzero probability to obtain the solution of a problem.But for GA with real encoding, this theorem is not always valid and it is recommended the multiple execution of the optimization algorithm, followed by the choose of the best solution among those obtained.
The optimization procedure based on a simple genetic algorithm and a neural network model applied in this paper is easy to manipulate and provides accurate results.In this way, a theoretical complete analysis of the decolorization process approached here is performed, with useful information for the practical applications.

Conclusions
This paper provides a general and simple optimization methodology, based on genetic algorithms and neural networks, applied to a wastewater decolorization process.The genetic algorithm solves the optimization problem (minimum transmittance at the end of the process) and the neural network constitutes the model included in the optimization procedure which estimates the evolution in time of a dyestuff wastewater treatment process consisting of a photocatalytic oxidative reaction.. Simple architecture of the neural network is proposed for process modeling: feed forward neural network with one hidden layer.The training and testing phases of the modeling procedure is conducted with experimental data performed in different reaction conditions (catalyst and Fenton reagent amounts).Good predictions are obtained with the neural model in validation phase, so this neural network gives a ver y good representation for the wastewater treatment process analysis.
The simple genetic algorithm proves to be a good tool for solving the optimization problem, providing important information useful in experimental practice.
The method can be easily extended and adapted to other environmental oriented processes, with high chances of providing accurate results by simple handling.