Groundwater Quality Variation and Regression Analysis – a Case Study Around Municipal Dumpsite in India

The quality of water around a municipal dumpsite is greatly affected by the leaching chemicals from the landfill. The aim of this study is to assess the groundwater quality and to develop and compare the performance of Statistical Package of Social Science (SPSS) regression and Artificial Neural Network models around municipal dumpsite in Tamil Nadu, India. The groundwater samples were collected every month from the 16 sampling points during the study period from January 2013 to December 2017. The physico chemical parameters of the samples such as pH, acidity, alkalinity, Hardness, Chloride, Sulphate and Total Dissolved Solids (TDS) were analysed and Water Quality Index (WQI) was arrived. From this data, the highest and the lowest polluted points S14 and S5 respectively, among the 16 sampling points was found. Correlation analysis showed that TDS exhibited a high positive correlation with chloride and hardness. Two models using SPSS regression and one model using ANN modeling were developed to predict the TDS in the sampling points. The prediction capabilities of the ANN were compared with the SPSS regression models. The maximum percentage of error obtained from ANN and SPSS were 7.5% and 15.6% at S5 sampling point. ANN models were more accurate than the SPSS multi nonlinear regression models having the same inputs and output.


1.Introduction
Water is the most basic element on which the life on depends for survival. The total freshwater quantity accounts to 3% of earth's water, of this only 1% is available for human use [1]. This freshwater is present as both surface water and groundwater. We depend on these freshwater sources to meet our daily needs. Due to poor management of surface water sources, urbanisation and frequent pollution, the availability of surface water sources is in a decreasing trend [2]. At present, groundwater is the most sought-after freshwater source compared to surface water sources. This because the groundwater is a replenishable, comparatively safe and reliable water source [3].
In India, the municipal solid waste collected from most of the cities and towns is disposed by open landfill method. Organic matter such as food waste, paper, plant and animal residue etc., constitutes to about 80% in MSW, while the remaining contents are inorganic in nature [4]. The inorganic materials present in MSW are electronic waste, plastic, textile waste products, these materials affect the quality of environment leading to pollution of land and water resources [5]. The solid waste also contains heavy metals and toxic chemicals which leach out from the open dumpsite [6]. The major cause for the generation of leachate from land fill is the rainwater percolating through the solid waste. This leachate varies widely in composition with respect to the nature of waste and age of the landfill. It usually contains both dissolved and suspended material [7]. Landfill leachate is characterized by high organic, inorganic, xenobiotics, and heavy metal concentrations and is extremely toxic to the groundwater [8].
The artificial neural network (ANN) model is regarded as a highly useful tool for establishing complicated non-linear relations of parameters [9]. Warren M and Walter P first proposed the conceptual theory of Artificial Neurons in the year 1943 [10]. Neural networks are composed of highly interconnected, simple processing units which are inspired by neural process observed in the human Geological strata in Ramaiyanpatti is of horn blended biotite gneissic formation overlaid by weathered rock formation followed by thin soil. The general rock formation is striking in east-west direction and dipping towards south with an angle of 75 o S. It is found that limestone flanked by Kankar followed by quartzite on the northern side and magnesium limestone, calcareous quartzite and calcgneiss on the southern side. The study area falls under pediment geomorphic unit with the absence of lineaments. In general, the Pediments are hard rock terrains forming outcrops with or without soil cover.

Sampling method
The water samples were collected from 16 sampling points every month for 5 years (January 2013 to December 2017) ( Table 1). From the open wells, the water samples were collected at a depth of 0. All the samples collected were analysed for their quality. The results obtained were recorded immediately and kept safely. For quality analysis, the experiments such as pH, acidity, alkalinity, sulphate, hardness, chloride and TDS were conducted for each sample.

Water quality index (WQI)
To convert large quantities of water quality data into a single number, Water Quality Index (WQI) is used which represents the overall drinking water quality status [12]. The WQI was computed for each sampling point. In this study work, weighted arithmetic water quality index method has been used for computing WQI [13]. Water Quality Index (WQI) which was calculated in three stages. In stage 1, each of the 6 parameters has been assigned a weight (wi) according to its influence in the overall quality of water for drinking purposes ( Table 2). The maximum weight of 4 has been assigned to the parameters pH, TDS and Sulphate due to their importance in water quality assessment. Weight of 3 has been assigned to the parameters Hardness and chloride. For Alkalinity the minimum weight of 2 is assigned which indicates that, it may not be deleterious. In stage 2, the relative weight (Wi) is computed from the following equation: where, Wiis the relative weight of each parameter and n is the number of parameters. Stage 3, a quality rating scale (qi) is assigned for each parameter by dividing its respective standard according to the guidelines by IS10500:2012 and the result multiplied by 100.
where, qi is the quality rating, Ci is the concentration of each parameter in each water sample, and Si is the IS10500:2012 drinking water standard for each parameter. For computing the WQI, first Sli for each parameter is determined, which is then used to determine WQI by the following equation Sli is the sub index of i th parameter: qi is the rating based on concentration of i th parameter and n is the number of parameters [14].

Correlation analysis
In this study, the total number of data collected by analysis the water samples were 704. Because of this large data set, descriptive analysis was carried out. From the descriptive analysis minimum, maximum, mean, range, standard deviation, variance, skewness and kurtosis were obtained [15]. Correlation analysis is a preliminary bivariate technique adopted to establish the degree of relationship between the parameters involved in a physico chemical process [16]. If the correlation coefficient is close to +1 or -1 then the relationship between the parameters is good and if the coefficient is near zero there is no relationship between the parameters at significant level (p) of < 0.01 [17].

Statistical analysis
The Statistical Package for the Social Sciences (SPSS) was used by researchers in various fields for analysing complex statistical data. It involves some advanced inferential and multivariate statistical procedures such as factor analysis, discriminant analysis, analysis of variance, etc., [18]. This analysis technique helps to simplify and organize large sets of data in order to make useful generalizations and insight [19]. In this study a relation was established by using SPSS regression analysis. TDS was considered as dependent parameter whereas Sulphate, chloride and temporary hardness were independent parameters.

ANN modelling
The ANN is used to learn complex input and output relationships. They require no clearly defined algorithms or theory rather they have a property of acquiring knowledge through the presentation of examples. A neural network consists of at least three layers, i.e., input layer, hidden layer and output layer, where the inputs are fed in the input layer and outputs were attained at the output layer and learning is achieved when the associations between a specified set of input, output pairs are established [20].
In this study, ANN model was developed to establish the relationship between TDS and the related parameters like Sulphate, Chloride and temporary hardness. Artificial Neural Network architecture consists of an input layer with the number of processing elements equal to the predictor variables and an output layer with the number of processing elements equal to the predicted variables. In between the input and output layers, there are hidden layers and the number of processing elements in each hidden layer was fixed on trial and error and depend on the desired accuracy of the model [21]. ANN was used to design the relationship between the independent variables and the dependent variable, for the sets of data obtained from the water quality analysis. The Back Propagation (BP) algorithm was used to predict the target. Two hidden layers, each with 10 neurons were used (

3.Results and discussions
The summary of physico chemical parameters of the groundwater was given as statistical descriptive (Table 3). Table 3. Descriptive statistics of groundwater samples

pH
The intensity of acidity and alkalinity of water is measured by pH. It is the major factor which influences all chemical and biological process [22]. From the water quality analysis there is a periodic variation in the values of physico-chemical parameters over the study period of five years. It is found that the values of pH had increased altogether from 6.71 to 8.56 during the period January 2013 to December 2017. The lowest pH value of 6.71 was found at the sampling point S4 during the month of May 2013. The highest pH value was found at the station S11 during the month of December 2017 which was 8.56. The pH values of all the groundwater samples around the dumpsite are within the permissible limit of pH 6.5 -8.5 as per IS 10500 -2012, except the sampling station S9 and S11 during December 2017.

Acidity and alkalinity
The capacity of water to neutralize an acid is termed as Alkalinity. It is mostly due to the presence of carbonate, bicarbonate and Hydroxyl ions. Acidity known as the number of ionizable hydrogen ions present in one molecule. Acidity may also be induced through carbon di oxide and is called carbonic acidity. If pH is more than 7, carbonic acidity may be present in the sample [23]. It is observed that the acidity and alkalinity value vary from 6.00 mg/L to 73.00 mg/L and 84.00 mg/L to 488.00 mg/L respectively during the period January 2013 to December 2017. It is also observed that there is regular increase of the alkalinity in rainy season and there is a decrease in summer season. From the above study it is observed that the acidity and alkalinity value is lowest in the month of March 2013 and September 2013 at station S6. The lowest acidity and alkalinity values are 6.00mg/L and 84.00 mg/L respectively. The highest acidity and 122 alkalinity values are observed at the station S11 and S15 during the month of December 2017 and its values are 73.00 mg/L and 488.00 mg/L.

TDS
TDS is mainly induced by anionic and cationic substances present in the water sample in dissolved manner [24]. The permissible limit of drinking water standard for TDS prescribed by BIS is 500 mg/L. It is found that the concentration of TDS had varies from 334 mg/L to 3448 mg/L during the period January 2013 to December 2017. From the research work it is observed that the TDS concentration is lowest at the station S5 during the month of August 2013. The lowest TDS value is 334 mg/L. The highest TDS concentration is observed at the station S14 during the month of December 2017. The highest TDS value is 3439 mg/L. The high concentrations of TDS in groundwater near the dumpsite are due to the percolation of leachate. The flow of leachate is also more during monsoon season. From the results it was observed that there is a regular increase of TDS in the rainy season and there is a decrease of TDS in the summer season. The TDS depends upon dissolution of leachate and rock minerals due to the action of water which penetrates through the rock strata. During the rainy season, the penetration of water is high. Hence the TDS values are more. But during the summer season, due to lack of rainfall and evaporation of moisture content from Municipal Solid Waste, penetration of water is low. Therefore, the TDS value in groundwater is low.

Hardness
Hardness is induced by divalent metallic cationic substances mainly calcium and magnesium [25]. BIS has prescribed 300 mg/L as CaCO3 as the acceptable limit and 600 mg/L CaCO3 as the permissible limit for total hardness. It is observed that the hardness value varies from 38.00 to 1353.00 mg/L as CaCO3 during the period January 2013 to December 2017. It is also observed that there is regular increase of the hardness value in rainy season and there is a decrease of the harness value in summer season. From the above study it is observed that the hardness value is lowest at the station S5 during the month of January 2013. The lowest hardness value is 38.63 mg/L as CaCO3. The highest hardness value is observed at the station S15 during the month of December 2017. The highest hardness value is 1353.69 mg/L as CaCO3. The concentration of hardness was exceeding the permissible limit in sampling points S4, S10, S11, S13, S14, S15 and S16 throughout the sampling period.

Chloride
Chloride is an anion formed by dissolving chloride compounds primarily hydrogen chloride in water or any other polar solvents [26]. The parameter is compared with drinking water standard as per IS 10500-2012, which has a permissible limit of chloride as 250 mg/L. It is observed that the chloride value varies from 54.5 mg/L to 909.5 mg/L during the period January 2013 to December 2017. It is also observed that there is regular increase of the chloride concentration in rainy season and there is a decrease of the chloride concentration in summer season. From the above study it is observed that the chloride value is lowest at the station S9 during the month of September 2013. The lowest chloride value is 54.5 mg/L. The highest chloride value is observed at the station S13 during the month of December 2017. The highest chloride value is 909.5 mg/L. During the sampling period, the chloride concentration in the sampling points S1, S2, S5, S6, S9 and S12 were found to be within the permissible limit, while the chloride concentration was above the permissible limits in sampling points S4, S7, S13, S14, S15 and S16 throughout the sampling period.

Sulphate
The presence of sulphate in the groundwater around dumpsite is generally due to the dissolution of the waste products formed from decaying organic matter [27]. The parameter is compared with drinking water standard as per IS 10500-2012, which has a permissible limit of 200 mg/L. It is observed that the sulphate value varies from 31.07 to 359.82 mg/L during the period January 2013 to December 2017. It is also observed that there is regular increase of the sulphate concentration in rainy season and there are decreases of the sulphate concentration in summer season. From the above study it is observed that the sulphate value is lowest at the station S1 during the month of September 2013. The lowest sulphate value is 31.07 mg/L. The highest sulphate value is observed at the station S12 during the month of December 2017. The highest sulphate value is 359.82 mg/L.

WQI
It is observed that the WQI value varies from 39.25 to 303.13 during the period January 2013 to December 2017. It is also observed that there is regular increase of the WQI value in rainy season and there was decreased of the WQI value in summer season. From the above study it is observed that the WQI value is lowest at the station S5 during the month of September 2013. The lowest WQI value is 39.25. The highest WQI value is observed at the station S14 during the month of December 2017. The highest WQI value is 303.13. From the observations, it is clear that the sampling points S7, S13, S14, S15 and S16 are affected very much. The qualities of water in this sampling point are very poor to water unsuitable for drinking purpose. The WQI values in these stations indicate that, remedial measures must be taken immediately to safe guard the underground water in the study area (Table 4). Poor Water S1, S2, S3, S4, S7, S8, S9, S10, S11, S12 4 200 -300 Very Poor Water S13, S16 5 >300 Water unsuitable for drinking S14, S15 In the rainy season, the concentration of water quality parameters increases the WQI and in the summer season it is decreased. The concentration of WQI was increased every year during the study period. Based on the interpretation of WQI value it is clear that the sampling points S13, S14, S15 and S16 are affected very much.

Correlation analysis
Correlation analysis is the extent of linear relationship between any pair of the physico chemical parameters as computed by the Pearson's product moment correlation [28]. The relationship between the parameters of groundwater is estimated by correlation coefficient which shows significant correlation as Table 5.
TDS showed a higher degree of positive correlation with groundwater quality parameters like chloride (0.962) and total hardness (0.906) and weak correlation with sulphate (0.112). These values revealed that higher TDS was mainly due to the presence of ions, which further substantiated the fact that groundwater contamination was occurring due to generation and migration of landfill leachate [29]. The association between total hardness and chloride showed positive higher degree of correlation (0.837). For sulphate the degree of positive correlation with alkalinity shows significantly moderate (0.467) and with chloride very weak correlation (0.052). The WQI showed higher degree of positive correlation with TDS, chloride and total hardness, moderate correlation with alkalinity and acidity and https://doi.org/10.37358/RC.21.1.8410 weak correlation with sulphate and pH. It shows WQI values mostly associate with TDS, chloride and total hardness. In this study the relation between all the parameters shows a positive correlation [30].

SPSS statistical analysis
The available SPSS Regression analysis is used to arrive at the regression equation to predict the TDS by incorporating the independent parameters such as chloride, sulphate and temporary hardness. A model is said to be a best model, if the sum of the square of the error between the predicted and observed values is minimum [31,32]. Due to the wide range in the available data and high percentage error, a single equation was not suitable for predicting TDS. So multi linear regression (MLR) equation was suitable for highly polluted sampling points and multi nonlinear regression (MNLR) equation was suitable for least polluted sampling points, as this gives low percentage error. The selected multi linear regression equation (5) for TDS is where, k is regression constant; x, y and z are predictors coefficient in multi linear regression analysis. The values of regression constant and predictor coefficient of MLR equation are listed out in Table 6a. The correlation between the parameter estimates of MLR equation is listed out in Table 6b.  The positive sign of the beta coefficients of chloride, sulphate and temporary hardness indicate that a positive relationship exists between TDS and chloride of the groundwater, as well as between TDS and sulphate and temporary hardness constituents of the groundwater for the highly polluted sampling points. The TDS value in sampling point S14 (highly polluted sampling point) was predicted using equation (5), the minimum and maximum percentage error obtained were 2.85 and 13.37 % respectively. The R 2 value for the MLR equation was found to be 0.937, which shows that the MLR equation is best suitable for predicting TDS values of highly polluted sampling points.
The selected MNLR equation (6) for TDS prediction is.
where, a is regression constant; b, c and d are predictors coefficient in multi nonlinear regression analysis. The values of regression constant and predictor coefficient of MNLR equation are listed out in Table 7a. The correlation between the parameter estimates of MNLR equation is listed out in Table 7b. The positive sign of the beta coefficients of chloride indicate that a positive relationship exists between TDS and chloride of the groundwater and the negative sign of the beta coefficient of sulphate and temporary hardness indicates that a negative relationship exists between TDS and sulphate of the groundwater as well as between TDS and temporary hardness constituents of the groundwater for the least polluted sampling points. The TDS value in sampling point S5 (least polluted sampling point) was predicted using equation (6), the minimum and maximum percentage error obtained were 2.85% and 13.37% respectively. The R 2 value for the MLR equation was found to be 0.897, which shows that the MNLR equation is best suitable for predicting TDS values of least polluted sampling points.

ANN modelling
ANNs are characterised by their topology, weight vectors and an activation function that are used in the hidden layers and the output layer [33]. In this study, multilayer perceptron, with each layer consisting of a number of computing neurons have been used. A multilayer perceptron trained with the back-propagation algorithm can be considered as a practical way of functioning a nonlinear input-output mapping of a general nature. The activation function used in both the hidden layer and the output layer is a non-linear function, whereas for the input layer, no activation function is used since no computation is involved in the input layer. All neurons in a particular layer are fully linked to the neurons in the adjacent layers. Information flows from one layer to other layer in a feed forward manner. The Feed Forward Back Propagation (FFBP) network is a popular architecture among different types of neural networks and finds its application in several areas of Engineering [34].
The ANNs need larger volumes of data so that it learns better and predicts the output accurately. The data thus obtained were used for training the ANN for assessment of TDS. For the sampling point S5 and S14, the data were taken as test data to predict the TDS value and that value is used to compare with the expected TDS value to obtain the % of error. this study work, models were developed by using ANN. The parameters chloride, sulphate and temporary hardness were chosen as the inputs and TDS was the target to train the neural networks. After completion of training, weights between input layers to hidden https://doi.org /10.37358/Rev.Chim.1949 Rev. Chim., 72 (1) layer, hidden layer to output layer, to be generated. Testing was done for sampling point S5 and S14. The R 2 value obtained from training and testing are 0.99845 and 0.996244 respectively which shows that the model is best suitable for predicting TDS (Figure 3).

SPSS and ANN comparison
Sampling points S1 to S8 data were used for training the network, in which the algorithm, training function, learning function and performance function were used as feed forward back propagation, TRINLM, LEARNGDM and MSE respectively [35]. In this modeling, 2 numbers of hidden layers and 10 neurons in each layer were chosen. The performance of SPSS MLR model was collated with that of ANN model by comparing their R 2 value and percentage error. The percentage error was calculated by comparing predicted TDS value with observed TDS value. For this comparison S5 and S14 sampling points were chosen. The R 2 value obtained from SPSS MNLR model and ANN model for S5 sampling point was 0.897 and 0.99624 respectively. Likewise, for the sampling point S14, R 2 value obtained from SPSS MLR model and ANN model was 0.937 and 0.99624 respectively. The variation between observed TDS and predicted TDS from SPSS MNLR model and ANN model for sampling point S5 was depicted in (Figure 4). The minimum percentage error obtained from SPSS MNLR model and ANN for S5 sampling point were 0.12 and 0.1% respectively. The maximum percentage error obtained from SPSS MNLR model and ANN were 15.6 and 7.5% respectively. The variation between observed TDS and predicted TDS from SPSS MLR model and ANN model for sampling point S14 was depicted in (Figure 5).

Figure 5.
Comparative graph of TDS for sampling point S14 For S14 sampling point, the minimum percentage error obtained from SPSS MLR and ANN model were 2.85 and 0.14% respectively. The maximum percentage error obtained from SPSS MLR and ANN model were 13.37 and 5.44% respectively. This shows that ANN model gives least percentage error than SPSS MLR model. So that ANN model was best suitable for predicting TDS around the dump site.

Conclusions
The Ramaiyanpatti dumpsite is a non-engineered landfill. It has no bottom liner and leachate collection -treatment system. Therefore, the leachate gets transported easily into the surrounding environment. Moreover, leachate emerging from the waste that has been dumped in the unlined dumpsite has severely affected the groundwater quality of the surrounding areas. The WQI for every sampling point increased in the monsoon season and decreased in the summer season. The WQI values in the stations S4, S7, S13, S14, S15 and S16 indicate high pollution due to leachate. The highest WQI value is 303.13 at sampling point S14. The groundwater quality along the downstream side of solid waste dumping site in Tirunelveli Municipal Corporation at Ramaiyanpatti is very poor.
The prediction capabilities of the ANN were compared with the SPSS regression models. The maximum percentage of error obtained from ANN and SPSS were 7.5 and 15.6% at S5 sampling point. ANN models were more accurate than the SPSS multi nonlinear regression models having the same inputs and output. Study findings revealed that the groundwater is unacceptable for drinking and agricultural purposes in sampling points S4, S7, S13, S14, S15 and S16. It is recommended that borewell water in the study area should be treated before it is used for drinking purpose. The municipality should make proper Leachate collection technique to avoid the contamination of groundwater.