Available online at http://icsejournal. com/index. php/JCSE Journal of Computer Science an Engineering (JCSE) e-ISSN 2721-0251 Vol. No. August 2025, pp. Statistical Analysis of Adaptive Thresholding Algorithms for Denoising Signature Images Ruhiteswar Choudhury1. Tanusree Deb Roy2* 1Department of Statistics. Assam University. Silchar 788011. India 2Department of Statistics. Assam University. Silchar 788011. India choudhury2015@gmail. 2 tanusree. roy@gmail. * corresponding author ABSTRACT ARTICLE INFO Article History: Received December 23, 2024 Revised July 19, 2025 Accepted August 30, 2025 Keywords: Digital Image Processing. Biometry. Gaussian Mixture Model. Otsu Thresholding. Histogram Thresholding. Correspondence: E-mail: tanusree. roy@gmail. This study explores the efficacy of adaptive thresholding techniques in denoising signature images captured under varying lighting conditions. Signature images from multiple individuals were obtained in different illumination scenarios, and three prominent adaptive thresholding algorithms, namely histogram thresholding. OtsuAos method, and the Gaussian Mixture Model (GMM), were applied to the noisy The performance of each technique was rigorously evaluated using root mean square error (RMSE) and correlation coefficient metrics. The findings reveal that the Gaussian Mixture Model significantly outperformed both histogram thresholding and OtsuAos method, achieving superior noise reduction and better preservation of essential information. This was evidenced by lower RMSE values and higher correlation coefficients. These results suggest that the Gaussian Mixture Model is a highly effective technique for denoising signature images, particularly under varying lighting conditions. Its superior performance underscores its potential as a robust tool for enhancing the clarity and accuracy of signature verification systems. This study provides valuable insights into the application of adaptive thresholding techniques in image processing, highlighting the advantages of the Gaussian Mixture Model over traditional methods. The implications of this research are substantial for fields that rely on precise signature recognition and verification, such as banking, legal documentation, and security systems. This study specifically focuses on signature segmentation as a preprocessing step for signature verification systems. It does not directly address full document verification but aims to improve segmentation accuracy under varying lighting conditions, which is a foundational component in document authentication Introduction Biometric authenticity and security have been a major concern in todayAos world keeping in mind the advancement of technology and science, starting from data security to the all-other essentials. Whether important documentations and financial security most of the things are under biometric authentication or encryption, so making it safe and keep the integrity and originality of the biometric system is very much essential . In, biometric authentication, handwritten signature has been extensively used and has a long history as a medium of verification of users or individuals for important documentation, cheque verification and many other applications. The name of an individual written by hand is represented by a handwritten signature. Signage is a behavioral trait that is employed in automated user verification systems inside the biometric framework . Since most users view signatures as non-threatening and inconspicuous as they are a common part of daily life, they are among the most widely accepted biometric attributes . , . Experts in forensic document analysis can check a signature on a document to verify its authenticity and to prevent fraud. Large database storage and upkeep are standard organizational practices these days as part of the transition to paperless offices. A lot of administrative paperwork is frequently scanned, then saved as pictures. Consequently, this practice has led to an enormous need for reliable methods of accessing and modifying the data that these images contain. Thus, signatures may serve as important authentication procedure for document authentication and retrieval. Because of the enormous Journal of Computer Science an Engineering (JCSE) Vol. No. August 2025, pp. e-ISSN 2721-0251 applications, image segmentation based on signature verification is an essential and important part . , . Digital Image processing is not a new task in the advancement of science and technology, since the first of application of image processing were observed in the early 1920Aos at newspaper industry. However, the first digital image has been developed by National Institute of Standards and Technology (NIST, 1. using a scanning device. Following which many researches has been carried out in the field of image processing, many statistical tools and techniques has been extensively used for the development of digital image processing. studied various statistical measures in the digital image processing diaphragm at root level and investigated its application in additional research area. Image segmentation is one of the core part of image processing where many researchers have used many techniques and thresholding is one of the most useful among them where . proposed an adaptive thresholding algorithm to segment the image based on the color histogram and mean/average of the color intensity . , . proposed an algorithm based on interclass variance for binarization or segmentation of digital images using histogram thresholding, where automatic thresholding is also applicable to some extent, following which many researchers have used this technique . , . , . applied OtsuAos thresholding and shows that it is equal to the average mean levels of two classes because of various variance, where it is biased towards the class having much variance in comparison. introduces a image segmentation algorithm based on Gaussian mixture model and the estimation of the parameter is based on EM Algorithm. Many researchers, have found this method as the most convienient and reliable method . studied the multi-scale segmentation of mid Infrared images by feature trimming algorithm using GMM and Convolutional U-net method . , . , . , . have studied the segmentation of sonar images using fast level set algorithm driven by the Gaussian Mixture Model. Using various thresholding algorithm . , measuring the performance of the algorithm a quantitative measure needs to estimate the accuracy. Correlation coefficient is one of the basic measures of computing interrelationship between variables which further computes correlation among the . proposed an algorithm Structural Similarity Index Measure (SSIM) to compute the amount of information restored from the original image to the segmented image, which is further been extensively used by many researchers. , . , . , . uses correlation coefficient and SSIM index to quantify the image segmentation algorithms. Although various thresholding algorithms have been applied to image segmentation, limited research has compared the performance of GMM versus Otsu on synthetic signature images, especially under non-uniform lighting conditions. This study addresses this gap by evaluating the effectiveness of these algorithms through structural similarity and other relevant metrics. This study differs from prior work by quantitatively benchmarking adaptive thresholding methods under simulated uneven illumination, a scenario rarely examined in signature-based image segmentation. Unlike prior studies that evaluate global binarization methods on uniformly illuminated datasets, our work introduces a controlled illumination perturbation framework and evaluates algorithmic robustness using SSIM and RMSE metrics. The primary objectives of this study are: . to evaluate the denoising and segmentation performance of three thresholding algorithms. Histogram Thresholding. OtsuAos Method, and Gaussian Mixture Model, on signature images, and . to compare their effectiveness under varied lighting conditions using SSIM. RMSE, and correlation coefficient metrics. Method The methodology adopted in this study comprises five key stages, beginning with the acquisition of synthetic signature images and concluding with the performance evaluation of segmentation The following flowchart outlines the sequential steps involved in the research process shown in Figure 1. Journal of Computer Science an Engineering (JCSE) Vol. No. August 2025, pp. Signature Image Acquisition e-ISSN 2721-0251 Noise Simulation (Various Light Condition. Apply Thresholding (Histogram. Otsu. GMM) Performance Metrics (SSIM. Correlation. RMSE) Segmentation Output Figure 1. Research workflow of adaptive thresholding-based signature image segmentation process Image thresholding is a technique used for several algorithms and techniques in image processing Image segmentation technique basically divides the respective image into two or more than two regions such as foreground or background based on the requirement of the representative image by affixing a certain threshold value and pixels with intensities greater or less than the specified threshold value which can be classified accordingly. Lighting variation was simulated by adjusting brightness and contrast within A30% using MATLABAos imadjust function to mimic realworld uneven illumination. There are several different types of thresholding algorithms that have been used among them the study focuses on Histogram Thresholding. Otsu Thresholding. Gaussian Mixture Model. Histogram Thresholding: This is one of the most basic type of thresholding by observing the histogram of the pixel intensities of a grayscale image. The histogram of the pixel intensities of a gray scale image shows bimodal distribution than affixing a certain threshold value which will distribute the image into two different regions, than by computing the averages of both the regions and further calculate a new threshold value which is the average of both the computed threshold . The mentioned iterative method is done using the following step by step procedure: Step 1: Initially a threshold value T is chosen randomly or according to any other desired method. Step 2: The image is filtered according to the object and background pixels and subsequently create two distinct sets ycI1 = yci. co, yc. O yci. co, yc. > ycN (Pixels of the Objec. ycI2 = yci. co, yc. O yci. co, yc. < ycN (Pixels of the backgroun. Og. , . is the value of the pixel located in the mth row and nth column Step 3: The average of each set ycI1 & ycI2 is computed yca1 = average pixel values of ycI1 yca2 = average pixel values of ycI2 Step 4: Now computing the new threshold value by averaging the above-mentioned averages a1 and a2 yca1 yca2 ycNO = . Step 5: Now again repeat the step 2 and keep repeating this process until the convergence has been reached. Otsu Thresholding: OtsuAos thresholding algorithm is based on the linear discriminant criteria which consists of only an object and a background, whereas if there is some disturbance or diversity in the background it is ignored . Based on the understanding, the image is divided into two regions having a particular threshold value say AycA . The OtsuAos method divides the image into two regions say ycN0 & ycN1, where ycN0 is a set of pixel intensity values from . , . and ycN1 is the set of intensities ranges . , . or instance l=. The minimum value for the pixel levels on each side of the threshold is found using OtsuAos thresholding method, which scans all potential thresholding values. The objective is to find the threshold value for the sum of the foreground and background that has the lowest entropy. The variance of clusters i. , the foreground and the background can be computed using OtsuAos method, which uses the statistical information in the Journal of Computer Science an Engineering (JCSE) Vol. No. August 2025, pp. e-ISSN 2721-0251 image to determine the threshold value. The best threshold value is found by minimizing the total of the group variances that are weighted, with the weights denoting the likelihood of each group. Using ycE. as the observed grey valueAos histogram probabilities ycE. = ycuycycoycayceyc. c, yc. c, yc. = ycn ycn = 1, 2, . , 256. yco = 256 . cI, y. Where r is the row index and c is the column index respectively, & R is the number of rows and S is the number of columns of the image respectively. AAyce . , yuayce2 . as the weight, mean, and variance of class ycN0 with intensity value from 0 to t, respectively. AAyca . , yuayca2 . as the weight, mean, and variance of class ycN1 with intensity value from t 1 to l, respectively. yuayui2 as the weighted sum of group variances. The optimal threshold value yc O with minimum within class variance, which can be defined as: yuayui2 = yuiyca . = Ocycycn=1 ycE. Where, . yco yui. = Oc ycE. ycn=yc 1 yuNyca . = yuNyce . = Ocycycn=1 ycn. Ocycoycn=yc 1 ycn. = Ocycycn=1. cn Oe yuNyca . ) . = Ocycoycn=yc 1. cn Oe yuNyca . ) . Computing all the functions mentioned above, the threshold value can be determined using the optimum with-in class variance yuayui2 , which results in the output of the segmented image. The threshold value for which yuayui2 is minimum has been considered as the optimum threshold value. Gaussian Mixture Model: An image is a matrix where every element is represented by a A pixelAos value is a numerical representation of the imageAos color or intensity. Assume that X is a random variable with these values. The mixture of Gaussian Distribution can be composed of the following form: yco yce. = Oc ycyycn ycA. yuNycn , yuaycn2 ) . ycn=1 Here, ycyycn > 0 are weights & k is the number of components or regions in the image where. Ocycoycn=1 ycyycn = 1 Now, it is known that the density function of the normal or Gaussian density is of the following form: yuNycn , yuaycn2 ) = yuaycn Oo2yuU exp (Oe . cu Oe yuNycn )2 2yuaycn2 Journal of Computer Science an Engineering (JCSE) Vol. No. August 2025, pp. e-ISSN 2721-0251 Where. AAycn is the location and Ei is the scale parameter respectively. The mean vectors, covariance matrices, and mixture weights from each component density parameterize the whole Gaussian mixture model. The notation serves as a collective representation of these factors. yuI = . cyycn , yuNycn , yuaycn } . From the equation presents several variations of the GMM. The covariance matrices, yuaycn , may be restricted to be diagonal or full rank. Furthermore, parameters might be connected or shared between the Gaussian components. For example, all components could share a common covariance matrix. An important factor in selecting the model conFigureuration . umber of components, complete or diagonal covariance matrices, and parameter tyin. is the quantity of data that is available to estimate the GMM parameters. Maximum Likelihood Estimation: Conventionally, the maximum likelihood estimation (MLE) can be estimated by maximizing the log-likelihood of the particular density form, in this case the pdf is of the form: yco yce. = Oc ycyycn . yuNycn , yuaycn2 ) . ycn=1 The likelihood of the said form can be written as Ae yco yco ya. = Oa {Oc ycyycn . yuNycn , yuaycn2 )} ycn=1 . ycn=1 Then, the respective log-likelihood for the function can be defined as Ae log ya = Oc log {Oc ycyycn ycn=1 ycn=1 yuaycn Oo2yuU exp (Oe . cu Oe yuNycn )2 2yuaycn2 From the above equation, it is clear that the log cannot pass through the summation and hence derivative throughout the equation with respect to the parameters is practically not possible. So, to overcome this difficulty and find the parameter estimates the EM (Expectation Maximizatio. Algorithm is utilized, it is a tool to iterate a particular function to converge the function until the optimization of the parameter has been done. Expectation Maximization: An MLE of (AAycn , yuycn ) may be found numerically iteratively using the EM Algorithm. The general plan is to start with an initial estimate for ycyycn , utilise it together with the observed data X to AucompleteAy the data set by postulating a value for Y using X and the estimated (AAycn , yuycn ). After that, we can discover an MLE for pi using the standard method. However, the real concept is a little bit more complex. We will postulate a whole distribution for Y and utilize an initial approximation for (AAycn , yuycn ), finally averaging out the unknown Y. We will specifically examine the anticipated full likelihood, or log-likelihood when more practical, ya. a(AAycn , yuycn . cU)] where the expectation is applied to the random vector YAos conditional distribution where X is given and the initial estimates for (AAycn , yuycn ) . , . The procedure is as follows: Let k = 0. Set the initial estimate of (AAycn , yuycn ) as yuNycn = yuNycn , ycn = Cycnyco Assuming yuNCycnyco & Cycnyco is correct for the respective parameters and given the observed data as X, than the conditional density yce. ycu, yuNCycnyco . Cycnyco ) . Computing the conditional expected log-likelihood or AyQ-functionAy: yco yco yco C yco ycE . uNycn , ycn . uNC ycn , yuaycn ) = ya [Oc log {Oc ycyycn ycA. yuNycn , yuaycn )}] ycn=1 ycn=1 The expectation is for the given conditional distribution of Y given X. Journal of Computer Science an Engineering (JCSE) Vol. No. August 2025, pp. Find the AAycn , yuycn that maximizes ycE . uNycn , ycn . uNC ycn , yuaycn ) e-ISSN 2721-0251 Suppose k=k 1 and call . uNycnyco 1 , ycnyco 1 ) and get back to step b. Until the estimate for AAycn , yuycn ceases to fluctuate, the EM Algorithm is repeated. Typically, an algorithm is iterated until a tolerance A is reached. yco 1 |. uNCycn yco 1 Oe yuNC || < yun ycn yco 1 yco 1 ||Cycn Oe C || < yun ycn Using all the proposed steps and computing the parameter values for the Gaussian Mixture Model the threshold value can be calculated and hence binarization of images have taken into place. 1 Correlation Coefficient The standard correlation coefficient is a method to check the correlation or interrelation between two variables, which in this particular case is the original image and the resultant. image, where the range of correlation coefficient is . where Ay0Ay indicating the images are completely different from each other and Ay1Ay indicating the images are completely similar, the correlation coefficient C can be computed using the following formula: ya= A ). c A Ocyco Ocycu. cuycoycu Oe ycuycoycu ycoycu Oe ycycoycu ) A ))(Oc Oc . c A Oo(Ocyco Ocycu. cuycoycu Oe ycuycoycu yco ycu ycoycu Oe ycycoycu )) . where x indicates rows and y indicates columns of the original image and ycuA indicates rows and ycA indicates columns of the resultant . image respectively . 2 Structural Similarity Index Measure (SSIM) The structural Similarity Index Measure (SSIM) . , . is a measure or quantitative measure to compute the perceived quality of digital image to the resultant image. It is a similarity-based feature between both the images, it basically computes the quantitative measure to find a comparative approach between the different thresholding or segmentation methods: ycIycIyaycA = . cycu ycyc yco1 ). ycaycuyc yco2 ) . cycu ycyc2 yco1 ). caycu2 ycayc2 yco2 ) . where x represents the original image and y represents the resultant . inary or segmente. image, ycycu indicating the average of the original image, uy indicating the average of the segmented image, ycaycu & ycayc are the standard deviation of the original and the resultant image respectively. ycaycuyc is the covariance of both the images and lastly, ya1 & yco2 are the variables to adjust the fraction with weak 3 Root Mean Square Error (RMSE) RMSE is a measure of goodness of fit to check the best fitted probability distribution among the selected, the smallest RMSE value indicates the best-fit model of the variate and also gives the standard deviation of the model prediction error . It measures the differences between observed and estimated values. The RMSE can be obtained using the following relation: ycIycAycIya = Oo Oc. cuycn Oe ycuCycn )2 ycu ycn=1 Where, ycuycn indicates the observed and ycuCycn indicates the estimated value. With the help of which the best method of parameter estimation is also obtained. Journal of Computer Science an Engineering (JCSE) Vol. No. August 2025, pp. e-ISSN 2721-0251 Results and Discussion Since, showing or publishing personal signature may raise concern of its misuse & personal dissatisfaction, so a hypothetical signature has been used for displaying the results of the algorithm The synthetic signature shown in Figure 2. Figure 2. A hypothetical Signature image Also the names of the participants is not mentioned throughout the study, the participants has been coded particularly for recognition and identification. 1 Histogram Thresholding From the above methodology, the results of the histogram thresholding algorithm can be found using the image histogram. The threshold value is fixed based on the bimodal distribution of the original image, the resultant of the Figure 3 is shown in the following Figure, where the pixel values are interchanged . lacks turned to white and vice vers. to identify the noise captured by applying the certain method: Figure 3. Segmented Image using Histogram Thresholding From Figure 3 it is noted that some noise has been captured in the resultant image from the qualitative evaluation. However, to evaluate these various quantitative measures has been computed, the results of which are shown in the following table 1. Table 1. Performance measures of different algorithms using Histogram Thresholding. Participants Correlation Coefficient SSIM Index RMSE values RC001 DD002 JB003 SA004 JS005 From Table 1, it can be noted that the various evaluation measures have been computed for the segmented image using the histogram thresholding algorithm. It is seen that in some cases the correlation coefficient is very high . , for the case RC001 & JB. whereas the rest others are too low similarly for the other measures too, i. , it cannot be concluded that the thresholding algorithm is not working consistently or significantly for all types of images and under various lighting Also, it can be concluded that on an average the correlation coefficient is 0. 646716 which is pretty high along with average standard deviation of 0. 323065, similarly the structural similarity as well as the RMSE values are 0. 675946 & 5. 53882 respectively. 2 Otsu Thresholding According to the Otsu thresholding algorithm, the threshold value has been fixed primarily, then by adjusting based on the steps involved. The minimum interclass variance turns out to be Ay99Ay, so taking that as threshold the resultant of the Figure 4 is shown in the following Figure, the pixel values are interchanged to identify the noise more precisely: Journal of Computer Science an Engineering (JCSE) Vol. No. August 2025, pp. e-ISSN 2721-0251 Figure 4. Segmented Image using Otsu Thresholding From Figure 3, it is clear that the qualitative evaluation of the results should have been measured and interpreted based on the appearance of the image. The algorithm or method appears to be better than the histogram-based algorithm, as it incorporates the iterative process of updating the threshold value based on the interclass variance of the foreground and background. However, the quantitative measures have been computed for the above said algorithm is mentioned in the following table: Table 2: Performance measures of different algorithms using Otsu Thresholding Participants Correlation Coefficient SSIM Index RMSE values RC001 DD002 JB003 SA004 JS005 Form Table 2, it is clear that as compared to the previous method the performance indexes also show the same results as per the qualitative evaluation, for the coefficient of correlation the correlation measure increases for all the individuals as compared to the histogram thresholding which means the relation between the images are quite similar, similarly the structural similarity also increases in this process for all the individuals and the RMSE also shows the same results. Since it is noted that on an average the correlation coefficient. SSIM index & RMSE value are 0. 808868, 0. 768776 & 7. respectively which as compared to the histogram algorithm is quite impressive and can be concluded to be a better algorithm for image thresholding. 3 Gaussian Mixture Model Gaussian Mixture Model is the most widely used model having diverse usability, feasibility and reliability in different frameworks from financial economic modelling to signal processing to image thresholding too. GMM has been essential tool in many cases where two or more normal densities are found, here using the GMM model the image of the hypothetical signature has been segmented out which is been displayed in the following Figure 5. Figure 5. Segmented Image using Gaussian Mixture Model From Figure 5, it is noted that from the appearance of the image the qualitative evaluation of the results should have been measured and can be interpreted that it is definitely a better algorithm or method as compared to the histogram based algorithm since it takes into account the iterative process of updating the threshold value based on the expectation maximization method. However, the quantitative measures have been computed for the above said algorithm mentioned in the following Table 3. Performance measures of different algorithms using GMM Participants Correlation Coefficient SSIM Index RMSE values RC001 DD002 JB003 SA004 JS005 Journal of Computer Science an Engineering (JCSE) Vol. No. August 2025, pp. e-ISSN 2721-0251 Form Table 3, it is clear that as compared to the previous method the performance indexes also shows the same results as per the qualitative evaluation, for the coefficient of correlation the correlation measure increases for all the individuals as compared to the histogram thresholding which means the relation between the images are quite similar, similarly the structural similarity also increases in this process for all the individuals and the RMSE also shows the same results. Since it is noted that on an average the correlation coefficient. SSIM index & RMSE value are 0. 972996, 0. 980273 & 1. respectively which is impressively high as compared to the histogram algorithm and Otsu method of It is clear from the above analysis and evaluation that as compared to three methods under consideration GMM is a far superior approach to both the histogram thresholding algorithm and the Otsu thresholding algorithm with SSIM index value of 0. 980273 on an average which indicates that the segmented image and the ground truth image has a similarity of above 98% which in contrast to the other methods are around 76% & 67% similar in the case of correlation coefficient & RMSE Figure 6. Comparative Performance Metrics across Thresholding Algorithms The comparative analysis presented in Figure 6 clearly demonstrates that the Gaussian Mixture Model (GMM) consistently outperforms both Histogram and Otsu thresholding algorithms across all quantitative measures. The GMM achieved the highest mean correlation coefficient . , the highest SSIM value . , and the lowest RMSE . , confirming its superior capability in preserving structural information and reducing noise under varying illumination. By contrast, the Otsu method shows moderate performance improvements compared to the Histogram approach, indicating that its inter-class variance optimization provides partial robustness against uneven lighting but remains limited for complex background variations. This finding reinforces the conclusion that GMM-based adaptive thresholding is a more resilient and accurate segmentation strategy, especially for signature images captured under non-uniform lighting The results also align with recent studies emphasizing the advantages of probabilistic and iterative methods . , . over static thresholding schemes in image segmentation tasks Conclusion In this study, signature image segmentation using various adaptive thresholding techniques have been evaluated and analyzed using several quantitative performance evaluation techniques. The results of the analysis provide a detailed examination of the performance of the adaptive thresholding algorithms under various lighting conditions. The findings indicate that the Histogram thresholding algorithm exhibits mixed results, with some images showing adequate segmentation performance, while others, particularly those with poor lighting conditions, yield below-average results. contrast, the Otsu thresholding algorithm demonstrates a relatively better performance across all Journal of Computer Science an Engineering (JCSE) Vol. No. August 2025, pp. e-ISSN 2721-0251 performance evaluation techniques, including the structural similarity index and correlation coefficient, which are significantly higher compared to those of the Histogram thresholding method. However, the most significant and consistent results are observed with the Gaussian Mixture Model (GMM) thresholding algorithm. Regardless of the lighting conditions, the GMM method consistently produces impressive results in the qualitative evaluation, accurately segmenting images from the original image. This superiority is further substantiated by the quantitative measures, which reveals that the GMM method outperforms the Otsu thresholding method by approximately 22% and the Histogram thresholding method by around 31% on average, in terms of the structural similarity Additionally, the results from the other performance evaluation techniques, such as correlation coefficient and RMSE, also demonstrate a significant advantage of the GMM method over the other two algorithms. These findings collectively highlight the exceptional performance of the GMM thresholding algorithm under diverse lighting conditions, making it a promising solution for image segmentation tasks. From the detailed study, it can be concluded that the GMM approach is superior to other adaptive thresholding algorithms for signature image segmentation in raw images, both qualitatively and Although the study demonstrates significant results, it has some limitations. The sample size is relatively small. Additionally, lighting variation was simulated, not captured naturally, which may limit generalizability. Future studies should validate these methods on larger, real-world datasets with natural lighting inconsistencies. References