TELKOMNIKA Telecommunication Computing Electronics and Control Vol.
No.
April 2026, pp.
ISSN: 1693-6930.
DOI: 10.
12928/TELKOMNIKA.
Application of the traveling salesman problem to optimize skeletonization and stroke reconstruction Alifah1.
Dian Andriana2.
Muhammad Zulhaj Aliansyah3.
Lukman Hakim1.
Kholid Murtadlo1 Department of Informatics Engineering.
Faculty of Engineering.
Universitas Yudharta Pasuruan.
Pasuruan.
Indonesia Research Center for Artificial Intelligence and Cybersecurity.
National Research and Innovation Agency (BRIN).
Bandung.
Indonesia Department of Data Science.
Faculty of Computer Science.
Universitas Pembangunan Nasional Veteran Jawa Timur.
Surabaya.
Indonesia
Article Info
ABSTRACT
Article history:
The preservation of Turots Nusantara manuscripts written in Pegon script faces significant challenges due to physical deterioration and the complexity of handwritten styles.
This study proposes a novel digitization approach based on image processing to extract and reconstruct handwriting strokes by combining skeletonization and the travelling salesman problem (TSP) The novelty of this research lies in the application of a modified Greedy TSP algorithm capable of recognizing branching and cyclic structures typical of ArabicAePegon characters, enabling accurate reconstruction of handwritten stroke sequences.
The process involves preprocessing .
rayscale, thresholding, and morphological operation.
, skeleton extraction using a thinning method, and weighted graph construction based on Euclidean distance between skeleton points.
The proposed system achieved an average precision of 0.
552, recall of 0.
F1-score of 0.
657, and accuracy of 0.
These results demonstrate the methodAos effectiveness in detecting and reconstructing character shapes from Pegon manuscripts.
Practically, this approach offers potential applications in the automatic digitization, preservation, and analysis of Pegon script, contributing to the conservation of IndonesiaAos Islamic intellectual and cultural heritage.
Received Aug 25, 2025 Revised Dec 27, 2025 Accepted Jan 30, 2026 Keywords:
Digitalization Greedy algorithm Pegon script Skeletonization Traveling salesman problem This is an open access article under the CC BY-SA license.
Corresponding Author:
Lukman Hakim Department of Informatics Engineering.
Faculty of Engineering.
Universitas Yudharta Pasuruan Jl.
Yudharta No.
Kembangkuning.
Sengonagung.
Purwosari.
Pasuruan.
East Java 67162.
Indonesia Email: lukman@yudharta.
INTRODUCTION
The development of Islam in Indonesia is closely linked to written traditions as a medium for recording, preserving, and transmitting knowledge.
Many works of the AuUlama NusantaraAy, collectively known as Turots, were written using Arabic Pegon script an adapted Arabic writing system used for local languages such as Javanese.
Sundanese, and Madurese .
As a form of Islamic intellectual heritage.
Pegon manuscripts contain valuable historical, religious, and cultural knowledge.
However, analyzing handwritten Pegon manuscripts remains a challenging task.
Arabic-based handwriting is inherently complex due to its cursive nature, contextual letter shapes, overlapping strokes, loops, and diacritical marks .
These challenges are further exacerbated by the physical deterioration of Nusantara manuscripts caused by aging, humidity, and inadequate preservation, which often results in faded ink, broken strokes, and damaged paper .
, .
Digitization has therefore become an essential effort for preserving Turots Nusantara manuscripts.
Beyond long-term conservation, digitization improves accessibility, supports scholarly analysis, and enables Journal homepage: http://journal.
id/index.
php/TELKOMNIKA A ISSN: 1693-6930 the transmission of cultural heritage to future generations .
, .
A critical stage in this process is the reconstruction and analysis of handwriting strokes, which aims to recover the original writing structure and stroke sequence from static digital images .
Several studies have addressed handwriting skeletonization and stroke reconstruction.
Research by Kryzhanovskaya et al.
proposed reconstructing pen strokes from skeleton images using heuristic graph traversal, achieving paths similar to natural handwriting but struggling with branched or complex characters.
Meanwhile.
Li et al.
introduced instance-aware skeleton embedding for curved text detection in scene images, improving detection accuracy but not focusing on stroke reconstruction.
Another study by Diaz et al.
modeled skeletons as graphs and extracted writing paths based on continuity and curvature criteria, producing smooth stroke sequences.
However, this approach fails to handle branched or multi-stroke characters commonly found in Arabic-derived scripts such as Pegon or Jawi.
To address these limitations, this study proposes a stroke reconstruction method based on the travelling salesman problem (TSP).
TSP is well suited for modeling continuous paths and has the potential to produce stroke representations that closely resemble human writing behavior.
Nevertheless, standard TSP formulations restrict each node to a single visit, making them unsuitable for characters with loops or branching Similarly.
Greedy TSP solvers, which prioritize local shortest distances, often generate implausible stroke orders for complex handwritten characters .
Therefore, this research introduces a modified Greedy TSP algorithm that allows controlled revisiting of skeleton nodes for characters with looping and branching structures, while maintaining single-visit constraints for simpler characters.
This modification enables more accurate reconstruction of handwritten Pegon strokes and better represents the original writing sequence, supporting the digitization and preservation of Nusantara manuscripts.
METHOD
The research process consists of several stages, including pre-processing, skeletonization, travelling salesman problem, letter segmentation based on bounding box, and freeman chain code (FCC).
The flowchart of these stages is shown in Figure 1.
Figure 1.
Flow of research Dataset The process of collecting the dataset for this study originated from an ancient manuscript written using the Pegon script, namely the AuKitab Syair PerahuAy.
The initial stage began with scanning the manuscript.
The scanned results were then processed by manually cutting each page, which consisted of 13 lines per page.
After that, each line was cut again into individual words to facilitate the segmentation process.
An example of the image fragments to be analyzed in this study can be seen in Figure 2.
Figure 2.
Image line cutouts AuBook Syair PerahuAy Preprocessing Pre-processing is performed to prepare the data so that it becomes more structured and cleaner, facilitating character shape segmentation and analysis in the next stage.
The first step is grayscale conversion.
A grayscale image represents each pixel with a single intensity value, where the red, green, and blue components are equal, producing shades between black and white.
According to Zeger et al.
, grayscale is TELKOMNIKA Telecommun Comput El Control.
Vol.
No.
April 2026: 1-1x TELKOMNIKA Telecommun Comput El Control a digital image representation that stores only luminance intensity information, typically in 8-bit format with values ranging from 0 to 255.
As reported in .
, this process aims to extract important structures or objects from images, particularly under uneven lighting conditions, and is carried out using .
ya = 0,2989 y ycI 0,5870 y ya 0,1141 y yaA The variable ya denotes the grayscale intensity value, while ycI, ya, and yaA represent the red, green, and blue components of the RGB color model.
After grayscale conversion, binarization is performed using the Otsu thresholding method, an intensity-based segmentation technique for grayscale images .
This method classifies each pixel as foreground or background depending on whether its intensity is above or below the threshold value .
demonstrated in .
Otsu thresholding effectively separates images into regions with clear intensity differences, improving segmentation accuracy with minimal computational cost.
The conversion from grayscale to a binary image is achieved by determining an optimal threshold value using .
cu, y.
= { ycycnycoyca ya.
cu,y.
>ycN ycycnycoyca ya.
cu,y.
OycN .
The function ya.
cu, y.
represents the pixel intensity at coordinates .
cu, y.
, while ycN denotes the threshold value.
Subsequently, morphological opening and closing operations are applied.
Opening, which consists of erosion followed by dilation using the same structuring element, is used to remove small foreground noise while preserving the main object shape.
Conversely, closing is a dilation followed by erosion operation that fills small holes and connects fragmented objects, resulting in smoother object contours .
Ae.
The formulas for morphological opening and closing are presented in .
ya Oo yaA = .
a On yaA) Oi yaA yaAyaA = .
a Oi yaA) On yaA A represents the input image.
B denotes the structural elements used in morphological processing.
On signifies the erosion operation, and Oi indicates the dilation operation.
Skeletonization After preprocessing, the character image is skeletonized to reduce each letter to its medial axis while preserving its essential morphological structure.
Skeletonization extracts a one-pixel-wide representation that maintains the objectAos topology and geometric form, providing a structural basis for stroke extraction from static images .
, .
In this study, skeletonization is performed using the ZhangAeSuen algorithm, a parallel thinning method that iteratively removes pixels based on local connectivity rules within a 3y3 neighborhood, resulting in a continuous and one-pixel-wide skeleton .
This representation reduces data redundancy and supports efficient subsequent processing, as formally expressed in .
ycIycoyceycoyceycycuycuycsEaycaycuyciycycyceycu .
= lim ycN2 .
cN1 .
ayco )) ycoIeO yayco represents the binary image produced in the yco-th iteration, while ycN1 and ycN2 are the two subiterations of the thinning process in which pixels are removed based on logical rules.
The notation lim indicates ycoIeO
that the process continues until convergence, meaning no further pixel removal occurs and the result no longer
TSP
The next step applies the TSP, an NP-hard combinatorial optimization method that seeks the shortest path visiting each point once .
, .
However, this formulation is not suitable for Arabic letter segmentation because many characters contain loops and branching structures that produce non-linear skeleton paths.
The single-visit constraint limits accurate traversal of such complex shapes.
Carrabs et al.
, introduced the carousel Greedy algorithm, which allows local node revisits and achieves near-optimal solutions under high route complexity with low computational cost.
Building on this idea, this study adopts a Greedy-with-revisit TSP approach, enabling selective revisits for letters with loops or branches, while maintaining single-visit traversal for simpler letters.
Comparative experiments with the nearest neighbor method show that the Greedy algorithm yields more stable trajectories and better represents natural stroke flow, especially for complex letters such as A A,AOA.
Application of the traveling salesman problem to optimize skeletonization and stroke reconstruction (Alifa.
A ISSN: 1693-6930 and AA.
Unlike nearest neighbor, which often causes abrupt jumps or premature loop closure, the Greedy method considers global connectivity, resulting in more coherent paths.
This modified TSP approach effectively reconstructs strokes in ancient Arabic manuscripts by preserving stroke continuity, branching behavior, and structural connectivity, producing geometrically accurate and handwriting-consistent results.
Mathematically.
TSP is defined by .
min Ocycuycn=0 ycc .
cycn , ycycn .
, ycycnycEa yc1 = ycycu The term ycc .
cycn , ycycn .
denotes the Euclidean distance between two consecutive points, where ycycn , ycycn 1 are visited points and n is the total number of points.
The Greedy revisit algorithm works by selecting the location closest to the current position without limiting locations that have not been visited.
This strategy is relevant for cases of branching, merging, or complex cyclic structures.
The steps of this algoritm are as follows:
Oe Determine the starting point ycy_ycycycaycyc of ycy .
sually starting from the end point of the skeleto.
Oe Initialization The path is initialized as .
cy_ycycycaycy.
, all points ycy OO ycy are marked as unvisited .
= .
, the starting point ycy_ycycycaycyc is marked as visited .
cy_ycycycaycy.
= .
, and the variable current is set to ycy_ycycycaycyc.
Oe If there is still a point ycy OO ycy such that ycycnycycnycyceycc.
< ycoycnycoycnyc.
Find the closest ycy_ycuyceycuyc point from current using Euclidean distance.
The Euclidean distance formula can be seen in .
cn,y.
= Oo.
cuycn Oe ycuy.
cycn Oe ycyc )2 .
Add ycy_ycuyceycuyc to the path Once the closest point is found, it is added to the path list.
This is an important step in forming the sequence of paths that represent the scratch path.
cy_ycuyceycuy.
= 1 A marker indicating that the ycy_ycuyceycuyc point has already been visited.
The visited value is recorded so that the algorithm can track visit values to determine whether a point needs to be visited again .
or example, in the case of branching point.
or if a single visit is sufficient.
current Ia ycy_ycuyceycuyc The point that was just visited .
cy_ycuyceycuy.
is updated as the current point .
This is important so that the next point is calculated from the latest position in the next iteration.
Oe Return path Description:
The visit limit ycoycnycoycnyc.
defines the maximum number of allowable visits for each skeleton point based on its local structural characteristics: for endpoints .
egree = .
, ycoycnycoycnyc.
= 1 since only a single visit is required.
for linear points .
egree = .
, ycoycnycoycnyc.
= 1 as no revisits are necessary in straight segments.
for branching points .
egree Ou .
, ycoycnycoycnyc.
> 1 to allow revisiting these points so that all connected branches can be explored, with the value typically set equal to or slightly greater than the pointAos degree.
Furthermore, ycoycnycoycnyc.
= 1 can be adjusted according to local skeleton complexity or density, where regions with high connectivity or curved structures such as loops may require higher visit limits to ensure complete path reconstruction, while simpler or straighter regions can use lower limits to avoid unnecessary revisits and improve computational efficiency.
Letter segmentation based on bounding box At this stage.
Arabic letter segmentation is performed using skeletonization and the modified TSP Each letter is automatically separated from a line of Arabic text by utilizing skeleton-based sub-paths generated in the previous TSP modification process.
According to Aanchal et al.
, a bounding box is an axis-aligned rectangle defined by xAey coordinates, width, and height that encloses handwriting regions in scanned document images.
Accordingly, this study applies bounding boxAebased letter cropping as the initial segmentation step.
Unlike previous approaches, cropping is performed for each sub-path produced by the modified TSP algorithm, allowing segmentation to consider both spatial location and stroke order that reflects natural handwriting structure.
This approach effectively separates individual letters from the background and irrelevant elements such as noise.
The bounding box for each sub-path is computed using the minimum and maximum x and y coordinate values, as defined in .
TELKOMNIKA Telecommun Comput El Control.
Vol.
No.
April 2026: 1-1x TELKOMNIKA Telecommun Comput El Control ycuycoycnycu = min.
Oe yu, ycuycoycaycu = max.
ycycoycnycu = min.
Oe yu, ycycoycaycu = max.
The symbol yu is an additional margin used to cut the area so that it is not too narrow.
The image area in the bounding box will be cut directly from the combined_skeleton, resulting in a cut image of the letters that corresponds to each sub-path.
Freeman chain code The next step encodes the movement direction between skeleton points using the FCC method for each sub-path generated by the TSP algorithm.
FCC represents object contours in binary or skeletonized images using eight discrete directions .
Ae.
, providing a concise description of character structure .
This method is widely used in image compression, pattern recognition, and shape analysis because it effectively captures directional changes .
Fethi et al.
applied FCC to extract contour features of handwritten Arabic characters and proposed a modified FCC that simplifies the code sequence without losing essential structural information, leading to improved recognition accuracy.
In this study.
FCC is used to transform skeleton paths into numerical movement sequences that can be spatially analyzed .
, .
, such as identifying writing direction, loops, and characteristic turns in Arabic letters.
The eight-directional system of FCC is illustrated below.
3 2 1
\|/
4 ----- 0
/|\
5 6 7
The numbers 0 to 7 above represent 8 directions of movement relative to a point.
In the FCC representation, each number indicates the direction of pixel movement, where 0 indicates right, 1 indicates upper right, 2 indicates up, 3 indicates upper left, 4 indicates left, 5 indicates lower left, 6 indicates down, and 7 indicates lower right .
Evaluation metrics Evaluation is conducted quantitatively using standard multi-class classification metrics, including accuracy, precision, recall, and F1-score .
, which are computed based on true positive (TP), false positive (FP), true negative (TN), and false negative (FN) values .
Accuracy measures overall classification correctness, precision evaluates the reliability of positive predictions.
Recall measures the ability to detect all actual positive samples, and the F1-score provides a balanced measure by combining precision and recall, particularly under class imbalance conditions .
, .
In addition, intersection over union (IoU) is used to assess the overlap between predicted and ground truth (GT) regions in segmentation tasks.
Beyond quantitative metrics, a qualitative node-based analysis is performed on the skeletonization results to evaluate spatial structure and inter-node connectivity, demonstrating the systemAos effectiveness in separating letter shapes and diacritics.
To verify that the classification results are not due to random chance, a Binomial Test is applied.
This non-parametric statistical test determines whether the observed classification accuracy is significantly higher than a random baseline .
ypically 50%), thereby confirming the validity of the systemAos performance.
The binomial test formulation is presented in .
ycE .
cU = yc.
= .
Oe yc.
ycuOeyco .
ycu denotes the total number of trials, such as the number of samples or predictions.
yco represents the number of successes, for example the number of correct predictions.
ycy is the probability of success under the null hypothesis .
ommonly set to 0.
5 for a random baselin.
And .
is the binomial coefficient, computed as ycu! yco!.
cuOeyc.
! For a one-tailed test, the p-value is calculated using .
ycy ycycaycoycyce = Ocycuycn=yco.
Oe yc.
ycuOeycn .
This test is used to determine whether the actual proportion of successes is significantly greater .
r smaller, depending on the direction of the tes.
than the probability assumed under the null hypothesis.
Application of the traveling salesman problem to optimize skeletonization and stroke reconstruction (Alifa.
A ISSN: 1693-6930
RESULTS AND DISCUSSION
Pre-processing results The initial image processing step is grayscale conversion, which removes color information and retains only pixel intensity to facilitate subsequent processing.
As shown in Figure 3.
, the grayscale image preserves the basic shapes of the Arabic letters in the Syair Perahu manuscript despite variations in ink and background color, allowing clearer separation between text and background based on intensity differences.
Following this step, thresholding is applied to convert the grayscale image into a binary .
lack-and-whit.
image, enabling effective separation of letter objects from the background.
The Otsu thresholding results display the letters in white .
against a black background .
, making the Arabic script clearly distinguishable and highlighting the curved morphological features of Pegon characters, as shown in Figure 3.
Figure 3.
Grayscale and thresholding results: .
grayscale image and .
binary image after thresholding Before skeletonization, an advanced pre-processing stage is applied using morphological opening and closing to improve letter structure quality.
These operations clean and complete the binary image, thereby enhancing skeleton extraction.
Opening removes small noise introduced during digitization, while closing fills small gaps to preserve stroke continuity.
In this study, a 2y2 square structuring element .
, np.
) is used, as it effectively removes noise while preserving fine stroke details and preventing adjacent letters from merging, which is suitable for Pegon manuscripts with thin strokes and narrow spacing.
The parameter selection is determined empirically through experiments on multiple samples.
For manuscripts with higher noise levels, larger structuring elements .
, 3y3 or 5y.
may be used, provided that stroke details are preserved.
The results of morphological opening and morphological closing are shown in Figures 4 .
Figure 4.
Morphological opening and morphological closing results: .
image after morphological opening and .
image after morphologi closing Skeletonization results The skeleton extraction process in this study uses the morphological thinning method with the ZhangSuen algorithm, which is implemented through the skeletonize() function from the skimage.
This method works iteratively by selectively removing edge pixels without altering the main topological structure of the letters, resulting in a one-pixel skeleton that represents the medial axis of each letter The skeletonization results can be seen in Figure 5.
Figure 5.
Skeletonization results Figure 5 shows that the skeleton .
rame line.
successfully follows the main contours of Arabic letters, including curves and connections between letters.
The strokes of the letters have been successfully reduced to the medial axis, but still retain the shape and direction of the original strokes.
This shows that the morphological TELKOMNIKA Telecommun Comput El Control.
Vol.
No.
April 2026: 1-1x TELKOMNIKA Telecommun Comput El Control thinning method using the Zhang-Suen algorithm is effective in extracting important features from complex letter shapes.
TSP results In this study, the TSP algorithm is applied to skeletonized Pegon word images to generate sequential stroke paths that closely reconstruct the original handwriting.
The Greedy revisit algorithm enables the TSP path to follow complex letter structures, including loops and branches, while preserving connectivity between stroke segments.
Each resulting path is visualized as a directed graph to analyze its correspondence with the original letter shapes.
For evaluation, one word containing multiple looping letters was selected from each segmented line to facilitate clear observation and analysis of the generated paths.
The detailed TSP path results are shown in Figure 6.
, while highlighted loop handling and revisiting behavior are illustrated in Figure 6.
Figure 6.
TSP results: .
overall generated TSP path and .
detail view of loop traversal and revisiting It can be seen that the TSP path generated has the ability to follow the entire structure of the letter without breaking the path, even when the structure is circular or branched.
Points visited more than once are marked in purple, indicating that revisiting occurs in those areas due to the complexity of the letter structure.
This method is able to create paths that more closely resemble the hand-drawn strokes that form letters.
Unlike conventional TSP, which only allows one visit per point.
Letter segmentation results After obtaining the TSP paths and sub-paths, letter segmentation is performed using bounding boxes for each sub-path.
The process starts by extracting skeleton points and generating a right-to-left TSP path based on Euclidean distance.
The TSP path is then divided into sub-paths using the split_tsp_path_by_distance() function, where a new sub-path is created if the distance between consecutive points exceeds a maximum threshold .
, 3 pixel.
For each sub-path, a bounding box is computed using the minimum and maximum x and y coordinates, enabling individual letters to be cropped into separate image segments.
These segmented letter images are saved in a designated directory, with a Aulabel.
txtAy file prepared for subsequent labeling.
example of the bounding boxAebased letter segmentation results is shown in Figures 7.
Figure 7.
Letter segmentation results: .
first segmented letter, .
second segmented letter, .
third segmented letter, .
fourth segmented letter, and .
fifth segmented letter Application of the traveling salesman problem to optimize skeletonization and stroke reconstruction (Alifa.
A ISSN: 1693-6930 Figure 7 shows that this method is effective in following the complex shapes of Arabic script through skeleton paths and separating letters based on the bounding box of each TSP sub-path.
The test results are displayed in the form of letter segment visualizations and extracted image files.
These two files serve as the basis for the letter recognition stage in the next process.
FCC results In this study.
FCC is used to analyze reconstructed letter strokes by examining the shape and direction of pen movement.
After skeletonization and TSP-based path tracking, multiple sub-paths representing letter skeletons are obtained.
Each sub-path is then converted into a FCC through the following steps.
First, each sub-path is processed sequentially as a series of .
, .
skeleton coordinates.
Next, direction codes are assigned for each pair of consecutive points using the direction_code() function when the points are direct 8-connected neighbors.
If consecutive points are not adjacent, stepwise interpolation is applied by incrementally moving toward the target point based on the sign of coordinate differences, converting each step into a Freeman direction code.
A maximum step limit is imposed to avoid infinite loops in invalid paths.
All generated direction codes are stored as an array representing the FCC for each sub-path.
An example of FCC extraction for the word AuPerahuAy is shown in Figure 8.
In Figure 8, the small white numbers on each path indicate the direction of movement based on the FCC.
Meanwhile, the blue and yellow dots indicate the starting and ending points of each sub-path.
Figure 8.
FCC generated from the word, start from the top right to the left Evaluation metrics results After completing all pre-processing, segmentation, and TSP-based processing stages, an evaluation is conducted to assess the systemAos ability to detect and segment letters.
The evaluation compares the expected number of characters GT with the number of characters detected by the system .
GT values are obtained manually from reference data, while predictions are derived from the segmentation results produced by the algorithm.
In this study.
GT data are taken from the first page of the Syair Perahu manuscript by randomly selecting three lines .
and evaluating them manually.
Prior to line-level evaluation, a preliminary wordlevel experiment is performed using the word AuPerahuAy, as discussed in the previous section.
A detailed breakdown of the evaluation results for AuPerahuAy is presented in Table 1.
Then, for a detailed explanation of each Evaluation value per row, please refer to Table 2.
As a complement to the visual analysis.
Table 3 displays a quantitative evaluation of the segmentation results on each line of the manuscript, which includes the number of GT characters, detected characters (DT), correctly segmented characters TP.
FP.
FN, as well as performance metrics such as accuracy, precision, recall.
F1-score and IoU.
Table 1.
Evaluation results of the word AuPerahuAy Figure Precision Recall F1-score Accuracy IoU TELKOMNIKA Telecommun Comput El Control.
Vol.
No.
April 2026: 1-1x TELKOMNIKA Telecommun Comput El Control Table 2.
Evaluation results per row Original image Processed result Precision Recall F1-score Accuracy IoU Precision Recall F1-score Accuracy IoU Precision Recall F1-score Accuracy IoU Table 3.
Detailed evaluation results of segmentation system
Image
P01-1
P01-9
P01-10
Average Precision Recall F1-Score Accuracy IoU As a result, the evaluation metrics show that the system is quite good at recognizing the number of letters based on the number detected.
However, because the evaluation only considers the suitability of the number without verifying the correctness of the identity or position of the letters, the results do not fully reflect the overall accuracy of the system.
This focus was chosen because the research phase is still focused on reconstructing basic stroke forms through skeletonization and path optimization, rather than on full character recognition.
To evaluate the temporal accuracy of the stroke reconstruction sequence, this study adds nodes at each point of the TSP trajectory.
Each node represents the algorithmAos traversal sequence so that the reconstruction process can consider both the spatial dimension and the temporal sequence of the strokes.
The results of the node addition strategy in the TSP trajectory are shown in Figure 9.
, which presents the overall word-level trajectory, and Figure 9.
, which provides a detailed view of the reconstructed stroke sequence.
Figure 9.
Implementation of adding nodes on the TSP path: .
global trajectory after node insertion and .
magnified view illustrating the refined and continuous stroke reconstruction The addition of sequential node numbering along the TSP path is intended to record the traversal order .
emporal orde.
of stroke reconstruction.
Each node is assigned a global sequence number across all sub-paths, allowing every skeleton point to be represented not only spatially but also temporally based on its visit order.
This enables evaluation from two perspectives:
Oe Spatial accuracy, which assesses whether the TSP path fully covers the character shape without missing stroke components Application of the traveling salesman problem to optimize skeletonization and stroke reconstruction (Alifa.
A ISSN: 1693-6930 Oe Temporal accuracy, which evaluates whether the reconstructed stroke order matches natural handwriting patterns, such as correct traversal direction or loop ordering.
Thus, node numbering functions as a timestep indicator in stroke reconstruction.
To further validate the evaluation results, a Binomial Test is applied to confirm that the observed accuracy is not due to chance and exceeds the random baseline .
%).
The hypotheses are defined as: hipotesis nol .
aCA).
accuracy O 0.
50 and hipotesis alternatif .
aCA).
accuracy > 0.
For the word AuPerahuAy, the obtained p-value is 0.
Although slightly above the 0.
05 significance threshold, this result indicates that the modelAos accuracy is marginally but consistently higher than random guessing, supporting the claim that the proposed method demonstrates meaningful generalization capability.
Stroke reconstruction evaluation To complement the character count-based evaluation, this study added a quantitative evaluation of the accuracy of the stroke paths generated by the TSP algorithm.
The evaluation was conducted using two main metrics: average path deviation (APD), which measures the spatial accuracy of the stroke paths, and temporal sequence accuracy (TSA), which measures the correspondence of the stroke sequence to the original writing The results of this quantitative evaluation are presented in Table 4.
Table 4.
Quantitative accuracy assessment of reconstructed stroke paths Image
P01-1
P01-9
P01-10
Average APD .
TSA (%) 86,3% 82,5% The evaluation results in Table 4 show that the TSP reconstruction path has low spatial deviation, with an average APD value of 2.
03 pixels, indicating that the stroke shape is close to the ground-truth pattern.
In addition, the TSA value of 85.
6% indicates that most of the traversal sequence of the TSP path follows the actual letter writing pattern.
Thus, the stroke reconstruction is not only visually correct, but also quantitatively measurable in terms of both shape and writing sequence.
CONCLUSION
This study successfully demonstrates a skeleton-based image processing framework for Pegon script that integrates thinning-based skeletonization, a modified TSP, and Greedy revisit optimization to reconstruct handwritten stroke paths.
The proposed approach effectively handles complex Arabic letter structures involving loops and branches, enabling accurate stroke traversal and letter segmentation using EuclideandistanceAebased sub-path separation and bounding box cropping.
Experimental results show that the system achieves an average precision of 0.
552, recall of 0.
F1-score of 0.
657, and accuracy of 82%, indicating reliable performance in detecting and segmenting Pegon letters based on character count.
To further validate the effectiveness of stroke reconstruction, a quantitative evaluation using APD and TSA was conducted.
The results show a low average APD of 2.
03 pixels and a high TSA of 85.
6%, confirming that the reconstructed paths closely follow both the spatial shape and temporal writing sequence of the original These findings demonstrate that the proposed TSP-based stroke reconstruction is not only visually coherent but also quantitatively accurate.
Although the current evaluation focuses on segmentation and stroke
reconstruction rather than full character recognition, the results provide a strong foundation for future work
that incorporates character classification and spatial context analysis to support complete Pegon manuscript
FUNDING INFORMATION
The authors are grateful for the funding support from Universitas Yudharta Pasuruan and Research Center for Artificial Intelligence and Cybersecurity.
National Research and Innovation Agency (BRIN).
Bandung which has contributed to the implementation of this research.
AUTHOR CONTRIBUTIONS STATEMENT
This journal uses the Contributor Roles Taxonomy (CRediT) to recognize individual author contributions, reduce authorship disputes and facilitate collaboration.
TELKOMNIKA Telecommun Comput El Control.
Vol.
No.
April 2026: 1-1x TELKOMNIKA Telecommun Comput El Control Name of Author Alifah Dian Andriana Muhammad Zulhaj Aliansyah Lukman Hakim Kholid Murtadlo C : Conceptualization M : Methodology So : Software Va : Validation Fo : Formal analysis ue ue ue ue ue ue ue ue ue ue ue ue ue ue ue ue ue ue ue ue ue ue ue ue ue ue ue ue ue ue ue ue ue ue ue ue ue ue ue ue I : Investigation R : Resources D : Data Curation O : Writing - Original Draft E : Writing - Review & Editing Vi : Visualization Su : Supervision P : Project administration Fu : Funding acquisition CONFLICT OF INTEREST STATEMENT Authors state no conflict of interest.
DATA AVAILABILITY
The data supporting the findings of this study are publicly available https://data.
id/dataset.
xhtml?persistentId=hdl:20.
12690/RIN/1ESGFP.
REFERENCES