1 Journal of Intelligent System and Telecommunications 1. 84-95 Journal of Intelligent System and Telecommunications Journal homepage: https://journal. id/index. php/jistel/index Imitation Learning Based Obstacle Avoidance for MSL Soccer Robot in Offensive Scenario Achmad Akmal Fikri1* Graduate School of Science and Technology. Kumamoto University. Japan *Correspondence: E-mail: akmal@st. kumamoto-u. ARTICLE INFO Article History: Submitted/Received 09 December 2024 First Revised 31 December 2024 Accepted 31 December 2024 First Available Online 31 December 2024 Publication Date 31 December 2024 ____________________ Keyword: Soccer Robot. Obstacle Avoidance. Offensive Scenario. Imitation Learning. GAIL. A 2024 Tim Pengembang Jurnal UNESA ABSTRACT Kontes Robot Sepak Bola Indonesia Beroda (KRSBI-B), inspired by the Middle-Size League (MSL) of RoboCup, serves as a platform to push advancements in autonomous soccer robots in Indonesia. A key requirement for these robots is the ability to perceive their environment and make independent decisions involving recognizing field features, planning the navigation, and playing offensive and defensive Among these, obstacle avoidance during offensive play is critical, as robots should dynamically navigate while targeting the goal. In line with the theme "Toward Robot Soccer League 2050," this study focuses on developing robots capable of human-like performance in dynamic and competitive settings. To achieve this, we utilize Generative Adversarial Imitation Learning (GAIL), a method that enables robots to learn adaptive navigation strategies from expert demonstrations. Equipped with an omnidirectional camera, the robot identifies obstacles, field lines, and goal positions, integrating this sensory data into its decisionmaking framework. The system was tested in four scenarios: no obstacles, one obstacle, two obstacles, and three obstacles, with randomized obstacle positions. Success rates of 100%, 99. 5%, 92. 5%, and 82. 5% were recorded, demonstrating the system's effectiveness in navigating complex environments and its potential to enhance robotic soccer performance. 85 | Journal of Intelligent System and Telecommunications. Volume 1 Issue 1. Dec 2024 Hal 84-95 INTRODUCTION Kontes Robot Indonesia (KRI) is a prestigious robotics competition held annually in Indonesia, designed to encourage university students to innovate and solve real-world problems using robotics . Among its various divisions, the Kontes Robot Sepak Bola Indonesia Beroda (KRSBI-B) stands out as a challenging competition that focuses on wheeled soccer robots. KRSBI-B adopts the rules of the RoboCup Middle-Size League (MSL) . , where fully autonomous wheeled robots compete in soccer matches. By aligning with RoboCup standards. KRSBI-B aims to improve the competitiveness of Indonesian teams on the global level. Basically, a fundamental aspect of KRSBI-B and RoboCup MSL is the requirement for a fully autonomous system. The robots should independently perceive their environment, plan their movements, and execute both offensive and defensive strategies in real-time without human intervention. Robust sensor integration, effective algorithms, and sophisticated decision-making frameworks are essential for these capabilities. In offensive scenarios, robots should navigate toward the goal while keeping ball possession on a field full of dynamic obstacles, such as opponents and teammates. Because of these challenges, advanced techniques that combine offensive play and obstacle avoidance must be developed. In recent years, obstacle avoidance research has been widely studied . using both traditional methods, such as rule-based systems . and path-planning algorithms . , and learning-based approaches . One of MSL Indonesia teams . applies a fuzzy logic controller to avoid the collision with the obstacles for catching the ball. Another research . implements type-2 fuzzy logic system collaborating with behavior tree for making decisions and avoiding obstacles in SSL robot matches. Our team . also proposes an artificial potential field (APF) based on fuzzy logic controller for obstacle avoidance. Another team . also designs an obstacle avoidance system by combining an improved dynamic window approach (IDWA) and artificial potential field (APF). Other research also proposes a path-planning approach for avoiding the collision using search-based approach . and sample-based approach . Furthermore, recent advances in machine learning, including reinforcement . and imitation . learning, demonstrates significant promise in this proposal. The research . introduces a combination of artiAcial neural networks (ANN. and a standardization technique to improve the performance results from the training network. Some example studies of reinforcement learning are utilized with different approaches, such as introducing mobile robot collision avoidance learning (MCAL) based on reinforcement learning integrated with path planning . and proposing end-to-end map-based deep reinforcement learning algorithm using dueling double DQN with a prioritized experienced replay . Imitation learning is also commonly chosen for obstacle avoidance purpose, namely developing a local policy based on an egocentric local occupancy map . and imitation learning based path planning (ILPP) in dynamic pedestrian environment . On the basis of recent advances in imitation learning, this study proposes a generative adversarial imitation learning (GAIL) based framework . relying on image-based virtual range finder with omnidirectional camera for obstacle avoidance to offensive scenarios in wheeled soccer robots. GAIL is a method that trains policies by imitating expert demonstrations without the need for a predefined reward function, setting it apart from traditional reinforcement learning, which relies on manually designed rewards. An imagebased virtual range finder also is integrated into the GAIL framework, providing a 360-degree spatial understanding of the environment. This enables the agent to effectively identify obstacles and map state-action pairs, which is essential for navigating dynamic environments and maintaining offensive strategies in wheeled soccer robot scenarios. This work aims to enhance the capabilities of soccer robots in KRSBI Beroda and RoboCup MSL. DOI: http://dx. org/10. x/jistel. vXiX p- ISSN 2528-1410 e- ISSN 2527-8045 Achmad Akmal Fikri. Imitation Learning Based Obstacle Avoidance for MSL Soccer Robot in Offensive Scenario | 86 METHODS Image-Based Virtual Range Finder with Omnidirectional Camera In robotic soccer, having an effective perception system is essential for robots to navigate in dynamic scenarios while maintaining awareness of their surroundings. Our robots use an omnidirectional camera as its main sensor, providing a full 360-degree field of view. This configuration allows the robots to detect key elements in their environment, including the ball, obstacles, and field lines, enabling it to respond effectively to rapidly changing situations. Algorithm 1 Image-Based Virtual Range Finder Approach for Obstacle Avoidance Input: Camera Streaming Output: ranges Function: roi_field. mg, vertice. Create a mask of zeros with the same shape as img Fill polygon on the mask using vertices Perform bitwise AND between img and the mask Return masked_image Function: drawLineRadial. oint, x, y, radiu. Initialize empty list pts Compute angle as 360/point for i = 0 to point do Compute coordinates xp = x radius y cos(. ngle y . /360 y 2A) Compute coordinates yp = y radius y sin(. ngle y . /360 y 2A) Append . , int. ) to pts Return pts 16: Function: radialPoints. ngle, x, y, radiu. Compute xrp = x radius y cos. ngle/360 y 2A) Compute yrp = y radius y sin. ngle/360 y 2A) Store . rp, yr. as ptsr Return ptsr 21: Main: 22: Capture frame and preprocess it . rop, resize, and convert to HSV) 23: Retrieve HSV thresholds using calibration sliders 24: Generate masks . ask1, mask. using HSV thresholds 25: Refine masks with erosion and dilation 26: Combine masks into mask region 27: Detect contours in mask region 28: for each contour do Compute convex hull and append to region field 30: Mask region of interest using roi_field 31: Highlight obstacles in region obstacle 32: Convert region obstacle to grayscale and threshold it to create thresh obs 33: Compute inward radial points ptsIn using drawLineRadial 34: for each radial line do for each radius step from 0 to max radius do Compute radial points using radialPoints if thresh obs at . rp, yr. is obstacle then Append point to ptsOut Break 40: for each pair of ptsIn and ptsOut do Draw line on the frame Compute distance and store it to ranges 43: Return ranges In obstacle detection purpose, we propose a range finder approach to search obstacle features from video streaming. The following algorithm describes the implementation of this virtual range finder, detailing the steps involved in generating radial obstacle detection lines using image processing techniques. The method combines region-of-interest masking. HSV-based DOI: http://dx. org/10. x/jistel. vXiX e- ISSN 25xx-80xx 87 | Journal of Intelligent System and Telecommunications. Volume 1 Issue 1. Dec 2024 Hal 84-95 segmentation, and contour analysis to accurately determine the location and distance of obstacles relative to the robot. Based on Algorithm 1, the frame from an omnidirectional camera is processed to detect It begins by preprocessing the image and creating masks using HSV thresholds, refined with morphological operations. Contours are extracted to identify obstacle regions, which are masked to isolate interest area. Afterwards, a virtual range finder is implemented by drawing radial lines from a central point, detecting obstacles based on pixel values along these Detected points are visualized with lines, and the computed ranges are returned for obstacle avoidance, providing a vision-based solution for spatial awareness and navigation. Finally, the obstacle detection frame can be shown in Figure 1. Figure 1. Obstacle Detection Visualization. 2 Generative Adversarial Imitation Learning Generative Adversarial Imitation Learning (GAIL) . is a framework enabling robots to imitate expert behavior by learning policies directly from demonstrations, without requiring explicit reward signals as traditional reinforcement learning. Instead. GAIL infers an implicit reward function from expert trajectories, allowing robots to navigate complex environments and replicate expert strategies. This approach is particularly suited for dynamic tasks as soccer In MSL soccer, obstacle avoidance is crucial for effective navigation in offensive Robots should adapt to changing environments, avoiding collisions while maintaining offensive positioning. GAIL facilitates this by learning obstacle avoidance and strategic behaviors through expert demonstrations. The field size used in KRI is 12 x 8 meter divided into two areas, defensive area . and offensive area . In this scenario, there are three obstacles placed in the offensive area as depicted in Figure 2. The objective is how the robot . lue mar. navigates to the opponent's penalty box . ed are. without colliding the obstacle before scoring the goal. DOI: http://dx. org/10. x/jistel. vXiX p- ISSN 2528-1410 e- ISSN 2527-8045 Achmad Akmal Fikri. Imitation Learning Based Obstacle Avoidance for MSL Soccer Robot in Offensive Scenario | 88 Figure 2. Field Visualization. Based on the problem statement. GAIL employs a generator-discriminator framework to learn policies that mimic expert behavior, as illustrated in the architecture diagram (Figure . The generator, denoted as yuUyuEya . cyc ), represents the policy model parameterized by yuEya . interacts with the environment . , generating actions . cayc ) based on the current state . cyc ) and producing trajectories. These trajectories are fed into discriminator, represented by ycy. yc, yc. , for distinguishing generator's trajectories and expert demonstrations. Figure 3. GAIL Architecture in this Scenario. The discriminator outputs a reward, providing feedback to the generator by evaluating the similarity between its trajectories and the expert's. The generator updates its policy by minimizing the discriminatorAos negative log-loss, improving its ability to mimic expert Meanwhile, the discriminator is trained to maximize its accuracy in classifying expert . c, yc. pairs and generated . c, yc. Through this adversarial process, the generator gradually learns to produce trajectories similar to expert demonstrations, achieving expert-level behavior without explicit reward functions. The neural network architecture in Figure 5 is designed to facilitate robot navigation in a soccer field, directly correlating with the problem representation illustrated in Figure 4. Figure 1 highlights the goal area in red and depicts key inputs to the network: actions . cOycu , ycOyc , yuiyc ), the robot's current state derived from 20 image-based virtual omnidirectional range finders, and DOI: http://dx. org/10. x/jistel. vXiX e- ISSN 25xx-80xx 89 | Journal of Intelligent System and Telecommunications. Volume 1 Issue 1. Dec 2024 Hal 84-95 the relative position . uuycu, yuuy. between the robot and the goal area. These inputs enable spatial awareness, obstacle detection, and goal-directed navigation. Figure 4. Problem Statement Representation: Environment. Action, and Current State As shown in Figure 5, the network's input layer consists of 25 neurons categorized into three components: current velocities ( ycO. cu,y. , yui. ), radial distances from the virtual range finders . Oe. ), and the relative position to the goal . cu,y. Two hidden layers process these inputs, extracting patterns for obstacle avoidance and goal navigation. The output layer generates motion commands, where ycOyca. cu,y. defines linear velocity and yuiyca. specifies angular velocity, guiding the robot to the goal area in Figure 4 while avoiding obstacles dynamically and efficiently. Figure 5. Neural Network Architecture. 3 Evaluation 1 Success Rate The success rate is a key evaluation metric used to measure the robot's performance in achieving its offensive scenarios. This metric is calculated as the ratio of successful trials to DOI: http://dx. org/10. x/jistel. vXiX p- ISSN 2528-1410 e- ISSN 2527-8045 Achmad Akmal Fikri. Imitation Learning Based Obstacle Avoidance for MSL Soccer Robot in Offensive Scenario | 90 the total trial number, expressed as a percentage. A trial is successful if the robot reaches the goal within a defined area while avoiding obstacles. ycAycycoycayceyc ycuyce ycIycycaycayceycycyceycyco ycNycycnycaycoyc ycIycI = ( ycNycuycycayco ycAycycoycayceyc ycuyce ycNycycnycaycoyc ) y 100 The success rate, calculated as successful trials over total trials, is tested across scenarios of varying obstacle placement. This metric reflects the robot's reliability and effectiveness in obstacle avoidance, with higher rates indicating better navigation performance. EXPERIMENT Experimental Setup In this experiment, the robot was simulated on the Gazebo as a world simulation. visualized in Figure 6, the robotAos design utilized was an omnidirectional robot with 4-wheels configuration based on real robotAos design with an omnidirectional camera as a main sensor. As a reference, we executed the simulation on a laptop, installed Ubuntu OS 22. 04 LTS and ROS2, that was equipped with an intel core i7-14650H CPU. NVIDIA GeForce RTX 4060 laptop GPU, and 16GB of RAM. Figure 6. Robot Visualization: . Real Robot, . Gazebo Simulation. Data Collection User control data was collected using a joystick on the Gazebo simulator. The user operated the soccer robot to reach the goal area while avoiding obstacles. As illustrated in Figure 2, three obstacles were selected from 15 possible positions, marked as orange points. This process was repeated for 100 episodes, with obstacle positions shuffled randomly within each row to ensure diverse expert demonstrations, improving the learning effectiveness. Train Results The training experiment was performed using an omnidirectional soccer robot in a simulated virtual environment using Gazebo. As shown in Figure 4, an image-based 360-degree virtual range finder was used on top center. Additionally, each training episode, the three cube obstacles were placed randomly in each row. The settings of each hyperparameter in imitation learning are shown in Table 1. DOI: http://dx. org/10. x/jistel. vXiX e- ISSN 25xx-80xx 91 | Journal of Intelligent System and Telecommunications. Volume 1 Issue 1. Dec 2024 Hal 84-95 Table 1. Hyperparameter Settings Hyperparameter Value Maximum Steps Batch Size Learning Rate Weight Decay Number of Layers Hidden Units Input Dropout Dropout Activation ReLU The Figure 7 shows training performance over time, with initial significant fluctuations in rewards, including sharp dips, as the model explores suboptimal policies. Over time, rewards improved and stabilized, indicating successful learning and convergence to an effective strategy, although occasional dips might result from exploration of new strategies or challenging scenarios. Figure 7. Reward on Imitation Learning. Test Results To evaluate the performance of the trained model, we conducted tests across four different scenarios: no obstacles, one obstacle, two obstacles, and three obstacles. Each scenario was designed to assess the robotAos ability to navigate toward the goal area while avoiding collisions with obstacles. DOI: http://dx. org/10. x/jistel. vXiX p- ISSN 2528-1410 e- ISSN 2527-8045 Achmad Akmal Fikri. Imitation Learning Based Obstacle Avoidance for MSL Soccer Robot in Offensive Scenario | 92 a No Obstacle: The robot navigates freely, serving as the baseline. a One Obstacle: A single obstacle was placed in the first row, shuffled randomly per a Two Obstacles: Obstacles were placed in the first and second rows, with positions shuffled within their rows. a Three Obstacles: Obstacles were placed in the first, second, and third rows, with random shuffling in each row for every trial. For each scenario, we conducted 200 trials to ensure statistical significance. A trial was deemed successful if the robot reached the goal area without colliding with any obstacles. The success rate for each scenario was then calculated as the percentage of successful trials out of the total trials. The results are as shown in Table 2. Table 2. Results of Each Scenario Success Rate Without Obstacle With One Obstacle With Two Obstacles With Three Obstacles These results demonstrate the robot's capability to navigate effectively under varying levels of environmental complexity. As the number of obstacles increases, the success rate decreases, indicating the challenge posed by more dynamic and cluttered environments. Figure 8. Sample Trajectories: . No Obstacle, . One Obstacles, . Two Obstacles, . Three Obstacles Additionally, we provide a sample trajectories plot (Figure . for each scenario, illustrating the robot's path as it approaches the goal area while avoiding obstacles. These trajectories highlight the model's ability to generate efficient and collision-free paths. DOI: http://dx. org/10. x/jistel. vXiX e- ISSN 25xx-80xx 93 | Journal of Intelligent System and Telecommunications. Volume 1 Issue 1. Dec 2024 Hal 84-95 CONCLUSION This study implements Generative Adversarial Imitation Learning (GAIL) to teach MSL soccer robots obstacle avoidance in offensive scenarios. Using expert demonstrations, the robot learned effective navigation policies, achieving a 100% success rate without obstacles and 5% with three randomly placed obstacles. The results demonstrate GAILAos capability to handle dynamic environments and ensure efficient, collision-free trajectories. This study highlights the potential of GAIL in advancing robotic soccer navigation and imitation learning applications. For the future work, complex obstacle setups, dynamic opponents, and multi-agent strategies to further enhance performance can be explored more. ACKNOWLEDGMENT We would like to thank the Graduate School of Science and Technology. Kumamoto University. Japan, for providing data for this study. AUTHORAoS NOTE The authors declare that there is no conflict of interest regarding the publication of this article. The authors confirmed that the paper was free of plagiarism. AUTHORAoS CONTRIBUTION/ROLE Achmad Akmal Fikri : Conceptualization. Methodology. Formal Analysis. Writing Original Draft. Investigation. REFERENCES