Science and Technology Indonesia e-ISSN:2580-4391 p-ISSN:2580-4405 Vol.
No.
October 2025 Research Paper The Kernel Function of Reproducing Kernel Hilbert Space and Its Application on Support Vector Machine Bernadhita Herindri Samodera Utami1,2* .
Warsono2 .
Mustofa Usman2 .
Fitriani2 Doctoral Student of Mathematics and Natural Science.
Faculty of Mathematics and Natural Science.
Universitas Lampung.
Lampung, 35145.
Indonesia Department of Mathematics.
Faculty of Mathematics and Natural Science.
Universitas Lampung.
Lampung, 35145.
Indonesia *Corresponding author: bernadhita@fmipa.
Abstract Reproducing Kernel Hilbert Space (RKHS) is a Hilbert space consisting of functions that can be represented or reproduced by a kernel The development of data science has made RKHS a method that refers to an approach or technique using the concept of reproducing kernels in certain applications, especially machine learning.
Support Vector Machine (SVM) is one of the machine learning methods included in the supervised learning category for classification and regression tasks.
This research aims to determine the form of linear kernel functions, polynomial kernel functions, and Gaussian kernel functions in Support Vector Machine analysis and analyze their performance in Support Vector Machine classification and regression.
Application of the RKHS method in SVM classification analysis using World Disaster Risk Dataset data published by Institute for International Law of Peace and Armed Conflict (IFHV) from Ruhr-University Bochum in 2022 obtained results that are based on the results by comparing the predictions of training data and testing data using linear kernel functions, polynomial kernels and Gaussian kernels, it is recommended that classification using linear kernels provides the best prediction performance.
Keywords Linear Kernel Function.
Polynomial Kernel Function.
Gaussian Kernel Function.
Reproducing Kernel Hilbert Space Received: 19 January 2025.
Accepted: 9 July 2025 https://doi.
org/10.
26554/sti.
INTRODUCTION
Reproducing Kernel Hilbert Space (RKHS) is a Hilbert space consisting of functions that can be represented or reproduced by a kernel function (Awan et al.
, 2.
The main components of the RKHS concept include Hilbert space, reproducing property, and kernel function (Chen et al.
, 2021.
Paulsen and Raghupathi, 2.
Hilbert space has the special property that every Cauchy sequence is convergent and is equipped with an inner product .
nner produc.
to calculate the length of the vector and the distance and angle between vectors (Utami et al.
Reproducing property means that the evaluation of a function at a point can be defined as the inner product between the original function and the kernel function (Yang, 2.
The contribution of the RKHS method is explained in the research of Paiva et al.
Akgyl .
, and Rosipal and Trejo .
, which plays a role in the field of neurophysiology, chemical engineering, and non-linear regression models in high-dimensional space.
Along with its development, the RKHS method has become a strong theoretical basis for understanding and developing machine learning algorithms involving kernels (Arqub et al.
, 2.
Applying the RKHS method in machine learning can solve non-linear regression and classification problems in high-dimensional feature spaces .
erms for independent variable.
without explicitly calculating transformations.
The development of machine learning is relatively fast, only in ten years.
Briefly, the machine learning era can be grouped into three eras: the era before 1980, the 1980s, and the 1990s to the present (Shukla et al.
, 2.
Support Vector Machine (SVM) is a machine learning algorithm that falls under the broader field of artificial intelligence.
Machine learning focuses on creating computer systems capable of learning from data, identifying patterns, and making decisions with minimal human involvement (Muthukrishnan et al.
, 2.
Its primary objective is to design algorithms or models that enable computers to carry out specific tasks autonomously by learning from training data.
The SVM method is used for classification analysis and regression analysis, where the model is trained using pairs of input .
ndependent variable.
and output .
ependent variable.
data that have been After the model is trained, the machine can be used to classify and build regression models from new data that has never been seen before (Pisner and Schnyer, 2020.
Rampisela Utami et.
and Rustam, 2018.
Anjani et al.
, 2.
Initially.
SVM was used to classify data into only two classes, but its development SVM can be extended to multi-class classification( Cervantes et al.
, 2020.
Azarnavid et al.
, 2015.
Pisner and Schnyer, 2.
This is supported by the research of Mohan et al.
, where they have tried pattern recognition using data mining classification techniques.
The results showed that the SVM algorithm produced good pattern prediction and low data classification Furthermore.
Zhu et al.
proposed the Huberized Pinball Support Vector Machine (HPSVM) method to improve the resilience of SVM to outliers.
This motivates this study to examine the application of the RKHS method in SVM.
One of the interesting things about the Support Vector Machine (SVM) algorithm is that it can be applied to analyze text categorization data, face detection, bioinformatics, and even used to study natural disaster areas that include patterns or characteristics that can be identified by SVM Ghosh et al.
Tuo et al.
Muthukrishnan et al.
, and Shukla et al.
In this study, the application of the kernel function in the analysis of Support Vector Machine classification and regression using disaster risk area data from 197 countries from 2011 to 2021.
Support Vector Machine is believed to be a powerful and effective classification algorithm, especially in handling high-dimensional data (Rampisela and Rustam, 2.
, so it is suitable for use in the analysis of disaster risk data from 197 countries from 2011 to 2021, which is panel data.
Panel data is a type of data that involves observations made on the same unit over several periods (Baltagi et al.
, 2.
panel data context.
SVM can be used for classification or regression tasks by considering the time structure, dependencies between observations, kernel choices, and dimensions (Lestari et al.
, 2.
If the panel data has a time structure.
SVM can model the relationship between variables at a particular time based on historical data.
This study aims to construct the properties of RKHS as a special case of Hilbert space containing unit elements.
examines the closed properties of kernel functions applicable to RKHS.
construct linear kernel functions, polynomial kernel functions, and Gaussian kernel functions on RKHS.
implements linear kernel functions, polynomial kernel functions, and Gaussian kernel functions on Support Vector Machine classification and regression.
apply and compare the performance of linear kernel functions, polynomial kernel functions, and Gaussian kernel functions on Support Vector Machine classification and regression using a dataset.
Chakraborty .
introduces a Bayesian nonlinear multivariate regression model utilizing RKHS for high-dimensional datasets, particularly in near-infrared spectroscopy.
The model adeptly handles multiple correlated responses and employs a robust Bayesian support vector regression framework, highlighting the efficacy of RKHS in complex, high-dimensional regression scenarios.
Rincyn and Ruiz-Medina .
propose a classification methodology combining wavelet bases with RKHS theory to analyze gene expression profiles, showcasing the adaptability of RKHS in functional data analysis.
UnA 2025 The Authors.
Science and Technology Indonesia, 10 .
1096-1108 like previous studies such as Chakraborty .
and Rincyn and Ruiz-Medina .
, which focused primarily on domainspecific applications of RKHS in spectroscopy and gene expression analysis, this study presents a mathematically grounded construction of classical kernel functions in RKHS and applies them systematically to both classification and regression using high-dimensional panel data on disaster risk.
This dual theoretical-practical approach, combined with the comparative analysis of kernel performance, offers both novel insights into RKHS theory and practical recommendations for kernel-based learning in real-world, time-dependent data settings.
The linear kernel function corresponds to the standard inner product in Euclidean space.
Even a polynomial kernel introduces non-linearity while retaining interpretability, and a Gaussian kernel maps data into an infinite-dimensional space and is universal under certain conditions.
Unlike previous studies that focused on the empirical implementation of kernel functions, this study provides a mathematical analysis of kernel properties in RKHS.
It systematically compares the performance of kernel functions on SVM using a data set.
There are four points of novelty in this study.
First, this study constructs RKHS properties based on Hilbert space equipped with unit elements in Hilbert space.
This means that this study presents the development of theoretical properties of RKHS as a special case of Hilbert space equipped with unit elements.
Second, this study examines the closed nature of kernel functions in RKHS and aims to provide new insights into the operations applicable to RKHS.
Third, this study discusses linear, polynomial, and Gaussian kernel functions and constructs their formulations and properties in RKHS, which have not been widely explored mathematically in previous studies.
Fourth, the implementation and comparative analysis of kernel functions in Support Vector Machine (SVM) using a data This study applies linear kernel, polynomial, and Gaussian kernel functions to the classification and regression of SVMs.
By comparing the performance of different kernels using a dataset, this study provides recommendations for selecting the best kernel according to the problem context.
By integrating rigorous mathematical analysis with applied machine learning, this study provides both a theoretical advancement in RKHS and practical insights for disaster data analysis using SVM.
In this paper, section 2 provides the methods or research Section 3 presents the results and discussion, which consist of three main parts: reproducing kernel Hilbert space on a support vector machine, kernel function on a support vector machine, and performance comparison of kernel function on a support vector machine.
Section 4 concludes the findings and provides recommendations for future research and practical kernel selection in high-dimensional and temporal data
EXPERIMENTAL SECTION
1 Methods This section describes the steps taken.
Page 1097 of 1108 Utami et.
Provide the remark of Hilbert space with the unit element based on the given definition.
Provide examples of feature mapping in RKHS and the remark based on the examples.
Provide the proposition of the Gram matrix, which is useful for proving theorems on the properties of kernel Provide the theorems of properties of the kernel function in RKHS.
Construct mathematical formulation of linear, polynomial, and Gaussian kernel functions within the RKHS framework on SVM.
Utilize the dataset with a Support Vector Machine (SVM) for classification and regression tasks and evaluate the performance of linear, polynomial, and Gaussian kernel
RESULTS AND DISCUSSION
1 Reproducing Kernel Hilbert Space on Support Vector Machine Unlike previous studies that focused on the empirical implementation of kernel functions, this study provides a mathematical analysis of kernel properties in RKHS.
It systematically compares the performance of kernel functions on SVM using Thus, the novelty of this study can be described as follows.
This study constructs more specific properties of RKHS by utilizing the concept of unit elements in Hilbert space, which has never been discussed by previous researchers.
By studying the closed properties of kernel functions, this study provides new insights into the operations applicable to RKHS.
This study applies linear kernel, polynomial kernel, and Gaussian kernel functions to SVM classification and regression.
By comparing the performance of different kernels using datasets, this study provides recommendations for selecting the best kernel according to the problem context.
Hilbert space is a key component in RKHS.
The following is the definition of a Hilbert space with the unit element (Small and McLeish, 1.
Science and Technology Indonesia, 10 .
1096-1108 for every v OO , yuA .
A) OO H , for every v OO and for every yuc OO H , yuc, yuA .
A) = yuc.
where yuc : Ie Eyn .
Remark 3.
4 Based on the Definition 3.
1, if yuC : Ie Ey and yuA .
i , v j ) are chosen in a Hilbert space such that yuC , yuA .
i , v j ) = yuC .
holds, then the Hilbert space in which the inner product holds is called the RKHS.
Kernel functions in RKHS can perform inner product computations in high-dimensional spaces without explicitly mapping data to that space.
The following defines the kernel functions (Bowman and Azzalini, 1997.
Shawe-Taylor and Cristianini, 2004.
Ghojogh et al.
, 2.
Definition 3.
5 Let be any set.
A function yuA : y Ie Ey is called a kernel function if for every v, w OO it satisfies yuA .
, .
= yuC .
, yuC .
where yuC : Ie Eyn is a mapping from to an n-dimensional feature space.
Here are examples of a kernel function.
Example 3.
6 Let be a two-dimensional set of real numbers Ey2 and define a mapping yuC : Ey2 Ie Ey3 :
Oo yuC : v = .
1 , v2 ) Ie Oe yuC .
= .
12 , v22 , 2v1v2 ) OO Ey3 .
Let v, w OO , by Definition 3.
5 applies yuA .
, .
= yuC .
, yuC .
Oo Oo = .
12 , v22 , 2v1v2 ), .
12 , w22 , 2w1w2 ) = v12w12 v22w22 2v1v2w1w2 = .
1w1 v2w2 ) 2 = v, w 2 .
Thus, the function yuA .
, .
= v, w 2 is a kernel function corresponding to the feature space Ey3 .
Definition 3.
1 Given a Hilbert space H .
Given an element 1.
ordered pair (H , .
is called a Hilbert space equipped with a unit element if 1, 1 = 1.
Thus, the scalar product of 1 is an element of Example 3.
7 Let = Ey2 and defined a feature mapping yuC :
Ey2 Ie Ey4 :
Remark 3.
2 A Hilbert space that induced by inner product, the unit element is not unique.
Let v, w OO , by Definition 3.
5 applies Reproducing Kernel Hilbert Space (RKHS) is fundamentally linked to kernel functions, as a specific type of Hilbert space is defined and constructed through these functions.
The following is the definition of RKHS as presented by Berlinet and Thomas-Agnan .
Definition 3.
3 Let be any set.
A kernel function yuA : y Ie Ey is a reproducing kernel Hilbert space on a Hilbert space H if and only if it satisfies the following conditions:
A 2025 The Authors.
yuC : v = .
1 , v2 ) Ie Oe yuC .
= .
12 , v22 , v1v2 , v2v1 ) OO Ey4 .
yuA .
, .
= yuC .
, yuC .
= .
12 , v22 , v1v2 , v2v1 ), .
12 , w22 , w1w2 , w2w1 ) = v12w12 v22w22 v1v2w1w2 v2v1w2w1 = .
1w1 v2w2 ) 2 = v, w 2 .
Thus, the function yuA .
, .
= v, w 2 is a kernel function corresponding to the feature space Ey4 .
Page 1098 of 1108 Science and Technology Indonesia, 10 .
1096-1108 Utami et.
Remark 3.
8 Based on Example 3.
6 and Example 3.
7, the kernel function uniquely determines the feature space.
The following properties are obtained about the kernel In other words, the same kernel function can produce different feature mapping dimensions.
Kernel functions can be represented as a Gram matrix because this matrix simplifies calculations in high-dimensional feature spaces without having to directly calculate the coordinates in that space (Shawe-Taylor and Cristianini, 2004.
Ghojogh et al.
, 2.
Theorem 3.
13 Let any set OI Eyn .
Define the functions y Ie Eyn .
Let yuA1 and yuA2 be any kernel functions.
If yuC is a mapping from Ie Eyn , then the following properties hold.
yuA .
uA , yuO) = yuA1 .
uA , yuO) yuA2 .
uA , yuO) is a kernel function, for any yuA , yuO OO .
yuA .
uA , yuO) = yu yuA1 .
uA , yuO) is a kernel function, for any yuA , yuO OO and yu OO Ey .
yuA .
uA , yuO) = yuA1 .
uA , yuO) yuA2 .
uA , yuO) is the kernel function, for any yuA , yuO OO .
Definition 3.
9 Let V be a set containing .
uA 1 , yuA 2 , .
, yuA n }.
The Gram matrix is defined as a matrix G of size l y l with entries Gi j = yuA i , yuA j .
Proof.
By Proposition 3.
12, then for any v OO it will be shown v A (G1 G2 )v Ou 0.
Based on Definition 3.
yuA1 .
uA 1 , yuA 1 ) .
yuA1 .
uA 1 , yuAl ) A yuA1 .
uA 2 , yuA 1 ) .
yuA1 .
uA 2 , yuAl ) A G1 = A yuA .
uA yuA yuA .
uA yuA l A yuC 1 .
uA 1 ), yuC 1 .
uA 1 ) .
yuC 1 .
uA 1 ), yuC 1 .
uAl ) A yuC .
uA ), yuC 1 .
uA 1 ) .
yuC 1 .
uA 2 ), yuC 1 .
uAl ) A =A 1 2 A.
yuC .
uA yuC .
uA ) yuC .
uA yuC .
uA ) 1 l A A 1 l Since yuA1 is a kernel function it means that v A (G1 )v Ou 0.
Likewise yuA2 .
uA 1 , yuA 1 ) .
yuA2 .
uA 1 , yuAl ) A yuA2 .
uA 2 , yuA 1 ) .
yuA2 .
uA 2 , yuAl ) A G2 = A A yuA2 .
uAl , yuA 1 ) .
yuA2 .
uAl , yuAl ) A yuC 2 .
uA 1 ), yuC 2 .
uA 1 ) .
yuC 2 .
uA 1 ), yuC 2 .
uAl ) A yuC 2 .
uA 2 ), yuC 2 .
uA 1 ) .
yuC 2 .
uA 2 ), yuC 2 .
uAl ) A =A A.
yuC .
uA yuC .
uA ) yuC .
uA yuC .
uA ) Since yuA2 is a kernel function, it means that v A (G2 )v Ou 0.
Thus, the sum v A (G1 )v v A (G2 )v = v A (G1 G2 )v Ou 0.
Thus, yuA .
uA , yuO) = yuA1 .
uA , .
yuA2 .
uA , yuO) is a kernel function.
Let yuA1 kernel function and a OO Ey .
It will be shown that yuv A G1v Ou 0.
Based on Definition 3.
yuA1 .
uA 1 , yuA 1 ) .
yuA1 .
uA 1 , yuAl ) A yuA .
uA , yuA ) .
yuA1 .
uA 2 , yuAl ) A G1 = A 1 2 1 A yuA1 .
uAl , yuA 1 ) .
yuA1 .
uAl , yuAl ) A yuC 1 .
uA 1 ), yuC 1 .
uA 1 ) .
yuC 1 .
uA 1 ), yuC 1 .
uAl ) A yuC 1 .
uA 2 ), yuC 1 .
uA 1 ) .
yuC 1 .
uA 2 ), yuC 1 .
uAl ) A =A A.
A yuC 1 .
uAl ), yuC 1 .
uA 1 ) .
yuC 1 .
uAl ), yuC 1 .
uAl ) A Since yuA1 is a kernel function, it means that v A (G1 )v Ou 0.
Thus, v A .
uG1 )v = yuv A (G1 )v Ou 0.
So, yuA .
uA , yuO) = yu yuA1 .
uA , yuO) is a kernel function.
Given yuA1 kernel function, meaning v A (G1 )v Ou 0 and yuA2 kernel function meaning v A (G2 )v Ou 0.
It will be shown v A (G1 G2 )v Ou 0.
Based on Schur product so that v A (G1 )v v A (G2 )v = v A (G1 G2 )v Ou 0.
So, yuA .
uA , yuO) = yuA1 .
uA , yuO) yuA2 A 2025 The Authors.
Page 1099 of 1108 Definition 3.
10 Let X be any set.
If yuA is a function that evaluates the inner product based on the feature mapping yuC then the corresponding Gram matrix has entries Gi j = yuC .
uA i ), yuC .
uA j ) = yuA .
uA i , yuA j ).
Definition 3.
11 Let be any set and A be a symmetric matrix.
The matrix A is called semi-positive definite if each eigenvalue is non-negative or v A Av Ou 0, for each vector v OO .
Based on Definition 3.
Definition 3.
10, and Definition 3.
the following propositions are obtained.
Proposition 3.
12 The Gram matrix and the kernel matrix are semi-positive definite matrices.
Proof.
Let G be a Gram matrix, where Gi j = yuA .
uA i , yuA j ) = yuC .
uA i ), yuC .
uA j ), .
for i , j = 1, 2, .
, l, and yuA i , yuA j OO .
Let v OO V be any vector.
Then:
v A Gv = v1 v2 A A A vl yuA .
uA 1 , yuA 1 ) yuA A .
uA 2 , yuA 1 ) yuA .
uA l , yuA1 ) = v1 OcA yuA .
uAl , yuAl ) a A A A vl yuC .
uA 1 ), yuC .
uA 1 ) A yuC .
uA 2 ), yuC .
uA 1 ) A yuC .
uAl ), yuC .
uA 1 ) yuA .
uA 1 , yuAl ) yuA .
uA 2 , yuAl ) a a a a yuC .
uA 1 ), yuC .
uAl ) yuC .
uA 2 ), yuC .
uAl ) a yuC .
uAl ), yuC .
uAl ) vi yuC .
uA i ), yuC .
uA j )v j i , j=1 * l OcA vi yuC .
uA i ).
OcA OcA vi yuC .
uA i ) AA A A Av2 A AA AA .
A A .
A A vl A v j yuC .
uA j ) Ou 0.
AA A
A Av2 A AA AA .
A A .
A A vl A Science and Technology Indonesia, 10 .
1096-1108 Utami et.
uA , yuO) is a kernel function.
Based on Theorem 3.
13, a new kernel function can be formed as follows.
Remark 3.
18 The kernel function yuA .
uA , yuO) = a0 a1 yuA1 .
uA , yuO) a2 yuA1 .
uA , yuO) 2 A A A an yuA1 .
uA , yuO) n is hereinafter called the polynomial kernel function.
Theorem 3.
14 Let any set OI Eyn and a function of y Ie Eyn .
If yuA1 , yuA2 .
A A A , and yuAn as finite kernel functions and yu1 , yu2 .
A A A , and yun as any positive real numbers, then the finite linear combination is a kernel function.
Proposition 3.
19 For yuA , yuO OO , if yuA1 .
uA , yuO) a kernel over y then yuA .
uA , yuO) = exp.
uA1 .
uA , yuO)) is kernel function.
Proof.
For i = 1, 2, .
, n every kernel function yuAi .
uA , yuO) there is a Gram matrix G .
) such that G .
jk ) = yuAi .
uA j , yuA k ).
j, k = 1, 2, .
, m Proof.
The exponential function exp.
uA1 .
uA , yuO)) is expressed as a Taylor series expansion O OcA yuA .
uA , yuO) n OAyuA 1 , yuA 2 .
A A A , yuA m OO .
Since yuAi .
uA , yuO) kernel function then every G .
) holds v A G .
) v Ou 0.
OAv OO .
Gram matrix combination yuA .
uA , yuO) = ysnG from.
linear A Gv = v A ( ysn yu yu yuA .
uA yuO) yu i=1 i i i=1 i i=1 i ysn G .
) )v.
Since yui > 0 and v A Gi v Ou 0 then i=1 yui .
A G .
) .
Ou 0.
Since yuA1 is a kernel function and n Ou 0 then every power of the kernel function is a kernel function.
Based on Theorem 3.
that the sum of kernel functions produces a kernel function then yuA .
uA , yuO) = exp.
uA1 .
uA , yuO)) is a kernel function.
2 Kernel Function on Support Vector Machine OeyuO Ou Proposition 3.
20 If yuA , yuO OO then yuA .
uA , yuO) = exp(Oe OuyuA2yue 2 ) is a kernel function.
Proposition 3.
15 If yuA .
uA , yuO) is a kernel over y, with yuA , yuO OO then yuA .
uA , yuO) = yuA A yuO is a kernel function.
Proof.
Let a function Proof.
Based on Definition 3.
5, a function is defined yuC .
uA), yuC .
uO) = yuA , yuO = yuA A yuO = OcA yuA .
uA , yuO) = exp(Oe yuA i yuOi .
yui yu j yuA .
uA i , yuA j ) = i=1 j=1 x OcA OcA yui yu j .
uA i A yuA j ) = OcA i=1 j=1 OuyuA Oe yuO Ou 2 = OuyuA Ou 2 OuyuO Ou 2 Oe 2yuA , yuO yui yui so that Remark 3.
16 The kernel function yuA .
uA , yuO) = yuA A yuO is hereinafter called the linear kernel function.
Proposition 3.
17 For yuA , yuO OO , if yuA1 .
uA , yuO) is a kernel over y and p.
uA) = a0 a1 yuA a2 yuA 2 A A A an yuA n is a polynomial with positive real coefficients then yuA .
uA , yuO) = p.
uA1 .
uA , yuO)) is a kernel function.
Proof.
The polynomial p.
uA1 .
uA , yuO) is explained as p.
uA1 .
uA , yuO)) = a0 a1 yuA1 .
uA , yuO) a2 yuA1 .
uA , yuO) 2 A A A .
OuyuA Ou 2 OuyuO Ou 2 2yuA , yuO OuyuA Oe yuO Ou 2 Thus.
OuyuA Oe yuO Ou 2 yuA .
uA , yuO) = exp Oe 2yue 2 OuyuA Ou 2 OuyuO Ou 2 2yuA , yuO = exp Oe Oe 2yue 2 2yue 2 2yue 2 OuyuA Ou 2 OuyuO Ou 2 2yuA , yuO = exp Oe Oe 2yue 2 2yue 2 2yue 2 Ou yuOOu The value of exp(Oe OuyuA ) and exp(Oe Ou2yue 2 ) depends on each of 2yue 2 yuA and yuO so that it does not affect the positive semi-definite property of the kernel matrix .
ince it is a scala.
Based on ,yuO Proposition 3.
19, exp( 2yuA ) is a kernel where yuA , yuO is a linear Since ai Ou 0 and yuA1 is a kernel function, then ai yuAi .
uA , yuO) i Ou 0.
OAi = 1, 2.
A A A , n.
Based on Theorem 3.
that the sum of kernel functions produces a kernel function, then yuA .
uA , yuO) = p.
uA1 .
uA , yuO)) is a kernel function.
A 2025 The Authors.
AOcA yuj yujA .
A j=1 ysx Let v = i=1 yui yuA i then the expression above becomes v A v = OuvOu 2 Ou 0.
So, yuA .
uA , yuO) = yuA A yuO is a kernel function.
an yuA1 .
uA , yuO) n OcA ai yuAi .
uA , yuO) i .
Note that According to Definition 3.
11, for every yu OO holds x OcA OcA OuyuA Oe yuO Ou 2 2yue 2 OeyuO Ou kernel function.
So, yuA .
uA , yuO) = exp(Oe OuyuA2yue 2 ) is the kernel OeyuO Ou Remark 3.
21 The kernel function yuA .
uA , yuO) = exp(Oe OuyuA2yue 2 ) is hereinafter called the Gaussian kernel function or Radial Basis Function.
Page 1100 of 1108 Science and Technology Indonesia, 10 .
1096-1108 Utami et.
Support Vector Machine (SVM) is a machine learning algorithm to identify the optimal hyperplane that maximally separates different data classes within the feature space.
The principle of SVM if using the linear kernel function yuA .
uA , yuO) = yuA A yuO is as follows.
n , with x OO Eyd , y OO {Oe1, .
Let a dataset set {.
i , yi )}i=1 SVM aims to find a hyperplane in the form f .
= w A x b where w OO Eyd is the weight vector and b OO Ey is the bias.
Classification is defined by prediction label = sign( f .
SVM maximizes the margin between two classes, namely the distance between the hyperplane and the nearest data .
upport vector.
with the formula:
OuwOu Optimization is done by solving the primal problem min OuwOu 2 relative to w,b 2 yi .
A xi .
Ou 1.
OAi , i = 1, 2.
A A A , n.
The objective function Ouw2 Ou represents the optimization goal, which is to minimize the norm w, which corresponds to the maximum margin between two classes in SVM.
The smaller OuwOu 2 , the larger the separation margin.
The constraint yi .
A xi .
Ou 1.
OAi ensures that the training data xi is correctly classified, i.
, is on the correct side of the margin with a distance of at least 1.
When the data dimension is low or the number of constraints is small, solving the primal problem is more efficient than the dual.
However, for SVM problems with high dimensional data, the dual approach is preferred due to its computational simplicity.
The form of the dual problem is OcA yui Oe 1 OcA OcA yui yu j yi y j yuA .
i , x j ) .
i=1 j=1 ysn relative to i=1 yui yi = 0, yui Ou 0.
OAi with yui is a Lagrange The kernel function yuA .
i , x j ) = xi A x j is substitued by linear kernel function or polynomial kernel function or Gaussian kernel function so that the weight w is expressed by OcA yui yi xi if linear kernel function, or OcA yui yi .
0 a1 .
i A .
i A .
2 A A A an .
i A .
n ) A 2025 The Authors.
if polynomial kernel function, or OcA Oux Oe zOu 2 yui yi exp Oe if Gaussian kernel function.
The bias b can be calculated as b = yk Oe OcA yui yi yuA .
i , x j ), .
where k is the data index of support vector.
3 The Performance Comparison of Kernel Function on Support Vector Machine This analysis uses the World Disaster Risk Dataset published by the Institute for International Law of Peace and Armed Conflict (IFHV) of the Ruhr-University Bochum in 2022.
The World Risk Report is an annual technical report published in German and English on global disaster risk and disaster risk management.
The World Risk Report publication includes the World Risk Index, which identifies the risk of extreme natural events that constitute disasters for many countries worldwide.
The World Risk Index uses 27 publicly available aggregated indicators to determine the disaster risk of 181 countries worldwide and can be accessed at kaggle.
4 SVM Classification SVM classification aims to find the best hyperplane that separates two classes in a feature space with maximum margin.
Hyperplane is a decision boundary or separator between different In general, hyperplane is a .
Oe .
dimension in a ndimensional space.
For example, if the data is two-dimensional, the hyperplane formed is a line, while if the data is in a threedimensional space, the hyperplane formed is a plane, and so The steps taken in conducting SVM classification analysis include data input, pre-processing data, feature selection, division of training data and testing data, model formation, model training, model validation, and model evaluation.
SVM classification analysis compares linear kernel functions, polynomial kernels, and Gaussian kernels using three data splitting schemes training and testing data 70%-30%, 80 %, and 90%-10%.
Comparing the performance of different kernel functions aims to evaluate different types of kernels and determine which one gives the best results based on the data In the data splitting schemes 70%-30%, 80%-20%, and 90%-10%, there is a possibility that the training and testing data do not represent the proper distribution of the data, which can lead to inaccurate performance assessments.
To avoid biased random sample selection on the training data, the k-fold cross-validation method is used.
The k-fold cross-validation method splits the dataset into k equally sized subsets.
Each subset is used once as the test set, while the remaining k Oe 1 subsets serve as the training set.
This process ensures that every data point is used for testing exactly Page 1101 of 1108 Science and Technology Indonesia, 10 .
1096-1108 Utami et.
Table 1.
Comparison of Classification Results Using Linear Kernel Functions.
Polynomial Kernels, and Gaussian Kernels Split Ratio
70:30
80:20
90:10
Kernel polynomial .
= .
= .
= .
= .
= .
= .
Gaussian polynomial .
= .
= .
= .
= .
= .
= .
Gaussian polynomial .
= .
= .
= .
= .
= .
= .
Gaussian Mean Accuracy once and for training k Oe 1 times, leading to a more reliable assessment of the modelAos performance.
The evaluation results from each iteration are averaged to provide a more accurate picture of performance.
The selection of the k value is adjusted to the number of data sets.
In this SVM classification analysis, k = 5 was chosen.
The performance of SVM classification using linear kernel functions, polynomial kernel functions, and Gaussian kernel functions from each scheme 70%-30%, 80%-20%, and 90%-10% is expressed by the average accuracy value, standard deviation of accuracy, and p-value shown in Table 1.
Accuracy is a metric used to measure the extent to which a classifier correctly predicts the class against the total number of samples.
The p-value indicates the statistical significance of the classification results.
Very small values (< 0.
indicate statistically significant results.
Based on Table 1, the p-value of all kernel functions show statistically significant results.
Still, the linear and Gaussian kernel functions produce better performance than the polynomial kernel functions.
Classification using the linear kernel function and the Gaussian kernel function produces an average accuracy of more than 52% and increases with the increase in the training data ratio.
The worst performance is seen from the polynomial kernel A 2025 The Authors.
St.
Dev.
Accuracy p-value The accuracy of the polynomial kernel function is much lower than other kernels, ranging from 32% to 39% in each training scheme.
There is a tendency for the classification accuracy to not improve as the polynomial degree increases.
It can be interpreted that the polynomial kernel function with degrees four to nine is too complex for the data pattern and cannot capture relevant patterns well.
Based on the data split ratios, the results in Table 1 indicate that the accuracy remains relatively consistent across different This stability may be attributed to the effectiveness of the k-fold cross-validation method.
Consequently, the SVM classification applied to the dataset demonstrates that the linear and Gaussian kernel functions outperform the polynomial kernel function in terms of performance.
Based on Table 1, it can be concluded that the linear kernel is an excellent choice for this data set because of its high accuracy, good efficiency, and performance stability.
The Gaussian kernel provides excellent performance and high efficiency but does not show significant advantages over the linear kernel.
The polynomial kernel is unsuitable for this data set because of its low accuracy and high Analysis of Support Vector Machine regression, also called Support Vector Regression (SVR), is one of the methods in machine learning used to model the relationship between depen- Page 1102 of 1108 Science and Technology Indonesia, 10 .
1096-1108 Utami et.
dent and independent variables.
SVR aims to find a regression function that minimizes the deviation between predictions and actual values while considering the adjusted margin constraints.
This section compares regression models using linear kernel functions, polynomial kernels, and Gaussian kernels.
In this SVR analysis, six quantitative variables are used, namely the disaster risk index .
orld risk inde.
, disaster exposure risk .
, vulnerability .
, sensitivity .
, coping capabilities .
ack of coping capabilitie.
, and the adaptive capabilities of the government and community .
ack of adaptive capacitie.
Based on the understanding of the data characteristics, the variables selected as independent variables are the exposure variable (X1 ), the vulnerability variable (X2 ), the susceptibility variable (X3 ), the lack of adaptive capacities variable (X4 ), the lack of coping capabilities variable (X5 ), and the world risk index variable (Y ) as the dependent Table 2 shows the parameter estimation results using the linear kernel function.
The linear kernel models the linear relationship between the input features of the variables X1 .
X2 .
X3 .
X4 .
X5 and the target output Y .
Therefore.
SVM regression with a linear kernel explicitly produces coefficients for each feature, showing each featureAos direct contribution to the prediction.
Table 2.
Parameter Estimation Results Using Linear Kernel Function linear but involves a polynomial combination of the original Table 3 shows that only the intercept parameter is available for the polynomial kernel, with no coefficient values for the independent variables X1 to X5 .
This is a common characteristic of nonlinear kernels, such as polynomial kernels, since the relationship between input and output is not directly represented in explicit linear form.
Table 3.
Parameter Estimation Results Using Polynomial Ker- nel Function Coefficients 70:30 80:20 90:10 Polynomial Kernel d=4 d=5 d=6 d=7 d=8 d=9 Table 4.
Parameter Estimation Results Using Gaussian Kernel
Function
Coefficient Coefficient Intercept
70:30
Linear Kernel
80:20
90:10
The intercept is the average predicted value of the model when all independent variables X1 to X5 are zero.
The intercept and coefficient values of X1 to X5 do not change significantly in the model, even though the data sharing ratio changes.
From the aspect of split ratio.
Table 2 shows that the coefficient of X1 remains stable at all data ratios.
In contrast, the coefficients of X2 .
X3 .
X4 .
X5 show greater fluctuations between split ratios, indicating a weaker relationship or sensitivity to data variation.
Unlike the estimation results using the linear kernel, the estimation results of regression parameters using polynomial kernel functions cannot determine the coefficients of variables X1 .
X2 .
X3 .
X4 .
X5 as shown in Table 3.
This is due to the nature of the polynomial kernel function, which projects data into a feature space expanded by a polynomial of a certain degree so that the relationship between features is no longer A 2025 The Authors.
Gaussian Kernel
70:30
80:20
90:10
The consistency of the intercept value despite changes in the split ratio indicates that changes in the amount of training data do not significantly affect the intercept estimate.
However, since polynomial kernels tend to provide suboptimal performance, the intercept alone is insufficient to indicate the goodness of fit of the polynomial kernel to the data set.
The lack of explicit coefficients for X1 to X5 in the polynomial kernel results from how this kernel operates.
It transforms the data into a higher-dimensional feature space using polynomial combinations, where the modelAos predictions are based on the interaction between the kernel function and the support vectors rather than the direct influence of the original input Similar parameter estimates are also produced by SVM Regression using the Gaussian kernel function shown in Table 4.
The Gaussian kernel is a distance-based kernel that projects the data into an infinite-dimensional feature space.
The absence of coefficients for variables X1 to X5 in the Gaussian kernel, as shown in Table 4 is a consequence of the nature of the Gaussian kernel itself, which works in a nonlinear feature space and does not use a linear representation of the original features in the model.
The intercept stability across data ratios indicates that the Gaussian kernel produces a robust model against changes in the amount of data training.
The intercept value that does not differ much across various data division ratios indicates that the Page 1103 of 1108 Science and Technology Indonesia, 10 .
1096-1108 Utami et.
Table 5.
Comparison of Regression Model Performance Using Linear Kernel Functions.
Polynomial Kernels, and Gaussian Kernels Split Ratio
70:30
80:20
90:10
Kernel polynomial .
= .
= .
= .
= .
= .
= .
Gaussian polynomial .
= .
= .
= .
= .
= .
= .
Gaussian polynomial .
= .
= .
= .
= .
= .
= .
Gaussian Gaussian kernel is relatively stable in modelling data patterns even though the size of the training and testing data changes.
Based on Table 2.
Table 3, and Table 4, it is concluded that the appropriate kernel function for the data set is the linear kernel function.
The recommendation for the proper kernel function is to look at the performance of the linear kernel function, polynomial kernel, and Gaussian kernel based on the Mean Square Error (MSE), coefficient of determination (R 2 ), and Mean Absolute Percentage Error (MAPE).
Table 5 presents the performance evaluation of the SVM regression model using three kernel types-linear, polynomial, and Gaussian-across different data split ratios .
:30, 80:20, and 90:.
The Mean Squared Error (MSE) represents the average of the squared differences between actual and predicted lower MSE values indicate better model performance.
The coefficient of determination (R 2 ) assesses how well the model captures the variability in the target variable, with values closer to 1 signifying stronger explanatory power.
Meanwhile, the Mean Absolute Percentage Error (MAPE) reflects the average absolute error percentage relative to the actual values, where a MAPE below 20% denotes a highly accurate model.
Based on Table 5, the linear kernel consistently produces the smallest MSE, the highest R 2 , and the smallest MAPE A 2025 The Authors.
MSE
MAPE
compared to other kernels at all split ratios.
The polynomial kernel has the highest MSE, the lowest R 2 , and the largest MAPE.
This may be due to the higher complexity of the model so that the model is overfit or not flexible enough to fit the data optimally.
The Gaussian kernel performs better than the polynomial kernel but not better than the linear kernel.
These results indicate that although the relationship in the data may contain nonlinear elements, the complexity of the Gaussian kernel does not provide a significant advantage over the linear It can be concluded that the data has a strong linear relationship between the features and the target.
Therefore, the linear kernel is more suitable for this data set.
Viewed from the aspect of split ratio.
Table 5 shows that the performance decreases with a larger training data ratio.
At split ratio 90:10, the performance of all kernels decreases compared to 70:30 or 80:20 ratios.
This is evident from the increase in MSE and MAPE and the decrease in R 2 .
A larger split ratio for the training data reduces the amount of test data, which can cause the evaluation results to be less representative.
Theoretically, the smaller the MSE, the better the model.
Based on Figure 1, it appears that the MSE of the linear kernel and the Gaussian kernel are close to 0 for each scheme, while for Page 1104 of 1108 Science and Technology Indonesia, 10 .
1096-1108 Utami et.
Figure 1.
Evaluation of SVR Model Using Linear Kernel Function.
Polynomial Kernel, and Gaussian Kernel Figure 2.
Schoenfeld Plot of Linear Kernel Function and Schoenfeld Plot of Gaussian Kernel Function the polynomial kernel, the MSE value increases as the degree of the polynomial increases.
The polynomial kernel of degree 9 shows a very large MSE, especially in the 70:30 scheme, which means the model is overfitting or fails to capture the pattern.
Based on the MSE value, the linear kernel function is the best choice because it produces the lowest mean squared error, indicating a strong linear relationship in the data.
Polynomial kernels with high degrees are unstable and less suitable for this A low value .
loser to .
is better for the MAPE metric.
Based on Figure 1, the linear and Gaussian kernels have low and stable MAPE values for each scheme, while the polynomial kernel tends to show a high MAPE, especially at degree 9.
Thus.
A 2025 The Authors.
n terms of percentage erro.
, the linear and RBF kernels show excellent and consistent performance across data partitioning schemes.
When analyzed based on the R 2 value, a value close to 1 indicates a good fit.
The linear kernel function for each scheme provides R 2 close to 1, making the model very good.
The Gaussian kernel function also performs well but is more sensitive to the proportion of data.
The polynomial kernel .
egree 4Ae.
produces a negative R 2 , meaning its performance is worse than the mean-only model.
Based on Figure 1, only the linear and Gaussian kernels are worthy of consideration for this model.
In contrast, the polynomial kernel is unsuitable and causes underfitting or overfitting.
Page 1105 of 1108 Utami et.
Science and Technology Indonesia, 10 .
1096-1108 Figure 3.
Schoenfeld Plot of Polynomial d = 4 .
Schoenfeld Plot of Polynomial d = 5 .
Schoenfeld Plot of Polynomial d = 6 .
Schoenfeld Plot of Polynomial d = 7 .
Schoenfeld Plot of Polynomial d = 8 .
Schoenfeld Plot of Polynomial d = 9 .
Based on Figure 1, the best kernel function is linear, as it produces the lowest MSE.
RMSE, and MAPE and the highest and most stable R 2 .
The Gaussian kernel function performs reasonably well but is somewhat inconsistent across High-degree polynomial kernel functions are not recommended as they perform poorly across all evaluation A 2025 The Authors.
The 90:10 scheme tends to produce the best results due to the larger training data, but the linear kernel still holds the best performance consistency.
The residual analysis of each kernel function is displayed graphically using the Schoenfeld plot.
Based on Figure 2, the green dots are the actual data values, while the blue dots are Page 1106 of 1108 Utami et.
the predicted values.
Based on Figure 2, it appears that most of the residuals are around zero, especially for small SVM Regression predictions (< .
The residuals become very large, especially for higher SVM Regression predictions (> .
Based on Figure 3, it can be seen that as the degree of the polynomial increases .
rom 4 to .
, the residual values become more extreme, both in positive and negative scales.
At a larger degree .
or example, 8 or .
, the SVM Regression prediction has a value that is very far from the actual data, which causes the residual to jump drastically.
This shows that the model is becoming more complex and can cause overfitting, so the SVM Regression prediction becomes less good.
So, numerically and graphically, the linear kernel function is more appropriate to be implemented on the data set used.
CONCLUSIONS
Based on the research and discussion results, it is concluded that the construction of Hilbert probability space with unit elements is not unique.
In implementing Reproducing Kernel Hilbert Space, the same kernel function can produce different feature mapping dimensions.
The analysis reveals that a finite linear combination of kernel functions retains the kernel property, which supports the construction of both polynomial and Gaussian kernel functions.
In applying the dataset, the linear kernel function emerges as the most suitable option, consistently delivering the most accurate and stable results across all data split ratios.
This suggests the presence of a strong linear relationship within the dataset.
The Gaussian kernel offers the flexibility to handle non-linear elements but is not better than the linear kernel.
The polynomial kernel performs poorly because its complexity does not match the data pattern.
This study provides strong evidence for the suitability of linear kernels in modeling disaster risk panel data.
However, questions remain regarding the behavior of kernels in timeevolving models and whether hybrid kernels could improve long-term generalization.
Future work is needed to address these challenges.
ACKNOWLEDGEMENT
The author wants to thank the Universitas Lampung, supervisor, co-supervisors, and reviewers for their contributions to producing a good article.
This research was funded by Directorate of Research.
Technology, and Community Service.
Directorate General of Higher Education.
Research, and Technology.
Ministry of Education.
Culture.
Research, and Technology in accordance with Research Contract Number:
076/C3/DT.
00/PL/2025.
REFERENCES