INTERNATIONAL JOURNAL OF EDUCATION AND HUMANITIES
e-ISSN: 2829-8675 .
-ISSN: 2830-4578 Volume.
Issue.
November .
: 221-230 DOI: https://doi.
org/10.
56314/ijoleh.
Fidelity Analysis of Artificial Intelligence-Powered Poetry Translation: A Comparison of Google Translate.
Deepl Translate, and Libre Translate Muhammad Chairil Imran Pendidikan Bahasa Inggris.
Universitas Islam Makassar.
Indonesia Correspondence* E-mail: muh.
imran@uimmakassar.
Received : 31 October 2025 Accepted : 25 November 2025 Published : 25 November 2025 Copyright .
2025 Author.
Muhammad Chairil Imran This work is licensed under a Creative Commons AttributionShareAlike 4.
0 International License.
Abstract The research aims to analyze the fidelity of Google Translate.
DeepL Translate, and Libre Translate in conveying the meaning and intent of the poetry.
Employing a mixed-method approach, data were collected from ten purposively selected lecturers at Universitas Islam Makassar Universitas Megarezky Makassar.
Quantitative data were obtained through a structured questionnaire using a five-point Likert scale to measure fidelity, while qualitative insights were gathered via thematic content analysis of the AI-powered poetry Findings reveal a clear fidelity score:
DeepL Translate .
ean = 4.
5, excellen.
Google Translate .
ean = 3.
7, goo.
, and Libre Translate .
ean = 1.
8, lo.
The researcher concludes that AI-powered translation tools exhibit considerable variability in their capacity to maintain the message, intent, and stylistic attributes of poetry.
the three systems assessed.
DeepL Translate shows the highest level of fidelity, supported by both quantitative evaluations and qualitative theme Keywords:
Artificial Intelligence.
Deeple.
Fidelity.
Google Translate.
Poetry Published By : CV.
Eureka Murakabi Abadi | https://jurnal-eureka.
com | Email :ijoleh.
journal@gmail.
Page INTRODUCTION Artificial intelligence (AI) has changed the way translation is done in the last few Machine translation techniques have made it possible for platforms like Google Translate.
DeepL Translate, and Libre Translate to do complicated language tasks with amazing fluency and understanding of the context.
These systems, trained on large multilingual data sets, can produce translations that look and sound like they were made by people (Alowedi & Al-Ahdal, 2023.
Naveen & Trojovsky, 2.
The quick use of AI in translation has changed how students, teachers, and researchers read texts in other languages.
Even with these improvements, translating literary works, especially poetry, using AI is still quite hard.
Poetry has a lot of substance packed into a small space, as well as metaphors, rhythm, and emotional power that other types of writing do not have (Almaktary, 2025.
De la Rosa et al.
, 2.
To translate poetry, you have to recreate both the meaning and the art.
It entails not only linguistic precision but also the maintenance of the poet tone, imagery, and cultural identity.
The complexity of poetry gives it a great way to look at how AI systems deal with creative and emotional parts of language.
Fidelity is an important idea in translation studies.
It means how well a translation stays true to the source material in both meaning and style (Leiter et al.
, 2024.
Zakhir, 2.
In poetry translation, fidelity goes beyond just being the same word for word.
It also includes semantic accuracy, stylistic consistency, emotional depth, and cultural resonance (Ghazi, 2.
To find this balance, human translators use interpretation, intuition, and knowledge of other cultures.
technologies, on the other hand, use computational predictions that might not pick up on nuanced connotations, analogies, or artistic details (Gao et al.
, 2.
This brings up an essential question on recreating the fidelity that readers demand from poetry translation.
Earlier research on AI translation (Doan, 2025.
Farghal & Haider, 2024.
Karaban & Karaban, 2024a.
Mohammed, 2025.
Naveen & Trojovsky, 2024.
Yuliasri et al.
, 2.
have predominantly concentrated on quantitative assessment, highlighting measurable factors like accuracy, lexical similarity, and grammatical correctness These evaluations show how well something works technically, but they do not often show how people feel about its artistic and emotional quality.
This is especially true for poetry, where rhythm, imagery, and tone are very important (Almaktary, 2.
The increasing usage of AI translation technologies in schools and in literature shows how important it is for people to make decisions.
Students and teachers are using these systems more and more to read poetry in many languages, often without realizing how the technology changes the tone or meaning (Ghazi, 2025.
Mauladiana & Juniardi, 2024.
Resende & Hadley, 2.
Published By : CV.
Eureka Murakabi Abadi | https://jurnal-eureka.
com | Email :ijoleh.
journal@gmail.
Page People think about AI translation that can help them learning more about how people judge, trust, or question machine-generated writing.
The development of AI technology in translation has resulted in significant improvements in context processing, lexical accuracy, and grammatical structure.
However, most existing research still focuses on quantitative metrics-based evaluations such as word equivalence, grammatical form, and automated This approach fails to capture the aesthetic and emotional complexity that characterizes poetry, including metaphor, rhythm, imagery, and inherent cultural nuances.
Furthermore, comparative research specifically assessing the quality of poetry translations by various AI platforms such as Google Translate.
DeepL Translate, and LibreTranslate remains very limited, particularly those that use a comprehensive fidelity assessment framework encompassing meaning, style, form, and the reader's emotional experience.
Meanwhile, the use of AIbased translation technology in the context of education and literary literacy continues to grow, while understanding how translation quality affects reader perception, interpretation, and experience remains largely unexplored.
This situation highlights the need for more in-depth and multidimensional research to evaluate AI's ability to translate poetry while maintaining fidelity to the form, meaning, and aesthetic value of the original work.
This research aims to find out the evolving interplay between artificial intelligence and literary translation, by focusing on the fidelity aspect AI-powered poetry translation.
METHOD
The research employed a mixed-method research design to provide a comprehensive evaluation of AI-powered poetry translation.
Poems were selected from Turikale dan Kisah Perjalanan (Rijal, 2.
, chosen for their rich imagery and cultural nuance, making them suitable for assessing translation Ten lecturers from Universitas Islam Makassar and Universitas Megarezky Makassar were purposively selected to ensure that all respondents possessed sufficient expertise in translation analysis and literary interpretation.
Quantitative data were collected through a structured questionnaire in which participants rated the fidelity of translations produced by Google Translate.
DeepL Translate, and Libre Translate using a five-point Likert scale .
= very low to 5 = excellen.
, following established instruments used in translation evaluation (Chen & Lin, 2025.
Gao et al.
, 2.
Mean scores were calculated to identify overall fidelity aspect.
To complement these numerical findings, thematic content analysis was conducted to examine how each AI system rendered semantic meaning, imagery, tone, and the overall spirit of the original poems.
This qualitative component enabled a deeper exploration of contextual precision, metaphorical accuracy, and stylistic fluency, offering richer insights into the strengths and limitations of each translation engine.
Quantitative analysis was conducted by calculating the mean, standard deviation, and comparing scores between platforms to identify the fidelity level of Published By : CV.
Eureka Murakabi Abadi | https://jurnal-eureka.
com | Email :ijoleh.
journal@gmail.
Page each machine translation.
If necessary, comparative statistical tests were applied to determine the significance of differences between groups of translations.
In the qualitative section, the analysis was conducted using a thematic analysis Respondents were asked to provide narrative descriptions of the strengths and weaknesses of each translation based on the dimensions of meaning, imagery, tone, rhythm, and the spirit of the poem.
The analysis process involved three stages: open coding, categorization, and theme abstraction to generate patterns of findings regarding the performance of each AI system in translating the stylistic and aesthetic elements of poetry.
The entire research process adhered to principles of academic ethics.
Participation was voluntary, with guarantees of data confidentiality, respondent anonymity, and data use solely for research purposes.
RESULT AND DISCUSSION
The structured questionnaire data indicated differing levels of perceived fidelity among the three AI translation systems.
The mean ratings were derived from ten respondents utilizing the five-point Likert scale.
Table 1.
Mean Fidelity Score Translation System Mean Fidelity Score Interpretation DeepL Translate Excellent Google Translate Good Libre Translate Low Table 2.
Sample of Result from AI-Powered Translation Asyik belalang bertengger di alang-alang Kupu-kupu bertengger di puncak menghisap sari mawar Tangkainya dan ranting tak patah Kuncupnya tetap menghadap ke timur laut (Angin Timur laut in Turikale dan Kisah Perjalana.
Teratai di samping sekolah SMA kita Kini tak lagi kuncup mekar ke ufuk timur Mentari pagi Kini ia layu Mungkin terlalu lama merindu (Balasan Cinta Dilan in Turikale dan Kisah Perjalana.
Google Translate:
The grasshopper is happily perched in the The butterfly is perched on the top, sucking the rose's nectar The stem and twig are unbroken The bud remains facing northeast Google Translate:
The lotus beside our high school Now it's no longer a bud blooming towards the eastern horizon of the morning sun Now it's withered Perhaps I've been missing it for too DeepL Translate:
DeepL Translate:
Published By : CV.
Eureka Murakabi Abadi | https://jurnal-eureka.
com | Email :ijoleh.
journal@gmail.
Page Crickets chirp in the reeds Butterflies perch on the tips, sucking the nectar from roses The stems and branches remain unbroken The buds still face northeast Libre Translate:
The grasshopper roars in the tarts The butterfly perched at the peak sucking the rose essence The trap and twigs don't break The lock remains facing northeast The lotus beside our high school No longer blooms toward the eastern horizon of the morning sun.
Now it withers.
Perhaps it has longed too long.
Libre Translate:
The lotus next to our high school No longer the bud blooms to the early east horizon Now it's withered.
Maybe too long homesick Respondents regarded DeepL Translate as the most dependable tool for preserving both poetic tone and meaning.
Its sentence structure and lexical choices were perceived as more natural and fluid, consistent with findings from recent comparative studies that highlight DeepL Translate superior on the contextual handling and fluency in literary translation tasks (Chen & Lin, 2025.
Telaumbanua et al.
, 2.
Google Translate generated translations that were clearly comprehensible.
however, it was frequently criticized for its lack of emotional nuance and tendency toward literal rendering, a limitation also observed in cross-linguistic poetry translation analyses (Anuar, 2025.
Gao et al.
Libre Translate received the lowest evaluations due to frequent lexical mismatches, inconsistent rhythm, and awkward phrasing that obscured poetic imagery, which aligns with previous reports about open-source models limited vocabulary depth and contextual adaptability(Abdelhalim et al.
, 2025.
Dost.
Descriptive statistics indicate a clear preference hierarchy: DeepL > Google > Libre Translate.
Nevertheless, respondents emphasized that even the most advanced system did not achieve complete fidelity, especially in capturing figurative language and rhythmic patterns central to poetic expression.
This reflects the inherent limitations of neural machine translation (NMT) models when processing creative texts that require affective and symbolic interpretation rather than structural precision (Gao et al.
, 2.
Participants frequently noted that DeepL Translate communicated meaning with higher precision and contextual Its success is attributed to its ability to capture metaphorical expressions and retain semantic depth, consistent with Chen & Lin .
, who reported that DeepL performs better at preserving contextual meaning across creative genres.
In contrast.
Google Translate literal tendency often resulted in flattened symbolism, while Libre Translate restricted lexicon led to occasional These outcomes reaffirm that richer training data and more sophisticated embedding models lead to more faithful renderings of metaphor and imagery (Dost, 2023.
Telaumbanua et al.
, 2.
In terms of stylistic continuity and rhythm.
DeepL Translate maintained smoother phrasing and more cohesive diction.
Similar to the results in Humanities and Social Sciences Communications.
DeepLAos translations were judged as more rhythmically balanced, while Google Translate often disrupted flow with abrupt lexical choices (Gao et al.
, 2.
Libre Translate was viewed as fragmented and lacking stylistic Published By : CV.
Eureka Murakabi Abadi | https://jurnal-eureka.
com | Email :ijoleh.
journal@gmail.
Page unityAiechoing Abdelhalim et al .
argument that open-access AI translators often fail to model prosodic structures found in literary texts.
Respondents widely agreed that AI translations lack emotional intensity.
DeepL was seen as the most emotionally resonant, yet still fell short of the poetAos original tone.
This finding mirrors (Ghazi, 2.
, who concluded that while AI can convey propositional meaning, it struggles to reproduce affective texture.
Google Translate neutral tone and Libre Translate inconsistent expressiveness further highlighted that current AI models prioritize cognitive comprehension over emotional recreation (Abdelhalim et al.
, 2.
A recurring issue was the AI systemsAo inability to interpret culturally loaded metaphors and local symbolism accurately.
Participants observed that expressions rooted in regional spirituality or tradition were mistranslated or rendered This is consistent with broader concerns about the cultural blind spots of AI translation models, which rely on probabilistic inference rather than interpretive cultural awareness (Gao et al.
, 2024.
Ghazi, 2.
Consequently, human intervention remains essential in preserving cultural and spiritual fidelity within poetic translation.
DeepL Translate has proven to be the most reliable in both semantic and stylistic The elevated mean fidelity score corresponds with qualitative feedback highlighting naturalness and contextual precision.
This finding is consistent with studies showing that DeepL Translate contextualized neural network architecture allows for more natural lexical selection and smoother syntactic generation (Chen & Lin, 2025.
Gao et al.
, 2.
The moderate evaluations of Google Translate reflect its literal dependable conveyance of fundamental meaning, aligning with observations that Google Translate statistical corpus approach often favors accuracy over aesthetic quality (Telaumbanua et al.
, 2.
Libre Translate, although operational, exhibited significant deficiencies in lexical diversity, fluency, and stylistic consistency, confirming the notion that open-source models with smaller datasets produce translations with reduced semantic depth and stylistic fluidity (Abdelhalim et al.
, 2025.
Agirrezabal et al.
, 2.
Participants concurred that none of the AI technologies could fully reproduce the aesthetic and emotional core of the poems.
This limitation mirrors the broader challenge identified in literary translation research AI systems can imitate linguistic structures but struggle to replicate the interpretive, emotional, and cultural dimensions of human creativity (Alowedi & Al-Ahdal, 2023.
Ghazi, 2.
Both quantitative and qualitative findings in this research reaffirms that current neural translation systems exhibit strong linguistic competence but limited poetic Of the three translation tools assessed.
DeepL Translate consistently achieved the highest fidelity ratings and received the most favourable qualitative feedback.
Participants observed that DeepLAos translations were more natural, contextually coherent, and stylistically refined.
This outcome corresponds with previous comparative research indicating that DeepLAos Transformer-based network demonstrates superior performance in lexical coherence and contextual accuracy (Chen & Lin, 2025.
Gao et al.
, 2.
However, despite its advanced Published By : CV.
Eureka Murakabi Abadi | https://jurnal-eureka.
com | Email :ijoleh.
journal@gmail.
Page handling of syntax and diction.
DeepL remained limited in reproducing poetic aesthetics particularly rhythm, meter, and musicality.
AI can approximate form, it can not replicate emotional essence.
This observation shows findings by Karaban & Karaban .
, who noted that AI translation remains largely descriptive rather than interpretive, capable of conveying information but not imagination.
Google Translate received moderate scores for both accuracy and affective Participants described its outputs as grammatically correct, it is also a trait observed in earlier analyses of Google literal translation tendencies (Telaumbanua et al.
, 2.
The system shows reliance on massive multilingual corpora enables high lexical coverage but sacrifices stylistic adaptation.
Consequently, metaphors, idioms, and personifications were often rendered into plain, denotative language, diminishing their artistic resonance.
This phenomenon aligns with findings in recent studies suggesting that NMT systems such as Google Translate prioritize semantic clarity over stylistic nuance, particularly in creative texts (Gao et al.
, 2024.
Ghazi.
While effective for narrative or descriptive content.
Google Translate struggled with abstract, metaphorical, or symbolic poetry confirming the broader consensus that current AI lacks interpretive awareness of literary tone and rhythm.
Libre Translate, as an open-source translation system, showed the weakest fidelity and stylistic coherence.
Smaller corpus and limited neural architecture resulted in grammatically inconsistent outputs and frequent lexical repetition.
This supports Abdelhalim et al .
argument that data scale and model complexity are major determinants of translation quality, especially in tasks demanding interpretive sensitivity.
Libre Translate restricted handling of idioms and metaphors further underscores its limitations.
Similar conclusions were drawn by Ghazi .
who found that lightweight AI models often misrepresent symbolic imagery due to inadequate exposure to figurative language during training.
Thus, this research confirms that poetic fidelity requires not just linguistic capacity but aesthetic intelligence, an area still beyond current open-source systems.
The inclusion of human evaluators offered deeper insights into how fidelity is perceived beyond measurable linguistic equivalence.
Participants emphasized that fidelity involves not only the accurate transfer of meaning but also the recreation of experience an emotional and cultural resonance that AI can not yet Similar to findings shows participants described feeling emotionally detached from AI-generated poems, even when the text was grammatically sound (Abdelhalim et al.
, 2025.
Ghazi, 2.
This underscores a persistent gap between cognitive precision and aesthetic empathy.
While AI can replicate syntactic patterns, it lacks creativity, cultural intuition, and emotional awareness the very traits that define human translators (Imran, 2023.
Karaban & Karaban.
The respondentsAo reflections affirm that AI translation remains an assistive tool rather than a replacement for human literary interpretation.
In poetry, meaning, structure, and emotion are inseparable.
thus, machine translation can only approximate the former two but not the last.
Published By : CV.
Eureka Murakabi Abadi | https://jurnal-eureka.
com | Email :ijoleh.
journal@gmail.
Page CONCLUSION AND RECOMMENDATION The researcher concludes that DeepL Translate offers the best and most natural translations of poetry, followed by Google Translate and Libre Translate.
All three tools could get across the main idea, but they often lost the poems' beauty, rhythm, and emotion.
This illustrates that AI translation still has trouble with creative or artistic language.
AI translation systems should be used to help students learning, not to replace human translators.
Students and teachers can use them to see how different translations mean the same thing.
Developers should work on making AI better at understanding metaphorical language, emotion, and style in Future research may use additional poems and people to yield more comprehensive results.
REFERENCES