JEELS
(Journal of English Education and Linguistics Studie.
P-ISSN: 2407-2575 E-ISSN: 2503-2194 https://jurnalfaktarbiyah.
id/index.
php/jeels A CORPUS-BASED STUDY OF LEXICAL BUNDLES IN THE ABSTRACTS OF HIGH-IMPACT RESEARCH ARTICLES
ACROSS DISCIPLINES
Umi Khoirunnisa1.
*Eri Kurniawan2 1-2English Language and Literature Study Program.
Universitas Pendidikan Indonesia.
Bandung.
West Java.
Indonesia khoirunnisa@upi.
*eri_kurniawan@upi.
(*) Corresponding Author Abstract: Lexical bundles are essential for creating coherent academic writing and for forming high-quality research articles.
Researchers have previously examined lexical bundles in a number of academic fields and However, there is a lack of investigation on cross-disciplinary comparisons in research article Thus, this study seeks to examine the utilization of lexical bundles (LB.
in abstracts across The analysis used a corpus-based study design to investigate the frequency, structural patterns, and function distributions of lexical bundles in abstracts across disciplines categorized as soft sciences (Linguistics.
ELT.
Psycholog.
and hard sciences (Electronic Engineering.
Medicine.
Biochemistr.
guarantee the representativeness of high-impact research articles, 180 abstracts of research articles, 30 from each discipline were chosen from high-citation Scopus1 Citation in APA style:
Khoirunnisa.
, & Kurniawan.
A corpus-based study of lexical bundles in the abstracts of high-impact research articles across disciplines.
JEELS, 12.
, 1109Ae1140.
DOI: 10.
30762/jeels.
Submission: January 2025.
Revision: April 2025.
Publication: November 2025 Khoirunnisa.
, & Kurniawan.
A corpus-based study of lexical bundles in the abstracts of high-impact research articles across disciplines.
JEELS, 12.
, 1109Ae1140.
indexed journals.
A descriptive quantitative approach was employed within Biber's .
structural taxonomy and Hyland's .
functional classification, along with understanding of LBs manifestations.
The results indicate significant disciplinary commonalities in LB usage albeit numerous noticeable variations.
A bias for phrasal over clausal constructs was observed in the structural use of bundles in both the Hard and Soft Sciences, with noun phrase-based bundles being the most In terms of functionality, research-oriented bundles predominated in both domains, with Soft Sciences preferring description bundles and Hard Sciences stressing topic bundles.
This study sheds light on disciplinary conventions in abstract writing and emphasizes the significance of understanding the structural and functional differences in LBs in crossdisciplinary engagement and effective academic Keywords: functions, high-impact, research article abstract, structures INTRODUCTION In the academic realm, research articles have become a significant As a written document which outlines the author's investigations, research articles present new knowledge through exploring theoretical and/or methodological concerns and typically compare the findings to those of others (Swales, 1.
Furthermore, research articles also facilitate scientists strengthen their credibility and convince others of their discoveries by taking into account the audience and social implications (Hyland, 1.
Ultimately, research articles are vital for expanding knowledge and fostering productive scholarly Thus, a research study should feature the best multidisciplinary content and be the most recent (Kurniawan et al.
Once the significance of research articles has been recognized, particular focus must be placed on the kind of publications that Khoirunnisa.
, & Kurniawan.
A corpus-based study of lexical bundles in the abstracts of high-impact research articles across disciplines.
JEELS, 12.
, 1109Ae1140.
significantly impact the academic community, i.
, high-impact research papers.
One critical factor contributing to a study's high impact is the output and the citation frequency associated with the researchers' work (Hirsch, 2.
High citation counts indicate that a piece of article has significantly contributed to the field's academic community (Kurniawan et al.
, 2.
Additionally, journal indexing also takes into account other aspects.
The journal indexation is also regarded as a portal to excellent research and publications (Kurniawan et al.
, 2.
since it is seen as a general indicator that the journal is of high quality.
Moreover, the primary determinant of a study's highimpact status is its publishing in a high impact factor journal, followed by reader interest, which is affected by the study's topic and design (Verma & Yuvaraj, 2.
As a first point of contact for readers with journal research articles, abstract is considered important for several reasons.
First, abstracts are an essential component that provide the basis for assessing a paper's merit for more reading (Ghasempour & Farnia.
Hyland, 2.
By providing a brief summary of a research's objectives, methodology, results, and discussion, the abstracts provide readers a general understanding of the nature and scope of the research they highlight the topic and key conclusions of the work without providing a thorough, step-by-step breakdown (Lorys, 2.
By then, the abstracts seek to draw readers in and facilitate their rapid evaluation of the article's relevance to their own research (Fauzan et al.
Kafes, 2.
Second, a global academic community has the ability to read online research articles due to the expanding use of online scholarly web indexes and the public access to abstracts (Tocalo.
Given the significance of abstracts, now they have been incorporated into the introduction, methods, results, and discussion (IMRAD) structure, which is the main benchmark for research articles (Wu, 2.
As a result, analyzing abstracts becomes critical for understanding their role in academic writing as well as their contribution to excellent scientific communication.
Building on the fundamental role of abstracts, it is necessary to examine more closely the linguistic elements that support their Khoirunnisa.
, & Kurniawan.
A corpus-based study of lexical bundles in the abstracts of high-impact research articles across disciplines.
JEELS, 12.
, 1109Ae1140.
efficacy, such as lexical bundles, which are vital resources for creating trustworthy and persuasive scholarly discourse.
Lexical bundles, known as multi-word expressions, were initially created by Biber et al.
Lexical bundles are clusters of three or more words that statistically occur together in a register, or extended collocations (Biber et al.
, 1.
Some examples of lexical bundles include: Auas a result of,Ay Auon the other hand,Ay Auin the case of the,Ay Authe context of the,Ay and Auit is likely toAy (Cortes, 2.
In academic writing, lexical bundles serve as discourse building elements (Biber et al.
, 1.
To identify them, frequency, fixedness, idiomaticity, and structural status are important characteristics that distinguish lexical bundles from others (Biber et al.
, 1999.
Cortes, 2.
Frequency refers to the high occurrence rate of these expressions within a particular discourse, with LBs being more frequent than pure idioms, which tend to appear less often or not at all.
Most bundles are formally regular and semantically transparent, which makes them the foundation of coherent conversation, in contrast to colloquial phrases.
In other words, because they usually span structural units, their frequencyAirather than their structureAiis the only empirical basis for their identification (Hyland, 2008.
This frequent occurrence is directly linked to fixedness, where LBs are defined by their stable word combinations that meet specific frequency criteria, regardless of alternative forms (Biber et al.
, 1.
However.
LBs are not rigid.
they exhibit a certain degree of flexibility, allowing them to adapt across different discourses.
For example, some LBs, such as AuI donAot know whatAy, are commonly found in spoken discourse, while others, like Auas a result ofAy, are more prevalent in written academic texts (Biber, 2.
On the other hand, the majority of LBs are non-idiomatic in terms of idiomaticity, which means that their meanings are obtained from their constituent parts rather than from a collective, figurative interpretation (Cortes, 2.
Moreover, the majority of LBs bridge two structural units rather than forming entire they usually start at the boundary of a phrase or clause and continue into the next unit.
Because of their structural adaptability.
LBs Khoirunnisa.
, & Kurniawan.
A corpus-based study of lexical bundles in the abstracts of high-impact research articles across disciplines.
JEELS, 12.
, 1109Ae1140.
can be used in a variety of syntactic configurations, which enhances their usefulness in academic writing (Biber et al.
, 1.
In order to better understand how LBs function in academic texts, several later research have used the frameworks for classifying LBs based on their structural and functional roles provided by Biber's .
work, which is widely regarded as basic in LB studies.
Table 1.
Structural classification of LBs (Biber et al.
, 2.
Structures (LBs incorpor Substructures Example Verb phrase (VP) fragments .
onnector ) 1st/2nd person pronoun VP fragment .
onnector ) 3rd person pronoun VP fragment Discourse marker VP Verb phrase .
ith non-passive Verb phrase with passive verb yes-no question fragment WH-question fragment Aowell I donAot knowAo AoitAos going to beAo AoI mean I donAotAo Aotake a look atAo Aocan be used toAo Aodo you want toAo Aowhat do you thinkAo Dependent clause (DC) 1st/2nd person pronoun dependent clause fragment WH-clause fragment If-clause fragment .
erb/adjective ) to-clause That-clause fragment Aoyou might want toAo Aowhat I want toAo Aoif we look atAo Aoto be able toAo Aothat there is aAo Noun phrase (NP) and hrase (PP) .
onnector ) Noun phrase with of-phrase fragment Noun phrase with other postmodifier fragments Other noun phrase expressions Prepositional phrase expressions Comparative expressions Aothe end of theAo Aothe way in whichAo Aoa little bit moreAo Aoof the things thatAo Aoas well as theAo In terms of functionality.
Biber et al.
state that the three primary functional categories of lexical bundles are referential Khoirunnisa.
, & Kurniawan.
A corpus-based study of lexical bundles in the abstracts of high-impact research articles across disciplines.
JEELS, 12.
, 1109Ae1140.
expressions, discourse organizers, and stance expressions.
Hyland .
expanded on this paradigm by modifying the classification to better fit the needs of academic writing, namely research papers, master's theses, and doctoral dissertations.
Since then, this revised taxonomy has gained widespread acceptance and been used in later research on scholarly speech.
Table 2 provides an overview of these functional categories.
Table 2.
Functional classification of LBs (Biber et al.
, 2004.
Hyland, 2008.
Functions (Biber et al.
, 2.
(Hyland, 2008.
Concerning the research Referential expressions Identification/focus Aoone of the mostAo Imprecision Aoor something like thatAo Specification of attributes Aoa lot of theAo.
Aothe size of theAo Time/place/text reference Aoin the United StatesAo.
Aoat the same timeAo.
Aoas shown in figureAo Research-oriented Location Aoat the beginning ofAo Procedure Aothe use of theAo Quantification Aoone of the mostAo Description Aothe structure of theAo Topic Aothe currency board systemAo Concerning the text Discourse organizers Topic introduction/focus Aoin this chapter weAo Topic elaboration/clarification Aoon the other handAo Text-oriented Transition signal Aoon the other handAo Resultative signal Aoas a result ofAo Structuring signal Aoin the next sectionAo Framing signal Aoin the case ofAo Concerning the author/au Stance expressions Epistemic stance AoI donAot know ifAo Attitudinal/modality stance Aocan be used toAo Participant-oriented Stance features Aomay be due toAo Engagement features Aoit should be noted thatAo Khoirunnisa.
, & Kurniawan.
A corpus-based study of lexical bundles in the abstracts of high-impact research articles across disciplines.
JEELS, 12.
, 1109Ae1140.
Diving deeply inside LBs, lexical bundles are crucial elements, particularly when the author is attempting to create coherence and successfully communicate information.
These expressions, which assist readers recognize a specific context or variety, such academic or legal, are made up of phrases or word sequences that typically appear repeatedly in a text (Biber et al.
, 1999.
Hyland, 2008.
Previous research has shown that lexical bundles serve as a pivotal role in scholarly communication, where these patterns suggest subject-matter competence in addition to influencing the structure of academic narratives (Hyland, 2008.
Up to this point.
LBs have been investigated from various In order to find patterns of variance and similarity.
LBs have been investigated both inside and across disciplines .
Farhang-Ju et al.
, 2024.
Kashiha, 2023.
Samraj, 2.
The way that LBs satisfy the communication requirements of several sectors has been made clear by this cross-disciplinary approach.
Moreover, numerous studies have examined LBs in particular research article sections, like introductions, discussions, and conclusions .
Deng & Liu, 2023.
Goodarzi et al.
, 2024.
Kurniawan & Haerunisa, 2.
, or more than one sections (Shahmoradi et al.
, 2021.
Shahriari, 2.
, demonstrating their tactical value in bolstering rhetorical devices and accomplishing communication objectives.
Additionally, from the existing LBS research, only a few attention to the abstract.
For instance.
Shahmoradi et al.
examined LBs in the information technology and applied linguistics articles, focused on the abstract and conclusion sections, whereas Varghaei and Khodadadi .
compared LBs in medical abstracts from Iranian and international journals.
Abdollahpour and Gholami .
examined the manifestation of LBs in medical sciences abstracts, whereas Qi and Pan .
investigated LB variance across movements in medical abstracts.
These investigations demonstrate how discipline norms and abstract-specific limitations are reflected in LBs.
Moreover, there is still a gap in research providing a comprehensive, crossdisciplinary analysis of LBs in abstracts, addressing their structure and Khoirunnisa.
, & Kurniawan.
A corpus-based study of lexical bundles in the abstracts of high-impact research articles across disciplines.
JEELS, 12.
, 1109Ae1140.
Following previous studies in disciplinary discourse (Biglan.
Becher & Trowler, 2.
, this research adopts the distinction between hard and soft sciences.
While the dichotomy is sometimes viewed as reductive, it remains a widely recognized framework for examining cross-disciplinary variation in academic writing.
Accordingly, the present study focuses on how lexical bundles (LB.
are manifested within abstracts from both hard sciences .
Medicine.
Biochemistry.
Electrical Engineerin.
and soft sciences .
English Language Teaching.
Psychology.
Linguistic.
structurally and These inquiries serve as a guide to address these research questions: .
What are the most frequent LBs in the abstract sections of hard sciences and soft sciences? .
Do structural and functional differences exist between lexical bundles in hard science and soft science research article abstracts? If so, how are they expressed? By addressing these research questions, the study seeks to learn more about how LBs are used in article abstracts and how they increase the chances of acceptance and visibility in the journals that are being METHOD Research Design This study applied a corpus-based study design.
Researchers are able to find hidden meanings in lexical elements, such as through the study of collocations, and detect recurring linguistic patterns in language usage by using quantifiable data from a corpus-based analysis to validate the presence of discourses (Baker, 2.
Furthermore, a descriptive quantitative approach is used since it is one of the corpus-based study characteristics.
That is, quantitative analysis uncovers key patterns by assessing the frequency and relative usage of language features, whereas descriptive analysis interprets the communicative functions associated with those observed quantitative patterns (Conrad, 1.
It aims to get a better comprehension of the LBs manifestation in six disciplines (Electrical Engineering.
English Language Teaching.
Linguistics.
Biochemistry.
Psychology, and Medicin.
Besides, the Z-test is also applied to calculate the differences Khoirunnisa.
, & Kurniawan.
A corpus-based study of lexical bundles in the abstracts of high-impact research articles across disciplines.
JEELS, 12.
, 1109Ae1140.
in the proportion of LBs between two corpora.
Hard Science and Soft Science (Kurniawan & Haerunisa, 2.
, testing the following A HCA= There is no significant difference in the proportion of LBs occurrences in the two corpora.
A H1= There is a significant difference in the proportion of LBs occurrences in the two corpora.
Additionally, the Z-test's alpha threshold was set at 0.
meaning that a p-value higher than 0.
05 does not provide sufficient evidence to reject HCA.
To give a better understanding of the results, the study's results are presented using figures, tables, explanations, and Corpora The data used in this study was articles from various disciplines include: Electrical Engineering (Maswana et al.
, 2.
Biochemistry (Kanoksilapatham, 2.
Medicine (Fryer, 2.
English Language Teaching (Rochma et al.
, 2.
Psychology (Yang, 2.
, and Linguistics (Alamri, 2.
All RAs were sourced from Scopus indexed databases which accessed from the Scopus.
com website.
The data comprised 180 high-impact RAs.
The 180 RAs consist of 30 abstracts of each discipline (Alamri, 2020.
Wannaruk & Amnuai, 2.
, which are grouped into two corpora, as seen in Table 3.
Table 3.
Description of the Corpora Hard Science Soft Science Corpus Electrical Engineering Biochemistry Medicine English Language Teaching (ELT) Psychology Linguistics Number of RAAs Word counts Total Khoirunnisa.
, & Kurniawan.
A corpus-based study of lexical bundles in the abstracts of high-impact research articles across disciplines.
JEELS, 12.
, 1109Ae1140.
With a total number of words of 33,380, this corpus is considered small because it is in the middle of the corpus size studied based on Chen and Baker .
, namely 26,000 to 88,000 words.
Data Collection and Analysis For identifying the LBs, this study used a computer software program named AntConc 4.
1 (Anthony, 2.
AntConc is the common computer software program which is used to analyze language corpus, especially lexical bundles (Farhang-Ju et al.
, 2024.
Goodarzi et al.
, 2024.
Jasim, 2023.
Kurniawan & Haerunisa, 2023.
Richter et al.
, 2.
Many features in Antcont are useful for in-depth textual analysis, particularly the N-Gram feature, which is used to identify lexical bundles' occurrence.
To achieve successful analysis with the program, the abstract section of each RA is separated and converted into a plain text .
xt The txt file of abstracts was imported into the software which separated into two corpora named Hard Science and Soft Science.
Then, the imported corpus was analyzed with the N-Gram feature.
This study employed a cut off criteria based on the guideline established by Biber and Barbieri .
which state that in order to prevent the influence of individual author styles, word combinations must appear at least three times and be dispersed across at least three different texts.
Moreover, this study only generates the three- and fourword bundles as it is more common than the other longer lexical bundle (Biber et al.
, 1.
FINDINGS
In order to address the RQs, this section reports the lexical bundles of three- and four-words that were found in both hard science and soft science corpora include the discussion of their structural and functional characteristics using taxonomies.
After the application of preset criteria about the identification of LBs, the N-gram feature of AntConc presents the different number of LBs in each corpus.
The program shows a list of 166 three-word and 47 four-word LBs used in Hard Science RAAs.
On the other hand, the Khoirunnisa.
, & Kurniawan.
A corpus-based study of lexical bundles in the abstracts of high-impact research articles across disciplines.
JEELS, 12.
, 1109Ae1140.
program generated the list of 70 three-word and 14 four-word LBs in Soft Science RAAs.
However, a significant number of initial lexical bundles in both corpora exhibited repetitive patterns.
Consequently, a manual elimination process was conducted based on the exclusion criteria outlined by Salazar .
The applied criteria for exclusion Fragments of other bundles.
For example, the three-word bundle "on the other" and the four-word bundle "on the other hand" appeared eight times.
Bundles "on the other" always appeared as a fragment of "on the other hand.
" "On the other" was thus not Bundles ending in articles.
For example, "in accordance with the" was a four-word bundle that was an extension of "in accordance with," and both words occurred with the same frequency.
"In accordance with theAy was ignored.
Bundles that lack semantic meaning or textual support such as Auyears and were.
Ay Bundles that contain arbitrary numbers, like "two or more.
Pointless groupings, such "et al in.
However, this study included topic-specific bundles such as Aupatients with CovidAy following Kurniawan and Haerunisa .
After following the elimination procedure, the final lists of LBs were 115 three-word and 38 four-word LBs in Hard Science, also 55 threeword and 9 four-word LBs Soft Science, as shown in Table 4.
Table 4.
LBs occurrences information Lexical Threeword Four-word Total Hard Science No.
Overall Soft Science No.
of bundles Overall freq.
There appears to be a difference in the amount of LB occurrences between Hard Science and Soft Science, as shown in Table This finding implies that Hard Science and Soft Science differ Khoirunnisa.
, & Kurniawan.
A corpus-based study of lexical bundles in the abstracts of high-impact research articles across disciplines.
JEELS, 12.
, 1109Ae1140.
noticeably, at least when it pertains to the quantity of LB occurrences.
In addition.
Table 5 and 6 present an overview of LBs in each corpus by listing the three most common of three- and four-word bundles in this study.
Table 5.
Most frequent three-word LBs in Hard Science and Soft Science .
Rank Hard Science LBs Freq.
patients with COVID-19 coronavirus disease .
of patients LBs Soft Science Freq.
as well as based on Table 6.
Most frequent four-word LBs in Hard Science and Soft Science .
Rank Hard Science LBs Freq.
in patients LBs Soft Science Freq.
as well as on the basis in terms Table 5 and 6 show that Hard Science lexical bundles are primarily composed of specific medical terms, including "severe acute respiratory syndrome" and "patients with COVID," which represent the discipline's exacting and technical nature.
On the other hand.
Soft Science lexical bundles, including "as well as" and "on the basis of," are typically less specialized and more expansive, which fits with the discipline's more interpretive and generic approach.
This result Khoirunnisa.
, & Kurniawan.
A corpus-based study of lexical bundles in the abstracts of high-impact research articles across disciplines.
JEELS, 12.
, 1109Ae1140.
indicates the differing linguistic and rhetorical demands of Hard Science and Soft Science.
56% 1.
56% 0.
Percentage LBs Structures in Hard Science and Soft Science This subsection examines the structural variation of lexical bundles in Hard Science and Soft Science using the framework by Biber .
This framework divides LBs into three categories: verb phrases, noun phrases, and prepositional phrases.
Additionally, this analysis incorporated the other categories presented by Nasrabady et al.
which include adjectival phrases, adverbials, be adjective/adverb to structures, and bundles beginning with conjunctions, to gain deeper understanding into LBs in both corpora.
Hard Science Soft Science Figure 1.
Distribution of LBs structures in Hard Science and Soft Science The results of this study are aligned with a number of earlier studies that highlight the prevalence of phrasal lexical bundles (LB.
in academic abstracts, include those by Abdollahpour and Gholami Khoirunnisa.
, & Kurniawan.
A corpus-based study of lexical bundles in the abstracts of high-impact research articles across disciplines.
JEELS, 12.
, 1109Ae1140.
Niu .
, and Varghaei and Khodadadi .
Figure 1 revealed that noun phrase (NP) fragments are used most frequently in both disciplines .
44% in Hard Science and 43.
75% in Soft Scienc.
, followed by prepositional phrase (PP) fragments .
25% in Soft Science and 25.
49% in Hard Scienc.
and verb phrase (VP) fragments .
22% in Hard Science and 21.
88% in Soft Scienc.
These results are in line with those of Abdollahpour and Gholami .
and Varghaei and Khodadadi .
, which emphasize the predominance of NP and PP fragments in medical abstract.
However, this analysis also shows a new finding that in Soft Science, the frequency of PP is higher than in Hard Science.
In contrast, other categories like dependent clause (DC) fragments, adjectival phrases, and bundles that start with conjunctions, are either barely represented or not at all in both corpora.
DC fragments .
, to describe th.
served as a modifier.
We aimed to describe the CT findings across different time points throughout the disease course.
(HS)
Those DC-based bundles with to-clause fragments used to explain an action (Kurniawan & Haerunisa, 2.
Table 7.
Z-test calculation of LBs main structures Structures VP fragments DC fragments NP fragments PP fragments Adjectival Adverbials Be adj/adv to Bundles begin Hard Science Soft Science z score p value Khoirunnisa.
, & Kurniawan.
A corpus-based study of lexical bundles in the abstracts of high-impact research articles across disciplines.
JEELS, 12.
, 1109Ae1140.
Moreover, only Hard Science uses adverb phrases .
Aomore likely toA.
, adding details about degree or likelihood.
Even no adjectival phrases and bundles with be adj/adv structures were found in either Lastly, bundles beginning with conjunction also appeared in low percentage in both corpora .
Aoand discriminant validityAo and Aoand ground glassA.
, introducing additional or complementary information.
As seen in Table 7, the most common lexical bundle structures were noun phrase (NP) fragments with comparable percentages in Hard Science .
44%) and Soft Science .
75%).
However, with an indicator p-value > 0.
05, the data showed a consistent usage of structural categories in lexical bundles across Hard and Soft Science with no structure showing a significant proportion difference.
Lexical bundle (LB) substructures were further examined to allow for a more thorough comparison, as seen in Table 8.
To improve structure and clarity, the classification underwent a number of Based on research by Berktien .
and Pearson .
, the first modification was changing the subcategory "connector 3rd person pronoun VP fragments" to "pronoun/noun VP " Another modification added adverb and adjective clauses as new substructures for bundles with DC fragments, which was initiated by Nasrabady et al.
Bundles that had adverbs, adjectival phrases, be adjective/adverb to structures, and those that began with conjunctions, however, were not included in the following analysis because they had no substructures.
Table 8.
LBs substructures distribution.
Hard Science Soft Science To-clause That-clause Adverbial clause Structures Substructures Pronoun/noun VP VP with non-passive VP with passive verb WH-question Khoirunnisa.
, & Kurniawan.
A corpus-based study of lexical bundles in the abstracts of high-impact research articles across disciplines.
JEELS, 12.
, 1109Ae1140.
Adjective clause of-phrase NP other post modifier Other NP expressions PP expressions Comparative Table 8 shows that there is no statistically significant difference between the two groups under analysis in any of the lexical bundle substructures, as all p-values exceed the threshold of 0.
In other words, no single substructure exhibits a clear pattern or distribution, it appears that the groups' use of substructures is generally consistent.
Thus.
H0 was failed to be rejected for all the structures.
NP-based Bundles According to NP-based bundle analysis, fascinatingly.
Hard Science and Soft Science have different dominant NP substructures.
The most common NP expressions in Hard Science are Aoother NP expressionsAo .
18%), with bundles like "science and technology", "an intensive care unit" and Authe median age,Ay .
helping to succinctly indicate specific topics (Abdollahpour & Gholami, 2.
This is not in line with Kim and Lee .
and Niu .
which focused on medical Their study showed that the most frequent bundle is NP with of-phrase fragments.
The median age of deceased patients .
was significantly older than recovered patients .
(HS)
In comparison.
NP of-phrase fragments account for 50.
00% of the total in Soft Science.
Often, these bundles outline relationships and areas of concentration.
Discussion focuses on the role of behavioral regulation in early academic achievement and preparedness for kindergarten.
(SS)
Khoirunnisa.
, & Kurniawan.
A corpus-based study of lexical bundles in the abstracts of high-impact research articles across disciplines.
JEELS, 12.
, 1109Ae1140.
The last point is that NP with post-modifier fragments, like "ways in which," contribute 22.
06% in Hard Science and 25.
00% in Soft Science, indicating their function in giving the central noun more This is consistent with Niu .
findings that these bundles provide academic abstracts more context and precision.
PP-based Bundles The analysis of prepositional phrase (PP)-based bundles showed that AoPrepositional phrase (PP) expressionsAo are the most common substructure of bundles based on PPs, which are found to be highly prevalent in both Hard Science and Soft Science.
PP expressions make up 92.
31% of PP-based bundles in Hard Science, while they make 00% in Soft Science.
Bundles like "in patients with" .
which underline the links between variables or components in research, can offer crucial locative, descriptive or comparative information.
This predominance is consistent with research by Kim and Lee .
and Niu .
, who highlighted the importance of prepositional words in abstracts, especially in interdisciplinary and medical research where clear links between components are crucial.
Both helper T (T.
cells and suppressor T cells in patients with COVID-19 were below normal levels, with lower levels of Th cells in the severe group.
(HS)
Moreover.
Comparative phrases, the second substructure, make 69% of Hard Science and 10.
00% of Soft Science.
These bundles, such as "as well as," .
highlight comparisons or contrasts.
This is in line with the findings of Abdollahpour and Gholami .
and Varghaei and Khodadadi .
, who emphasized the value of comparison constructions in medical and scientific presentations when presenting findings and talking about consequences.
Also, the bundle Auas well asAy was used to designate two components as being equally significant (Kurniawan & Haerunisa, 2.
Khoirunnisa.
, & Kurniawan.
A corpus-based study of lexical bundles in the abstracts of high-impact research articles across disciplines.
JEELS, 12.
, 1109Ae1140.
An individual's self-efficacy and outcome expectations were found to be positively influenced by the encouragement of others in their work group, as well as others' use of computers.
(SS)
VP-based Bundles Table 8 analyses are consistent with a finding from a previous study.
Niu .
which concentrates on LBs in English abstracts of Chinese and International Journals.
The table reveals an intriguing dominance in the VP fragment substructures, where VP with pronoun/noun VP fragments appeared as the most frequent substructure in Hard Science .
00%).
This substructure is frequently employed to explain goals, conclusions, or circumstances.
Results suggest that a phonological deficit can appear in the absence of any other sensory or motor disorder, and is sufficient to cause a literacy impairment, as demonstrated by five of the (HS) These results show a difference from previous research in which passive VP was the most frequently occurring in medical RAs (Kim & Lee, 2021.
Varghaei & Khodadadi, 2.
In this study.
VP with passive verbs appeared as the second most frequent occurring in Hard Science .
while in Soft Science.
VP with a passive verb was the most frequently occurring structure .
%), followed by pronoun/noun VP fragment .
%).
Older age was associated with greater risk of development of ARDS and death likely owing to less rigorous immune (HS) In addition, two other substructures of VP fragments namely VP with non-passive verbs and VP with WH-questions appeared less frequently in both corpora.
Even no VP with WH-questions occurred in both corpora.
Khoirunnisa.
, & Kurniawan.
A corpus-based study of lexical bundles in the abstracts of high-impact research articles across disciplines.
JEELS, 12.
, 1109Ae1140.
DC-based Bundles Analyzing bundles based on dependent clauses (DC) shows intriguing trends in how they are used in both soft and hard sciences.
With 71.
43% in Hard Science and 100% in Soft Science.
Aoto-clausesAo are the most prevalent substructure among DC fragments, dominating this These bundles express purposes or objectives, which reflects their function in outlining the goals and intentions of study.
Based on the mode analysis, an approximation method is developed to estimate the peak gain point, which is useful in LLC design.
Only 14.
25% of academic abstracts in the Hard Sciences and none at all in the Soft Sciences contain adverbial phrases .
uch as "who were admitted to") or that-clauses .
uch as Authose of theAoA.
Adverbial clauses helped develop connections between concepts for the creation of cohesive writings (Fang, 2.
, whereas that-clauses offer information by referencing or comparing features.
Their restricted and specialized importance in these sectors is reflected in their low usage.
LBs Function is Hard Science and Soft Science Percentage Research-oriented Text-oriented Hard Science Perticipant-oriented Soft Science Figure 2.
Distribution of LBs functions in Hard Science and Soft Science Khoirunnisa.
, & Kurniawan.
A corpus-based study of lexical bundles in the abstracts of high-impact research articles across disciplines.
JEELS, 12.
, 1109Ae1140.
Initially, the final lists of lexical bundles (LB.
were functionally classified using HylandAos .
The chart illustrates that research-oriented bundles predominate in both corpora, making up 66% in Hard Science and 71.
88% in Soft Science.
This is consistent with earlier research, including Kim and Lee .
Varghaei and Khodadadi .
Niu .
and Abdollahpour and Gholami .
, which also found that research-oriented bundles are commonly used in academic writing to describe procedures, present results, or contextualize research within a particular framework.
On the other hand, as a result of the greater interpretive and explanatory demands of Soft Science writing, text-oriented bundles, which aid in discourse organization and reader guidance are more common in Soft Science .
13%) compared to Hard Science .
69%).
This finding is consistent with observations made by Niu .
and Hyland .
, who pointed out that text-oriented bundles are frequently used in Soft Science disciplines to build relationships and make arguments more understandable for readers.
Moreover, participant-oriented bundles, which indicate the author's position or interaction with the audience, are hardly presentAithey make up 0.
of Hard Science abstracts and 0% of Soft Science abstracts.
The previous section.
Table 8 presents the sub-functions of LBs that were examined to have a better understanding of their presence in the two corpora.
In order to examine bundles that did not suit Hyland's .
framework, other frameworks were also used.
The other subfunctions were grouping, citation, generalization, and objective from Salazar .
, as well as doubling, exemplifier, and questioning from Nasrabady et al.
Additionally, the categorization of Hyland's .
transition and resultative signal sub-functions into additive, comparative, inferential, and causal categories by Salazar .
was also used.
Khoirunnisa.
, & Kurniawan.
A corpus-based study of lexical bundles in the abstracts of high-impact research articles across disciplines.
JEELS, 12.
, 1109Ae1140.
Table 9.
LBs sub-functions distribution.
Hard Science Soft Science Z score p value Location Procedure Quantification Description Topic Grouping Doubling Additive Comparative Inferential Causative Structuring Framing Citation Generalization Objectives Exemplifier Questioning Stance Engagement Functions Sub-functions Researchoriented Textoriented Participantoriented Table 9 shows the apparent statistical differences in the distribution of LB functions and sub-functions between research articles in Hard Science and Soft Science.
Each sub-function appeared differently in Hard Science and Soft Science.
The results are discussed in greater detail in the ensuing subsections.
Research-oriented Bundles As the figure shows, the examination of research-oriented bundles identifies significant distinctions between Hard Science and Soft Science in their sub-functions.
In Hard Science, topic bundles account for 66.
41% of research-oriented bundles, while in Soft Science, they only make up 19.
Bundles like "patients with Covid-19" .
or Khoirunnisa.
, & Kurniawan.
A corpus-based study of lexical bundles in the abstracts of high-impact research articles across disciplines.
JEELS, 12.
, 1109Ae1140.
"respiratory distress syndrome" are commonly used in Hard Science abstracts to precisely identify and highlight the primary research topic.
This is consistent with findings by Niu .
, who highlighted the use of topic-oriented bundles in Hard Science to convey research focal .
The lungs from patients with Covid-19 also showed distinctive vascular features, consisting of severe endothelial injury associated with the presence of intracellular virus and disrupted cell membranes.
Unlike hard science abstracts, which tend to emphasize procedural or result-oriented bundles, soft science abstracts make extensive use of descriptive bundles, such as "the development of", .
to offer richer contextual and interpretive information (Varghaei & Khodadadi, 2.
This article presents the development of a brief, self-report measure of female sexual function.
Initial face validity testing of questionnaire items, identified by an expert panel, was followed by a study aimed at further refining the questionnaire.
The Z-test study shows that Hard Science and Soft Science differ significantly in certain sub-functions of research-oriented bundles:
Topic, description and location sub-functions .
ata marked in bol.
The prevalence of topic bundles is much higher in Hard Science .
41%) compared to Soft Science .
57%), with a p-value of 0.
000 and a Z score of 5.
467, highlighting Hard Science's emphasis on precisely specifying study topics and areas.
The p-value of 0.
000 and the Z score of -5.
822 show that, on the other hand, description bundles are substantially more common in Soft Science .
83%) than in Hard Science .
59%).
This is consistent with Soft Science's interpretative focus on detailed and contextual descriptions.
Additionally, with a Z score of -3.
216 and a p-value of 0.
location bundles .
also exhibit a significant difference, being more Khoirunnisa.
, & Kurniawan.
A corpus-based study of lexical bundles in the abstracts of high-impact research articles across disciplines.
JEELS, 12.
, 1109Ae1140.
prevalent in Soft Science .
87%) compared to Hard Science .
78%),
highlighting the contextual framing frequently observed in Soft Science These notable distinctions show how the two fields employ different rhetorical methods, with Soft Science emphasizing in-depth explanations and contextualization and Hard Science concentrating on succinct, topic-driven discourse.
Yet Anthony also used his discursive inquiry to Autrouble the waterAy in his classroom and in the study group workshops.
Text-oriented Bundles Different distributions between sub-functions in Hard Science and Soft Science are revealed by the examination of text-oriented More objective bundles, like "to evaluate the," .
are found in Hard Science .
17%) than Soft Science .
11%), indicating that Hard Science places a greater focus on clearly articulating study Comparably, structuring bundles, like "in this paper," .
make up 25.
00% of Hard Science and 11.
11% of Soft Science, indicating their function in directing readers and structuring discourse (Hyland.
Moreover, comparative bundles, such as "was the most," .
are more common in Hard Science than Soft Science, highlighting Hard Science's emphasis on accurate comparisons to properly convey study .
This study aimed to evaluate the clinical characteristics of COVID-19 in pregnancy and the intrauterine vertical transmission potential of COVID-19 infection.
In this paper, we introduce a framework for VHR scene .
On admission, ground-glass opacity was the most common radiologic finding on chest computed tomography (CT) .
4%).
In contrast, the other sub-functions of text-oriented bundles reveal notable variations between Hard Science and Soft Science.
The fact that inferential bundles, such "findings indicate that," .
are far more Khoirunnisa.
, & Kurniawan.
A corpus-based study of lexical bundles in the abstracts of high-impact research articles across disciplines.
JEELS, 12.
, 1109Ae1140.
common in Soft Science .
78%) than in Hard Science .
highlights the interpretive character of Soft Science and its emphasis on making connections and deductions from data.
Moreover.
Soft Science .
67%) places a greater emphasis on causative bundles such as "as a result of," .
than in Hard Science .
33%) which indicate cause-andeffect relationships (Salazar, 2.
Furthermore, the findings indicate that a classification system that is based on the simple view has advantages over standard systems that focus only on word recognition and/or reading (SS) .
As a result, much research is still based on the old Kusera and Francis frequency norms.
(SS)
Furthermore, framing bundles like "in terms of" are commonly used in the Soft Sciences to contextualize arguments, and additive bundles .
Aoand the resultsA.
, which are less prevalent overall, are used slightly more in the Soft Sciences than in the Hard Sciences to help with idea connections.
Certain text-oriented bundles, such exemplifier and generalization bundles, are absent from both fields, suggesting their limited use in academic abstracts.
Lastly, the Z-test results suggest that none of the text-oriented sub-functions exhibit statistically significant differences between Hard Science and Soft Science, with all p-values over 0.
05, despite these apparent distributional discrepancies.
This lack of statistical significance implies that both fields use text-oriented bundles in a similar way overall, even though there are differences in the presence or absence of certain sub-functions.
Participant-oriented Bundles Participant-oriented bundles show a significant lack of variance between Hard Science and Soft Science, with engagement features completely missing from both corpora and stance features making up 100% of Hard Science participant-oriented bundles.
Stance bundles.
Khoirunnisa.
, & Kurniawan.
A corpus-based study of lexical bundles in the abstracts of high-impact research articles across disciplines.
JEELS, 12.
, 1109Ae1140.
like "more likely to," .
highlight the author's role in delivering the research by expressing the authors' views, level of confidence, or assessment of their findings.
This supports the findings of Niu .
, who observed that participant-oriented bundlesAiin particular, posture featuresAiare more common in fields that place a high value on findings that are straightforward and easy to understand.
The 2019-nCoV infection was of clustering onset, is more likely to affect older males with comorbidities, and can result in severe and even fatal respiratory diseases such as acute respiratory distress syndrome.
The lack of engagement elements, which usually entail communication with the audience .
uch as instructions or inquirie.
, highlighted that academic abstracts typically maintain an impersonal tone, minimizing direct interaction with readers.
CONCLUSION
This study explored the use of lexical bundles in abstracts of high-impact research articles across disciplines, especially comparing Hard Science and Soft Science.
The analysis revealed information about the frequency, structural patterns, and functional distribution of lexical bundles by utilizing Biber's .
structural taxonomy and Hyland's .
functional classification, along with other categories to support the investigation.
The results demonstrated distinct disciplinary similarities and variations in both structural and functional aspects.
The Hard Science and Soft Science were dominated structurally by bundles based on noun phrases, followed by PP-based bundles and VP-based bundles.
This finding indicates that both corpora utilize more phrasal bundles than clausal ones.
However, the frequency of PPbased bundles is higher in Soft Science.
This higher use of PP-based bundles in hard science abstracts is indicative of the field's demand for exact locational, temporal, and relational markers in order to effectively communicate technical and factual information.
Functionally, both fields had the highest concentration of research1133 Khoirunnisa.
, & Kurniawan.
A corpus-based study of lexical bundles in the abstracts of high-impact research articles across disciplines.
JEELS, 12.
, 1109Ae1140.
oriented bundles, especially in topic bundles in the Hard Sciences and description bundles in the Soft Sciences.
The higher use of topic bundles indicates a greater emphasis on defining the topic or focus of the investigation, while the higher use of description in Soft Science indicates a focus on elaborating attributes.
This study emphasizes the variety of ways lexical bundles (LB.
are employed in academic writing, pointing out that their functions and forms are influenced by discourse communities and might not be appropriate in all contexts.
Despite the limited size of its corpus, the study identifies recurrent patterns that serve as the basis for recommendations on how abstracts can align with disciplinary standards.
For more accurate and thorough insights of LB usage, future studies should employ sophisticated analytical tools and expand the corpus size.
ACKNOWLEDGMENTS
The research program offered by Indonesia's Ministry of Research.
Technology, and Higher Education made this study feasible.
We also appreciate the collaboration and support of our colleagues and friends, which contributed significantly to the success of this work.
DECLARATION OF AI AND AI-ASSISTED TECHNOLOGIES
While preparing this paper, the authors utilized Grammarly to reduce writing errors and ChatGPT to help with data calculation.
The writers have used these tools to thoroughly evaluate and revise the content as needed, and they are solely responsible for the final version of this publication.
REFERENCES