Frequency and structures of lexical bundles in the German Goethe-Institut website

Amalia Qurrota Ayuni, Universitas Padjadjaran, Indonesia
Susi Yuliawati, Universitas Padjadjaran, Indonesia
Dian Ekawati, Universitas Padjadjaran, Indonesia


Corpus linguistics allows researchers to discover the nature of language use through lexical bundles throughout genres, registers, and language varieties. With the fast-changing development of the internet, the language of websites has become one fascinating variety to investigate. In the contexts of German language studies, Goethe-Institut is a worldwide German cultural institution dedicated to teaching German and propagating German culture, with its website becoming a well-known source for German-related studies. Under these considerations, this research is interested in analyzing the German language patterns in Goethe-Institut website by examining their frequency and structure of lexical bundles. Using a mixed-method approach, the corpus was found to be dominated by lexical bundles within the ranges of three- and four-bundles, with the least quantity of lexical bundles in the range of five. The majority of the four-word lexical bundles on this site fell into the categories of noun, preposition, or verb groups. Meanwhile, the adverb, conjunction, and adjective groups were the fewest to appear in the four-word lexical bundles. The language in the Goethe-Institut was shown to contain semi-formal expressions according to the frequency of use of the prepositions, nouns, verbs, and the active sentence expressions. The utilization of standardized German language and basic vocabulary indicates that this website is designed to be accessible for everyone, including German language learners. The language usage also demonstrates that the Goethe-Institut is especially a user-oriented website with expressions that evoke a ‘sense of belonging’.


Lexical Bundles; Website; Corpus Linguistics; Goethe Institute

Full Text:



Bernardini, S., Ferraresi, A., & Gaspari, F. (2010). Institutional Academic English in the European context: A web-as-corpus approach to comparing native and non-native language. In Professional English in the European Context: The EHEA Challenge (pp. 27–53).

Biber, D., Conrad, S., & Cortes, V. (2004). If you look at …: Lexical bundles in university teaching and textbooks. Applied Linguistics, 25(3), 371–405.

Biber, D., Johansson, S., Leech, G., Conrad, S., & Finegan, E. (1999). Longman grammar of spoken and written English. Harlow: Longman. 109–132.

Biel, Ł., Koźbiał, D., & Wasilewska, K. (2019). The formulaicity of translations across EU institutional genres: A corpus-driven analysis of lexical bundles in translated and non-translated language. Translation Spaces, 8(1), 67–92.

Brommer, S. (2018). Sprachliche muster. de Gruyter.

Budiwiyanto, A., & Suhardijanto, T. (2020). Indonesian lexical bundles in research articles: Frequency, structure, and function. Indonesian Journal of Applied Linguistics, 10(2), 292–303.

Butler, C. S. (1998). Collocational frameworks in Spanish. International Journal of Corpus Linguistics, 3, 1-32.

Castagnoli, S. (2006). Using the web as a source of LSP corpora in the terminology classroom. In Baroni, M. & Bernardini, S. (eds.) 159–172.

Cortes, V. (2004). Lexical bundles in published and student disciplinary writing: Examples from history and biology. English for specific purposes, 23, 397–423.

Chen, Y. H., & Baker, P. (2010). Lexical bundles in L1 and L2 academic writing. Language Learning & Technology, 14, 30–49.

Cheng, W. (2011). Exploring corpus linguistics. Routledge.

Cortes, V. (2007). A comparative analysis of lexical bundles in academic history writing in English and Spanish. Corpora, 3(1), 43–57.

Creswell, J. W. & Vicki L. P. C. (2011). Designing and conducting mixed methods research (3rd ed.). SAGE Publication.

Duden. (2022, November 10). Rechtschreibung, Bedeutung, Definition, Herkunft.

Eisenberg, P. (2014). Grundriss der deutschen Grammatik. In Grundriss der deutschen Grammatik. J.B. Metzler.

Márquez, M. F. (2014). Lexical bundles and phrase frames in the language of hotel websites. English Text Construction, 7(1), 84–121.

Goethe Institut. (2022, September 26). Sprache. Kultur. Deutschland.

Helbig, G. & Buscha J. (1996). Deutsche grammatik ein handbuch für den ausländer. Langenscheidt.

Hyland, K. (2008). As can be seen: Lexical bundles and disciplinary variation. English for Specific Purposes, 27(1), 4–21.

Jiménez-Crespo, M. A. (2020). Localization. The Routledge Handbook of Translation and Globalization, 375–390.

Khairani. (2013). Preposisi bahasa Jerman mit dan bei dalam majalah nadi (2009) dan padanannya dalam bahasa Indonesia. Universitas Negeri Yogyakarta.

Kim, Y. J. (2009). Korean lexical bundles in conversation and academic texts. Corpora, 4(2), 135–165.

Krekeler, C. (2020). Schreiben im studium: Eine korpuslinguistische untersuchung zum sprachgebrauch in den studiengängen maschinenbau und betriebswirtschaft. Zeitschrift Für Interkulturellen Fremdsprachenunterricht, 25(2).

Kwary, D.A, & Arum, K.W.A. (2011). Lincoln’s vs. Obama’s presidencies: Diachronic corpus based analysis of the adjectival collocates of [man] and [woman] in the American English. ReVEL, 9 (17), 211-225.

Lemnitzer, L., & Zinsmeister, H. (2015). Korpuslinguistik (3rd ed.). Gunter Narr Verlag.

Pawley, A., & Syder, F. H. (1983). Two puzzles for linguistic theory: Nativelike selection and nativelike fluency. Language and Communication,191–226. 36027-12

Putri, M. T. D. (2022). Struktur dan fungsi gugus leksikal dalam teks peraturan perundang-undangan, surat resmi, dan surat bisnis (Structure and function of lexical bundles in legal texts, formal letters, and business letters). Kandai, 18(1), 1–21.

Scott, M. (2001). Comparing corpora and identifying key words, collocations, and frequency distributions through the Word Smith Tools suite of computer programs. In Small corpus studies and ELT. John Benjamins Publishing Company.

Shelly, G. B., & Misty, E. V. (2010) Discovering computers 2010. Cengage Learning.

Stubbs, M. (2005). The most natural thing in the world: Quantitative data on multi-word sequences in English. Paper Presented at Phraseology.

Tracy-Ventura, N., Cortes, V., & Biber, D. (2007). Lexical bundles in Spanish speech and writing. In G. Parodi (Ed.), Working with Spanish corpora (pp. 354–375). Continuum.

Yuliawati, S. (2014). Analisis berbasis korpus: Kolokasi kata-kata bermakna “Perempuan” dalam media sunda (majalah manglé, 2012 – 2013). Ranah: Jurnal Kajian Bahasa, 3(2), 107–123.

Yuliawati, S., Ekawati, D., & Mawarrani, R. E. (2021). Investigating lexical bundles in the corpora of English and Indonesian research articles with the sketch engine. Jurnal Sosioteknologi, 20(2), 188–200.



Copyright (c) 2023 Amalia Qurrota Ayuni, Susi Yuliawati, Dian Ekawati

Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

Our Journal indexed by:


 Creative Commons License
LingTera is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
Based on a work at

View My Stats