The constructive nature and functions of the corpus


Views: 14 / PDF downloads: 12

Authors

DOI:

https://doi.org/10.32523/2616-678X-2025-150-1-143-153

Keywords:

computational linguistics, corpus, corpus composition, National Corpus, corpus assembly, lexicographic studies

Abstract

The study aims to explore the concept of corpora and parallel corpora, their purpose, structural features, and role in machine translation. It provides an overview of corpus creation and proposes ways to leverage computational linguistics effectively for machine translation. Information on corpus structure and operational algorithms serves as a solid theoretical and practical foundation for research in this field. The research employs methods such as theoretical analysis, generalization, and questionnaires. The main findings indicate a high demand for the National Corpus of the Kazakh Language's capabilities. However, the corpus currently supports only a limited range of text processing functions and has several basic and technical shortcomings. Machine translation should become a key tool in enhancing the corpus’s functionality. To advance this, it is essential to develop parallel corpora within the existing framework. The study significantly contributes to the improvement of the National Corpus of the Kazakh Language. Its practical value lies in offering guidance on corpus structure, algorithms, and functionality, which can aid future researchers in conducting practical work on machine translation.

Published

2025-03-30

How to Cite

Shokabayeva С. ., & Serikzhan Ф. . (2025). The constructive nature and functions of the corpus. Bulletin of L.N. Gumilyov Eurasian National University. PHILOLOGY Series, 150(1), 143–153. https://doi.org/10.32523/2616-678X-2025-150-1-143-153