Корпусқа енгізілетін мәтіндердегі сөздерге морфологиялық белгіленім қою және оларды компьютерлік бағдарламаға енгізу мәселелері

С.К.  Kulmanov; A. A. Zhanabekova; N.M.  Ashimbayeva; А.Z.-G.  Bisengali; N.K.  Shulenbayev; B.K.  Kordabay

doi:10.32523/2616-678X-2022-140-3-103-113

Problems of morphological markup of words in corpus texts, and their inclusion in a computer program

Views: 378 / PDF downloads: 453

Authors

С.К. Kulmanov A. Baitursynuly Institute of Linguistics
A. A. Zhanabekova A. Baitursynuly Institute of Linguistics
N.M. Ashimbayeva
А.Z.-G. Bisengali A.Baitursynuly Institute of Linguistics
N.K. Shulenbayev A. Baitursynuly Institute of Linguistics
B.K. Kordabay A. Baitursynuly Institute of Linguistics

DOI:

https://doi.org/10.32523/2616-678X-2022-140-3-103-113

Keywords:

corpus, corpus linguistics, text, morphology, conditional marking, markup, computer program

Abstract

The article gives a brief overview of the history of corpus creation in linguistics, characteristics of corpus linguistics, theoretical and practical tasks and requirements of morphological markup are indicated.

Morphological markup of words in corpus texts was originally created manually. Explanations of the basic principles of morphological analysis of individual words and markings are given. It is known that morphological analysis is carried out mainly without reference to the context. The article separately highlights various features encountered in the analysis of morphological structures of parts of speech and the placement of morphological markings of words.

Automatic disassembly of the morphological system of the language is carried out by performing several stepwise conditions in the computer memory. These are: 1) identification of the morphological structure of words (single-root word, affixes); 2) entering a list and pre-prepared affixes into the computer's memory; 3) entering electronic format texts of various language styles and containing morphological markings into the computer's memory. Then, with the help of a computer program, the following works are performed: a) marking parts of speech on some words that are not placed; b) in the process of processing registry words, manually correct single errors when placing parts of speech on them; b) leave only one of the homonyms relative to one of the parts of speech in the list of registry words; c) identify differences in word-forming suffixes and formative affixes.

Author Biographies

С.К. Kulmanov, A. Baitursynuly Institute of Linguistics

– Candidate of Philology, Associate Professor

A. A. Zhanabekova , A. Baitursynuly Institute of Linguistics

– Doctor of Рhilology, Professor

N.M. Ashimbayeva

– Candidate of Рhilological Sciences

А.Z.-G. Bisengali , A.Baitursynuly Institute of Linguistics

– Doctor of Philosophy (PhD)

N.K. Shulenbayev, A. Baitursynuly Institute of Linguistics

– Master of Humanities

B.K. Kordabay , A. Baitursynuly Institute of Linguistics

– Master of Humanities

Downloads

Pdf (Қазақша)

Published

2022-12-17

How to Cite

Kulmanov С. ., Zhanabekova А. ., Ashimbayeva Н. ., Bisengali А. ., Shulenbayev Н. Қ. ., & Kordabay Б. (2022). Problems of morphological markup of words in corpus texts, and their inclusion in a computer program. Bulletin of L.N. Gumilyov Eurasian National University. PHILOLOGY Series, 140(3), 103–113. https://doi.org/10.32523/2616-678X-2022-140-3-103-113

Download Citation

Issue

Vol. 140 No. 3 (2022)

Section

Linguistics

License

Here is the academic English version suitable for publication on the journal website:

The academic journal “Bulletin of L.N. Gumilyov Eurasian National University. Philology Series” adheres to an Open Access policy for all published materials, based on the principle of free and equitable dissemination of scholarly knowledge. The Editorial Board believes that open access to research results contributes to the advancement of philological science, strengthens academic communication, and promotes the integration of national research into the international scientific community.

1. Free and Open Access

All articles published in the journal are made openly available on the official website of the journal and are accessible to all users without restrictions, registration, or payment.

Users are entitled to:

freely read and download materials;
copy and distribute the texts of publications;
print articles;
use materials for scientific and educational purposes, provided that proper attribution is given to the author(s) and the original source of publication.

2. Licensing

Journal materials are distributed under the terms of the Creative Commons Attribution-NonCommercial 4.0 International (CC BY-NC 4.0) license:
https://creativecommons.org/licenses/by-nc/4.0/

This license permits the use, copying, distribution, and adaptation of the materials for non-commercial purposes, provided that appropriate credit is given to the author(s) and a link to the original publication is included.

3. Benefits of Open Access

The Open Access policy ensures:

increased visibility and citation of scholarly publications;
prompt dissemination of research findings in the fields of philology, linguistics, literary studies, and translation studies;
expansion of international academic cooperation;
access for readers to up-to-date scientific information without financial or technical barriers.

The Editorial Board is committed to ensuring transparency in editorial processes, maintaining high standards of peer review, and providing broad accessibility to research outcomes in the field of philological studies.