Word-based largest chunks for Agreement Groups processing: Cross-linguistic observations

László Drienkó

SZSZC-Jáky Székesfehérvár , Hungary


The present study reports results from a series of computer experiments seeking to combine word-based Largest Chunk (LCh) segmentation and Agreement Groups (AG) sequence processing. The AG model is based on groups of similar utterances that enable combinatorial mapping of novel utterances. LCh segmentation is concerned with cognitive text segmentation, i.e. with detecting word boundaries in a sequence of linguistic symbols. Our observations are based on the text of Le petit prince (The little prince) by Antoine de Saint-Exupéry in three languages: French, English, and Hungarian. The data suggest that word-based LCh segmentation is not very efficient with respect to utterance boundaries, however, it can provide useful word combinations for AG processing. Typological differences between the languages are also reflected in the results.


cognitive computer modelling, segmentation, syntactic processing, language acquisition

Drienkó, L. (2020). Word-based largest chunks for Agreement Groups processing: Cross-linguistic observations. Linguistics Beyond and Within (LingBaW), 6(1), 60–73.

László Drienkó 
SZSZC-Jáky Székesfehérvár http://orcid.org/0000-0002-6749-2017


