The Effect of Encoder and Decoder Stack Depth of Transformer Model to Performance of Machine Translator for Low-resource Languages

Heryadi, Yaya; Tho, Cuk; Wijanarko, Bambang Dwi; Murad, Dina Fitria; Hashimoto, Kiyota

doi:10.46254/AP03.20220479

Track: IoT

Abstract

Automated language translator has wide potential applications especially in plural language countries. Study on low-resource languages is very crucial such as making information accessible to people live in less-connected and technologically underdeveloped areas, making more digital content available, and making Natural Language Processing models more accessible to low-resource languages. Vanilla transformer model has achieved excellent performance to address machine translation task. Despite its high performance, the model contains adjustable hyperparameters such as the number of encoder-decoder stack depth. This paper presents exploration results on the effect of encoder-decoder stack depth to performance of the vanilla transformer model as a neural machine translation of Bahasa Indonesia-Sundanese languages. The empiric results of fine-tuning a pretrained vanilla transformer model showed that average performances of vanilla transformer model with 2, 4, or 6 stack depth are higher than average performance of the model with 8 stack depth. The highest performances achieved by the transformer model with 2 stack depth are: 0.99 average training accuracy, 0.97 average validation accuracy, and 0.99 average testing similarity. Interestingly, according to non-parameteric significance test results with 95% confidence interval, there is no siginificant difference on performance of vanilla transformer model with 2, 4, 6, and 8 stack depths. These results showed that using vanilla transformer with less number of depth stack is favourable for machine translation as it has less number of model parameters but it gives acceptable model performance. From experimentation results, it showed that vanilla transformer model with 2 stack depth is potential to be explored further.

Keywords

Neural Machine Translation, Transformer Model, Natural Language Processing, Encoder, Decoder, Stack Depth.

The Effect of Encoder and Decoder Stack Depth of Transformer Model to Performance of Machine Translator for Low-resource Languages

Yaya Heryadi, Cuk Tho, Bambang Dwi Wijanarko, Dina Fitria Murad & Kiyota Hashimoto

Publisher: IEOM Society International

Track: IoT

Abstract

Related Research