Swinburne
Browse

Investigating the Impacts of LSTM-Transformer on Classification Performance of Speech Emotion Recognition

Download (2.79 MB)
thesis
posted on 2024-07-12, 20:45 authored by Felicia Andayani
Speech Emotion Recognition (SER) is a task of recognizing emotions by learning the features extracted from speech signals. This research focuses on designing and developing an LSTM-Transformer hybrid model for the SER system to learn the long-term dependencies in speech signals as well as investigating its impacts on the classification performance of SER. The resulting recognition accuracy showed that the LSTM-Transformer hybrid model could learn the temporal information from the frequency distributions according to the Mel-Frequency Cepstral Coefficients (MFCCs) of each emotion on language-independent and language-dependent datasets.

History

Thesis type

  • Thesis (Masters by research)

Thesis note

A thesis submitted in fulfilment of the requirements for the degree of Master of Science (Research) performed at Swinburne University of Technology, Sarawak, June 2022.

Copyright statement

Copyright © 2022 Felicia Andayani.

Supervisors

Lau Bee Theng

Language

eng

Usage metrics

    Theses

    Categories

    No categories selected

    Keywords

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC