Speech processing
Speech processing
Information collected from the users is another important input of the SMART system. This is usually done with textual interfaces (such as Twitter). In SMART we take another step forward by allowing the users to provide us voice messages. We are currently working on two aspects of this technology:
- Speech transcription: Speech recordings and textual data will be used for training models for the speech transcription engine.
- Speaker verification: Some of the speech recordings will also be used to train and test the speaker verification engine. At the same time the IBM team is also working to improve the speaker verification algorithms and add new ones into the verification engine. Recently a light JFA (Joint Factor Analysis) module and light i-vector module have been added to the speaker verification engine. These modules improve the verification rate to be almost as good as the classical JFA and i-vector methods but requires much less computational resources.