Speech Recognition with Python

A practical introduction to speech recognition: theory, concepts and implementations

Javier Jorge

Algorithms Human-Machine-Interaction Machine-Learning

See in schedule

Nowadays, we are surrounded by devices that can listen to us: Alexa, Siri, Cortana, etc, and the interaction with them has become easier and easier and more intuitive. The first challenge to communicate in a colloquial way with all these devices is to convert the voice signal to text. To do this, several approaches based on searching methods, algorithmic techniques, and machine learning are combined in very smart and interesting ways.

In this talk, I introduce the underneath speech recognition systems that these devices utilize. This will be illustrated with a guided example where we will develop a system to recognize isolated words in Python.

Finally, I will show how we are implementing these and more advanced techniques in our production systems, providing transcriptions for different companies and institutions, using Python on different parts of the process.

Type: Talk (30 mins); Python level: Intermediate; Domain level: Beginner


Javier Jorge

Universidad Politécnica de Valencia

I'm a PhD student in Computer Science at Universidad Politécnica de Valencia (UPV). I received a B.Sc. in Computer Science from the UPV in 2014 and the Master’s degree in Artificial Intelligence, Pattern Recognition and Digital Imaging (MIARFID) from the UPV in 2015. Nowadays, I'm finishing a Master's degree in Parallel and Distributed Computing while completing the PhD program.

I'm working as a researcher with the "Machine Learning and Language Processing" Group (MLLP) of the UPV and my research interests include Computer Vision, Natural Language Processing, Pattern Recognition and Machine Learning. More information could be found on my website (http://jjorge.es), my Google Scholar profile (https://scholar.google.com/citations?user=qKPZU50AAAAJ&hl=en) or LinkedIn (https://www.linkedin.com/in/javier-jorge-cano-0555a721/).