Speech Recognition is the process of converting audio into text. This is used in voice assistants like Alexa, Siri etc. Python provides an API called SpeechRecognition to allow us to convert audio into text for further processing.
Currently, SpeechRecognition supports the following file formats:
- WAV: must be in PCM/LPCM format
- FLAC: must be native FLAC format; OGG-FLAC is not supported
If you want to know more about SpeechRecognition, check out their documentation.