I was looking for an application that could take input from an audio file and extract the speech and write it to a text file. Information about whisper is here.
Instructions for installing whisper are found at https://github.com/openai/whisper.
Using Jupyter Notebook I have about a dozen python projects, each having their own environment.
Jupyter Notebook and a ton of modules is in the repository.
If you install a python module outside of a specific environment it will install in your home account as a generally available module.
However, whisper is big app and I didn't want or need to install it in all of my projects, so from my home account I issued
python3 -m pip install -U openai-whisper
and a few minutes later it was installed without issues.
pip list
showed whisper in my list of modules. It also showed server nvidia drivers. It seems that whisper is designed to use an nvidia GPU if it can find one, otherwise it defaults to using the CPU.
pip show openai-whisper
showed the modules that whisper required and those that require whisper.
Then I tried it out on an mp3 file
whisper /home/jerry/Videos/2016_12_24_8_christmas_service.mp3 --model medium --language English
and a few minutes later I had the text of the voice on the mp3. It was 100% accurate.
After reading the documentation I discovered that English is the default language and specifying it is unnecessary. Also, "medium.en" is designed specifically to increase the speed of the tool on files spoken in English.
The list of commands is given using
whisper --help
Instructions for installing whisper are found at https://github.com/openai/whisper.
Using Jupyter Notebook I have about a dozen python projects, each having their own environment.
Jupyter Notebook and a ton of modules is in the repository.
If you install a python module outside of a specific environment it will install in your home account as a generally available module.
However, whisper is big app and I didn't want or need to install it in all of my projects, so from my home account I issued
python3 -m pip install -U openai-whisper
and a few minutes later it was installed without issues.
pip list
showed whisper in my list of modules. It also showed server nvidia drivers. It seems that whisper is designed to use an nvidia GPU if it can find one, otherwise it defaults to using the CPU.
pip show openai-whisper
showed the modules that whisper required and those that require whisper.
Then I tried it out on an mp3 file
whisper /home/jerry/Videos/2016_12_24_8_christmas_service.mp3 --model medium --language English
and a few minutes later I had the text of the voice on the mp3. It was 100% accurate.
After reading the documentation I discovered that English is the default language and specifying it is unnecessary. Also, "medium.en" is designed specifically to increase the speed of the tool on files spoken in English.
The list of commands is given using
whisper --help