Mozilla Voice STT
Mozilla Voice STT (formerly research project “Deep Speech”) is an advanced open-source Speech-to-Textengine which aims to make speech recognition technology openly available to developers.
Supported by a community of like-minded developers, companies, and researchers, we have applied sophisticated machine learning techniques and a variety of innovations to build a deep learning-based STT engine that approaches human accuracy. Implemented with Google’s TensorFlow framework, it can run on anything from an off-line Raspberry Pi 4 to a server class machine, obviating the need to pay patent royalties or exorbitant fees for existing STT services.
Together with the growing Common Voice dataset we believe this technology can and will enable a wave of innovative products and services, and that it should be available to everyone.
"We have applied sophisticated machine learning techniques and a variety of innovations to build a deep learning-based STT engine that approaches human accuracy."
Mozilla Voice STT is a vibrant open source tool. Building on this foundation guarantees future developer support and continued performance optimizations. Its architecture allows for easy localization. No advanced linguistic knowledge is required to localize to a particular language, only data. This is in stark contrast to more traditional STT architectures, where advanced linguistic knowledge is required and hinders adoption by under-resourced languages.
If that doesn’t cover what you’re looking for, you can also use our discussion forum to engage with the rest of the community.