It just dawned on me using my Apple Watch that you could use a simple tweak to improve dictation software, probably really easily. You just create a dataset targeted at people that have speech impediments, probably broken out by type (e.g., a lisp versus a stammer), where the training dataset is generated by people correcting text generated by the existing dictation software. That is, you use whatever dictation software you have, and then build a new dataset mapping inputs to corrected outputs. I wouldn’t even call this machine learning, it’s just a simple added layer that accounts for the fact the user has a speech impediment. In fact, you should probably allow this to be a feature in any dictation device, since someone could just have a funny voice that doesn’t map well to the training dataset or more generally, simply doesn’t perform well for whatever reason. This is a trivial adjustment, that could have real utility.
I did a quick Google search, and it looks like a simple version of this product does not exist, and instead people are using Machine Learning. There’s no way this is a hard problem, especially using machine learning. A lot of people stutter, and so there should be demand for exactly this software. I don’t have time to implement this (in particular because it requires people to generate the dataset), so you can keep this one, there it is, public domain.
Discover more from Information Overload
Subscribe to get the latest posts sent to your email.