Startup harnesses self-supervised learning to tackle speech recognition biases

The technique dramatically increased the software’s pool of training data

Training school

The advantage of self-supervised models is that they don’t require all their training data to be labeled by humans. As a result, they can enable AI systems to learn from a much larger pool of information.

This helped Speechmatics increase its training data from around 30,000 hours of audio to around 1.1 million hours.

Will Williams, the company’s VP of machine learning, told TNW that the approach improved the software’s performance across a variety of speech patterns:

Learning like a child

One of the technique’s benefits was closing Speechmatics’ age understanding gap.

Based on the open-source projectCommon Voice,the software had a92% accuracy rate on children’s voices. The Google system, by comparison, had an accuracy of 83.4%.

Williams said enhancing the recognition of kids’ voices was never a specific objective:

That doesn’t mean that self-supervised learning alone can eliminate AI biases.Allison Koenecke, the lead author of the Stanford study, noted that other issues also need to be addressed:

Story byThomas Macaulay

Thomas is a senior reporter at TNW. He covers European tech, with a focus on AI, cybersecurity, and government policy.Thomas is a senior reporter at TNW. He covers European tech, with a focus on AI, cybersecurity, and government policy.

Get the TNW newsletter

Get the most important tech news in your inbox each week.

Also tagged with

More TNW

About TNW

Holiday homes platform launches ‘global first’ visual search engine

French startup Poolside nears $3B valuation for AI that can write code

Discover TNW All Access

Tech bosses think nuclear fusion is the solution to AI’s energy demands – here’s what they’re missing

Startups race to curb data centre energy use amid AI boom