Many new companies start out with a clever idea on how to gather and use data but lack the knowledge base to analyze it themselves. The ability of these companies to succeed at their stated goals depends on how well they use their data. Effective algorithms for handling this data allow companies to provide their service efficiently, while bad algorithms result in high error rates and low customer satisfaction.
One SFL client was a startup trying to detect sleep apnea from audio data. Many cases of sleep apnea go undiagnosed because there is no blood test for it and symptoms only occur while the patient is asleep. If an app on a person’s phone can flag that person as a potential sleep apnea case a commonly underdiagnosed condition can be much more effectively treated and many people can be saved the time of going to costly and time-consuming sleep clinics. This kind of product succeeds or fails on the strength of its ability to correctly and efficiently use its data for its stated purpose, of diagnosing sleep apnea.
SFL works with clients to understand their specific data needs and data sources. Many startups have novel ideas for sources and uses of data and this requires the ability to develop novel techniques and intelligently repurpose existing algorithms. In this case SFL was given raw audio recordings, in the form of mp3s, of people sleeping. These 8 hour long audio clips were tagged by medical experts for potential sleep apnea events. The first step was to create sliding and overlapping two second windows that were each analysed separately. We created features on the raw audio data using a combination of frequency and amplitude domain transformations; these features included hand crafted rules, such as the percentage of time spent above a threshold and algorithmically generated ones such as MFCCs. With these features, a supervised learning algorithm learnt the landscape of the features that corresponded to apnea-like events. After extensive feature extraction, generation, and refinement we were able to build an accurate model to detect apnea with a very low false positive rate.
Startups that solve their data problems are able to live up to their potential as an idea, both in terms of profitability and making positive changes in the world. The model described above was good enough to achieve FDA approval as a diagnostic technique. This allowed the app to tell users in the morning whether they experienced apnea the night before and dramatically cut the potential time between developing, diagnosing, and treating sleep apnea. Further, these types of apps that can be used for remote diagnostics can save hundreds of millions of dollars revolutionizing the medical field.