Disruptive developments in modern speech technology and...

· Digital Health Tech

Disruptive developments in modern speech technology and how they may transform healthcare for the better

Communication is key and always will be.

Language and verbal communication as a whole serve as humans’ main tool for social interaction, yet they are incredibly error prone. Misspeaking, misunderstanding and being lost for words are part of our daily experience, let alone having to overcome the immense barriers created by the roughly 7000 different languages spoken world wide¹. In the medical field, communication mistakes are suggested to be a major factor among all human errors², with inadequate verbal communication often being the root cause for impairments of patient safety³.

Fellow pioneers in technology-assisted communication

Today, modern technology is used to optimise interpersonal communication in most areas of life, a development which has been accelerated by the increasing implementation of artificial intelligence in recent years. This is demonstrated in Facebook’s unveiling of their new AI-based speech recognition technology (wav2vec 2.0), for instance.

Images retrieved from: https://ai.facebook.com/blog/wav2vec-state-of-the-art-speech-recognition-through-self-supervision/ 

Traditional speech recognition systems rely on large amounts of transcribed data to successfully transform spoken audio into text⁴. These extensive sets of data are difficult to gather for many languages, which is why the wav2vec system was built to be self-supervising and process unannotated data⁵. This opens new prospects in a plethora of application areas such as the continuous development of speech translation systems⁵.

Similarly, Fujitsu’s release of their new multilingual speech translation device seems highly promising and could enable doctors and patients to surpass language barriers in clinical settings. With the help of multidirectional microphones, this hands-free device picks up users’ voices as well as their location and uses speech pauses to immediately translate between conversation partners⁶. Fujitsu’s translation service was developed specifically with the healthcare context in mind, which is reflected in its high accuracy in environments with background noise in addition to its hands-free design⁷.

Fujitsu’s multilingual, hands-free speech translation device; Retrieved from: https://www.fujitsu.com/global/about/resources/featurestories/2017112001.html

Medudoc’s digital solution for better doctor-patient communication

At medudoc, we too, believe that adequate language plays a vital role in doctor- patient interactions. Not only do these interpersonal exchanges serve to transfer knowledge and information on an intellectual level. They are also the means to communicate empathy, to reassure worried patients and to build trust. Thus, there is a need for clear, understandable information in the patient’s native language and accordance with their individual level of education. However, the appropriate choice of words and sense of empathy alongside an equally important professional distance matter just as much. Such sophistication is difficult to achieve for anybody, but especially in busy clinic settings with high workloads and low procedural flexibility. This seems paradoxical, since it is difficult to imagine a context in which a clear understanding of facts would be much more important than when discussing potentially life-altering medical procedures.

For this reason medudoc aims to bridge communication gaps by providing intelligible patient education videos. Our digital solution is built on genuine patient-centred care and practical doctor-centred development. While our videos are personalised as the respective treating physician tailors it to their individual patient, they also help to standardise the education process and save physicians valuable time due to more focused in-person conversations.

Thus, by improving communication, medudoc offers the key to a clinical work environment with satisfied patients and unburdened doctors.

If we have caught your interest, please find further information about medudoc and our purpose in the world of digital health here.

About the author:

Evelyn Lange works as a Medical Education Writer at the digital health start-up “medudoc” in Berlin. With a background in psychology she is now working on bridging the digital gap between doctors and patients through the creation of intelligible medical education content.

You can contact Evelyn via e-mail and LinkedIn.

References:

¹Eberhard, D. M., Simons, G. F., & Fennig, C. D. (2021). Ethnologue: Languages of the World. Retrieved from: https://www.ethnologue.com/guides/how-many-languages

²Brindley, P., & Reynolds, S. (2011). Improving verbal communication in critical care medicine. Journal of critical care, 26(2), 155–9.

³Rabøl, L. I., Andersen, M. L., Østergaard, D., Bjørn, B., Lilja, B., & Mogensen, T. (2011). Descriptions of verbal communication errors between staff. An analysis of 84 root cause analysis-reports from Danish hospitals. BMJ quality & safety, 20(3), 268–274.

⁴Synced. (2020). Facebook AI Wav2Vec 2.0: Automatic Speech Recognition From 10 Minute Sample Facebook AI researchers have open-sourced the new wav2vec 2.0 algorithm for self-supervised language learning. Retrieved from https://syncedreview.com/2020/09/25/facebook-ai-wav2vec-2-0-automatic-speech-recognition-from-10-minute-sample/

⁵Baevski. A., Conneau, A., & Auli, M. (2020). Wav2vec 2.0: Learning the structure of speech from raw audio. Retrieved from: https://ai.facebook.com/blog/wav2vec-20-learning-the-structure-of-speech-from-raw-audio/

⁶Fujitsu. (2021). A Wearable Speech Translation Device That Translates Voices by Easy, Hands-free Operation. Retrieved from: https://www.fujitsu.com/global/about/resources/featurestories/2017112001.html

⁷The Japan Times. (2021). Fujitsu releases hands-free speech translation service. Retrieved from: https://www.japantimes.co.jp/news/2021/05/14/business/corporate-business/fujitsu-translation-service/