The continued rise in health care costs for much of the 21st century has put pressure on efficiency and
development at the national level. Attempts are being made to harness information technology, at least in
part, to curb rising costs. In particular, artificial intelligence and the opportunities it brings are still almost in
the starting position on the healthcare side. One of the artificial intelligence-based solutions is considered
to be automatic speech recognition.
Speech recognition in this context refers to the conversion of spoken words and sounds into text using
artificial intelligence. In practice, this means that the professional speaks and the computer listens and
converts the text it hears almost immediately directly on the computer screen, for example directly into the
patient information system. The purpose of the solution is to speed up the making of patient record entries
and to simplify the process so that in the future no separate dictation decoders would be needed. A
professional can check their own text as soon as it is dictated to artificial intelligence.
Briefly and simply described, automatic speech recognition works by having the software receive sounds
and, based on ready-made algorithms, break the sounds into very short parts and then combine them into
words and sentences into text format. The software also learns how the speaker speaks and thus speeds up
the processing process.
In Finland, speech recognition in the social and health sector has mainly been used in certain special fields,
mainly in radiology. However, its use has now expanded nationwide and an increasing number of social and
health professionals have at least tried the use of speech recognition. Traditionally, the idea has been to
replace the word processor stage where the dictated audio recording is decoded into a patient record.
Nowadays, uses for speech recognition have also been identified among other professionals, such as in
social work, where customer documentation can be long and, depending on the professional’s ability to
type with the keyboard, also time consuming.
One of the most time-consuming areas for professionals is documentation. A solution has been sought for
speech recognition, and according to some studies, the workload in clinical work is reduced when
documentation can be done using speech recognition. However, most of the work documented by
professionals takes place in direct interaction with the patient. In the future, automated speech recognition
systems may automatically extract clinically important information from what is happening throughout the
reception and document it into the patient information system.
In addition, speech recognition can also be used for purposes other than documentation. For example, in
residential services or home care, voice-guided support functions can be used that respond to a speech
command from a professional or resident. This can be, for example, a call initiated by a voice command to
an on-call nurse. The possibilities in this regard are vast, and Amazon’s Alexa, familiar from home
automation, for example, is already working in part for this purpose.
In addition, existing speech recognition systems in healthcare are already able to build sequences of
actions. At its simplest, this means that the software is first given a voice command, such as “shut down the
computer”. The software is then taught step by step what to do with this command. After that, when the
user says aloud “shut down the computer,” the software shuts down the computer based on a step-by-step
tutorial. From here, you can take your scripts beyond the boundaries of your expertise and software
support.
In general, it can be said that speech recognition in social and health care is mainly focused on
documentation made by certain special groups. However, as technology advances, this perspective could
be extended to other specialized sectors as well as social care. In addition, one might wonder that what
other possibilities does speech recognition have than the implementation of mere documentation.
Author
Atte Nieminen, Digital Health-student, Savonia UAS
Sources
BERGSTRÖM, Mikko. 2020. Puheentunnistusteknologian käyttö terveydenhuollon potilaskertomuksissa. Kandidaatin tutkielma. Jyväskylän yliopisto, informaatioteknologian tiedekunta. Available: https://jyx.jyu.fi/bitstream/handle/123456789/68390/1/URN%3ANBN%3Afi%3Ajyu-202003312604.pdf
GOSS, Foster; BLACKLEY, Suzanne; ORTEGA, Carlos; KOWALSKI, Adam; LIN, Chen-Tan; METEER, Marie; BAKES, Samantha; GRADWOHL, Stephen; BATES, David & ZHOU, Li. 2019. A clinician survey of using speech recognition for clinical documentation in the electronic health record. International Journal of Medical Informatics. Vol. 130.
MATVEINEN, Petri. 2021. Terveydenhuollon menot ja rahoitus 2019. THL TILASTORAPORTTI 15/2021. Terveyden ja Hyvinvoinnin laitos. Available: https://www.julkari.fi/bitstream/handle/10024/142578/Tr15_21.pdf?sequence=1&isAllowed=y
REPONEN, Jarmo; KANGAS, Maarit; HÄMÄLÄINEN, Päivi; KERÄNEN, Niina & HAVERINEN, Jari. Use of information and communications technology in Finnish health care in 2017. Current situation and trends. Terveyden ja hyvinvoinnin laitos (THL). National Institute for Health and Welfare (THL). Report 5/2018. 207 pages. Helsinki 2018. ISBN 978-952-343-107-2 (printed); ISBN 978-952-343-108-9X (online publication). Available: https://www.julkari.fi/bitstream/handle/10024/136278/URN_ISBN_978-952-343-108-9.pdf?sequence=1&isAllowed=y
VOGEL, Markus; KAISERS, Wolfgang; WASSMUTH, Ralf & MAYATEPEK, Ertan. 2015. Analysis of Documentation Speed Using Web-Based Medical Speech Recognition Technology: Randomized Controlled Trial. Journal of Medical Internet Research. 2015;17(11):e247 URL: http://www.jmir.org/2015/11/e247/