Data61 develop ‘voice liveness detection’ to protect against voice spoofing attacks

    Researchers Data61, the digital specialist arm of CSIRO, have developed a new technique to protect consumers from voice spoofing attacks. 

    Voice assistants provide consumers with great conveniences like playing music, controlling our smart homes, to shopping online, make phone calls and sending messages. Unfortunately, those conveniences, often come with a risk.

    Fraudsters can record a person’s voice for voice assistants like Amazon Alexa or Google Assistant and replay it to impersonate that individual. They can also stitch samples together to mimic a person’s voice in order to spoof, or trick third parties.   

    The new solution, called Void (Voice liveness detection), can be embedded in a smartphone or voice assistant software and works by identifying the differences in spectral power between a live human voice and a voice replayed through a speaker, in order to detect when hackers are attempting to spoof a system. 

    Muhammad Ejaz Ahmed, Cybersecurity Research Scientist at CSIRO’s Data61 and lead author of the research paper, said privacy preserving technologies are becoming increasingly important in enhancing consumer privacy and security as voice technologies become part of daily life.  

    “Voice spoofing attacks can be used to make purchases using a victim’s credit card details, control Internet of Things connected devices like smart appliances and give hackers unsolicited access to personal consumer data such as financial information, home addresses and more.  

    Although voice spoofing is known as one of the easiest attacks to perform as it simply involves a recording of the victim’s voice, it is incredibly difficult to detect because the recorded voice has similar characteristics to the victim’s live voice. Void is game-changing technology that allows for more efficient and accurate detection helping to prevent people’s voice commands from being misused”. 

    Muhammad Ejaz Ahmed, Cybersecurity Research Scientist at CSIRO

    Unlike existing voice spoofing techniques which typically use deep learning models, Void was designed relying on insights from spectrograms – a visual representation of the spectrum of frequencies of a signal as it varies with time to detect the ‘liveness’ of a voice.  

    This technique provides a highly accurate outcome, detecting attacks eight times faster than deep learning methods, and uses 153 times less memory, making it a viable and lightweight solution that could be incorporated into smart devices. 

    Void has been tested using datasets from Samsung and Automatic Speaker Verification Spoofing and Countermeasures challenges, achieving an accuracy of 99% and 94% for each dataset.  

    Research estimates that by 2023, as many as 275 million voice assistant devices will be used to control homes across the globe – a growth of 1,000% since 2018. 

    Data security expert Dr Adnene Guabtni, Senior Research Scientist at CSIRO’s Data61, shares tips for consumers on how to protect their data when using voice assistants: 

    • Always change your voice assistant settings to only activate the assistant using a physical action, such as pressing a button. 
    • On mobile devices, make sure the voice assistant can only activate when the device is unlocked. 
    • Turn off all home voice assistants before you leave your house, to reduce the risk of successful voice spoofing while you are out of the house. 
    • Voice spoofing requires hackers to get samples of your voice. Make sure you regularly delete any voice data that Google, Apple or Amazon store. 
    • Try to limit the use of voice assistants to commands that do not involve online purchases or authorisations – hackers or people around you might record you issuing payment commands and replay them at a later stage. 

    The paper, ‘Void: A fast and light voice liveness detection system’, was co-authored by Muhammad Ejaz Ahmed, CSIRO’s Data61, Il-Youp Kwak, Chung-Ang University, Jun Ho Huh and Iljoo Kim, Samsung Research, Taekkyung Oh, KAIST and Sungkyunkwan University, and Hyoungshick Kim, Sungkyunkwan University, and was published in USENIX Security 2020.  

    The de-identified datasets from Samsung and Automatic Speaker Verification Spoofing and Countermeasures challenges included:  

    • 255,173 voice samples generated with 120 participants, 15 playback devices and 12 recording devices 
    • 18,030 publicly available voice samples generated with 42 participants, 26 playback devices and 25 recording devices 
    Jason Cartwright
    Jason Cartwright
    Creator of techAU, Jason has spent the dozen+ years covering technology in Australia and around the world. Bringing a background in multimedia and passion for technology to the job, Cartwright delivers detailed product reviews, event coverage and industry news on a daily basis. Disclaimer: Tesla Shareholder from 20/01/2021

    Leave a Reply


    Latest posts


    Related articles