Forensic Perspective on Voice Biometrics and AI : A Review

Pragati Jain; Poorvi Ujjainia; Anshika Srivastava; Kajal Shrivastav; Ishu Rani; Akshat Vashisht; Rudranarayan Behera; Bhavika Moza; Debhjit Mukherjee

doi:10.32628/IJSRST2411581

Authors

Pragati Jain Department of forensic science, University Institute of applied health sciences, Chandigarh university, NH-95, Ludhiana, Punjab, India Author https://orcid.org/0009-0000-8666-7129 (unauthenticated)
Poorvi Ujjainia Department of forensic science, University Institute of applied health sciences, Chandigarh university, NH-95, Ludhiana, Punjab, India Author https://orcid.org/0009-0000-1170-4589 (unauthenticated)
Anshika Srivastava Department of Forensic Science, College of Paramedical Sciences, Teerthanker Mahaveer University, Moradabad, Uttar Pradesh, India Author https://orcid.org/0000-0003-0132-1060 (unauthenticated)
Kajal Shrivastav Department of forensic science, University Institute of applied health sciences, Chandigarh university, NH-95, Ludhiana, Punjab, India Author https://orcid.org/0000-0002-2819-421X (unauthenticated)
Ishu Rani Department of Forensic Sciences, Chandigarh School of Business, Chandigarh Group of Colleges, Jhanjeri, Punjab, India Author https://orcid.org/0009-0001-7889-5401 (unauthenticated)
Akshat Vashisht Department of Computer application, Silver Oak College of Computer Application (SOCCA), Silver Oak university, Ahmedabad, Gujarat, India Author https://orcid.org/0009-0005-2823-7952 (unauthenticated)
Rudranarayan Behera Department of forensic science, University Institute of applied health sciences, Chandigarh university, NH-95, Ludhiana, Punjab, India Author https://orcid.org/0009-0004-7040-733X (unauthenticated)
Bhavika Moza Department of forensic science, University Institute of applied health sciences, Chandigarh university, NH-95, Ludhiana, Punjab, India Author https://orcid.org/0009-0001-9019-6267 (unauthenticated)
Debhjit Mukherjee Department of Forensic Sciences, Chandigarh School of Business, Chandigarh Group of Colleges, Jhanjeri, Punjab, India Author https://orcid.org/0009-0009-5673-2554 (unauthenticated)

DOI:

https://doi.org/10.32628/IJSRST2411581

Keywords:

Voice Biometrics, Artificial Intelligence, Deepfake Voice Scams, Deep Learning, Neural Networks, Multilingual Recognition, Forensic Applications

Abstract

Modern internet has given rise to various voice related crimes worldwide, notably deepfake voice scams where the perpetrators utilize artificial intelligence to deceive victims by the means of forgery of voice. This review article aims to discuss the advancements and challenges in voice biometrics, particularly focusing on the impact of AI and deep learning on this field. It underscores the evolution of voice biometrics from early methods to modern AI enhanced techniques, by highlighting the significant improvements in accuracy, security, and adaptability etc. The key findings of the article have highlighted that while AI-driven advancements have addressed many challenges including voice robustness and multilingual recognition, new threats like deep fake audio require ongoing innovation. The integration of various methods like deep learning, neural networks and advanced feature extraction has shown incredible potential in enhancing the system resilience. But challenges such as voice variability, privacy concerns and the forensic applications of these technologies remain critical issue to be addressed by the future researchers. This review article recommends multidisciplinary research to bridge the gap between this field and forensic science, emphasizing the need for continued development to address and prevent emerging threats very efficiently.

📊 Article Downloads

References

G. Ulutas, G. Tahaoglu, and B. Ustubioglu, “Deepfake audio detection with vision transformer based method,” in 2023 46th International Conference on Telecommunications and Signal Processing, TSP 2023, 2023, pp. 244–247. doi: 10.1109/TSP59544.2023.10197715. DOI: https://doi.org/10.1109/TSP59544.2023.10197715

Times of India, “About 83% Indians have lost money in AI voice scams: Report ,” 2023.

Y. Yanagi, R. Orihara, Y. Tahara, Y. Sei, T. Alumäe, and A. Ohsuga, “The Proposal of Countermeasures for DeepFake Voices on Social Media Considering Waveform and Text Embedding,” Annals of Emerging Technologies in Computing, vol. 8, no. 2, pp. 15–31, 2024, doi: 10.33166/AETiC.2024.02.002. DOI: https://doi.org/10.33166/AETiC.2024.02.002

Craig Gibson and Josiah Hagen, “Virtual Kidnapping ,” https://www.trendmicro.com/vinfo/us/security/news/cybercrime-and-digital-threats/how-cybercriminals-can-perform-virtual-kidnapping-scams-using-ai-voice-cloning-tools-and-chatgpt.

NDTV, “AI Voice Cloning: What It Is And How To Avoid Getting Scammed By It,” Feb. 2024.

M. Guest Author, “Voice Cloning Scams: The Alarming Use of Artificial Intelligence by Cybercriminals,” https://www.medianama.com/2024/04/223-voice-cloning-scams-cybercriminals-ai/.

C. García-Mateo and G. Chollet, Voice Biometrics: Technology, trust and security. 2021. doi: 10.1049/PBSE012E. DOI: https://doi.org/10.1049/PBSE012E

K. Revett, Behavioral Biometrics: A Remote Access Approach. 2008. doi: 10.1002/9780470997949. DOI: https://doi.org/10.1002/9780470997949

J. A. Atah and G. Howells, “Score normalisation of voice features for template free biometric encryption,” in International Conference on Artificial Intelligence and Pattern Recognition 2008, AIPR 2008, 2008, pp. 269–272. [Online]. Available: https://www.scopus.com/inward/record.uri?eid=2-s2.0-84876774976&partnerID=40&md5=51cd3d5a316bcd1aa9b0b5b3a3ced5e7

G. Chetty, “Biometric liveness checking using multimodal fuzzy fusion,” in 2010 IEEE World Congress on Computational Intelligence, WCCI 2010, 2010. doi: 10.1109/FUZZY.2010.5584864. DOI: https://doi.org/10.1109/FUZZY.2010.5584864

S. Fong, “Using hierarchical time series clustering algorithm and wavelet classifier for biometric voice classification,” J Biomed Biotechnol, vol. 2012, 2012, doi: 10.1155/2012/215019. DOI: https://doi.org/10.1155/2012/215019

Z. Y. Zhu, Q. H. He, X. H. Feng, Y. X. Li, and Z. F. Wang, “Liveness detection using time drift between lip movement and voice,” in Proceedings - International Conference on Machine Learning and Cybernetics, 2013, pp. 973–978. doi: 10.1109/ICMLC.2013.6890423. DOI: https://doi.org/10.1109/ICMLC.2013.6890423

E. Luckyanets, A. Melnikov, O. Kudashev, S. Novoselov, and G. Lavrentyeva, “Bimodal anti-spoofing system for mobile security,” in Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2017, pp. 211–220. doi: 10.1007/978-3-319-66429-3_20. DOI: https://doi.org/10.1007/978-3-319-66429-3_20

H. B. Sailor, M. C. Madhavi, and H. A. Patil, “Significance of phase-based features for person recognition using humming,” in ACM International Conference Proceeding Series, 2015, pp. 99–103. doi: 10.1145/2708463.2709035. DOI: https://doi.org/10.1145/2708463.2709035

B. M. Garlapati, S. R. Chalamala, and K. R. Kakkirala, “Tamper detection in speech based access control systems using watermarking,” in Proceedings - AIMS 2015, 3rd International Conference on Artificial Intelligence, Modelling and Simulation, 2016, pp. 325–331. doi: 10.1109/AIMS.2015.59. DOI: https://doi.org/10.1109/AIMS.2015.59

E. O. J. Van Rensburg and R. Von Solms, “Voice recognition as a user-authentication method,” in European Conference on Information Warfare and Security, ECCWS, 2017, pp. 702–709. [Online]. Available: https://www.scopus.com/inward/record.uri?eid=2-s2.0-85028025006&partnerID=40&md5=4fcfa6cc4f6a6b6fadb1b8567ff8f705

S. Soviany, V. Sǎndulescu, and S. Puşcoci, “A multimodal biometric identification method for mobile applications security,” in Proceedings of the 8th International Conference on Electronics, Computers and Artificial Intelligence, ECAI 2016, 2017. doi: 10.1109/ECAI.2016.7861102. DOI: https://doi.org/10.1109/ECAI.2016.7861102

B. Arslan, E. Yorulmaz, B. Akca, and S. Sagiroglu, “Security perspective of Biometric recognition and machine learning techniques,” in Proceedings - 2016 15th IEEE International Conference on Machine Learning and Applications, ICMLA 2016, 2017, pp. 492–497. doi: 10.1109/ICMLA.2016.183. DOI: https://doi.org/10.1109/ICMLA.2016.0087

S. H. Mankad, S. Garg, M. Patel, and H. Adalja, “Investigating Feature Reduction Strategies for Replay Antispoofing in Voice Biometrics,” in Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2019, pp. 400–408. doi: 10.1007/978-3-030-34872-4_44. DOI: https://doi.org/10.1007/978-3-030-34872-4_44

T. Ring, “Europol: the AI hacker threat to biometrics,” Biometric Technology Today, vol. 2021, no. 2, pp. 9–11, 2021, doi: 10.1016/S0969-4765(21)00023-0. DOI: https://doi.org/10.1016/S0969-4765(21)00023-0

J. C. Bernal-Romero, J. M. Ramirez-Cortes, and J. De Jesus Rangel-Magdaleno, “Unbreakable Biometrics: How Physical Unclonable Functions are Revolutionizing Security,” IEEE Instrum Meas Mag, vol. 27, no. 2, pp. 71–78, 2024, doi: 10.1109/MIM.2024.10472986. DOI: https://doi.org/10.1109/MIM.2024.10472986

W. AbdAlmageed, “Anti-spoofing’s state-of-the-art: deep learning meets super sensors,” Biometric Technology Today, vol. 2019, no. 7, pp. 8–11, 2019, doi: 10.1016/S0969-4765(19)30099-2. DOI: https://doi.org/10.1016/S0969-4765(19)30099-2

R. Kathuria, A. Wadehra, and V. Kathuria, “Human-Centered Artificial Intelligence: Antecedents of Trust for the Usage of Voice Biometrics for Driving Contactless Interactions,” in Communications in Computer and Information Science, 2020, pp. 325–334. doi: 10.1007/978-3-030-60700-5_42. DOI: https://doi.org/10.1007/978-3-030-60700-5_42

M. V Evsyukov, M. M. Putyato, and A. S. Makaryan, “Antispoofing Countermeasures in Modern Voice Authentication Systems,” in CEUR Workshop Proceedings, 2021, pp. 197–202. [Online]. Available: https://www.scopus.com/inward/record.uri?eid=2-s2.0-85122794382&partnerID=40&md5=6e038815dd07d23968182b493ff3a593

S. Shakil, D. Arora, and T. Zaidi, “Feature based classification of voice based biometric data through Machine learning algorithm,” in Materials Today: Proceedings, 2021, pp. 240–247. doi: 10.1016/j.matpr.2021.05.261. DOI: https://doi.org/10.1016/j.matpr.2021.05.261

L. Chato and S. Latifi, “Application of Machine Learning to Biometric Systems- A Survey,” in Journal of Physics: Conference Series, 2018. doi: 10.1088/1742-6596/1098/1/012017. DOI: https://doi.org/10.1088/1742-6596/1098/1/012017

S. Singh, “The role of speech technology in biometrics, forensics and man-machine interface,” International Journal of Electrical and Computer Engineering, vol. 9, no. 1, pp. 281–288, 2019, doi: 10.11591/ijece.v9i1.pp281-288. DOI: https://doi.org/10.11591/ijece.v9i1.pp281-288

S. Srinivasan, “Understanding User Perception of Biometric Privacy in the Era of Generative AI,” in 4th International Conference on Communication, Computing and Industry 6.0, C216 2023, 2023. doi: 10.1109/C2I659362.2023.10430931. DOI: https://doi.org/10.1109/C2I659362.2023.10430931

S. Aanjanadevi, S. Aanjankumar, K. R. Ramela, and V. Palanisamy, “Face Attribute Convolutional Neural Network System for Data Security with Improved Crypto Biometrics,” Computer Systems Science and Engineering, vol. 45, no. 3, pp. 2351–2362, 2023, doi: 10.32604/csse.2023.031893. DOI: https://doi.org/10.32604/csse.2023.031893

G. Harilatha and K. Riyazuddin, “Artificial intelligence oriented security system using alexa,” in Lecture Notes in Electrical Engineering, vol. 643, 2020, pp. 303–312. doi: 10.1007/978-981-15-3125-5_32. DOI: https://doi.org/10.1007/978-981-15-3125-5_32

T. A. de Lima and M. C. Da-Costa Abreu, “Phoneme analysis for multiple languages with fuzzy-based speaker identification,” IET Biom, vol. 11, no. 6, pp. 614–624, 2022, doi: 10.1049/bme2.12078. DOI: https://doi.org/10.1049/bme2.12078

M. Enomoto and T. Koito, “Bio-idiom - NEC’s biometric authentication brand,” NEC Technical Journal, vol. 13, no. 2, pp. 14–18, 2019, [Online]. Available: https://www.scopus.com/inward/record.uri?eid=2-s2.0-85078325693&partnerID=40&md5=32a2b097c154bb438570bb0f8fd50fe6

F. Sal, Biometric Techniques For Personal Identification & Voice Authentication. 2019. doi: 10.13140/RG.2.2.16351.61607.

A. K. Jain and A. Kumar, “Biometrics of next generation: An overview,” Second generation biometrics, vol. 12, no. 1, pp. 2–3, 2010.

A. K. Jain, A. Ross, and S. Pankanti, “Biometrics: a tool for information security,” IEEE transactions on information forensics and security, vol. 1, no. 2, pp. 125–143, 2006. DOI: https://doi.org/10.1109/TIFS.2006.873653

N. Singh, A. Agrawal, and R. A. Khan, “Voice biometric: A technology for voice based authentication,” Adv Sci Eng Med, vol. 10, no. 7–8, pp. 754–759, 2018. DOI: https://doi.org/10.1166/asem.2018.2219

P. Jain, P. Chinmayee, K. Kaur, S. Chaudhary, K. Kaur, and S. Karunya, “Advancements in Forensic Voice Analysis: Legal Frameworks and Technology Integration,” Asian Journal of Advances in Research, vol. 7, no. 1, pp. 369–384, 2024.

S. Srivastava, A. A. Hussain, and S. Gupta, “A Review Article on Layered Voice Analysis: Forensic Utility and Limitation,” International Journal of Indian Psychȯlogy, vol. 10, no. 3, 2022.

S. S. Tirumala, S. R. Shahamiri, A. S. Garhwal, and R. Wang, “Speaker identification features extraction methods: A systematic review,” Expert Syst Appl, vol. 90, pp. 250–271, 2017, doi: 10.1016/j.eswa.2017.08.015. DOI: https://doi.org/10.1016/j.eswa.2017.08.015

A. Merouane, S. Benziane, P. Boulet, A. El Hassan Benyamina, and L. Loukil, “Hybridization of discrete binary particle swarm optimization and invariant moments for dorsal hand vein feature selection,” in 2013 International Conference on Electronics, Computers and Artificial Intelligence, ECAI 2013, 2013. doi: 10.1109/ECAI.2013.6636192. DOI: https://doi.org/10.1109/ECAI.2013.6636192

E. L. Campbell, G. Hernández, and J. R. Calvo, “Feature extraction of automatic speaker recognition, analysis and evaluation in real environment,” in Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2018, pp. 376–383. doi: 10.1007/978-3-030-01132-1_43. DOI: https://doi.org/10.1007/978-3-030-01132-1_43

V. S. Baidwan and S. Gujral, “Comparative analysis of prosodic features and linear predictive coefficients for speaker recognition using machine learning technique,” in 2014 International Conference on Devices, Circuits and Communications, ICDCCom 2014 - Proceedings, 2014. doi: 10.1109/ICDCCom.2014.7024705. DOI: https://doi.org/10.1109/ICDCCom.2014.7024705

K. K. Patro and P. R. Kumar, “Machine learning classification approaches for biometric recognition system using ECG signals,” Journal of Engineering Science and Technology Review, vol. 10, no. 6, pp. 1–8, 2017, doi: 10.25103/jestr.106.01. DOI: https://doi.org/10.25103/jestr.106.01

A. Gona, M. Subramoniam, and R. Swarnalatha, “Transfer learning convolutional neural network with modified Lion optimization for multimodal biometric system,” Computers and Electrical Engineering, vol. 108, 2023, doi: 10.1016/j.compeleceng.2023.108664. DOI: https://doi.org/10.1016/j.compeleceng.2023.108664

C. Turner and A. Joseph, “A Statistical and Cluster Analysis Exploratory Study of Snort Rules,” in Procedia Computer Science, 2017, pp. 106–115. doi: 10.1016/j.procs.2017.09.023. DOI: https://doi.org/10.1016/j.procs.2017.09.023

S. M. H. Mousavi and A. Ilanloo, “Seven Staged Identity Recognition System Using Kinect V.2 Sensor,” in 2022 9th Iranian Joint Congress on Fuzzy and Intelligent Systems, CFIS 2022, 2022. doi: 10.1109/CFIS54774.2022.9756435. DOI: https://doi.org/10.1109/CFIS54774.2022.9756435

B. Arslan, E. Yorulmaz, B. Akca, and S. Sagiroglu, “Security perspective of Biometric recognition and machine learning techniques,” in Proceedings - 2016 15th IEEE International Conference on Machine Learning and Applications, ICMLA 2016, 2017, pp. 492–497. doi: 10.1109/ICMLA.2016.183. DOI: https://doi.org/10.1109/ICMLA.2016.0087

L. Chato and S. Latifi, “Application of Machine Learning to Biometric Systems- A Survey,” in Journal of Physics: Conference Series, 2018. doi: 10.1088/1742-6596/1098/1/012017. DOI: https://doi.org/10.1088/1742-6596/1098/1/012017

I. El Magrouni, A. Ettaoufik, A. Siham, A. Maizate, and B. Lotfi, “Approach for the construction of gestural interfaces to control graphical interfaces based on artificial intelligence,” in Proceedings - 2022 9th International Conference on Wireless Networks and Mobile Communications, WINCOM 2022, 2022. doi: 10.1109/WINCOM55661.2022.9966424. DOI: https://doi.org/10.1109/WINCOM55661.2022.9966424

A. R. Ambili and R. C. Roy, “Spoken Language Identification of Indian Languages in Adversarial Synthetic and Noisy Attacking Environments,” in Proceedings of International Conference on Computing, Communication, Security and Intelligent Systems, IC3SIS 2022, 2022. doi: 10.1109/IC3SIS54991.2022.9885560. DOI: https://doi.org/10.1109/IC3SIS54991.2022.9885560

G. Ulutas, G. Tahaoglu, and B. Ustubioglu, “Deepfake audio detection with vision transformer based method,” in 2023 46th International Conference on Telecommunications and Signal Processing, TSP 2023, 2023, pp. 244–247. doi: 10.1109/TSP59544.2023.10197715. DOI: https://doi.org/10.1109/TSP59544.2023.10197715

A. Babu, E. Raoul, G. Kassahun, I. Dufour, D. Mandal, and D. Thuau, “Programmable Polymeric-Interface for Voiceprint Biometrics,” Adv Mater Technol, vol. 9, no. 4, 2024, doi: 10.1002/admt.202301551. DOI: https://doi.org/10.1002/admt.202301551

J. Xue and H. Zhou, “Physiological-physical feature fusion for automatic voice spoofing detection,” Front Comput Sci, vol. 17, no. 2, 2023, doi: 10.1007/s11704-022-2121-6. DOI: https://doi.org/10.1007/s11704-022-2121-6

Y. Chen, Y. Zhu, P. Zhao, and J. Guo, “Can you trust what you hear: Effects of audio-attacks on voice-to-face generation system,” International Journal of Intelligent Systems, vol. 37, no. 5, pp. 3090–3116, 2022, doi: 10.1002/int.22825. DOI: https://doi.org/10.1002/int.22825

M. Chen, L. Lu, Z. Ba, and K. Ren, “PhoneyTalker: An Out-of-the-Box Toolkit for Adversarial Example Attack on Speaker Recognition,” in Proceedings - IEEE INFOCOM, 2022, pp. 1419–1428. doi: 10.1109/INFOCOM48880.2022.9796934. DOI: https://doi.org/10.1109/INFOCOM48880.2022.9796934

N. Fathima, J. B. Simha, and S. Abhi, “Transfer Learning for Speaker Verification with Short-Duration Audio,” in Lecture Notes in Networks and Systems, 2024, pp. 195–205. doi: 10.1007/978-981-97-1313-4_17. DOI: https://doi.org/10.1007/978-981-97-1313-4_17

A. Dhole and V. Kadroli, “An Overview of Speaker Recognition: Conceptual Framework and CNN based Identification Technique,” in 14th International Conference on Advances in Computing, Control, and Telecommunication Technologies, ACT 2023, 2023, pp. 2901–2908. [Online]. Available: https://www.scopus.com/inward/record.uri?eid=2-s2.0-85174385196&partnerID=40&md5=d5dec00f0cfe1faae61a0e544fa2858e

C. B. Tan, M. H. Ahmad Hijazi, and P. N. Ellyza Nohuddin, “A Hybrid Classification Approach For Artificial Speech Detection,” in 5th IEEE International Conference on Artificial Intelligence in Engineering and Technology, IICAIET 2023, 2023, pp. 236–240. doi: 10.1109/IICAIET59451.2023.10291764. DOI: https://doi.org/10.1109/IICAIET59451.2023.10291764

U. Sadique, M. S. Khan, S. Anwar, and M. Ahmad, “Machine Learning based human recognition via robust Features from audio signals,” in 3rd IEEE International Conference on Artificial Intelligence, ICAI 2023, 2023, pp. 52–57. doi: 10.1109/ICAI58407.2023.10136683. DOI: https://doi.org/10.1109/ICAI58407.2023.10136683

Y. Belova, “Comparative Analysis of Authentication Using Formant Features of Vowels and Consonants,” in Springer Proceedings in Mathematics and Statistics, 2024, pp. 211–220. doi: 10.1007/978-3-031-52965-8_17. DOI: https://doi.org/10.1007/978-3-031-52965-8_17

S. Kumar et al., “Multilayer Neural Network Based Speech Emotion Recognition for Smart Assistance,” Computers, Materials and Continua, vol. 74, no. 1, pp. 1523–1540, 2023, doi: 10.32604/cmc.2023.028631. DOI: https://doi.org/10.32604/cmc.2023.028631

S. R. Shahamiri, “An optimized enhanced-multi learner approach towards speaker identification based on single-sound segments,” Multimed Tools Appl, vol. 83, no. 8, pp. 24541–24562, 2024, doi: 10.1007/s11042-023-16507-2. DOI: https://doi.org/10.1007/s11042-023-16507-2

K. Milewski, S. Zaporowski, and A. Czyżewski, “Comparison of the Ability of Neural Network Model and Humans to Detect a Cloned Voice,” Electronics (Switzerland), vol. 12, no. 21, 2023, doi: 10.3390/electronics12214458. DOI: https://doi.org/10.3390/electronics12214458

P. Wang et al., “Intelligent Access Control System Based on Voiceprint and Voice Technology,” in Proceedings - 2022 11th International Conference of Information and Communication Technology, ICTech 2022, 2022, pp. 461–465. doi: 10.1109/ICTech55460.2022.00098. DOI: https://doi.org/10.1109/ICTech55460.2022.00098

P. Shetty, R. Rodricks, S. Malgundkar, H. Pamnani, and S. Katke, “SPEECH BIOMETRICS: A Comprehensive Deep Learning-based Speaker Identification System,” in 3rd International Mobile, Intelligent, and Ubiquitous Computing Conference, MIUCC 2023, 2023, pp. 228–232. doi: 10.1109/MIUCC58832.2023.10278329. DOI: https://doi.org/10.1109/MIUCC58832.2023.10278329

J. Monteiro, J. Alam, and T. H. Falk, “Multi-level self-attentive TDNN: A general and efficient approach to summarize speech into discriminative utterance-level representations,” Speech Commun, vol. 140, pp. 42–49, 2022, doi: 10.1016/j.specom.2022.03.008. DOI: https://doi.org/10.1016/j.specom.2022.03.008

T. M. Al-Hadithy and M. Frikha, “A Real-Time Speaker Diarization System Based on Convolutional Neural Networks Architectures,” in HORA 2023 - 2023 5th International Congress on Human-Computer Interaction, Optimization and Robotic Applications, Proceedings, 2023. doi: 10.1109/HORA58378.2023.10156741. DOI: https://doi.org/10.1109/HORA58378.2023.10156741

A. Priyadharshini, R. Balakrishnan, S. Mohamed Shazuli, D. Gunapriya, and D. Joseph, “Convolutional Neural Network for Speaker Recognition Embedding with Biometric System,” in 5th International Conference on Inventive Computation Technologies, ICICT 2022 - Proceedings, 2022, pp. 896–900. doi: 10.1109/ICICT54344.2022.9850483. DOI: https://doi.org/10.1109/ICICT54344.2022.9850483

P. Foggia, A. Greco, A. Roberto, A. Saggese, and M. Vento, “Identity, Gender, Age, and Emotion Recognition from Speaker Voice with Multi-task Deep Networks for Cognitive Robotics,” Cognit Comput, 2024, doi: 10.1007/s12559-023-10241-5. DOI: https://doi.org/10.1007/s12559-023-10241-5

G. Dişken, “Complementary regional energy features for spoofed speech detection,” Comput Speech Lang, vol. 85, 2024, doi: 10.1016/j.csl.2023.101602. DOI: https://doi.org/10.1016/j.csl.2023.101602

J. Zhou, T. Hai, D. N. A. Jawawi, D. Wang, E. Ibeke, and C. Biamba, “Voice spoofing countermeasure for voice replay attacks using deep learning,” Journal of Cloud Computing, vol. 11, no. 1, 2022, doi: 10.1186/s13677-022-00306-5. DOI: https://doi.org/10.1186/s13677-022-00306-5

S. Joshi and M. Dua, “LSTM-GTCC based Approach for Audio Spoof Detection,” in 2022 International Conference on Machine Learning, Big Data, Cloud and Parallel Computing, COM-IT-CON 2022, 2022, pp. 656–661. doi: 10.1109/COM-IT-CON54601.2022.9850820. DOI: https://doi.org/10.1109/COM-IT-CON54601.2022.9850820

R. Khan, S. Hossain, A. Hossain, F. H. Siddiqui, and S. B. Noor, “Bangla Speech-Based Person Identification Using LSTM Networks,” in Lecture Notes of the Institute for Computer Sciences, Social-Informatics and Telecommunications Engineering, LNICST, 2023, pp. 358–370. doi: 10.1007/978-3-031-34619-4_29. DOI: https://doi.org/10.1007/978-3-031-34619-4_29

R. Naveen, C. Jeevan Reddy, R. Tanguturu, and M. Anand Kumar, “Speaker Identification and Verification using Deep Learning,” in 2022 International Conference on Signal and Information Processing, IConSIP 2022, 2022. doi: 10.1109/ICoNSIP49665.2022.10007520. DOI: https://doi.org/10.1109/ICoNSIP49665.2022.10007520

M. Dua, C. Jain, and S. Kumar, “LSTM and CNN based ensemble approach for spoof detection task in automatic speaker verification systems,” J Ambient Intell Humaniz Comput, vol. 13, no. 4, pp. 1985–2000, 2022, doi: 10.1007/s12652-021-02960-0. DOI: https://doi.org/10.1007/s12652-021-02960-0

Z. Ni et al., “Unusable spoken response detection with BLSTM neural networks,” in 2018 11th International Symposium on Chinese Spoken Language Processing, ISCSLP 2018 - Proceedings, 2018, pp. 255–259. doi: 10.1109/ISCSLP.2018.8706635. DOI: https://doi.org/10.1109/ISCSLP.2018.8706635

J. V Campos De Negreiros, C. Veiga Muniz, D. L. Dos Santos, F. R. R. Santos, M. G. Fernandes Costa, and C. F. F. Costa Filho, “Identification of Individuals Using Multimodal Data and LSTM Neural Networks,” in International Conference on Electrical, Computer, Communications and Mechatronics Engineering, ICECCME 2023, 2023. doi: 10.1109/ICECCME57830.2023.10253325. DOI: https://doi.org/10.1109/ICECCME57830.2023.10253325

K. K. Katrak, K. Singh, A. Shah, R. Menon, and V. R. Badri Prasad, “Transformers for Speaker Recognition,” in Smart Innovation, Systems and Technologies, 2022, pp. 49–62. doi: 10.1007/978-981-16-7996-4_5. DOI: https://doi.org/10.1007/978-981-16-7996-4_5

K. Khoria, A. T. Patil, and H. A. Patil, “On significance of constant-Q transform for pop noise detection,” Comput Speech Lang, vol. 77, 2023, doi: 10.1016/j.csl.2022.101421. DOI: https://doi.org/10.1016/j.csl.2022.101421

M. V Evsyukov, M. M. Putyato, and A. S. Makaryan, “Antispoofing Countermeasures in Modern Voice Authentication Systems,” in CEUR Workshop Proceedings. Yalta, Crimea, 2021, pp. 197–202.

M. R. Kamble and H. A. Patil, “Effectiveness of Mel scale-based ESA-IFCC features for classification of natural vs. spoofed speech,” in Pattern Recognition and Machine Intelligence: 7th International Conference, PReMI 2017, Kolkata, India, December 5-8, 2017, Proceedings 7, Springer, 2017, pp. 308–316.

A. Babu, E. Raoul, G. Kassahun, I. Dufour, D. Mandal, and D. Thuau, “Programmable Polymeric‐Interface for Voiceprint Biometrics,” Adv Mater Technol, vol. 9, no. 4, p. 2301551, 2024. DOI: https://doi.org/10.1002/admt.202301551

T. Parlar, “Data privacy and security in the metaverse,” in Metaverse: Technologies, Opportunities and Threats, Springer, 2023, pp. 123–133. DOI: https://doi.org/10.1007/978-981-99-4641-9_8

M. V Evsyukov, M. M. Putyato, and A. S. Makaryan, “Antispoofing Countermeasures in Modern Voice Authentication Systems,” in CEUR Workshop Proceedings, 2021, pp. 197–202. [Online]. Available: https://www.scopus.com/inward/record.uri?eid=2-s2.0-85122794382&partnerID=40&md5=6e038815dd07d23968182b493ff3a593

T. Parlar, “Data Privacy and Security in the Metaverse,” in Studies in Big Data, vol. 133, 2023, pp. 123–133. doi: 10.1007/978-981-99-4641-9_8. DOI: https://doi.org/10.1007/978-981-99-4641-9_8

Y. Yanagi, R. Orihara, Y. Tahara, Y. Sei, T. Alumäe, and A. Ohsuga, “The Proposal of Countermeasures for DeepFake Voices on Social Media Considering Waveform and Text Embedding,” Annals of Emerging Technologies in Computing, vol. 8, no. 2, pp. 15–31, 2024, doi: 10.33166/AETiC.2024.02.002. DOI: https://doi.org/10.33166/AETiC.2024.02.002

Forensic Perspective on Voice Biometrics and AI : A Review

Authors

DOI:

Keywords:

Abstract

📊 Article Downloads

References

Downloads

Published

Issue

Section

License

How to Cite

RightSideBlock

IssueDate

Latest publications