Araştırma Makalesi
BibTex RIS Kaynak Göster

LIP READING USING CNN FOR TURKISH NUMBERS

Yıl 2022, Cilt: 5 Sayı: 2, 155 - 160, 31.12.2022
https://doi.org/10.46238/jobda.1100903

Öz

Recently, lip reading has become one of the most important fields of study in the field of artificial intelligence. In this study, lip reading process was performed in Turkish language using convolutional neural networks (CNNs). For this purpose, people were asked to record the numbers video (61 video), and 9 video also collected from YouTube. The dataset was collected for 20 numbers. In this study, only the video was used and the sounds were completely removed. Due to the small dataset, it was tried to reproduce with different methods. The model was trained on the train dataset and 56.25% success was achieved on the test dataset.

Kaynakça

  • Agrawal, S., & Omprakash, V. R. (2016, July). Lip reading techniques: A survey. In 2016 2nd International Conference on Applied and Theoretical Computing and Communication Technology (iCATccT) (pp. 753-757). IEEE.
  • Chen, X., Du, J., & Zhang, H. (2020). Lipreading with DenseNet and resBi-LSTM. Signal, Image and Video Processing, 14(5), 981-989.
  • Chung, J. S., Senior, A., Vinyals, O., & Zisserman, A. (2017, July). Lip reading sentences in the wild. In 2017 IEEE conference on computer vision and pattern recognition (CVPR) (pp. 3444-3453). IEEE.
  • Elrefaei, L. A., Alhassan, T. Q., & Omar, S. S. (2019). An Arabic visual dataset for visual speech recognition. Procedia Computer Science, 163, 400-409.
  • Faisal, M., & Manzoor, S. (2018). Deep learning for lip reading using audio-visual information for urdu language. arXiv preprint arXiv:1802.05521.
  • Garg, A., Noyola, J., & Bagadia, S. (2016). Lip reading using CNN and LSTM. Technical report, Stanford University, CS231 n project report.
  • Li, Y., Takashima, Y., Takiguchi, T., & Ariki, Y. (2016, June). Lip reading using a dynamic feature of lip images and convolutional neural networks. In 2016 IEEE/ACIS 15th International Conference on Computer and Information Science (ICIS) (pp. 1-6). IEEE.
  • Martinez, B., Ma, P., Petridis, S., & Pantic, M. (2020, May). Lipreading using temporal convolutional networks. In ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (pp. 6319-6323). IEEE.
  • Noda, K., Yamaguchi, Y., Nakadai, K., Okuno, H. G., & Ogata, T. (2014). Lipreading using convolutional neural network. In fifteenth annual conference of the international speech communication association, 1149-1153.
  • Ozcan, T., & Basturk, A. (2019). Lip reading using convolutional neural networks with and without pre-trained models. Balkan Journal of Electrical and Computer Engineering, 7(2), 195-201.
  • Petridis, S., Li, Z., & Pantic, M. (2017, March). End-to-end visual speech recognition with LSTMs. In 2017 IEEE international conference on acoustics, speech and signal processing (ICASSP) (pp. 2592-2596). IEEE.
  • Yargıç, A., & Doğan, M. (2013, June). A lip reading application on MS Kinect camera. In 2013 IEEE INISTA (pp. 1-5). IEEE.

Türk Rakamları İçin CNN İle Dudak Okuma

Yıl 2022, Cilt: 5 Sayı: 2, 155 - 160, 31.12.2022
https://doi.org/10.46238/jobda.1100903

Öz

Kaynakça

  • Agrawal, S., & Omprakash, V. R. (2016, July). Lip reading techniques: A survey. In 2016 2nd International Conference on Applied and Theoretical Computing and Communication Technology (iCATccT) (pp. 753-757). IEEE.
  • Chen, X., Du, J., & Zhang, H. (2020). Lipreading with DenseNet and resBi-LSTM. Signal, Image and Video Processing, 14(5), 981-989.
  • Chung, J. S., Senior, A., Vinyals, O., & Zisserman, A. (2017, July). Lip reading sentences in the wild. In 2017 IEEE conference on computer vision and pattern recognition (CVPR) (pp. 3444-3453). IEEE.
  • Elrefaei, L. A., Alhassan, T. Q., & Omar, S. S. (2019). An Arabic visual dataset for visual speech recognition. Procedia Computer Science, 163, 400-409.
  • Faisal, M., & Manzoor, S. (2018). Deep learning for lip reading using audio-visual information for urdu language. arXiv preprint arXiv:1802.05521.
  • Garg, A., Noyola, J., & Bagadia, S. (2016). Lip reading using CNN and LSTM. Technical report, Stanford University, CS231 n project report.
  • Li, Y., Takashima, Y., Takiguchi, T., & Ariki, Y. (2016, June). Lip reading using a dynamic feature of lip images and convolutional neural networks. In 2016 IEEE/ACIS 15th International Conference on Computer and Information Science (ICIS) (pp. 1-6). IEEE.
  • Martinez, B., Ma, P., Petridis, S., & Pantic, M. (2020, May). Lipreading using temporal convolutional networks. In ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (pp. 6319-6323). IEEE.
  • Noda, K., Yamaguchi, Y., Nakadai, K., Okuno, H. G., & Ogata, T. (2014). Lipreading using convolutional neural network. In fifteenth annual conference of the international speech communication association, 1149-1153.
  • Ozcan, T., & Basturk, A. (2019). Lip reading using convolutional neural networks with and without pre-trained models. Balkan Journal of Electrical and Computer Engineering, 7(2), 195-201.
  • Petridis, S., Li, Z., & Pantic, M. (2017, March). End-to-end visual speech recognition with LSTMs. In 2017 IEEE international conference on acoustics, speech and signal processing (ICASSP) (pp. 2592-2596). IEEE.
  • Yargıç, A., & Doğan, M. (2013, June). A lip reading application on MS Kinect camera. In 2013 IEEE INISTA (pp. 1-5). IEEE.
Toplam 12 adet kaynakça vardır.

Ayrıntılar

Birincil Dil İngilizce
Bölüm Özgün Bilimsel Makaleler
Yazarlar

Hadı Pourmousa 0000-0001-6713-5872

Üstün Özen 0000-0002-7595-4306

Yayımlanma Tarihi 31 Aralık 2022
Yayımlandığı Sayı Yıl 2022 Cilt: 5 Sayı: 2

Kaynak Göster

APA Pourmousa, H., & Özen, Ü. (2022). LIP READING USING CNN FOR TURKISH NUMBERS. Journal of Business in The Digital Age, 5(2), 155-160. https://doi.org/10.46238/jobda.1100903
AMA Pourmousa H, Özen Ü. LIP READING USING CNN FOR TURKISH NUMBERS. JOBDA. Aralık 2022;5(2):155-160. doi:10.46238/jobda.1100903
Chicago Pourmousa, Hadı, ve Üstün Özen. “LIP READING USING CNN FOR TURKISH NUMBERS”. Journal of Business in The Digital Age 5, sy. 2 (Aralık 2022): 155-60. https://doi.org/10.46238/jobda.1100903.
EndNote Pourmousa H, Özen Ü (01 Aralık 2022) LIP READING USING CNN FOR TURKISH NUMBERS. Journal of Business in The Digital Age 5 2 155–160.
IEEE H. Pourmousa ve Ü. Özen, “LIP READING USING CNN FOR TURKISH NUMBERS”, JOBDA, c. 5, sy. 2, ss. 155–160, 2022, doi: 10.46238/jobda.1100903.
ISNAD Pourmousa, Hadı - Özen, Üstün. “LIP READING USING CNN FOR TURKISH NUMBERS”. Journal of Business in The Digital Age 5/2 (Aralık 2022), 155-160. https://doi.org/10.46238/jobda.1100903.
JAMA Pourmousa H, Özen Ü. LIP READING USING CNN FOR TURKISH NUMBERS. JOBDA. 2022;5:155–160.
MLA Pourmousa, Hadı ve Üstün Özen. “LIP READING USING CNN FOR TURKISH NUMBERS”. Journal of Business in The Digital Age, c. 5, sy. 2, 2022, ss. 155-60, doi:10.46238/jobda.1100903.
Vancouver Pourmousa H, Özen Ü. LIP READING USING CNN FOR TURKISH NUMBERS. JOBDA. 2022;5(2):155-60.

                                                                Creative Commons Lisansı

Bu eser Creative Commons Atıf-AynıLisanslaPaylaş 4.0 Uluslararası Lisansı ile lisanslanmıştır.