음성 분석 기초지식을 위한 리서치
유튜브 지식 강의
- https://www.kaggle.com/kcs93023/keras-sequential-conv1d-model-classification
- https://www.youtube.com/watch?v=HzgCnlre4EE
- https://www.youtube.com/watch?v=XhjPqGKF9Zs
- https://www.youtube.com/watch?v=mAjvfIh2iXw
음성처리 정리 블로그
https://brunch.co.kr/@kakao-it/180
https://wikidocs.net/30651 -> 정리 엄청 잘되있음
https://medium.com/@jongdae.lim/%EA%B8%B0%EA%B3%84-%ED%95%99%EC%8A%B5-machine-learning-%EC%9D%80-%EC%A6%90%EA%B2%81%EB%8B%A4-part-6-eb0ed6b0ed1d
https://engineering.linecorp.com/ko/blog/voice-waveform-arbitrary-signal-to-noise-ratio-python/
https://heartbeat.fritz.ai/a-2019-guide-to-speech-synthesis-with-deep-learning-630afcafb9dd –> 2019년도 음성분석 총 정리
https://medium.com/@saxenauts/speech-synthesis-techniques-using-deep-neural-networks-38699e943861
음성/음악신호 _ 머신러닝 초심자를 위한 가이드
- 1편
http://keunwoochoi.blogspot.com/2016/01/blog-post.html
- 2편
http://keunwoochoi.blogspot.com/2016/03/2.html
- 3편
http://keunwoochoi.blogspot.com/2016/12/3.html
- 4편
http://keunwoochoi.blogspot.com/2017/06/4.html
Augmentations
link
https://towardsdatascience.com/state-of-the-art-audio-data-augmentation-with-google-brains-specaugment-and-pytorch-d3d1a3ce291e
paper
https://ai.googleblog.com/2019/04/specaugment-new-data-augmentation.html
code
https://github.com/zcaceres/spec_augment
kaggle
Freesound Audio Tagging 2019 Solutions
- 1st place solution (with code)
https://www.kaggle.com/c/freesound-audio-tagging-2019/discussion/95924#latest-586969 - 2nd place solution (with code)
https://www.kaggle.com/c/freesound-audio-tagging-2019/discussion/97815#latest-582300 - 3nd place solution
https://www.kaggle.com/c/freesound-audio-tagging-2019/discussion/97926#latest-583269 - 4nd place solution (with code)
https://www.kaggle.com/c/freesound-audio-tagging-2019/discussion/96440#latest-561393 - 6nd place solution (with code)
https://www.kaggle.com/c/freesound-audio-tagging-2019/discussion/96680#latest-623999 - 7nd place solution (with code)
https://www.kaggle.com/c/freesound-audio-tagging-2019/discussion/97812#latest-564533
Beginner guide to Audio data
- https://www.kaggle.com/maxwell110/beginner-s-guide-to-audio-data-2
Audio representation - what it’s all about
- https://www.kaggle.com/davids1992/audio-representation-what-it-s-all-about
In-depth introduction-to-audio-for-beginners
- https://www.kaggle.com/deepaksinghrawat/in-depth-introduction-to-audio-for-beginners
Beginner’s Visualization and Removing Uniformative Part
- https://www.kaggle.com/dude431/beginner-s-visualization-and-removing-uniformative
Papers
- A Study on Speech Recognition Technology
https://www.researchgate.net/publication/278811438_A_Study_on_Speech_Recognition_Technology
- SpecAugmentation
https://ai.googleblog.com/2019/04/specaugment-new-data-augmentation.html?m=1
- Unsupervised speech representation learning using WaveNet autoencoders
https://arxiv.org/abs/1901.08810v2
- wav2vec: Unsupervised Pre-training for Speech Recognition
https://arxiv.org/abs/1904.05862v4
- Learning Discriminative features using Center Loss and Reconstruction as Regularizer for Speech Emotion Recognition
https://arxiv.org/abs/1906.08873v2
- Two-Pass End-to-End Speech Recognition
https://arxiv.org/abs/1908.10992v1
- Advancing Speech Recognition With No Speech Or With Noisy Speech
https://arxiv.org/abs/1906.08871v5
- Coarse-to-fine Optimization for Speech Enhancement
https://arxiv.org/abs/1908.08044v1
기타 지식
스펙토그램 설명
https://ko.wikipedia.org/wiki/스펙트로그램
스펙트럼 설명
https://ko.wikipedia.org/wiki/스펙트럼
Ok Google: How to do Speech Recognition?
https://towardsdatascience.com/ok-google-how-to-do-speech-recognition-f77b5d7cbe0b
Voice representation
- https://www.frontiersin.org/articles/10.3389/fpsyg.2017.01180/full
- paper
https://www.frontiersin.org/articles/10.3389/fpsyg.2017.01180/full
음성인식 정리 잘된 프로젝트 - Project DeepSpeech
- https://github.com/mozilla/DeepSpeech/blob/master/README.rst
ACL논문 찾아보기
- ACL 2019 논문
https://www.aclweb.org/anthology/P19-1039.pdf