Blind Speech Separation and Enhancement With GCC-NMF

Wood, S. U. N.; Rouat, Jean; Dupont, Stéphane; Pironkov, Gueorgui

Request a copy

Article (Scientific journals)

Blind Speech Separation and Enhancement With GCC-NMF

Wood, S. U. N.; Rouat, Jean; Dupont, Stéphane et al.

2017 • In IEEE/ACM Transactions on Audio, Speech and Language Processing

Peer Reviewed verified by ORBi

Permalink
https://hdl.handle.net/20.500.12907/42001

Files (1)Send to Details Statistics Bibliography Similar publications

Files

Full Text

wood2017.pdf

Publisher postprint (6.38 MB)

Request a copy

All documents in ORBi UMONS are protected by a user license.

Send to

RIS BibTex APA Chicago Permalink X Linkedin

Details

Research center :

CRTI - Centre de Recherche en Technologie de l'Information

Disciplines :

Library & information sciences

Author, co-author :

Wood, S. U. N.

Rouat, Jean

Dupont, Stéphane ; Université de Mons > Faculté Polytechnique > Information, Signal et Intelligence artificielle

Pironkov, Gueorgui ; Université de Mons > Faculté Polytechnique > Information, Signal et Intelligence artificielle

Language :

English

Title :

Blind Speech Separation and Enhancement With GCC-NMF

Publication date :

01 April 2017

Journal title :

IEEE/ACM Transactions on Audio, Speech and Language Processing

ISSN :

2329-9290

eISSN :

2329-9304

Publisher :

Institute of Electrical and Electronics Engineers (IEEE), New York, United States - New York

Peer reviewed :

Peer Reviewed verified by ORBi

Research unit :

F105 - Information, Signal et Intelligence artificielle

Research institute :

R300 - Institut de Recherche en Technologies de l'Information et Sciences de l'Informatique
R450 - Institut NUMEDIART pour les Technologies des Arts Numériques

Available on ORBi UMONS :

since 05 April 2017

Statistics

Number of views

93 (0 by UMONS)

Number of downloads

0 (0 by UMONS)

More statistics

Scopus citations^®

Scopus citations^®
without self-citations

Bibliography

S. U. N. Wood and J. Rouat, "Blind speech separation with GCC-NMF," in Proc. Interspeech Conf., Sep. 2016, pp. 3329-3333.
E. C. Cherry, "Some experiments on the recognition of speech, with one and with two ears," J. Acoust. Soc. Amer., vol. 25, no. 5, pp. 975-979, 1953.
S. Haykin and Z. Chen, "The cocktail party problem," Neural Comput., vol. 17, no. 9, pp. 1875-1902, 2005.
D. D. Lee and H. S. Seung, "Learning the parts of objects by non-negative matrix factorization," Nature, vol. 401, no. 6755, pp. 788-791, 1999.
D. D. Lee and H. S. Seung, "Algorithms for non-negative matrix factorization," in Proc. Adv. Neural Inf. Process. Syst., 2001, pp. 556-562.
N. Ono, Z. Koldovsky, S. Miyabe, and N. Ito, "The 2013 signal separation evaluation campaign," in Proc. Int. Workshop Mach. Learn. Signal Process., 2013, pp. 1-6.
N. Ono, Z. Rafii, D. Kitamura, N. Ito, and A. Liutkus, "The 2015 signal separation evaluation campaign," in Latent Variable Analysis and Signal Separation. Berlin, Germany: Springer, 2015, pp. 387-395.
T. Virtanen, J. F. Gemmeke, B. Raj, and P. Smaragdis, "Compositional models for audio processing: Uncovering the structure of sound mixtures," IEEE Signal Process. Mag., vol. 32, no. 2, pp. 125-144, Mar. 2015.
E. Vincent, N. Bertin, R. Gribonval, and F. Bimbot, "From blind to guided audio source separation: How models and side information can improve the separation of sound," IEEE Signal Process. Mag., vol. 31, no. 3, pp. 107-115, May 2014.
B. Wang and M. D. Plumbley, "Musical audio stream separation by non-negative matrix factorization," in Proc. DMRN Summer Conf., 2005, pp. 23-24.
M. N. Schmidt and R. K. Olsson, "Single-channel speech separation using sparse non-negative matrix factorization," in Proc. Int. Conf. Spoken Language Process., 2006, pp. 2614-2617.
A. Ozerov, E. Vincent, and F. Bimbot, "A general flexible framework for the handling of prior information in audio source separation," IEEE Trans. Audio, Speech, Language Process., vol. 20, no. 4, pp. 1118-1133, May 2012.
S. Arberet et al., "Nonnegative matrix factorization and spatial covariance model for under-determined reverberant audio source separation," in Proc. IEEE Int. Conf. Inf. Sci. Signal Process. Appl., May 2010, pp. 1-4.
K. Adiloglu and E. Vincent, "Variational Bayesian inference for source separation and robust feature extraction," Ph.D. dissertation, INRIA, France, 2012.
J. Traa, P. Smaragdis, N. D. Stein, and D. Wingate, "Directional NMF for joint source localization and separation," in Proc. IEEE Workshop Appl. Signal Process. Audio Acoust., Oct. 2015, pp. 1-5.
T. T. Vu, B. Bigot, and E. S. Chng, "Speech enhancement using beamforming and non negative matrix factorization for robust speech recognition in the CHiME-3 challenge," in Proc. IEEE Workshop Autom. Speech Recognit. Understanding, Dec. 2015, pp. 423-429.
N. D. Stein, "Nonnegative tensor factorization for directional blind audio source separation," 2014. [Online] Available: http://arxiv. org/abs/1411.5010
C. Févotte and J. Idier, "Algorithms for nonnegative matrix factorization with the β-divergence," Neural Comput., vol. 23, no. 9, pp. 2421-2456, 2011.
J. Roux, F. Weninger, and J. Hershey, "Sparse NMF-half-baked or well done?" Mitsubishi Elect. Res. Lab. Cambridge, MA, USA, Tech. Rep. TR- 2015-023, 2015.
C. H. Knapp and G. C. Carter, "The generalized correlation method for estimation of time delay," IEEE Trans. Acoust., Speech, Signal Process., vol. ASSP-24, no. 4, pp. 320-327, Aug. 1976.
X. Anguera, "Robust speaker diarization for meetings," Ph.D. dissertation, Universitat Polit'ecnica de Catalunya, Barcelona, Spain, 2006.
C. Blandin, A. Ozerov, and E. Vincent, "Multi-source TDOAestimation in reverberant audio using angular spectra and clustering," Signal Process., vol. 92, no. 8, pp. 1950-1960, 2012.
N. Q. Duong, E. Vincent, and R. Gribonval, "Under-determined reverberant audio source separation using a full-rank spatial covariance model," IEEE Trans. Audio, Speech, Lang. Process., vol. 18, no. 7, pp. 1830-1840, Sep. 2010.
J. L. Roux, J. R. Hershey, and F. Weninger, "Deep NMF for speech separation," in Proc. IEEE Int. Conf. Acoust., Speech, Signal Process., Apr. 2015, pp. 66-70.
V. Emiya, E. Vincent, N. Harlander, and V. Hohmann, "Subjective and objective quality assessment of audio source separation," IEEE Trans. Audio, Speech, Lang. Process., vol. 19, no. 7, pp. 2046-2057, May 2011.
E. Vincent, R. Gribonval, and C. Févotte, "Performance measurement in blind audio source separation," IEEE Trans. Audio, Speech, Lang. Process., vol. 14, no. 4, pp. 1462-1469, Jul. 2006.
[Online] Available: https://sisec.inria.fr/
Y. Salaün et al., "The flexible audio source separation toolbox version 2.0," in Proc. IEEE Int. Conf. Acoust., Speech, Signal Process., Florence, Italy, May 2014. [Online] Available: https://hal.inria.fr/hal-00957412
Z. Rafii and B. Pardo, "Online REPET-SIM for real-time speech enhancement," in Proc. IEEE Int. Conf. Acoust., Speech, Signal Process., May 2013, pp. 848-852.
L. Le Magoarou, A. Ozerov, and N. Q. Duong, "Text-informed audio source separation using nonnegative matrix partial co-factorization," in Proc. 2013 IEEE Int. Workshop Mach. Learn. Signal Process., Sep. 2013, pp. 1-6.
L. Wang, H. Ding, and F. Yin, "A region-growing permutation alignment approach in frequency-domain blind source separation of speech mixtures," IEEE Trans. Audio, Speech, Lang. Process., vol. 19, no. 3, pp. 549-557, Mar. 2011.
[Online] Available: http://www.onn.nii.ac.jp/sisec13/evaluation-result/ BGN/Kayser.txt
S. Araki et al., "The 2011 signal separation evaluation campaign (SiSEC2011): Audio source separation," in Latent Variable Analysis and Signal Separation. Berlin, Germany: Springer, 2012, pp. 414-422.
2017. [Online] Available: https://sisec.wiki.irisa.fr/tiki-index98cf.html? page=Two-channel+noisy+recordings+of+a+moving+speaker+within+ a+limited+area
2017. [Online] Available: http://www.onn.nii.ac.jp/sisec13/evaluation-result/MOV/submissions/dami an/damian-info.txt
2017. [Online] Available: http://www.onn.nii.ac.jp/sisec13/evaluation-result/MOV/submissions/olde nburg/oldenburg.txt
L. Wang, T. Gerkmann, and S. Doclo, "Noise power spectral density estimation using maxnsr blocking matrix," IEEE/ACM Trans. Audio, Speech, Lang. Process., vol. 23, no. 9, pp. 1493-1508, Sep. 2015.
2017. [Online] Available: http://www.onn.nii.ac.jp/sisec13/evaluation-result/MOV/submissions/male k/malek.txt
[Online] Available: https://sisec.inria.fr/home/bgn-2016/
2017. [Online] Available: http://www.onn.nii.ac.jp/sisec13/evaluation-result/BGN/homepage-BGN-dev.html
[Online] Available: http://www.onn.nii.ac.jp/sisec15/evaluation-result/ BGN/BGN2015-dev.html
[Online] Available: http://www.onn.nii.ac.jp/sisec13/evaluation-result/ MOV/MOV2013.htm
V. P. Pauca, J. Piper, and R. J. Plemmons, "Nonnegative matrix factorization for spectral data analysis," Linear Algebra Appl., vol. 416, no. 1, pp. 29-47, 2006.
A. Mehmood, T. Damarla, and J. Sabatier, "Separation of human and animal seismic signatures using non-negative matrix factorization," Pattern Recognit. Lett., vol. 33, no. 16, pp. 2085-2093, 2012.
H. Lee and S. Choi, "Group nonnegative matrix factorization for EEG classification," in Proc. Int. Conf. Artif. Intell. Statist., 2009, pp. 320-327.
N. Kaboodvand, F. Towhidkhah, and S. Gharibzadeh, "Extracting and study of synchronous muscle synergies during fast arm reaching movements," in Proc. Iranian Conf. Biomed. Eng., Dec. 2013, pp. 155-160.