😊 Hi, this is Linfeng Feng. I am currently working toward the Ph.D. degree in information and communication engineering with the Northwestern Polytechnical University, Xi’an, China. Concurrently, I am an intern with the Institute of Artificial Intelligence (TeleAI), China Telecom. My research interests include spatial audio, speech enhancement, and multimodal learning.

🎓️ Educations

2024.03 - now, Ph.D. candidate in Information and Communication Engineer, Northwestern Polytechnical University, Xi’an, China.
2021.09 - 2024.03, M.S. in Information and Communication Engineer, Northwestern Polytechnical University, Xi’an, China.
2017.09 - 2021.06, B.S. in Communication Engineer, Guangdong Polytechnic Normal University, Guangzhou, China.

👔 Internships

2024.01 - now, Institute of Artificial Intelligence (TeleAI), China Telecom, China.

📝 Publications

2025

DualSpec: Text-to-spatial-audio Generation via Dual-Spectrogram Guided Diffusion Model [pdf] [demo]
Lei Zhao*, Sizhou Chen*, Linfeng Feng*, Xiao-Lei Zhang, Xuelong Li
Preprint
AudioSpa: Spatializing Sound Events with Text [pdf] [demo]
Linfeng Feng*, Lei Zhao*, Boyu Zhu, Xiao-Lei Zhang, Xuelong Li
Preprint
UniForm: A Unified Multi-Task Diffusion Transformer for Audio-Video Generation [pdf] [demo]
Lei Zhao*, Linfeng Feng*, Dongxu Ge*, Rujin Chen, Fangqiu Yi, Chi Zhang, Xiao-Lei Zhang, Xuelong Li
Preprint
Deep learning based stage-wise two-dimensional speaker localization with large ad-hoc microphone arrays [pdf]
Shupei Liu*, Linfeng Feng*, Yijun Gong, Chengdong Liang, Chen Zhang, Xiao-Lei Zhang, Xuelong Li
Speech Communication
Co-Attention Based Multi-Channel TF-GridNet for Speech Separation with Ad-Hoc Microphone Arrays
Hongmei Guo, Linfeng Feng, Yijiang Chen, Xueqing Li, Boyu Zhu, Hao-Yu Wang, Xiao-Lei Zhang, Xuelong Li
Proceedings of the 49th IEEE International Conference on Acoustic, Speech, and Signal Processing (ICASSP 2025)

2024

Quantization-error-free soft label for 2D sound source localization [pdf]
Linfeng Feng, Xiao-Lei Zhang, and Xuelong Li
Proceedings of 14th International Symposium on Chinese Spoken Language Processing (ISCSLP 2024)
Learning Multi-dimensional Speaker Localization: Axis Partitioning, Unbiased Label Distribution, and Data Augmentation [pdf]
Linfeng Feng, Yijun Gong, Zhi Liu, Xiao-Lei Zhang, and Xuelong Li
IEEE/ACM Transactions on Audio, Speech and Language Processing (TASLP)
Eliminating Quantization Errors in Classification-Based Sound Source Localization [pdf] [code]
Linfeng Feng, Xiao-Lei Zhang, and Xuelong Li
Neural Networks

2023

Soft Label Coding for End-to-end Sound Source Localization with Ad-hoc Microphone Arrays [pdf]
Linfeng Feng, Yijun Gong, and Xiao-Lei Zhang
Proceedings of the 47th IEEE International Conference on Acoustic, Speech, and Signal Processing (ICASSP 2023)

🎖️ Competitions

First place on ASR and third place on overall score, URGENT speech enhancement challenge of NeurIPS 2024 [Leaderboard]
Linfeng Feng, Hao Ma, Haocheng Dong, Xiao-Lei Zhang, and Xuelong Li

💻 Others

Reviewer

Neural Networks,
IEEE/ACM Transactions on Audio, Speech and Language Processing (TASLP),
IEEE Signal Processing Letters (SPL),
EURASIP Journal on Audio, Speech, and Music Processing,
IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP),
Conference of the International Speech Communication Association (INTERSPEECH),
National Conference on Man-Machine Speech Communication (NCMMSC).