Filter by Session, Presentation, Author, Type, Room, Track, Code, Date
Note: filtering may appear slow or unresponsive. There is currently no loading indicator and it has not been optimized. The filtering data set is very large.
Forum
Jun 04 - 08:30 AM - 05:30 PM
PROGRESS Forum
Room
Salon des Roses A
Session Code
PW
Track
Forum
Tutorial
Jun 04 - 08:30 AM - 12:00 PM
Rethinking Sparsity-Aware Bayesian Learning for Signal Processing and Machine Learning
Room
Nafsika A
Session Code
T01
Track
Tutorials
Tutorial
Jun 04 - 08:30 AM - 12:00 PM
Advances in Federated Optimization: Efficiency, Resiliency, and Privacy
Room
Executive Room Gamma
Session Code
T02
Track
Tutorials
Tutorial
Jun 04 - 08:30 AM - 12:00 PM
No Touch Needed: Contact-Free Physiological Sensing for Fitness and Healthcare Using Cameras and RF Signals
Room
Executive Room Alpha
Session Code
T03
Track
Tutorials
Tutorial
Jun 04 - 08:30 AM - 12:00 PM
Fixed point theory: From proximal algorithms to deep learning
Improved Calibration method for CML Humidity Retrievals Over Complex Terrain
Yoav Rubin, Alpert pinhas
A Cramer–Rao based study of 2-D fields retrieval by measurements from a random sensor network
Shay Sagiv, HAGIT MESSER
PERFORMANCE OF A LOW-COST DUAL-FREQUENCY GNSS RECEIVER FOR NEAR REAL-TIME WATER VAPOR ESTIMATION
Christina Oikonomou, Ion-Anastasios Karolos, Stylianos Bitharis, Christos Pikridas, Haris Haralambous
OPPORTUNISTIC RAINFALL SENSING: STATE OF THE ART AND PERSPECTIVES IN ITALY
Filippo Giannetti, Vincenzo Lottici, Fabiola Sapienza, Federico Porcù, Giacomo Roversi, Pier Paolo Alberoni, Elia Covi, Roberto Nebuloni, Greta Cazzaniga, Carlo De Michele, Cristina Deidda, Matteo Colli, Sara Zani, Christian Gianoglio, Daniele D. Caviglia, Elisa Adirosi
IMPROVED WATER VAPOR DENSITY ESTIMATION WITH COMMERCIAL MICROWAVE LINKS ATTENUATION AND TEMPERATURE
Generating Artistic Images via Few-Shot Style Transfer
Itay Buchnik, Or Berebi, Tammy Riklin Raviv, Nir Shlezinger
Incremental Image Labeling via Iterative Refinement
Fausto Giunchiglia, Xiaolei Diao, Mayukh Bagchi
FACE-DUBBING++: LIP-SYNCHRONOUS, VOICE PRESERVING TRANSLATION OF VIDEOS
Alexander Waibel, Moritz Behr, Dogucan Yaman, Fevziye Irem Eyiokur Yaman, Tuan-Nam Nguyen, Carlos Mullov, Mehmet Arif, Alperen Kantarcı, Stefan Constantin, HAZIM KEMAL EKENEL
State-of-the-Art in Nudity Classification: A Comparative Analysis
Fatih Cagatay Akyon, Alptekin Temizel
Scalable Missing Data Imputation with Graph Neural Networks
Guillaume Lachaud, Patricia CONDE CESPEDES, Maria Trocan
Evaluation of a Marine Mesoscale Events Classifier
Marco Reggiannini, Oscar Papini, Gabriele Pieri
Collaborative visual-inertial localization of teams with floorplan extraction
Sándor Gazdag, Dániel Pásztornicky, Zsolt Jankó, Tamás Szirányi, András Majdik
SieveNet : AN EFFICIENT MODEL UTILIZING H.265 CODEC STRUCTURE FOR VIDEO OBJECT DETECTION
A new Multiway MFDM based technique for EEG Source Localisation and Interpretation
Anchal yadav, Monika Agrawal, S D Joshi
Topological analysis of low dimensional phase space trajectories of high dimensional EEG signals for classification of interictal epileptiform discharges
Annika Stiehl, Martina Flammer, Fabienne Anselstetter, Nicole Ille, Harald Bornfleth, Stefan Geißelsöder, Christian Uhl
Enabling Large-Scale Probabilistic Seizure Detection with a Tensor-Network Kalman Filter for LS-SVM
Seline J de Rooij, Kim Batselier, Borbala Hunyadi
Novel Approach Explains Spatio-Spectral Interactions in Raw Electroencephalogram Deep Learning Classifiers
Charles A Ellis, Abhinav Sattiraju, Robyn Miller, Vince Calhoun
Hypercomplex Multimodal Emotion Recognition from EEG and Peripheral Physiological Signals
Invariant Adversarial Imitation Learning from Visual Inputs
Haoran Zhang, Yinghong Tian, Liang Yuan, Yue Lu
SpectraNet-SO(3): Learning Satellite Orientation from Optical Spectra by Implicitly Modeling Mutually Exclusive Probability Distributions on the Rotation Manifold
Matthew Phelps, Ryan Swindle, Zack Gazak, Andrew Vandenberg, Justin Fletcher
STRUCTURED-ANCHOR PROJECTED CLUSTERING FOR HYPERSPECTRAL IMAGES
Guozhu Jiang, jie zhang, Yongshan Zhang, Xinwei Jiang, Zhihua Cai
Learning sparse auto-encoders for green AI image coding
Cyprien Gille, Frederic Guyard, Marc Antonini, Michel Barlaud
Learning to Generate 3D Representations of Building Roofs Using Single-View Aerial Imagery
Maxim Khomiakov, Alejandro Valverde Mahou, Alba Reinders Sánchez, Jes Frellsen, Michael Andersen
Robust Monocular Localization of Drones by Adapting Domain Maps to Depth Prediction Inaccuracies
Priyesh Shukla, Sureshkumar Senthilkumar, Alex C Stutts, Sathya Ravi, Theja Tulabandhula, Amit R Trivedi
Large dimensional analysis of LS-SVM transfer learning: Application to POLSAR classification
UCONV-CONFORMER: HIGH REDUCTION OF INPUT SEQUENCE LENGTH FOR END-TO-END SPEECH RECOGNITION
Andrei Andrusenko, Rauf Nasretdinov, Aleksei Romanenko
A Comparison of Semi-Supervised Learning Techniques for Streaming ASR at Scale
Charles C Peyser, Michael Picheny, Kyunghyun Cho, Tara Sainath, W. Ronny Huang, Rohit Prabhavalkar
Improving Contextual Biasing with Text Injection
Tara Sainath, Rohit Prabhavalkar, Diamantino Caseiro, Pat Rondon, Cyril Allauzen
STRUCTURED STATE SPACE DECODER FOR SPEECH RECOGNITION AND SYNTHESIS
Koichi Miyazaki, Masato Murata, Tomoki Koriyama
JEIT: JOINT END-TO-END MODEL AND INTERNAL LANGUAGE MODEL TRAINING FOR SPEECH RECOGNITION
Zhong Meng, Weiran Wang, Rohit Prabhavalkar, Tara Sainath, Tongzhou Chen, Ehsan Variani, Yu Zhang, Bo Li, Andrew Rosenberg, Bhuvana Ramabhadran
Variable Attention Masking for Configurable Transformer Transducer Speech Recognition
Pawel Swietojanski, Stefan Braun, Dogan Can, Thiago Fraga da Silva, Arnab Ghoshal, Takaaki Hori, Roger Hsiao, Henry Mason, Erik McDermott, Jan Silovsky, Ruchir Travadi, Xiaodan Zhuang
Factorized Blank Thresholding for Improved Runtime Efficiency of Neural Transducers
Duc Le, Frank Seide, Yuhao Wang, Yang Li, Kjell Schubert, Ozlem Kalinli, Mike Seltzer
Fast-U2++: Fast and Accurate End-to-End Speech Recognition in Joint CTC/Attention Frames
Chengdong Liang, Zhang XiaoLei, Binbin Zhang, Di Wu, Shengqiang Li, Xingchen Song, Zhendong Peng, Fuping Pan
Ensemble prosody prediction for expressive speech synthesis
Tian Huey Teh, Vivian Hu, Devang Mohan, Zack Hodari, Christopher Wallis, Tomás Gómez Ibarrondo, Alexandra Torresquintero, James Leoni, Mark Gales, Simon King
EXPRESSIVE-VC: HIGHLY EXPRESSIVE VOICE CONVERSION WITH ATTENTION FUSION OF BOTTLENECK AND PERTURBATION FEATURES
Ziqian Ning, Qicong Xie, Pengcheng Zhu, Zhichao Wang, Liumeng Xue, Jixun Yao, Lei Xie, Mengxiao Bi
HIGH-ACOUSTIC FIDELITY TEXT TO SPEECH SYNTHESIS WITH FINE-GRAINED CONTROL OF SPEECH ATTRIBUTES
Rafael Valle, João Felipe Santos, Kevin Shih, Rohan Badlani, Bryan Catanzaro
Embedding a differentiable mel-cepstral synthesis filter to a neural speech synthesis system
Generating Sound Effects, Music, Speech, and Beyond, with Text
Haohe Liu, Zehua Chen, Yi Yuan, Xinhao Mei, Xubo Liu, Danilo P. Mandic, Wenwu Wang, Mark D. Plumbley
DisCoHeadTV: Disentangled Control of Head Pose and Facial Expressions for Text-to-Video Synthesis
Sungwoo Park, GeumByeol Hwang, Kihyeok Lee, Sunwon Hong, Gyeongsu Chae
Intelligent Dialogue-based Tutoring System for Second Language Reading Comprehension
Jin-Xia Huang, Byung Ok KANG, Minsoo Cho, Oh-Woog Kwon, Yunkeun Lee
Optimize for my Voice with Speaker Identification
Marcin Ciolek, Michal Sulewski, Rafal Pilarczyk, Raul Casas, Samer Hijazi, Scott Plude, Dror Maydan, Michelle Mao, Guoqing Zhang, Nathan Rickey, Mahesh Godavarti, Kamil Wojcicki, Ali Mouline, Savita Kini, Marta Chelkowska, Taha Emara, Yusuf Isik, Amir Abdelwahed
Mitigating Unintended Memorization in Language Models via Alternating Teaching
Zhe Liu, Xuedong Zhang, Fuchun Peng
ITERATIVE SHALLOW FUSION OF BACKWARD LANGUAGE MODEL FOR END-TO-END SPEECH RECOGNITION
Atsunori Ogawa, Takafumi Moriya, Naoyuki Kamo, Naohiro Tawara, Marc Delcroix
PROCTER: PRONUNCIATION-AWARE CONTEXTUAL ADAPTER FOR PERSONALIZED SPEECH RECOGNITION IN NEURAL TRANSDUCERS
Rahul Pandey, Roger Ren, Qi Luo, Jing Liu, Ariya Rastrow, Ankur Gandhe, Denis Filimonov, Grant Strimel, Andreas Stolcke, Ivan Bulyko
Adaptive Multi-Corpora Language Model Training for Speech Recognition
Yingyi Ma, Zhe Liu, Xuedong Zhang
SUFFIX RETRIEVAL-AUGMENTED LANGUAGE MODELING
Zecheng Wang, Yik-Cheung Tam
Large-scale Language Model Rescoring on Long-form Data
Tongzhou Chen, Cyril Allauzen, Yinghui Huang, Daniel S Park, David Rybach, W. Ronny Huang, Rodrigo Cabrera, Kartik Audhkhasi, Bhuvana Ramabhadran, Pedro J Moreno, Michael Riley
Adapted Multimodal BERT with Layer-wise Fusion for Sentiment Analysis
Odysseas S Chlapanis, Georgios Paraskevopoulos, Alexandros Potamianos
Improving Disfluency Detection with Multi-scale Self Attention and Contrastive Learning
Peiying Wang, Chaoqun Duan, Meng Chen, Xiaodong He
Dialog act guided contextual adapter for personalized speech recognition
Feng-Ju Chang, Thejaswi Muniyappa, Kanthashree Mysore Sathyendra, Kai Wei, Grant Strimel, Ross McGowan
OUTSIDE KNOWLEDGE VISUAL QUESTION ANSWERING VERSION 2.0
Benjamin Reichman, Anirudh S Sundar, Christopher G Richardson, Tamara Zubatiy, Prithwijit Chowdhury, Aaryan Shah, Jack Truxal, Micah Grimes, Dristi Shah, Woo Ju Chee, Saif Punjwani, Atishay Jain
SERI: SkEtching-Reasoning-Integrating Progressive Workflow for Empathetic Response Generation
E2E Segmentation in a Two-Pass Cascaded Encoder ASR Model
W. Ronny Huang, Shuo-yiin Chang, Tara Sainath, Yanzhang He, David Rybach, Robert David, Rohit Prabhavalkar, Cyril Allauzen, Charles C Peyser, Trevor Strohman
Analyzing Acoustic Word Embeddings from Pre-trained Self-supervised Models
Ramon R Sanabria, Hao Tang, Sharon Goldwater
Unsupervised Word Segmentation Using Temporal Gradient Pseudo-Labels
Tzeviya S Fuchs, Yedid Hoshen
Cascading and Direct Approaches to Unsupervised Constituency Parsing on Spoken Sentences
Yuan Tseng, Cheng-I Lai, Hung-yi Lee
Towards trustworthy phoneme boundary detection with autoregressive model and improved evaluation metric
Hyeongju Kim, Hyeong-Seok Choi
Integrating Syntactic and Semantic Knowledge in AMR Parsing with Heterogeneous Graph Attention Network
Level-line Guided Edge Drawing for Robust Line Segment Detection
Xinyu Lin, Yingjie Zhou, Yipeng Liu, Ce Zhu
Dynamic Local and Global Context Exploration For Small Object Detection
Ziji Zhang, Ping Gong, Haotian Sun, Pingping Wu, Xuanyuan Yang
INFORMATION EXTRACTION FROM PILL BOTTLE IMAGES VIA TEXT STITCHING
Rahul Kumar Gupta, Shilka Roy, Sujit Jos, Unni V.S., Lauren Lavoie, Frederic Medous, Walter Smith
EI2SR: LEARNING AN ENHANCED INTRA-INSTANCE SEMANTIC RELATIONSHIP FOR ARBITRARY-SHAPED SCENE TEXT DETECTION
Yan Shu, Harbin Institute of Technology, Shaohui Liu, Yu Zhou, Also with University of Chinese Academy of Sciences, Harbin Institute of Technology, Harbin Institute of Technology, Harbin
Hanzi Wang, F Richard Yu
Room
Poster Area 11 - Dome
Session Code
IVMSP-P14
Track
Image, Video, and Multidimensional Signal Processing
C2KD: Cross-Lingual Cross-Modal Knowledge Distillation for Multilingual Text-Video Retrieval
Andrew Rouditchenko, Yung-Sung Chuang, Nina Shvetsova, Samuel Thomas, Rogerio Feris, Brian Kingsbury, Leonid Karlinsky, David Harwath, Hilde Kuehne, James Glass
The Edinburgh International Accents of English Corpus: Towards the Democratization of English ASR
Ramon R Sanabria, Nikolay Bogoychev, Nina Markl, Andrea Carmantini, Ondrej Klejch, Peter Bell
Adaptive Knowledge Distillation between Text and Speech Pre-trained Models
Jinjie Ni, Yukun Ma, Wen Wang, Qian Chen, Dianwen Ng, HAN LEI, Trung Hieu Nguyen, Chong Zhang, Bin Ma, Erik Cambria
A processing framework to access large quantities of whispered speech found in ASMR
Pablo Pérez Zarazaga, Gustav Eje Henter, Zofia Malisz
CLASS-GUIDED TRIPLE HEAD PREDICTION NETWORK FOR LONG-TAIL OBJECT DETECTION
xuyang liu, Yuan Zheng
SMCL: SALIENCY MASKED CONTRASTIVE LEARNING FOR LONG-TAILED VISUAL RECOGNITION
Sanglee Park, Seung-won Hwang, Jungmin So
COMPLEMENTARY LEARNING SYSTEM BASED INTRINSIC REWARD IN REINFORCEMENT LEARNING
Zijian Gao, Kele Xu, National University of Defense Technology, National University of Defense Technology, National University of Defense Technology, National University of Defense Technology, National University of Defense Technology, National University of Defense Technology
Promoting Cooperation in Multi-Agent Reinforcement Learning via Mutual Help
A UNIFIED UNCERTAINTY-AWARE EXPLORATION: COMBINING EPISTEMIC AND ALEATORY UNCERTAINTY
Parvin Malekzadeh, Ming Hou, Konstantinos N Plataniotis
MEET: A Monte Carlo Exploration-Exploitation Trade-off for Buffer Sampling
Julius Ott, Lorenzo Servadei, Jose Arjona-Medina, Enrico Rinaldi, Gianfranco Mauro, Daniela Sanchez Lopera, Michael Stephan, Thomas Stadelmayer, Avik Santra, Robert Wille
CONVERGENCE ANALYSIS OF GRAPHICAL GAME-BASED NASH $Q-$LEARNING USING THE INTERACTION DETECTION SIGNAL OF $\mathcal{N}-$STEP RETURN
Yunkai Zhuang, Shangdong Yang, Wenbin Li, Yang Gao
DEEP REINFORCEMENT LEARNING FOR GREEN UAV-ASSISTED DATA COLLECTION
SEQUENCE-BASED DEVICE-FREE GESTURE RECOGNITION FRAMEWORK FOR MULTI-CHANNEL ACOUSTIC SIGNALS
Zhizheng Yang, Xun Wang, Dongyu Xia, Wei Wang, Haipeng Dai
CAT: Causal Audio Transformer for Audio Classification
Xiaoyu Liu, Hanlin Lu, Jianbo Yuan, Xinyu Li
CNEG-VC: Contrastive Learning using Hard Negative Example in Non-parallel Voice Conversion
Bima Prihasto, YiXing Lin, Le Phuong, CHIEN-LIN HUANG, Jia-Ching Wang
TOWARDS ROBUST DATA-DRIVEN UNDERWATER ACOUSTIC LOCALIZATION: A DEEP CNN SOLUTION WITH PERFORMANCE GUARANTEES FOR MODEL MISMATCH
Amir Weiss, Andrew C Singer, Gregory W Wornell
AD-YOLO: YOU LOOK ONLY ONCE IN TRAINING MULTIPLE SOUND EVENT LOCALIZATION AND DETECTION
Jin Sob Kim, Hyun Joon Park, Wooseok Shin, Sung Won Han
Framewise multiple sound source localization and counting using binaural spatial audio signals
Lei Wang, Zhibin Jiao, Qiyong Zhao, jie zhu, Yang Fu
Learning Speech Representations with Flexible Hidden Feature Dimensions
Huaizhen Tang, Xulong Zhang, Shenzhen Co., Ltd., Ping An Technology, Ning Cheng, Shenzhen Co., Ltd, Ping An Insurance
Guided Speech Enhancement Network
Yang Yang, Shao-Fu Shih, Hakan Erdogan, Jamie Menjay Lin, Chehung Lee, Yunpeng Li, George Sung, Matthias Grundmann
Blind Estimation of Audio Processing Graph
Sungho Lee, Jaehyun Park, Seungryeol Paik, Kyogu Lee
MarginNCE: Robust Sound Localization with a Negative Margin
Sooyoung Park, Arda Senocak, Joon Son Chung
Improving Noisy Student Training on Non-target Domain Data for Automatic Speech Recognition
YU CHEN, Wen Ding, Junjie Lai
Large-Scale Nonverbal Vocalization Detection Using Transformers
Panagiotis Tzirakis, Alice Baird, Jeff Brooks, Chris Gagne, Lauren Kim, Michael Opara, Christopher Gregory, Jacob Metrick, Garrett Boseck, Vineet Tiruvadi, Bjoern W. Schuller, Dacher Keltner
AUDIO QUALITY ASSESSMENT OF VINYL MUSIC COLLECTIONS USING SELF-SUPERVISED LEARNING
Alessandro Ragano, Emmanouil Benetos, Andrew Hines
Speech Intelligibility Classifiers from 550k Disordered Speech Samples
Subhashini Venugopalan, Jimmy Tobin, Samuel J. Yang, Katie Seaver, Richard Cave, Pan-Pan Jiang, Neil Zeghidour, Rus Heywood, Jordan Green, Michael Brenner
NORD: Non-Matching Reference Based Relative Depth Estimation From Binaural Audio
Pranay Manocha, Israel D Gebru, Anurag Kumar, Dejan Markovic, Alexander Richard
Ensemble of Deep Neural Network Models for MOS Prediction
Marie Kunešová, Jindrich Matousek, Jan Lehečka, Jan Svec, Josef Michalek, Daniel Tihelka, Martin Bulin, Zdenek Hanzlicek, Marketa Rezackova
On Crowdsourcing-Design with Comparison Category Rating for Evaluating Speech Enhancement algorithms
Angélica Stephania Zambrano Suárez, Clement Laroche, Line Clemmensen, Sneha Das
EFFICIENT INTELLIGIBILITY EVALUATION USING KEYWORD SPOTTING: A STUDY ON AUDIO-VISUAL SPEECH ENHANCEMENT
Cassia Valentini, Andrea L Aldana, Ondrej Klejch, Peter Bell
TORCHAUDIO-SQUIM: REFERENCE-LESS SPEECH QUALITY AND INTELLIGIBILITY MEASURES IN TORCHAUDIO
On the detection of synthetic images generated by diffusion models
Riccardo Corvi, Davide Cozzolino, Giada Zingarini, GIovanni Poggi, Koki Nagano, Luisa Verdoliva
TRUSTERA: A LIVE CONVERSATION REDACTION SYSTEM
Evandro Gouvea, Ali Dadgar, Shahab Jalalvand, Rathi Chengalvarayan, Badrinath Jayakumar, Ryan Price, Nicholas Ruiz, Jennifer McGovern, Srinivas Bangalore, Ben Stern
COMBINING THE SILHOUETTE AND SKELETON DATA FOR GAIT RECOGNITION
Likai Wang, Ruize Han, Wei Feng
Learning from the raw domain: cross modality distillation for compressed video action recognition
Yufan Liu, Jiajiong Cao, Weiming Bai, Bing Li, NLPR, Institute of Automation, Chinese Academy of Sciences, Institute of Automation,Chinese Academy of Sciences
GAITCOTR: improved spatial-temporal representation for gait recognition with a hybrid convolution-transformer framework
Raw Ultrasound-based Phonetic Segments Classification Via Mask Modeling
kang you, Bo Liu, Kele Xu, Yunsheng Xiong, Qisheng Xu, Ming Feng, Tamás G Csapó, Boqing Zhu
Pitch Mark Detection from Noisy Speech Waveform using Wave-U-Net
Hyun-Joon Nam, Hong-June Park
Leveraging Multiple Sources in Automatic African American English Dialect Detection for Adults and Children
Alexander Johnson, Vishwas Shetty, Mari Ostendorf, Abeer Alwan
Does human speech follow Benford's Law?
Leo Hsu, Visar Berisha
Real-Time MRI Video synthesis from time aligned phonemes with sequence-to-sequence networks
Sathvik Udupa, Prasanta Dr Ghosh, IISc, Bangalore
Exploring Subgroup Performance in End-to-End Speech Models
Alkis Koudounas, Eliana Pastor, Giuseppe Attanasio, Vittorio Mazzia, Manuel Giollo, Thomas Gueudre, Luca Cagliero, Luca de Alfaro, Elena Baralis, Daniele Amberti
EFFICIENT STUTTERING EVENT DETECTION USING SIAMESE NETWORKS
Deekshitha G, Prasanta Ghosh, Hema A Murthy, Philipp Olbrich, Pranaw Kumar, Keiichi Tokuda, Mark Hasegawa-Johnson, Heiga Zen, Sathvik Udupa, Abhayjeet Singh, Jesuraj Bandekar, Sandhya Badiger
Multi-speaker Multi-lingual VQTTS System for LIMMITS 2023 Challenge
Chenpeng Du, Yiwei Guo, Feiyu Shen, Kai Yu
VANI: Very-lightweight Accent-controllable TTS for Native and Non-native speakers with Identity Preservation
Rohan Badlani, Akshit Arora, Subhankar Ghosh, Rafael Valle, Kevin Shih, João Felipe Santos, Boris Ginsburg, Bryan Catanzaro
LeanSpeech: The Microsoft Lightweight Speech Synthesis System for LIMMITS Challenge 2023
Chen Zhang, SHUBHAM BANSAL, Aakash Lakhera, Jinzhu Li, Gag Wang, Sandeep kumar Satpal, sheng zhao, Lei He
Lightweight Prosody-TTS for multi-lingual multi-speaker scenario
Giridhar Pamisetty, Chaitanya Varun Sahukari, Sri Rama Murty Kodukula
Region-awared transformer with asymmetric loss in multi-label classification
Long Feng, Guohua Geng, Chen Guo, longquan yan, Xingrui Ma, Zhan Li, Kang Li
CF-VTON: Multi-Pose Virtual Try-On with Cross-domain Fusion
Chenghu Du, Shengwu Xiong
LEARNING TO LOCATE VISUAL ANSWER IN VIDEO CORPUS USING QUESTION
Bin Li, Yixuan Weng, Bin Sun, Shutao Li
An End-to-End Framework for Partial View-aligned Clustering with Graph Structure
Liang Zhao, Qiongjie Xie, Songtao Wu, shubin ma
Detecting Out-of-distribution Examples via Class-conditional Impressions Reappearing
Jinggang Chen, Xiaoyang Qu, Shenzhen Co., Ltd, Huazhong University of Science and Technology, Ping An Technology, Jiguang Wan, Jing Xiao, Group Company of China
Guide and Select: A Transformer-based Multimodal Fusion Method for Points of Interest Description Generation
Hanqing Liu, Wei Wang, Niu Hu, Hai-Tao Zheng, Rui Xie, Wei Wu, Yang Bai
Boosting Fine-grained Sketch-based Image Retrieval with Self-supervised Learning
Zhaolong Zhang, Yangdong Chen, Yuejie Zhang, Rui Feng, Tao Zhang
Papez: Resource-efficient Speech Separation with Auditory Working Memory
Hyunseok Oh, Juheon Yi, Youngki Lee
MossFormer: Pushing the Performance Limit of Monaural Speech Separation using Gated Single-head Transformer with Convolution-augmented Joint Self-Attentions
Shengkui Zhao, Bin Ma
A Framework for Unified Real-time Personalized and Non-Personalized Speech Enhancement
Zhepei Wang, Ritwik Giri, Devansh Shah, Jean-Marc Valin, Michael M Goodwin, Paris Smaragdis
ImagineNET: Target Speaker Extraction with Intermittent Visual Cue through Embedding Inpainting
Zexu Pan, Wupeng Wang, Marvin Borsdorf, Haizhou Li
QUANTITATIVE EVIDENCE ON OVERLOOKED ASPECTS OF ENROLLMENT SPEAKER EMBEDDINGS FOR TARGET SPEAKER SEPARATION
Xiaoyu Liu, Xu Li, Joan Serra
Unifying Speech Enhancement and Separation with Gradient Modulation for End-to-End Noise-Robust Speech Separation
PAAPLoss: A Phonetic-Aligned Acoustic Parameter Loss for Speech Enhancement
Muqiao Yang, Joseph Konan, David Bick, Yunyang Zeng, Shuo Han, Anurag Kumar, Shinji Watanabe, Bhiksha Raj
D2Former: A Fully Complex Dual-Path Dual-Decoder Conformer Network using Joint Complex Masking and Complex Spectral Mapping for Monaural Speech Enhancement
Shengkui Zhao, Bin Ma
Semi-supervised speech enhancement based on speech purity
The Multimodal Information Based Speech Processing (MISP) 2022 Challenge: Audio-Visual Diarization and Recognition
Zhe Wang, Shilong Wu, Hang Chen, Mao-Kui He, Jun Du, Chin-hui Lee, Shinji Watanabe, Sabato M Siniscalchi, Odette Scharenborg, Baocai Yin, Jia Pan, Cong Liu
DOPPLER-CODED JOINT DIVISION MULTIPLE ACCESS WAVEFORM FOR AUTOMOTIVE MIMO RADAR
Yanhua Wang, Electromagnetic Sensing Research Center of CEMEE State Key Laboratory, Beijing Institute of Technology, Beijing, China, School of Information and Electronics, beijing institute of technology, Xueyao Hu, Jiamin Long, Hao Yu, Chongqing Innovation Center, Beijing Institute of Technology, Chongqing, China, School of Information and Electronics, beijing institute of technology
EARLY DETECTION OF COGNITIVE DECLINE USING VOICE ASSISTANT COMMANDS
Eli Kurtz, Youxiang Zhu, Tiffany Driesse, Bang Tran, John Batsis, Robert Roth, Xiaohui Liang
EXPLORING THE ROLE OF FRICATIVES IN CLASSIFYING HEALTHY SUBJECTS AND PATIENTS WITH AMYOTROPHIC LATERAL SCLEROSIS AND PARKINSON’S DISEASE
Tanuka Bhattacharjee, Yamini BK, Nalini Atchayaram, Ravi Yadav, Prasanta Dr Ghosh
STATIC AND DYNAMIC SOURCE AND FILTER CUES FOR CLASSIFICATION OF AMYOTROPHIC LATERAL SCLEROSIS PATIENTS AND HEALTHY SUBJECTS
Tanuka Bhattacharjee, Chowdam Venkata Thirumala Kumar, Yamini BK, Nalini Atchayaram, Ravi Yadav, Prasanta Dr Ghosh, IISc, Bangalore
Tranferring Quantified Emotion Knowledge for the Detection of Depression in Alzheimer's Disease Using ForestNets
Paula Andrea Pérez-Toro, Dalia Rodríguez-Salas, Tomas Arias-Vergara, Sebastian P Bayerl, Philipp Klumpp, Korbinian Riedhammer, Maria Schuster, Elmar Noeth, Andreas K Maier, Juan Rafael Orozco-Arroyave
Leveraging Pretrained Representations with Task-related Keywords for Alzheimer's Disease Detection
Jinchao Li, Kaitao Song, Junan Li, Bo ZHENG, Dongsheng Li, Xixin Wu, Xunying Liu, Helen Meng
Synthetic Aperture RF Reception using Rydberg Atoms
Nikunjkumar Prajapati, Alexandra Artusio-Glimpse, Matthew Simons, Samuel Berweger, Andrew Rotunno, Maitreyi Jayaseelan, Kaleb Campbell, Christopher Holloway
Towards Rydberg atom synthetic apertures: Wide-area high-resolution RF amplitude and phase imaging with Rydberg probes
David Anderson, Luis Goncalves, Remy Legaie, Georg Raithel
Joint Waveform and Wavefront Engineering for Terahertz Communications in 6G
Duschia Bodet, Josep Jornet
DEEP DENOISING PRIOR-BASED SPECTRAL ESTIMATION FOR PHASELESS SYNTHETIC APERTURE RADAR
Samia Kazemi, Bariscan Yonel, Birsen Yazici
Fast Cauchy-Rician Modelling of SAR Images with Method of Algebraic Moments Estimator
Mutong LI, Ercan E Kuruoglu
Row-Column Beamformer for Fast Volumetric Imaging
Lasse Thurmann Jørgensen, Sebastian Præsius, Nathalie Panduro, Sofie Andersen, Charlotte Sørensen, Jørgen Jensen
SCOPING A DOCUMENT ON RECOMMENDED PRACTICES FOR SYNTHETIC APERTURE RADIOMETRY
Brian Sequeira, Corina Nafornita
Federated Multi-Task Learning for THz Wideband Channel and DoA Estimation
Ahmet M Elbir, Wei Shi, Kumar Vijay Mishra, Symeon Chatzinotas
AN IMPROVED AUTOFOCUS ALGORITHM WITH BAYESIAN TRACKING OF RESIDUAL MOTION FOR AUTOMOTIVE MIMO-SAR IMAGING
Gabriele Balducci, Marco Manzoni, Stefano Tebaldini, Andrea Virgilio Monti-Guarnieri, Claudio Maria Prati, Ivan Russo
Speeding Up Detection and Imaging Using Quantum Radars
David Luong, Bhashyam Balaji, Sreeraman Rajan
Bistatic MIMO Radar Sensing of Specularly Reflecting Surfaces for Wireless Power Transfer
Benjamin J. B. Deutschmann, Maximilian Graber, Thomas Wilding, Klaus Witrisal
Satellite-to-satellite linear array SAR 3D backward projection super-resolution imaging algorithm with compressed sensing
Zhexian Liu, Shuai Shao, Hongwei Liu
Synthetic aperture sonar micronavigation with variational inference of a state-space model
Angeliki Xenaki, Yan Pailhas, Alessandro Monti
EXPLOITATION OF SINGLE-CHANNEL SPACE-BORNE SAR DATA FOR SHIP TARGETS IMAGING AND MOTION PARAMETERS ESTIMATION