Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

If I may insert a relevant plug: we (MERL) just put out a paper last week with SOTA 7.0 % WER on LibriSpeech test-other (vs wav2letter@anywhere's 7.5%) with 590 ms theoretical latency using joint CTC-Transformer with parallel time-delayed LSTM and triggered attention. Check it out: https://arxiv.org/abs/2001.02674


Correction: no PTDLSTM here but time-restricted self-attention (duh). PTDLSTM was our previous encoder setup published at ASRU.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: