A Survey of Spoken Dialogue Models (60 pages)
streaming duplex speech moshi speech-representation encodec gpt-4o speech-language-model spoken-dialogue-models modal-alignment intreaction mini-omni llama-omni wavtokenizer
- Updated
![]() |
VOOZH | about |
A Survey of Spoken Dialogue Models (60 pages)
A curated list of full-duplex spoken dialogue models & benchmarks
FastLongSpeech is a novel framework designed to extend the capabilities of Large Speech-Language Models for efficient long-speech processing without necessitating dedicated long-speech training data.
Add a description, image, and links to the spoken-dialogue-models topic page so that developers can more easily learn about it.
To associate your repository with the spoken-dialogue-models topic, visit your repo's landing page and select "manage topics."