AudioMorphix: Training-free audio editing using diffusion models
AudioMorphixAs a training-free audio editor, the porposed AudioMorphix presents the following features: (1) Tuning-free: The AudioMorphix is a zero-shot editing method that does not require extra training to fit task-specific data; (2) Audio-referenced: Instead of text instruction which could be ambiguous in some use cases, the AudioMorphix takes an extra audio as reference for editing; (3) Versatile: the AudioMorphix is an universal framework capabable of diverse editing tasks, including addition, removal, replacement, and style transferring; (4) Region-specific: The AudioMorphix enables to edit a particular region of audio spectrogram while keeping the rest unchanged during editing.
|
---|
Demos on the addition task
|
|
|
|
|
|
|
---|---|---|---|---|---|---|
|
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
|
|
|
|
|
|
|
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
|
|
|
|
|
|
|
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
|
|
|
|
|
|
Demos on the removal task
|
|
|
|
|
|
|
---|---|---|---|---|---|---|
|
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
|
|
|
|
|
|
|
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
|
|
|
|
|
|
|
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
|
|
|
|
|
|
Demos on the replacement task
|
|
|
|
|
|
|
---|---|---|---|---|---|---|
|
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
|
|
|
|
|
|
|
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
|
|
|
|
|
|
|
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
|
|
|
|
|
|
Demos on the time shift & streching tasks
|
|
|
|
|
---|---|---|---|---|
"Add a 4-second delay to the thunder sound." |
![]() |
![]() |
![]() |
![]() |
|
|
"Stretch the acoustic guitar sound by 2.0 times and add a 0.2-second delay to the output." |
![]() |
![]() |
![]() |
![]() |
|
|
"Stretch the automobile horn sound by 2.5 times." |
![]() |
![]() |
![]() |
![]() |
|
|
Demos on the pitch shift task
|
|
|
|
|
---|---|---|---|---|
"Increase the pitch of the piano sound." |
![]() |
![]() |
![]() |
![]() |
|
|
"Lower the pitch of the high-pitched sound." |
![]() |
![]() |
![]() |
![]() |
|
|
"Increase the pitch of the woman’s speech." |
![]() |
![]() |
![]() |
![]() |
|
|
Page updated on 1 Aug 2025.