Directional audio coding (DirAC) is a technique for various tasks in spatial sound reproduction. It is based on Spatial impulse response rendering, it is based on the same principles, and partly on the same methods. The processing can be divided into three steps:
Analysis: the sound signals are divided into frequency bands using filterbank or STFT. The diffuseness and direction of arrival of sound at each frequency band are analyzed depending on time.
Transmission: A mono channel is transmitted with directional information, or, in applications targeting for best quality, all recorded channels are transmitted.
Synthesis: the sound at each frequency channel is first divided into diffuse and non-diffuse streams. The diffuse stream is then produced using method which produces maximally diffuse perception of sound, and non-diffuse stream is produced with a technique which produces as point-like perception of sound source as possible.
Synthesis can be implemented in various ways, depending on microphone technique, transmission type, and reproduction system.
Reproduction of B-format recordings. Demos available for 5.0 loudspeaker setup. Traditionally, B-format recordings are reproduced using e.g. Ambisonics, which produces coherent loudspeaker signals. This produces blurred spatial image and small optimal listening area. In DirAC, the coherence can be avoided since in both diffuse and non-diffuse reproduction, which produces less blurring and larger listening area.
Transmission of spatial information as side band to mono signal in teleconferencing. Demos available for 5.0 loudspeaker setup. The microphone setup is a custom B-format microphone composed of four miniature capsules. Sound is transmitted as a mono signal, with a narrow side band containing the azimuth directions for each frequency band depending on time.
In the teleconferencing demos the application of DirAC as a new type of directional microphone for noisy recording environments. This is implemented by reproducing only the sound coming from the direction of speech source. Although the SNR decreases from 0 to -25dB, speech is still somehow intelligible, although the reproduced speech signal contains lots of distortion.
Upmixing of stereo files to multichannel files. The stereophonic file is recorded with a simulated B-format microphone in simulated anechoic conditions. The sound can then be decoded to arbitrary reproduction systems.
Virtual-world sound synthesis can be enhanced with the parametric tools of DirAC. In this case, the direction of arrival and diffuseness are defined by the sound designer or the audio engine and the spatial sound is synthesized accordingly. For this purpose, novel algorithms have been developed that include: virtual-source positioning, synthesis of spatial extent, and efficient synthesis of reverberation.
|Some selected publications||Short description|
|Laitinen M.-V., Pihlajamäki T., Erkut C., Pulkki V., "Parametric time-frequency representation of spatial sound in virtual worlds", ACM Transactions of Applied Perception, 9(2), 2012||Description how the parametric spatial audio tools of DirAC can applied to virtual-world applications.|
|Del Galdo, G., Taseska, M., Thiergart, O., Ahonen, J., & Pulkki, V. (2012). The diffuse sound field in energetic analysis. The Journal of the Acoustical Society of America, 131, 2141.||Different methods to measure the diffuseness of sound field for DirAC are addressed in this paper.|
|Ahonen, Jukka; Del Galdo, Giovanni; Kuech, Fabian; Pulkki, Ville 2012 Journal of the Audio Engineering Society http://www.aes.org/e-lib/browse.cfm?elib=16322 ISSN: 0004-7554||Methods to utilize microphone arrays with rigid object inside for parametric spatial audio processing are derived here.|
|Laitinen MV, Kuech F, Disch S, V. Pulkki "Reproducing Applause-Type Signals with Directional Audio Coding" J. Audio Eng. Soc., 59(1/2) 2011.||Surrounding applause-type signals are very hard signals for many parametric spatial audio reproduction methods. It is shown, however, in this article, that such coding is possible, though the needed time-frequency resolution is very fine for such processing.|
|Vilkamo J, Lokki T, and Pulkki V. "Directional audio coding: Virtual microphone-based synthesis and subjective evaluation" J. Audio Engineering Society 57(9) 2009,||The use of virtual microphones in DirAC processing is presented here, and the quality produced by DirAC is shown to be very good in extensive listening tests.|
|Pulkki V, Laitinen MV, and Erkut C. "Efficient spatial sound synthesis for virtual worlds" The AES 35th International Conference London, UK, February 11-13 2009.||The use of DirAC in virtual world audio rendering is shown here. DirAC can be used to position virtual sources, to control the spatial extent of the sources and to provide reverberation efficiently. Also, recorded spatial sound scenes can easily be augmented with virtual sources.|
|Laitinen MV and Pulkki V "Binaural reproduction for directional audio coding" IEEE Workshop on Applications of Signal Processing to Audio and Acoustics. New Paltz, NY, USA, October 18-21 2009.||It is shown, that DirAC provides a nicely externalized perception of spatial sound, when using head tracking in headphone listening.|
|V. Pulkki "Directional audio coding in spatial sound reproduction and stereo upmixing". AES 28th Int. Conf. Pitea, Sweden, June 2006.||The use of DirAC in high-fidelity reproduction of B-format recordings is presented here. Also, the idea of using DirAC in stereo to multichannel upmixing is presented.|
|V. Pulkki and C. Faller "Directional audio coding: Filterbank and STFT-based design. In 120th AES Convention, Paris, France, May 20-23, 2006. Audio Engineering Society. Paper # 6658.||The use of DirAC in teleconferencing is presented here, with some discussion on the selection of time-frequency analysis methods in different applications.|