Somatone VR Audio Tutorials – Unity+FMOD+GVR – Part 1 – Introduction
Introduction
Welcome to the first of a three part series of in depth tutorials about the art and technology of creating audio for virtual reality. In this first part, we will be giving a general overview and talking about some of the key concepts of audio in 360 degree surround. In the following articles, we will go into the details of setting up that audio in FMOD, and then integrating that audio into Unity. We don’t leave out any of the steps, and even include the FMOD and Unity packages for you to play around with!
In this series of articles, we have created videos that demonstrate the pipeline for working with the Google VR (GVR) audio plugins in FMOD and Unity. We will be using the Viking Village Demo from Unity and a Vive demonstrate, but FMOD and GVR will work with any device. This video is a collaboration between FMOD, Google and Somatone we would like express our gratitude to everyone who helped to make these videos possible.
VR Audio Concepts
There are a few concepts and I want to go over before we dive in, which I cover in this video.
The first concept is Ambisonics. Ambisonics is the technique that describes creating the full-sphere of sound around a listener; this includes both the stereo panning of the horizontal plane as well as the up and down of the vertical plane. We will be using First Order Ambisonic files. these are 4 channel files in which the channels do not correspond to speakers, and for the purposes of VR, should be decoded to a stereo output and heard in headphones.
GVR in FMOD
The next concept is the different GVR plugins. There are 3 plugins in FMOD: the listener, source, and soundfield.
- The Listener takes information from both the source and soundfield and applies binaural spatialization and room effects, then converts the information to a stereo output. This should put on the master bus in FMOD.
- The Source takes mono sound files and applies distance attenuation and directivity patterns, then sends that information to the Listener.
- The Soundfield decodes first order ambisonic files based on rotation and sends it to the Listener.
GVR Effects
Spatial Audio in GVR is comprised of 3 effects:
- Interaural time differences – This is the difference in time from when one ear a sound to when the other hears that same sound.
- Interaural level differences – This is the difference in volume between one ear and the other.
- Spectral filtering – This is the changes our outer ear makes to a sound depending on the angle in which it reflects off of the outer ear and into our inner ear. This is the primary way we hear elevation of sounds.
These effects that our ears do for us in the real world must be recreated in a virtual world.
Based on these effects Google has created HRTFs or head related transfer functions that recreate the effects and packaged them into the GVR audio plugins so we can spatialize sounds in virtual reality. The plugins also include distance attenuation curves and can link to the GVR audio room which will simulate early and late reflections (or reverb) in real time.
Additional GVR Settings
The plugins also have settings for directivity patterns, occlusion and spread.
- The directivity is the way the sound propagates from a source, these work like virtual microphones and mimic the common pickup patterns patterns: omni, cardioid and figure eight. It also includes a sharpness factor – which is the width of the directivity pattern.
- The occlusion factor simulates an object blocking the direct sound from a source and affects high and low frequencies differently and mimics how occlusion works in the reality.
- The spread describes the point the sound comes from. Sounds like gunshots and ricochets can come from small point sources and have no spread, but water sounds will come from wider sources and have higher spread.
Final Thoughts – Key Concepts in VR Audio
Lastly, the videos above cover a few general rules for designing sound in a VR space to make your world seem more believable.
- Use mono source files, unless it is an ambisonic sound field or it is not going to be spatialized (ex: non-diagetic music, internal player VO).
- Only use reverb in enclosed spaces
- Place audio sources as accurately as possible (voice sources should come from the mouth, not the center of the body)
- Use complex waveforms covering a wide spectrum – fundamental waveforms like sine waves do not exist naturally and do not spatialize well.
- If you want a listener to locate a source, play a sound multiple times, make sure to include higher frequencies as they are easier for our ears to pinpoint, and animate it, people have an easier time locating moving sources as opposed to stationary sources.
Stay Tuned…
We will be releasing the remaining parts of this tutorial series over the next few weeks. In the next installment, we go into the FMOD setup for different types of VR audio and how to use the GVR plugin. In the final two installments, we explain in detail how to integrate FMOD into Unity, including the projects themselves and the code required to get some really cool audio hooked up.