What is Vienna MIR Pro?
Vienna MIR Pro is an innovative, highly integrated software package for mixing, spatialization (distribution in space) and reverberation of virtual orchestral instruments or any other kind of audio signal. Vienna MIR Pro is meant to be used as mixing front-end, much like you would use a mixing console in the analogue world. Vienna MIR Pro offers integrated sample players, advanced stage positioning possibilities, sample-based reverb (so-called impulse responses), and individually assignable signal processing.
Its main goal is a fast and intuitive, yet highly realistic approach to the realization and mixdown of virtual orchestral music as well as any other kindof digital audio production.
Vienna MIR Pro's main components and core technologies are:
- a host and MIDI interface for Vienna Instruments (Vienna Ensemble Pro)
- unique instrument-conscious (signal dependent) directivity handling
- a mixing and spatialization tool for any kind of digital audio sources on detailed virtualizations of real halls and stages
- a host for 3rd-party VSTi's (virtual instruments adhering to the VST and / or AU standards)
- a host for VST / AU effect and signal processing plug-ins with full latency compensation
- a convolution-based reverberation tool, derived from world class orchestral venues
- an advanced multi-format mixing engine with instrument-specific, hand-tuned presets
- an inviting, hands-on graphical user interface
- LAN capabilities and direct DAW integration
- a built-to-the-task algorithmic reverb add-on for hybrid reverberation and sound design
Philosophy and underlying concepts of Vienna MIR Pro
Vienna MIR Pro is designed to be self-explanatory. In spite of the underlying complex technology, you can simply look onto a stage and interact with the musicians you invited to play on it, without the need to use any technical abstractions like faders, buttons, numbers or editors.
Of course, on certain occasions it may be helpful to have all the means for detailed manipulation of parameters. In these cases, a bit of background information may be helpful.
This little add-on to MIR Pro's main manual supplies you with the ideas and technical concepts of Vienna MIR Pro, for a more profound under-standing of what is going on behind the curtain - and why.
You may skip reading this add-on manual if you are eager to start making music in beautiful virtual rooms right away. But make sure to come back later, because knowledge is power. :-)
Welcome to Reality
The former New York Times film critic, Vincent Canby, wrote, "All of us have different thresholds at which we suspend disbelief, and then gladly follow fictions to conclusions that we find logical." Any recording is a "fiction," a falsity, even in its most pure form. It is the goal of the recording engineer and producer to create a universe so compelling and transparent that the listener isn't aware of any manipulation." (source)
The same is true for virtual orchestration, where the abstraction of a recording is even aggravated by the virtuality of its provenance.
In a nutshell: The beauty achieved by any recording - be it a "real" or a virtually created one - is in the eyes (in the ears!) of the beholder. Let's fool them.
The Idea of Vienna MIR Pro - Room is more than just "Reverb"
Music – and especially orchestral music – is more than just adding one single instrument to another. Of course, the proper mixture between them is crucial to making an arrangement work. But we still miss some decisive elements: The positions of single instruments on a stage, the size of the stage itself and its surrounding auditorium, and of course the spatial relationships between the instruments. Due to the nature of sound, all these components tend to interact with each other, and it's only after this interaction takes place that the human ear identifies a bunch of sound waves as "orchestral music".
An orchestra is all about space.
Unless we are locked in an anechoic measurement facility (a very unpleasant experience, by the way), any sound source will provoke reflections from any surface the soundwaves encounter while travelling through the air. Together with the direct signal, the human ear needs these so-called "early reflections" to gather information about the position of the sound source and its distance.
After a short time (a few hundred milliseconds), a myriad of those reflections join into a dense swoosh of sound – a phenomenon we perceive as reverb. Reverb enables us to get a distinctive auditory image from the enclosing ambience: its size, its form, and the kind of surfaces it encompasses.
The need to add an artificial sense of room to a recorded signal is at least as old as multitrack recording technology itself. Starting with the use of so-called "echo chambers" (real rooms equipped with a loudspeaker and a microphone), audio engineers soon invented "reverb out of the box". Mechanical devices like springs and plates paved the way for elaborate digital reverberation devices, starting with the early 1980ies. Those machines – still highly esteemed – finally came pretty close to generating believable acoustic equivalents to real rooms.
The algorithms those devices rely on are very advanced nowadays. Developers managed to get rid of dreaded phenomenons like metallic resonances and echo-like loops. The results are perfect – actually, they're too perfect for our needs. While this is good for "unreal" music production like it is considered to be state of the art in pop music, for example, synthetic reverb is not the final solution for virtual orchestration. No real room, especially no real orchestral hall would sound so uniform and character-free.
Every room has a voice of its own, with its own characteristics, its imperfections – one could even say: Its magic. Even more, this holds true for every spot on the stage, for every corner of its auditorium.
Therefore, what we are striving for is the "perfect picture of the imperfect world".
Sampling and Multisampling
As an avid user of the Vienna Symphonic Library's software instruments, Vienna Instruments and Vienna Instruments Pro, you are familiar with the concept of sampling: Based on recordings of either single notes or well-defined phrases of real instruments, it needs just a clever piece of software to re-combine these bits and pieces into a truly playable musical instrument. The more single samples the software can rely on, the higher the perceived realism will be.
Similar to the sampling of instruments, it is possible to sample reverb and rooms (and actually any kind of linear acoustic system). A well-defined test signal (theoretically an impulse, practically a sine sweep) is sent into this room through a loudspeaker. The "response" of the room to this impulse gets recorded. This is the so-called impulse response (in short: IR). Consecutively, any signal can be put into the position of the original impulse, using a mathematical process called "convolution".
Ideally, this signal will now sound exactly as if it were recorded instead of the impulse from the very beginning! – In reality, there are a few problems, though:
First of all, convolution is an extremely demanding task even for the most recent computers – and even more so if it is done properly.
Secondly, "room" is all about three dimensions. Any signal source will sound different from any possible position within one and the same room; and it will sound different again from any of the possible listener's positions.
And thirdly, each signal source, i.e., every individual instrument, every sound source will act differently within a room.
How could one single IR cover all those aspects?
Multi Impulse Responses
Imagine the sound of a Bösendorfer Imperial Grand. Do you think it would be sufficient to record a single Middle C to make for a convincing virtual instrument? Of course not. As we already agreed upon, you have to go for individual samples from as many keys as possible, sampled in as many velocities as the human ear is able to distinguish. The same holds true for sampling a room.
What the industry has worked with up to now are single samples from an acoustic entity much larger than even the biggest instruments. There's no way you can "play" a room like that. But this is what great rooms are all about – they want to be played by musicians, conductors, and arrangers just like any other instrument.
This is the reason why the Vienna Symphonic Library started to record multi-samples from great musical venues: Multi Impulse Responses.
In fact, Vienna MIR Pro is much more than just "a multi-sample" of a hall – and this is where it surpasses any other convolution reverb available on the market by far. Vienna MIR Pro is multi-source, multi-directional, multi-positional, and multi-format.
Because, as we said before: every room has its own unique voice.
Technical Background – What works How and Why
Let's see (or should we say: let's hear!) what happens when you place a Vienna Instrument, e.g., the solo horn, on Vienna MIR Pro's virtual stage of a given concert hall. First of all, the stage position triggers the selection of one or more sets of 8 impulses (6 for horizontal directions, 2 for upward and downward directions). Equally important, the directivity characteristics of each instrument are applied before the convolution of impulses, so that the result depends on the frequency distribution and the volume an instrument is emitting in various directions. A horn, directed to the rear, obviously has a different spatial frequency profile than, e.g., the frontally blaring trumpet. The MIR Pro engine calculates all of this in real-time, and what you get is – a solo horn that sounds exactly as if it were playing on that very spot on the stage.
While all this should seem like a completely effortless procedure to the user, it wouldn't be possible without a unique combination of both audio and computing techniques. Some of those are entirely original developments implemented exclusively by the Vienna Symphonic Library.
These are Vienna MIR Pro's core technologies:
- Positional IR-Prerendering and Dynamic Processing
- Spatial Mixing
- Source-aware processing relying on Instrument Profiles, esp. Instrument Directivity Profiles
- MIDI controlled VSTi / AU Instruments hosting
- VST / AU plug-in hosting
- Audio input processing
- Innovative GUI solutions
As you no doubt are aware of by now, we are talking about a quite complex equation here:
Vienna MIR Pro = Virtual Instruments Host + Audio Mixer + Spatial Positioning + Multi-IR Based Convolution Reverb
The following section will give you an overview how MIR Pro handles this very diverse tasks.
Sampling and Convolution
We've covered the basic concepts of sampling and convolution in the preceding paragraphs of this chapter. In short, sound sampling is used to virtualize instruments based on their actual recordings (see also). Convolution (in our context) is used to virtualize real rooms in which these virtual instruments are playing by way of "samples" gathered from these rooms. These samples are called IRs – Impulse Responses (see also ](http://emusician.com/tutorials/emusic_acting_impulse/), 2).
Convolution is the most CPU-intensive part of MIR Pro's engine. Only due to a proprietary method we call "Positional IR Prerendering" is it possible to hear the auditory effects of hundreds, sometimes thousands of convolutions in (close to) real time. We'll come back to this a few paragraphs below.
The sampled virtual instruments mostly tax the hard-disk's data throughput and of course the available RAM, which they have to share with the loaded impulse responses. IRs can't be streamed from disk like instrument samples; they have to reside completely in RAM.
Ambisonics is the third cornerstone upon which MIR Pro is founded. Ambisonics is crucial for MIR Pro to work at all, which is why a basic understanding of this advanced audio format will help a lot to grasp MIR Pro's vast possibilities.
Ambisonics relies on a meta-audio-format which is not meant to be listened to directly. It allows for decoding of an almost limitless number of actual audio formats, be it broad or narrow stereo, different surround formats, or any other multi-channel format. By defining "virtual" microphones, a dedicated sonic behaviour can be assigned to each channel: The polar patterns as well as the angles of those microphones with regard to the input signal can be controlled after the actual recording.
To achieve this, an Ambisonics signal has to be based on four channels (W, X, Y, Z) (To be more precise, we would have to distinguish between Ambisonics A (the "raw" recording format) and Ambisonics B (the decodable audio format we are talking about here, derived from the raw Format A). If you want to learn more, please refer to the large amount of online literature concerning this topic [here]](http://en.wikipedia.org/wiki/Ambisonics), and here). The W channel is the signal's non-directional mono component, corresponding to the output of an omni-directional microphone. The X, Y and Z channels are the directional components in three dimensions. If you have an audio-engineering background, you might think of it as "three-dimensional M/S", or even better a "three-dimensional Blumlein" microphone array.
Ambisonics is a 360-degree, full-sphere sound recording, synthesis, and playback system. It is capable of accurately recording, processing, and playing back sounds from left/right, front/back, and up/down. Incorporating the vertical dimension makes Ambisonics a true periphonic, or surrounding, sound reproduction system instead of an artificial 2D representation spread out over multiple loudspeakers. Ambisonics creates the aural or sonic impression of a physical, three-dimensional space (quote).
To top it all, when using Vienna MIR Pro we are not limited to those spots that were used for impulse recording in the first place. Ambisonics allows for the seamless interpolation of each and every point within the covered area of a stage. Only the so-called off-stage "HotSpots" are limited to a single position.
Positional IR Prerendering and Dynamic Processing
Computers have become incredibly powerful over the years. But when it comes to intensely demanding tasks like convolution, it is still impossible to have more than a thousand of these processes computed in real time – which is a pity, because as we said before the whole MIR Pro concept relies on multi-sampled rooms.
Why that many convolutions, you might ask? Let's see: Assuming that we have decided to record 40 positions from a big stage, this number has to be multiplied by 8 for each direction we've sent the impulse to (0°, 60°, 120°, 180°, 240°, 300°, to the floor and to the ceiling), which adds up to 320 IR recordings. As we've discussed before, every IR is captured (and consequently processed) in 4-channel Ambisonics, which means that in our example the whole multi impulse response set from a single microphone position consists of 1,280 individual IRs.
To overcome the aforementioned limitations, Vienna MIR Pro introduces a proprietary method to take much of this burden off the CPU: Positional IR Prerendering. Cutting down the number of necessary convolutions to a bare minimum, it still allows for the same acoustical results one would achieve by using all those 1,280 IRs.
Without going into detail too far, the principle behind it is easy to understand. Each time an audio source is put on the stage using the MIR Icon, the MIR Pro engine looks up all actually necessary IRs in the complete set and prepares a momentary set of individual IRs for this very signal. The signal's exact position, stereo width and direction, as well as all instrument-specific meta-data derived from the assigned Instrument Profile and the chosen Output Format are all taken into account. Prerendering takes just a few milliseconds, so every move or change applied to an audio signal is almost instantly reflected in the again newly created impulse response characteristic.
Large orchestral arrangements will still have to rely on some hundred IRs, but that is something the latest generation of CPUs is already able to handle. Combined with MIR Pro's Dynamic Processing feature, which switches off any unused convolution task as long as no signal has to be processed, we are able to use today's computers as if they were coming from ten years in the future.
Instrument Directivity Profiles
Until now, most (if not all) approaches to virtual spatialization have taken it for granted that any signal could be regarded as an omni-directional source. This would mean that an instrument sounds the same irrespective of the direction it is heard from. In fact, quite the opposite is true.
MIR Pro covers directionality (i.e., "room") both from the listener's perspective (the microphone) as well as from the signal source's perspective (the instrument). We've already seen how Ambisonics is used for a completely flexible main microphone setup. The way to achieve the same flexibility regarding an instrument's position is partly covered by the Multi Impulse Responses themselves, of course – but the MIR Pro engine needs to know much more about the source signals to make full use of this huge amount of spatial information.
This is why we implemented detailed, individual Instrument Directivity Profiles for (almost) every Vienna Instrument. In addition we supply "General Purpose" Profiles for the use with any audio signal MIR Pro is able to process.
The underlying data was gathered over (literally) years of extensive research and development, and is now saved within so-called Instrument Profiles. They are not directly visible to the user, but selected from a list in accordance with the required Vienna Instrument's samples.
Why all the effort? Just look at the raw measurement results of two common instruments on the previous page – a flute and an alto trombone.
Each curve shows the instrument's unweighted frequency profiles as emitted by the instrument in 60° sectors in an unreflective measurement room (including ceiling and floor). It is clearly visible that the flute is anything but an "omni-directional" source. Now compare this profile with the trombone shown on the right. The differences are more than obvious – and of course this also holds true for any other instrument.
Measured with the aid of our newly developed method (based on sectorized microphone swarms), we gathered an enormous database of frequency profiles for all kinds of instruments (and other sources). Taking into account these directivity-dependent changes in sound, we can now supply MIR Pro with a direction-dependent acoustic fingerprint. This is made possible by the way we recorded the multi impulse sets of each room: As mentioned before, the impulses were sent into the room in the same sectorized way we used for measuring the instruments – in 60° steps, plus the room's ceiling and floor.
To enhance the achievable authenticity even more, the raw profiles were weighted regarding an instrument's typical playing styles and the average distribution of pitches we gathered statistically from analyzing dozens of exemplary orchestral music scores.
Apart from directivity information, Instrument Profiles contain data about other aspects, too:
- Natural Volume
- The Stereo Width inherent in the original recording
- Instrument and / or ensemble size
- Natural timbres and possible changes to them
- Typical playing techniques and ways of sound production
In combination with all these aspects saved in an Instrument Profile, the directivity patterns greatly enhance the possibilities of MIR Pro, and thus the achievable realism of a virtual orchestral performance.
Working with an orchestra in its natural environment may seem a self-evident task: A big stage in an acoustically suitable hall. Nevertheless, in our work with virtual orchestras we unintentionally have gotten used to accept all kinds of technical abstractions of this logical model. We're dealing with mixer channels, pan-pots and faders, equalizers, auxiliary sends and reverb engines – only to achieve an effect that "just happens" in the real world: The impression of depth and room, the perceptibility of the instruments' positions and their mutual relation.
Vienna MIR Pro reintroduces the "natural" environment to the virtual domain, so that we finally are in the lucky position to forget about most of the annoying (or at least tedious) detours. The MIR Pro engine offers a holistic approach to the spatialization of virtual orchestras, allowing the user to interact with the players more like a conductor than an engineer. MIR Pro is all about "room", and both the interface as well as the underlying processes serve this sole task only.
How Vienna MIR Pro Venues are created
A MIR recording session needs serious planning and logistics to work out properly.
When we decide to "sample" a hall for use within MIR Pro, there are quite a few things to consider. It may take up to two 24-hour days to capture a big hall in its full depth and glory, which is why rental fees are a major point – the best orchestral venues on this planet certainly don't come for free. Other things that have to be clarified in advance are legal issues as well as the actual pure logistics stuff, like vehicle access to the site, the availability of in-house staff, and so on.
The actual time we have to schedule depends on a very diversified set of factors: Obvious ones like the size of the stage (and thus the number of necessary IR-source positions on and off stage), or the availability of the hall due to other bookings, as well as less obvious ones, like the isolation of the hall against environmental noise, weather conditions, or simply the availability of a good control room.
Once the equipment is set up, several experienced teams are needed to keep the recording process going around the clock. Usually, the whole MIR task force consists of about 10 high-profile recording engineers who are able and willing to work non-stop. (… please don't ask about the amount of coffee involved in sessions like that – it's about as many cups as the number of impulse responses recorded.)
The number of necessary IR positions differs from room to room. They are calculated in advance on the basis of floor plans, but it takes several hours to determine and align the real-world positions with the help of laser triangulation. Maybe it's hard to believe, but errors within a range of merely a centimetre may have devastating effects on the results in a complex system like Vienna MIR Pro. At the same time, every effort is taken to make the room as quiet as possible. Supported by the local staff, we try to get rid of any disturbing noise components, like air conditions, dimmers, lights, natural air flow or environmental noise through leaky windows, and so on. – There was a case where we had to hunt down a catering refrigerator humming along three rooms away.
Schedule-wise, there's always some overtime taken into account, just in case we would have to react to unexpected problems. More often than not, this eventually means that we have spare time at hand. We use it for recording acoustically interesting but not necessarily "important" positions (most of the off-stage "HotSpots" are the result of these "spare" sessions).
Setting Up The Equipment
The audio system we employ for recording our multi impulse responses has become highly refined over the years. It combines perfect audio fidelity with the highest possible mobility.
Up to four dedicated Ambisonics microphone arrays (hand-crafted prototypes built to the task by the specialists from AKG Acoustics GmbH in Vienna) are put into carefully chosen positions. They are connected to remote-controlled microphone pre-amps to keep the analogue signal path as short as possible. The signals are converted to 96kHz/24bit digital audio streams directly after the pre-amp. The actual recording is done in 96 kHz/32 bit floating point, and all raw data are kept in this format throughout the entire post-processing until the final delivery in 44.1kHz/24bit.
The other decisive component is the loudspeaker which emits the impulse (actually a long sine sweep) into the hall. It has to fulfill special prerequisites since we're aiming for sectorized impulse responses, so that there are very few models on the market that comply with our demands. In addition to that, orchestral halls are huge, so we need considerable amplification power to guarantee distortion-free reproduction.
The loudspeaker is mounted on a unique moveable rack which can be rotated precisely by a stepper motor. This allows us to direct our impulses into the room in perfect 60° steps. Only the turns to the ceiling and the floor have to be done manually.
Depending on a hall's size and reverb length, the raw data we gather from one session goes into hundreds of giga-bytes. Every single recording pass is carefully monitored by two engineers. This means listening to sine-sweeps ranging from 40 Hz to 20 kHz for many hours – and staying attentive! An extensive log file is kept for every take, covering obvious parameters like position, angle and amplification factors, and less obvious ones like room temperature or the humidity of the air.
Post Processing and Implementation
Based on the aforementioned log file, the best takes are later chosen for further processing. This encompasses a multitude of sophisticated single steps which cannot be automatized completely. The final check is always done by ear; it's the most exciting moment when one hears the results from weeks of highly demanding work in their full glory for the first time.
We hope that you will share this excitement with us.