Using the best encoding profiles is essential to ensuring your users are recording at a high quality while the performance is optimal for the device being used to record. The Azure Media Capture SDK offers a way to choose profiles for all media coming from the camera as well as the media being sent to the live ingest service. You have the ability to encode more than one bitrate and/or resolution but the higher the bitrate and resolution or the more tracks you create, the greater the impact on your CPU will be. To ensure that the user’s device can keep up with the captured media we offer an easy way to create default profiles at runtime based on camera’s native capability as well as some tips should you choose to create your own profiles.

Tips & requirements to choosing profiles

  1. Audio must be captured using PCM subtype and encoded using ACC.
  2. Video must be captured using an uncompressed format (NV12 is recommended) and encoded using H264.
  3. Always check the capabilities of your camera first and base your profiles on this. For example: If your camera can only produce 360p video or 1 channel audio, there is no need to encode 480p video or 2 channel audio. While the SDK will allow this, it is more efficient to treat your native device’s recording abilities as the lowest common denominator and create profiles that are at or below it.
  4. Similar to #3 above, always make sure your output profile is the same or less than your input profile. For example: If you capture at 360p, there’s no need to encode at 480p. You are better off letting the client scale the content during playback since there is no quality gain by artificially scaling during encode.
  5. Multi-resolution video output is more expensive than multi-bitrate at the same resolution output. Single resolution/bitrate output is the least expensive and if the input and output profiles match properties, this will be the most performant.
  6. You cannot change any of the profiles while you are actively recording but you can while previewing.
  7. There are restrictions on what kind of output profiles can be used by the Media Foundation. Read more here about audio restrictions. Read more here about video restrictions.

Using the SDK default profile helper

The SDK includes a helper class to generate default input and/or output profiles. The input profile is the profile used by the camera and the output profile is the one sent to the server (also the same one your viewers will get).

Microsoft.Media.CaptureClient.EncodingProfileHelper can be called to generate safe default profiles to use in your app. The profiles created by the class must then be supplied to the CaptureSession object during initialization.

For example: To get both input and output profiles by supplying the audio and video device being used by the MediaCapture class:

EncodingProfileHelper.InitializeCaptureSession(captureSession, mediaCapture.AudioDeviceController, mediaCapture.VideoDeviceController, VideoEncodingPreferences.SingleBitrate, null);

  • The 3rd parameter lets you specify mutli-resolution & bitrate, multi-bitrate only, or single bitrate video profiles.
  • The last parameter lets you cap the vertical resolution even if your camera supports it.
  • EncodingProfileHelper also offers methods to create just input or output profiles.

Creating your own profiles

You can also set your own profiles by populating VideoEncodingProperties & AudioEncodingProperties objects and passing them to the CaptureSession object.

For example, to create input (or capture) profiles:
var inputProfile = new MediaEncodingProfile();
inputProfile.Container = null;
inputProfile.Audio = AudioEncodingProperties.CreatePcm(44100, 2, 16);
inputProfile.Video = VideoEncodingProperties.CreateUncompressed(MediaEncodingSubtypes.Nv12, 640, 480);
inputProfile.Video.FrameRate.Numerator = 30;
inputProfile.Video.FrameRate.Denominator = 1;
inputProfile.Video.PixelAspectRatio.Numerator = 1;
inputProfile.Video.PixelAspectRatio.Denominator = 1;
captureSession.CaptureEncodingProfile = inputProfile;

For example, to create an output profile including 2 video tracks and 1 audio track:
var outputAudioProfile = AudioEncodingProperties.CreateAac(44100, 2, 16);

var outputVideo1Profile = VideoEncodingProperties.CreateH264();
outputVideo1Profile.Width = 640;
outputVideo1Profile.Height = 480;
outputVideo1Profile.Bitrate = 1000000;
outputVideo1Profile.FrameRate.Numerator = 30;
outputVideo1Profile.FrameRate.Denominator = 1;
outputVideo1Profile.PixelAspectRatio.Numerator = 1;
outputVideo1Profile.PixelAspectRatio.Denominator = 1;

var outputVideo2Profile = VideoEncodingProperties.CreateH264();
outputVideo2Profile.Width = 320;
outputVideo2Profile.Height = 240;
outputVideo2Profile.Bitrate = 800000;
outputVideo2Profile.FrameRate.Numerator = 30;
outputVideo2Profile.FrameRate.Denominator = 1;
outputVideo2Profile.PixelAspectRatio.Numerator = 1;
outputVideo2Profile.PixelAspectRatio.Denominator = 1;

captureSession.AudioOutputEncodingProperties = new[] { outputAudioProfile };
captureSession.VideoOutputEncodingProperties = new[] { outputVideo1Profile, outputVideo2Profile };

Last edited Aug 20, 2014 at 9:57 PM by timgreenfield, version 3