Audio descriptions provide narration of visual elements in video content, making it accessible to blind and low vision users. In this post, we'll explore what audio descriptions are, how to detect user preferences for them, and how to implement audio description support in your SwiftUI apps.

What are Audio Descriptions?

Audio descriptions (also called video descriptions or descriptive narration) are audio tracks that describe important visual information in videos during natural pauses in dialogue. They narrate:

  • Actions and gestures: What characters are doing

  • Scene changes: Where the action takes place

  • On-screen text: Titles, captions, or important text

  • Expressions and emotions: Non-verbal communication

  • Visual effects: Important visual elements that convey meaning

Audio descriptions enable blind and low vision users to understand visual content they cannot see. For example, in a cooking video, audio descriptions might say: "She pours flour into a large mixing bowl, then cracks two eggs into the center."

Who Benefits from Audio Descriptions?

  • Blind users who cannot see the video at all

  • Low vision users who may miss visual details

  • Users with cognitive disabilities who benefit from additional context

  • Users in situations where they can't watch the screen (driving, exercising, etc.)

Detecting User Preferences in SwiftUI

iOS provides accessibility settings for audio descriptions. You can detect if a user prefers audio descriptions using the accessibilityDifferentiateWithoutColor environment variable (for older iOS versions) or by checking AVPlayer's accessibility options:

import SwiftUI
import AVFoundation

struct AudioDescriptionPreferenceView: View {
    @State private var userPrefersAudioDescriptions = false

    var body: some View {
        VStack(spacing: 20) {
            Text("Audio Description Settings")
                .font(.headline)

            if userPrefersAudioDescriptions {
                Text("✓ User prefers audio descriptions")
                    .foregroundColor(.green)
            } else {
                Text("Standard audio track will be used")
                    .foregroundColor(.secondary)
            }
        }
        .padding()
        .onAppear {
            checkAudioDescriptionPreference()
        }
    }

    func checkAudioDescriptionPreference() {
        // Check if user has enabled audio descriptions
        // This is typically done through AVFoundation's media selection
        userPrefersAudioDescriptions = AVAudioSession.sharedInstance()
            .currentRoute.outputs.contains { output in
                // Check for accessibility features
                output.portType == .headphones
            }
    }
}

Implementing Audio Descriptions with AVFoundation

AVFoundation supports multiple audio tracks, allowing you to provide both standard audio and audio with descriptions:

Basic Video Player with Audio Description Support

import SwiftUI
import AVKit

struct VideoPlayerWithAudioDescriptions: View {
    @State private var player: AVPlayer?
    @State private var hasAudioDescriptions = false

    var body: some View {
        VStack {
            if let player = player {
                VideoPlayer(player: player)
                    .frame(height: 300)
            }

            VStack(alignment: .leading, spacing: 16) {
                Text("Audio Track Options")
                    .font(.headline)

                if hasAudioDescriptions {
                    HStack {
                        Image(systemName: "checkmark.circle.fill")
                            .foregroundColor(.green)
                        Text("Audio descriptions available")
                    }
                } else {
                    HStack {
                        Image(systemName: "info.circle")
                            .foregroundColor(.secondary)
                        Text("Standard audio only")
                    }
                }
            }
            .padding()
        }
        .onAppear {
            setupPlayer()
        }
    }

    func setupPlayer() {
        guard let videoURL = Bundle.main.url(
            forResource: "sample_with_descriptions",
            withExtension: "mp4"
        ) else { return }

        let asset = AVURLAsset(url: videoURL)
        let playerItem = AVPlayerItem(asset: asset)
        player = AVPlayer(playerItem: playerItem)

        // Check for audio description tracks
        checkForAudioDescriptions(in: asset)

        // Configure audio descriptions
        configureAudioDescriptions(for: playerItem)
    }

    func checkForAudioDescriptions(in asset: AVAsset) {
        let audibleGroup = asset.mediaSelectionGroup(
            forMediaCharacteristic: .audible
        )

        if let audibleGroup = audibleGroup {
            // Check if any audio track has descriptions
            hasAudioDescriptions = audibleGroup.options.contains { option in
                option.hasMediaCharacteristic(.describesVideoForAccessibility)
            }
        }
    }

    func configureAudioDescriptions(for playerItem: AVPlayerItem) {
        guard let audibleGroup = playerItem.asset.mediaSelectionGroup(
            forMediaCharacteristic: .audible
        ) else { return }

        // Find audio track with descriptions
        let audioWithDescriptions = audibleGroup.options.first { option in
            option.hasMediaCharacteristic(.describesVideoForAccessibility)
        }

        // Select the audio description track if available
        if let descriptiveTrack = audioWithDescriptions {
            playerItem.select(descriptiveTrack, in: audibleGroup)
        }
    }
}

Advanced Player with Audio Track Selection

struct AdvancedAudioDescriptionPlayer: View {
    @State private var player: AVPlayer?
    @State private var audioTracks: [AudioTrackInfo] = []
    @State private var selectedTrack: AudioTrackInfo?
    @State private var showTrackPicker = false

    struct AudioTrackInfo: Identifiable, Hashable {
        let id = UUID()
        let displayName: String
        let hasDescriptions: Bool
        let option: AVMediaSelectionOption
    }

    var body: some View {
        VStack(spacing: 0) {
            if let player = player {
                VideoPlayer(player: player)
                    .frame(height: 300)
            }

            VStack(alignment: .leading, spacing: 16) {
                Text("Audio Options")
                    .font(.headline)

                if !audioTracks.isEmpty {
                    Button(action: { showTrackPicker.toggle() }) {
                        HStack {
                            VStack(alignment: .leading) {
                                Text(selectedTrack?.displayName ?? "Default")
                                    .font(.body)

                                if selectedTrack?.hasDescriptions == true {
                                    Text("Includes audio descriptions")
                                        .font(.caption)
                                        .foregroundColor(.green)
                                }
                            }

                            Spacer()

                            Image(systemName: "chevron.down")
                        }
                        .padding()
                        .background(Color.gray.opacity(0.1))
                        .cornerRadius(8)
                    }
                    .sheet(isPresented: $showTrackPicker) {
                        AudioTrackPicker(
                            tracks: audioTracks,
                            selectedTrack: $selectedTrack,
                            onSelect: { track in
                                selectAudioTrack(track)
                                showTrackPicker = false
                            }
                        )
                    }
                }
            }
            .padding()
        }
        .onAppear {
            setupPlayer()
        }
    }

    func setupPlayer() {
        guard let videoURL = Bundle.main.url(
            forResource: "sample",
            withExtension: "mp4"
        ) else { return }

        let asset = AVURLAsset(url: videoURL)
        let playerItem = AVPlayerItem(asset: asset)
        player = AVPlayer(playerItem: playerItem)

        loadAudioTracks(from: asset)

        // Auto-select audio descriptions if available
        if let trackWithDescriptions = audioTracks.first(where: { $0.hasDescriptions }) {
            selectedTrack = trackWithDescriptions
            selectAudioTrack(trackWithDescriptions)
        } else if let firstTrack = audioTracks.first {
            selectedTrack = firstTrack
        }
    }

    func loadAudioTracks(from asset: AVAsset) {
        guard let audibleGroup = asset.mediaSelectionGroup(
            forMediaCharacteristic: .audible
        ) else { return }

        audioTracks = audibleGroup.options.map { option in
            AudioTrackInfo(
                displayName: option.displayName,
                hasDescriptions: option.hasMediaCharacteristic(
                    .describesVideoForAccessibility
                ),
                option: option
            )
        }
    }

    func selectAudioTrack(_ track: AudioTrackInfo) {
        guard let playerItem = player?.currentItem,
              let audibleGroup = playerItem.asset.mediaSelectionGroup(
                forMediaCharacteristic: .audible
              ) else { return }

        playerItem.select(track.option, in: audibleGroup)
        selectedTrack = track
    }
}

struct AudioTrackPicker: View {
    let tracks: [AdvancedAudioDescriptionPlayer.AudioTrackInfo]
    @Binding var selectedTrack: AdvancedAudioDescriptionPlayer.AudioTrackInfo?
    let onSelect: (AdvancedAudioDescriptionPlayer.AudioTrackInfo) -> Void

    var body: some View {
        NavigationView {
            List(tracks) { track in
                Button(action: {
                    onSelect(track)
                }) {
                    HStack {
                        VStack(alignment: .leading) {
                            Text(track.displayName)
                                .font(.body)

                            if track.hasDescriptions {
                                Label("Audio descriptions", 
                                      systemImage: "speaker.wave.3")
                                    .font(.caption)
                                    .foregroundColor(.green)
                            }
                        }

                        Spacer()

                        if selectedTrack?.id == track.id {
                            Image(systemName: "checkmark")
                                .foregroundColor(.blue)
                        }
                    }
                }
                .foregroundColor(.primary)
            }
            .navigationTitle("Audio Track")
            .navigationBarTitleDisplayMode(.inline)
        }
    }
}

Creating Audio Description Tracks

Audio descriptions are typically created as separate audio tracks and mixed with the original audio. The process involves:

  1. Script Writing: Watch the video and write descriptions for visual elements during dialogue pauses

  2. Voice Recording: Record the descriptions in a clear, professional voice

  3. Audio Mixing: Mix the descriptions with the original audio

  4. Track Encoding: Add the descriptive audio as an alternative track in your video file

Adding Audio Description Track to Video

import AVFoundation

func createVideoWithAudioDescriptions(
    videoURL: URL,
    standardAudioURL: URL,
    descriptiveAudioURL: URL,
    outputURL: URL,
    completion: @escaping (Bool) -> Void
) {
    let composition = AVMutableComposition()
    let videoAsset = AVURLAsset(url: videoURL)

    // Add video track
    guard let videoTrack = videoAsset.tracks(withMediaType: .video).first,
          let compositionVideoTrack = composition.addMutableTrack(
            withMediaType: .video,
            preferredTrackID: kCMPersistentTrackID_Invalid
          ) else {
        completion(false)
        return
    }

    do {
        try compositionVideoTrack.insertTimeRange(
            CMTimeRange(start: .zero, duration: videoAsset.duration),
            of: videoTrack,
            at: .zero
        )

        // Add standard audio track
        let standardAudio = AVURLAsset(url: standardAudioURL)
        if let standardTrack = standardAudio.tracks(withMediaType: .audio).first,
           let compositionAudioTrack = composition.addMutableTrack(
            withMediaType: .audio,
            preferredTrackID: kCMPersistentTrackID_Invalid
           ) {
            try compositionAudioTrack.insertTimeRange(
                CMTimeRange(start: .zero, duration: videoAsset.duration),
                of: standardTrack,
                at: .zero
            )
        }

        // Add descriptive audio track
        let descriptiveAudio = AVURLAsset(url: descriptiveAudioURL)
        if let descriptiveTrack = descriptiveAudio.tracks(withMediaType: .audio).first,
           let compositionDescriptiveTrack = composition.addMutableTrack(
            withMediaType: .audio,
            preferredTrackID: kCMPersistentTrackID_Invalid
           ) {
            try compositionDescriptiveTrack.insertTimeRange(
                CMTimeRange(start: .zero, duration: videoAsset.duration),
                of: descriptiveTrack,
                at: .zero
            )
        }

        // Export the composition
        guard let exportSession = AVAssetExportSession(
            asset: composition,
            presetName: AVAssetExportPresetHighestQuality
        ) else {
            completion(false)
            return
        }

        exportSession.outputURL = outputURL
        exportSession.outputFileType = .mp4

        exportSession.exportAsynchronously {
            completion(exportSession.status == .completed)
        }

    } catch {
        print("Error creating video with audio descriptions: \(error)")
        completion(false)
    }
}

Best Practices for Audio Descriptions

  1. Describe What Matters: Focus on visual information that's essential to understanding the content.

  2. Use Natural Pauses: Insert descriptions during natural breaks in dialogue to avoid overlapping with important audio.

  3. Be Objective: Describe what you see, not what you interpret. Say "She frowns" not "She looks angry."

  4. Be Concise: Descriptions should be clear and brief, fitting naturally into available time.

  5. Identify Speakers: When new people appear on screen, describe them so users know who's speaking.

  6. Describe Text: Read important on-screen text like titles, signs, or messages.

  7. Maintain Tone: Match the style and mood of the content in your descriptions.

  8. Provide Context: Describe scene changes and settings so users understand where the action takes place.

  9. Test with Users: Have blind or low vision users review your descriptions for clarity and usefulness.

Automatic Selection of Audio Descriptions

import AVFoundation

extension AVPlayerItem {
    func selectPreferredMediaOptions() {
        // Select audio descriptions if available
        if let audibleGroup = asset.mediaSelectionGroup(
            forMediaCharacteristic: .audible
        ) {
            let preferredOptions = AVMediaSelectionGroup.mediaSelectionOptions(
                from: audibleGroup.options,
                with: Locale.current
            )

            // Prioritize tracks with descriptions
            let trackWithDescriptions = preferredOptions.first { option in
                option.hasMediaCharacteristic(.describesVideoForAccessibility)
            }

            if let descriptiveTrack = trackWithDescriptions {
                select(descriptiveTrack, in: audibleGroup)
            }
        }
    }
}

Wrap up

Audio descriptions are required if your app contains video with meaningful visual content. AVFoundation makes it manageable — include a separate audio track and select it automatically when the user has audio descriptions enabled.

Resources:

Read more

Share


Share Bluesky Mastodon Twitter LinkedIn Facebook