Advanced: Recording Voice and Video

There is a new release of the Developer Center! If you'd like to check it out, please click  here

Advanced: Recording Voice and Video

This page shows how to use the Agora Recording SDK to enable voice and video recording and use the transcoding scripts.

  • During voice recording, if a user leaves the channel and rejoins it after 15 seconds, a new recording session will be created, and two separate recorded files will be generated.
  • During video recording, if a user leaves the channel and rejoins (regardless of the time interval), changes the resolution, mutes or unmutes the video, the recorded video file can be split.

Recording Files

The Agora Recording SDK supports recording in two modes:

  • Individual recording: To record the voice and video stream respectively for each UID.
  • Composite recording: To mix the voice and video recordings for different users in the same channel; the video mixing layout is also supported.

Individual Recording

Individual recording enables the SDK to record the voice and video respectively for each UID in the channel.

Set isMixingEnabled = 0 to start individual recording.

Choose the recording mode by setting the parameters according to the following table:

Recording Mode Parameters File Before Transcoding
Voice-only isMixingEnabled = 0 A voice file for each UID.
Video-only isMixingEnabled = 0
  • An MPEG4 file for each UID (Native SDK).
  • A WebM file for each UID (Web).
Voice + video isMixingEnabled = 0 A voice/MPEG4/WebM file for each UID. Voice and video are separated.

To merge the recorded voice and video files of each UID, use the video_convert.py script for transcoding. See Using the Transcoding Script.

Composite Recording + Unsynchronized Transcoding

Set isMixingEnabled = 1 and mixedAudioVideo = 0 to start composite recording with unsynchronized transcoding.

This enables mixed-voice and mixed-video recordings, and the voice and video files are separated.

You can also call setVideoMixingLayout to set the video mixing layout.

Choose the recording mode by setting the parameters according to the following table:

Recording Mode Parameter Files Before Transcoding
Voice-only isMixingEnabled = 1, mixedAudioVideo = 0 A mixed AAC voice file
Video-only isMixingEnabled = 1, mixedAudioVideo = 0 A mixed MPEG4 video file
Voice + video isMixingEnabled = 1, mixedAudioVideo = 0 A mixed AAC voice file and a mixed MPEG4 video file (two separate files)

To merge the recorded voice and video files, use the video_convert.py script for transcoding. See Using the Transcoding Script.

Composite Recording + Synchronized Transcoding

Set isMixingEnabled = 1 and mixedAudioVideo = 1 to start composite recording with synchronized transcoding. This enables mixed voice and video recording without using the transcoding script.

Choose the recording mode by setting the parameters according to the following table:

Recording Mode Parameter File
Voice-only isMixingEnabled = 1, mixedAudioVideo = 0 A mixed AAC voice file
Video-only isMixingEnabled = 1, mixedAudioVideo = 0 A mixed MPEG4 video file
Voice + video isMixingEnabled = 1, mixedAudioVideo = 1 An MPEG4 voice + video file

Note

The Agora Recording SDK does not support composite recording + synchronized transcoding in the web-only mode. See Joining a Channel (joinChannel) for the supported players.

Raw Data

The Agora Recording SDK supports raw data in individual recording, and mixed voice in composite recording.

Individual Recording

Choose the recording mode by setting the parameters according to the following table:

Recording Mode File
Voice-only
  • AAC file: AUDIO_FORMAT_AAC_FRAME_TYPE = 1
  • PCM file: AUDIO_FORMAT_PCM_FRAME_TYPE = 2
Video-only
  • H264 file: VIDEO_FORMAT_H264_FRAME_TYPE = 1
  • YUV file: VIDEO_FORMAT_YUV_FRAME_TYPE = 2
Voice + video
  • H264 + AAC file
  • H264 + PCM file
  • YUV + AAC file
  • YUV + PCM file

Note

The web supports raw data in YUV only, not in H264.

Composite Recording

The Agora Recording SDK supports raw data for mixed voice, and generates a PCM file.

Recording Mode Parameter File
Voice-only isMixingEnabled = 1, isAudioOnly = 1 PCM file: AUDIO_FORMAT_MIXED_PCM_FRAM_TYPE = 3

Screen Capture

The Agora Recording SDK supports screen capture in individual recording only, and no subsequent transcoding is needed.

Recording Mode Parameter File
Video-only isVideoOnly = 1,decodeVideo = 3/4,captureInterval >= 1 [1]
  • JPEG file
  • JPEG buffer

Footnotes

[1]The captureInterval parameter sets the time interval for each screen capture. The minimum value is 1 s and the default value is 5 s. Ensure decodeVideo = 3/4. For more information, see Joining a Channel (joinChannel).

Screen Capture + Recording

The Agora Recording SDK supports screen capture and recording in individual recording only. Transcoding is not necessary for screen capture. For transcoding the recorded file, see Individual Recording.

Recording Mdoe Parameter File
Screen capture + recording decodeVideo = 5,captureInterval >= 1
  • An MPEG4 file (un-transcoded) for each UID on the Native SDK
  • A WebM file (un-transcoded) for each UID on the Web SDK
  • A JPEG file

Managing the Recorded Files

The recordFileRootDir folder in the config.json [2] folder specifies the top-level recording directory. Its structure is as follows:

  • yyyy_mm_dd (Date): A new directory with the date is created daily (assuming recording occurs daily), and all the files and directories on the datum is stored under this directory.
  • ChannelName_HHMMSS: The recorded files are stored in the directory created on the same date as the recording. Each recorded file contains a channel name and a timestamp (hour, minute, and second) to indicate when the recording started.
File Description
UID_HHMMSSMS.aac If recording on the Native SDK, PC, or the web, each UID has an AAC file containing the voice content of its corresponding UID only.
UID_HHMMSSMS.mp4 If recording on the Native SDK or PC, each UID has an MPEG4 file. If the user joined and left the channel multiple times, this UID can have multiple MPEG4 files containing the video content of this UID only.
UID_HHMMSSMS.webm If recording on the web, each UID has a WebM file. If the user joined and left the channel multiple times, this UID can have multiple WebM files containing the video content of this UID only.
recording2-done.txt Indicates that the recording in this channel has finished.
UID_HHMMSSMS.txt Records the start and end time of each voice or video file, and the related information, such as the width and rotation.

Footnotes

[2]The config.json file, generated by the user, contains the saving path of the recorded file. During recording, the SDK uses the cfgFilePath parameter to abtain the saving path. For more information, see the description of cfgFilePath in Joining a Channel (joinChannel) .

Using the Transcoding Script

Once recording is finished, use video_convert.py and ffmpeg to merge the recorded files (decompress the transcoding tool with the command line tar -xvf):

  • If multiple voice files are generated after the recording, transcoding merges them into one M4A file called UID_HHMMSSMS.m4a.

  • If multiple voice and video files are generated after the recording, transcoding merges them into one MPEG4 file called UID_HHMMSSMS_av.mp4. To merge the voice and video files by the session:

    • If a UID leaves a channel and rejoins it within 15 seconds, the Recording SDK considers this session as one. A new voice file is not generated but a new video file is generated and merged into the voice and video file, and a UID_HHMMSSMS_av.mp4 file is generated.
    • If a UID leaves a channel and rejoins it after 15 seconds, the Recording SDK considers this as two seperate sessions. A new voice and video file is generated creating a new UID_HHMMSSMS_av.mp4 file for the new session.
  • If multiple voice and video files are generated after the recording, and you wish to merge the MPEG4 files of different sessions by the UID, transcoding merges them into different MPEG4 files such as UID_0_merge_av.mp4, UID_1_merge_av.mp4, and UID_2_merge_av.mp4.

    • In the automatically mode, the -m parameter merges all voice and video files of one UID and generates a single UID_0_merge_av.mp4 file.
    • In the manually mode, the startService and stopService parameters manage and divide the recorded files. Each start/stop makes one service, and the -m parameter generates multiple UID_XX_merge_av.mp4 files.

The transcoding tool includes video_convert.py and ffmpeg. The Python script can merge the separated voice and video recording files into one MPEG4 file and the script relies on FFmpeg. Ensure the directory of ffmpeg is included in PATH. Execute the following command to run the transcoding tool:

Execute python video_convert.py and see the following usage:

Usage: video_convert.py [options]

Options:
  -h, --help
  -f FOLDER, --folder=FOLDER
                        Convert folder
  -m MODE, --mode=MODE  Convert merge mode, [0: txt merge A/V(Default); 1: uid
                        merge A/V; 2: uid merge audio; 3: uid merge video]
  -p PFS,  --fps=FPS    Convert fps, default 15
  -s --saving           Convert Do not time sync
  -r RESOLUTION, --resolution=RESOLUTION
                        Specific resolution to convert '-r width height'
                        Eg: '-r 640 360'
Option Description
-f Directory of the file to be transcoded
-m

Transcoding mode:

  • 0: Transcode by the session. Merge the voice and video file of one session into one file.
  • 1: Merge the voice and video file of the same UID into a file chronologically.
  • 2: Merge the voice file of the same UID into a file chronologically.
  • 3: Merge the video file of the same UID into a file chronologically.
-p Parameter for setting the frame rate in both composite and individual recording. 15 fps is the default value. [3]
-s The saving mode that indicates if the transcoding should be strictly synchronized with time; in other words, if the time interval when the user is not in the channel remains in the recorded file. Make sure that you use this parameter together with -m = 1/2/3. The default value indicates “always recording”. [4]
-r Parameter for setting the resolution of transcoding in the “width height” format. For the supported resolution, see Joining a Channel (joinChannel).

Footnotes

[3]The -p parameter sets the frame rate from 5 fps to 120 fps. If it is set to less than 5 fps, the SDK treats it as 5 fps.
[4]For “always recording”, if a user leaves and rejoins a channel, the idle time is shown as a black screen in the recorded file. For example, if a user is in a channel for 2 minutes, then leaves a channel for 30 minutes before rejoining the channel and spending another 2 minutes in it, then the total recorded time is 34 minutes with 30 minutes of black screen.

See the following figure for the different transcoding script options:

../_images/recording_demo.png

The transcoded MPEG4 file supports the following players:

Operation System Players
Windows Windows Media Player, KMPlayer, VLC Media Player
Mac QuickTime, Movist, MPlayerX, KMPlayer
iOS iOS default player, VLC Media Player, KMPlayer
Android Android default player, MX Player, VLC Media Player, KMPlayer

Note

You can start transcoding only if a recording2-done.txt file exists in the recording folder. A convert-done.txt file is generated after the transcoding is complete. Contact sales@agora.io if the transcoding fails. Once the transcoding script is used, a convert.log file is generated in the same directory as the voice and video files upon completion of the transcoding.

Protecting Recorded Files

The recorded files are stored on your server only, and Agora has no access to them. You are responsible for the protection and security of the recorded files. Consult a security expert if necessary.

Note

The recording.log and recording_sys.log files under the ChannelName_HHMMSS directory list any exception or problem occurred during recording.

Is this page helpful?