Syncing audio to video of a grabbed video
Why audio and video can get out of sync
Symptom
When playing the movie you notice a delay between audio and video images. Mostly the offset is growing with the length of the video. In the properties window of your video file sometimes there are also shown different total run times of the audio and video track.
Causes of audio delay
- The Audio delay occurs as a result of the automatic video synchronization of the VHS player.
VHS players compensate geometric variations of the mechanic tape flow by regulating the playback speed slightly until the image is properly synchronized. This is done automatically or manually via dial. As a result in VHS, the actual frame rate during playback is never exactly 25 fps. It is always slightly different.
If this recording is played at a computer with exactly 25 fps this leads to an incorrect duration of the video track. But the audio track is played with almost with the same sampling frequency as it has been record (exactly 48 kHz or 41.5 kHz). As a result audio and video track diverge.
Already 0.1% deviation of video speed lead to a an audio delay of 1s after 20 minutes running time.
This audio delay increases constantly with increasing running time of the video. - Wrinkled VHS tape leads to destroyed frames. Possibly
this results in dropped frames in the video file, because sometimes the
grabber cannot sync to destroyed frames. But audio will
be recorded without interruptions. The biggest source of
this problem are E-300 cartridges because they use the thinnest video
tape.
Debut video capture software displays the number of dropped frames at the end of the recording.
You always should start to recording only if there is already a stable video signal from the VHS player fed to the input of the grabber.
Audio offset caused by lost frames occurs stepwise, so that the sound abruptly in shifts at faulty points of the video and gets a significant delay.
Frames also can be delayed when there are disturbances in VHS tape flow. Delayed or doubled frames need that the audio track has to be delayed too. Dropped frames shorten the running time of the video so that the audio track needs a shorter runnung time (or a negative delay at a given time). - Dropped frames also arise as a result of high CPU load during capture. The computer does not manage to compress and store the frames in time. This error should not occur if you follow the advices given in part 1.
- Audio delay also is produced as a result of different sampling rates when using separated audio and video converters.
Since all audio cards are clocked by quarz crystals which are highly accurate, this is offset is very small and it should only be measurable after a very long running time of a video.

Audio delay vs running time in a grapped VHS video (frame rate not 25 fps and dropped frames starting from 0:50)
In the figgure you can see the real audio delay of a fairy tale I had digitized from VHS. Until the 50th minute there is a steadily increasing offset, which arises solely because the tape is not exactly running at 25 frames per second. The frame rate differs only 0.026%. But that's enough, so that from about the 10th Minutes you will note that the audio track lags behind the picture. We will notice a shift of greater 100 ms. At 50 minutes, the offset has been increased to 800 ms. That is almost one second.
From the 50th minute the audio delay suddenly increases sharply. In that range I could see stepwise movements in the digitized video. So there were dropped frames or (in this case) delayed frames. In the audio track there were some small clicks and crackle noises too. So I conclude that the dropped frames result from disturbances in VHS cassette tape flow, and were not caused by excessive computer load when digitizing.
Time corrections of the audio track then were performed in Wavelab. I've corrected each section of the graph between two points separately in Wavelab. The correction value which is used as offset is always the difference between the offset at the end of the section and the offset at the beginning. For example between 1:05 and 1:07 we have 200 ms delay. After these adjustments, the synchronicity between sound and video image was fine.
Step 1: Find out the exact delay time between video and audio
Before you can correct the offset of audio, you must first get out exactly how big it is.
a) By reading the file information of your video
Unfortunately this only works with MPG files.
In the AVI files that were recorded with Debut Video Capture software and the
PICVideo MJPEG encoder, the running times for image and sound met not
exactly.
Open the video file in Avidemux: File > File information

0.01381% delay between audio and video. This is about 1 s offset per hour of video run-time.
You see the length of the video stream (basing on 25.00 fps)
and the length of the audio stream (which is corresponding to the
actual recording time). You get a delay expressed
in percent by converting both values into seconds, and
then dividing audio length by video length. In the example, you
get 0.01381% offset.
You just can enter this value for the audio length correction in Wavelab.
But for grabbing we do not use MPG files, because the image is compressed which is bad when it comes to deinterlacing.
b) By calibrating your recording set up
Make a test recording with 10…30 minutes of your video as MPG file (you may use the WinAVI Video Grabber software). Then you can apply the method a) to your recording to get the exact percentage of audio delay.
If all of your videos are running with exactly the same playback speed, they all need the same correction value.
After that you can record the video with a different codec. The
percentage of offset between sound and image will remain the same.
The calibration method works only if your VHS player is playing the videos with the same constant synchronization setting. If your player uses an automatic synchronization adjustment, you cannot ensure that the speed has not changed. Most likely you have to apply method c) in this case.
c) The trying method
The trying method is less accurate. With some practice you will succeed to estimate the audio delay to about 100 ms.
Open the video with a video player (MPC Media Player Classic or VLC Media
player). If your video does not have too much image breakups, you can
assume that it has a constantly increasing offset between audio and
video. Consequently, the offset at the beginning of the video is 0, and
finally it reaches a maximum.
You can determine this maximum as follows:
Go to the end of the video to a location where you can easily compare the
sound with the corresponding images. Images with clearly visible lip
movements are favorable.
In Media Player Classic you reach the audio settings via: Right-click > Audio > Options.
Activate [x] Audio time shift and put in any estimated value. You can input positive or negative values.
Pause the video and let it play again. Now you have to evaluate if
you're setting gives a better synchronisation. Gradually improve the
numerical value for the audio time shift. If the audio is lip-sync,
write down the exact value.
For videos with strong image breakups you have to evaluate audio delay after each series of video dropouts because it may increase stepwise.

The example shows in Media Player Classic an audio time offset adjusted to +1100 ms.
Note: MPC will remember the last used value for audio time offset. Therefore, you should set the value to zero afterwards.
If you are correcting the audio track gradually, with MPC you can read the corrected audio track as a dub and select it instead of the original soundtrack (File > Open file > Dub: Browse, and then Play > Audio: Select audio track). So you do not need to weave the audio track back into the video file after each correction step.
How do you sync audio and video (repair digital recordings from VHS tape)
Save audio
In order to process the audio track, you will need to extract it from the video file. To do this:
In Virtualdub:
Open video file
Virtualdub > File > Save WAV...
In Avidemux:
Open video file
Avidemux > Audio > Save Audio track...
If you have recorded audio as AC3 stream (Dolby digital), you can export it as PCM stream in this way:
In Avidemux:
Open video
Video: Copy
Audio: PCM
File > Save as...
A new video file is written, where only the audio track is newly rendered.
Open the new video in Avidemux and save the audio track.
Also you can save the AC-3 audio data stream directly from Avidemux.
Then you need to use the tool "BeSweet" which converts AC-3 data into wave files.
Correct audio run time with Wavelab
Open the audio file in Wavelab.
First correct audio volume.
If necessary, you can also filter the sound, declick, denoise or dehum.
If you grabbed the video with a the USB Video Grabber, you should apply a 60 Hz high pass filter.
Store the result.
It can happen that you have to return back to this point if you make
mistakes during the following run-time correction.
Time correction with Wavelab:
Wavelab > Perform > Time correction
You can specify either the planned duration, or the percentage of the desired correction.
I recommend the latter, since the numerical value will be the same for most of your videos.
In the options select Quality: Good, [x] maintain pitch
You should use the Dirac processor only if your video contains
music. A time correction with the Dirac processor takes up to
several hours (!) Without it just a few minutes.
Attention: Wavelab time correction opens with the last used adjustments. For this you first have to click at the Ratio: Source button to reset the destination leng to 100% of input length. Then just add the necessary time offset to this value (in most cases some 100 ms).
Save the finished file as a new file with a different name so you can take a step back if necessary.
Correct audio run time with other audio software
Please submit suggestions to me: sven@engon.de
Testing
Direct test in MPC:
File > Open file > Dub: Browse, select corrected audio file
Play > Audio: select the corrected audio track
Play the video. Audio time offset should be set to 0.
Direct Test in Virtualdub:
Virtualdub > Audio > Audio from other file...: chose your corrected audio file.
Play different positions of the video to check whether audio is now
in sync with the running video. When everything is okay, you
can filter the video and do the final encoding.
Indirect Test with a software video player and Virtualdub:
Virtualdub > Audio > Audio from other file...: chose your corrected audio file..
Video: Direct stream copy
Save the new Avi video file with a different name (which works really fast, because the video is not rendered again)
Load this file into a video player and check if the corrected audio track is in sync with the video stream.
When everything is fine, you can filter the video and do the final encoding.
Version b) Sync video to audio by correcting the video data stream
In principle, one could also change the duration of the video stream by inserting or deleting intermediate images. This may lead to jerky movements, so I do not recommend this.
We need VirtualDub and Avidemux..
1. Finding out the real frame rate (this works with MPEG video only)
Open the video with VirtualdubMPEG
Virtualdub > Video > Video frame rate control:
[o] Change so video and audio durations match (Value fps)
(This option changes the video refresh rate to a level at which the video track becomes as long as the audio track.)

Figure: Show the real frame rate in Virtualdub (based on the audio run time)
If the input video has MPG data format, Virtualdub can not store it without re-encoding.
But re-encoding is better done in Avidemux because we want to use X.264 for the final incoding.
Therefore we perform the following steps in Avidemux.
2. Assign the real frame rate to the video track
Open video in Avidemux
Video > Framerate: input the value supplied by Virtualdub
File > properties: the video should now display the correct value for run time (length of audio and video should match)

Fig 2: Video and Audio with wrong fps
Fig 3: Video and Audio with corrected fps
3. Generate intermediate frames and render the new movie with 25 frames per second
Now, though the fps value is set correctly, but most graphics cards will not properly play this movie, because the frame rate can only have default values of 25 (50) and 30 fps (60). Therefore a video filter must be inserted before rendering that inserts or remove frames to correct run time.
Video Filters: insert the filter "Frame rate" and enter the desired target frame rate (50.000)

Video filters in Avidemux for VHS (Resample fps)
Now render the movie. More hints about the specific filters chosen here are described in VHS video filter settings in AVIDEMUX and in VHS video filter settings in Virtualdub.
