A Quick Guide to Digital Video Resolution and Aspect Ratio Conversions

Contents

  1. Introduction
  2. The Connection Between the Analog and the Digital
  3. A Conversion Table for Digital Video Formats
  4. Frequently Argued Questions
  5. Related Links

Recent updates

15-Jan-2008

25-Feb-2006

28-Feb-2004

22-Feb-2004

20-Feb-2004

14-Feb-2004

26-Jun-2003

11-Feb-2003

26-Jun-2002

13-Apr-2002

11-Apr-2002

10-Apr-2002

9-Apr-2002

6-Apr-2002

Abstract

Despite of ever-growing number of people working with digital video formats daily, there is still a great deal of confusion regarding how their image geometry and aspect ratios actually work. This document tries to shed some light on these issues.

Feel free to e-mail me any comments, corrections, suggestions, additions or opinions. Should you come across a broken link, please let me know so I can fix it.

Acknowledgements

My warm thanks go to Colin Browell, Andy Furniss, Ole Hansen, and Paul Keinanen, and Olafs who have provided valuable comments and feedback concerning this page.

Linking to this document

You are free to link to this document. If you do so, please use the URL <http://www.iki.fi/znark/video/conversion/>. This ensures that the link will always work, regardless of the actual physical location of this site.


1. Introduction

There is a fair number of mind-blowing, scary oddities and secrets in the world of digital video.

One of the very first a beginner will usually encounter is the fact that in digitized video data, pixels are often not considered "square" in their form. In most real-world digital video applications pixels have a width/height ratio – or aspect ratio, as it is more conveniently called – that can be something completely different from 1/1!

The second great revelation usually comes when one runs into the concept of anamorphic 16:9 video for the very first time. If it was initially hard to grasp the idea of pixels changing their shape when displayed in different environments, this one is even more baffling: the very same pixel resolution you have only just learned to associate with 4:3 displays can now suddenly represent another, totally different image geometry. In other words, the pixels have changed their shape again!

Unfortunately, these two are often the only things most ordinary people will ever learn about digital video and aspect ratios.

1.1 The dirty little secret revealed

Tutorials and manuals usually tend to keep very quiet and secretive about the finer technical details of digital video, particularly when it comes to the topic of (pixel) aspect ratios and image geometry.

Even if converting (resampling) video clips to other resolutions is discussed, the accompanying explanation is usually troublingly simplistic and vague – often inaccurate and misleading – and sometimes the suggested methods are just plain wrong. It is not uncommon that the examples only deal with arbitrarily chosen ("x pixels by y pixels") frame dimensions and use ideal frame aspect ratios such as 16:9 or 4:3 as the basis for calculations – not the actual pixel aspect ratios – which is usually a good indicator that the writer may not actually take the real image geometry into account at all.

It is almost as if the whole aspect ratio issue was considered some sort of dirty little secret of the video industry; black magic you could not even begin to explain to mere mortals in reasonable terms. This is a shame. In this case, there is really more to it than meets the eye. Confusing people with incomplete and watered-down explanations does not do any good to the industry. 

Now that you have read this far, it is time to reward your effort with The Third Big Revelation about aspect ratios and frame sizes - the one that is usually left unsaid:

Not a single one of the commonly used digital video resolutions exactly represents the actual 4:3 or 16:9 image frame.

Shocking, isn't it? 768576, 720576, 704576, 720480, 704480, 640480... none of them is exactly 4:3 or 16:9; not even the ones you may conventionally think as "square-pixel" resolutions.

So there. Now you finally know the truth. Let's find out what it actually means.

2. The Connection Between the Analog and the Digital

Digital video standards do not live outside the realm of analog world. On the contrary, all commonly used modern (SDTV) digital video formats have a well-defined relationship with their counterparts in analog video standards. You could really say they have their roots in analog soil.

And now, my friend, we are rapidly closing to The Fourth Big Revelation:

It is really the analog video standards that define the image geometry and pixel aspect ratio in digital formats.

Even if you did all of your video work solely in digital domain, those pesky old analog video standards still define the shape of your images and pixels.

How come?

From the video industry's point of view, the current (SDTV, as opposed to HDTV which is another kettle of fish) digital video formats - those that actually get used in practical real-life applications such as DVD, DV, VCD, SVCD, digital television etc. - are all about interoperability. At the advent of digital video - late 1970's, when committee work was started on CCIR 601 (later to become ITU-R BT.601) - there was already a vast catalog of analog video material in formats defined solely by analog standards. What is more, enormous amounts of money had been poured in analog studio equipment such as cameras, video switchers, proc amps, tape decks and other tools of trade. What a waste it would have been if the "next generation" digital video formats were designed in a such way they had absolutely nothing in common with old analog formats, and required ditching all the analog equipment!

It was clear from the beginning that the industry wanted a smooth, well-defined transition path between the current analog systems and the brave new digital world without running into too many compatibility issues. It was also considered necessary to be able to freely mix and match digital and analog equipment. The result was that the digital (SDTV) video formats we now use are based on the concept of digitizing old, analog video signals, thus interlocking to the analog video standards.

This connection between the digital and analog domains is permanent. Some of the fundamental features of digital video, such as image geometry, are actually defined in the analog standards. Even if we go all-digital, the relationship is still there, as long as we use either ITU-R BT.601 pixels or "industry standard" square pixels.

2.1 What does it mean?

There are three basic sampling rates from which almost all modern digital video formats are derived:

13.5 MHz ITU-R BT.601 (aka CCIR 601 aka Rec. 601) non-square pixels for both 625/50 and 525/59.94 systems. This sampling rate was originally designed for digitizing component video signals. Now used extensively in almost all modern digital video gear.
14.75 MHz "Industry standard" square pixels for 625/50 systems. Originally designed for digitizing composite video signals.
12 + 3/11 MHz SMPTE 244M "industry standard" square pixels for 525/59.94 systems. Originally designed for digitizing composite video signals.

Let's see how this works out with 13.5 MHz and both 525/59.94 and 625/50 systems:

If you have the B/W (luminance) part of a component video signal in a coaxial cable, you can plug in an A/D converter and start metering (sampling) the voltage level in the cable at regular intervals.

It also works the same way for square-pixel sampling rates. You will just get a different number of horizontal samples. The calculations are left as an exercise to the reader.

2.3 I am already lost!

If you did not understand a word of the above, you might want to take a look at the following introductory links:

Also see the Related Links section.

3. A Conversion Table for Digital Video Formats

The following is a frame size and aspect ratio conversion table, representing many commonly used digital video formats:

The formats related to 625-line systems with a 50 Hz field rate
sampling matrix sampling
rate (MHz)
pixel aspect
ratio (x/y)
sampling matrix width in s actual active picture size supports
interlacing
notes
width height width height
768 576 14.75 768/767 52.06780 767 576 Y "Industry standard" 625/50 square-pixel video
768 576 14 + 10/13 1/1 52.00000 768 576 Y "True" computer square-pixel resolution
768 560 14.75 768/767 52.06780 767 576 Y CD-i
720 576 13.5 128/117 53.33333 702 576 Y D1, DV, DVB, DVD, SVCD
720 540 ambiguous 1/1 ambiguous 720 540 N Oddball compromise format. Better to avoid unless you really know what you are doing.
704 576 13.5 128/117 52.14815 702 576 Y DVD, H.263 (4CIF), VCD
702 576 13.5 128/117 52.00000 702 576 Y Active picture frame for 625/50 systems in ITU-R BT.601-4 pixels.
544 576 10.125 512/351 53.72840 526+1/2 576 Y DVB (3/4 of BT.601 sampling rate)
480 576 9 128/78 53.33333 468 576 Y SVCD (2/3 of BT.601 sampling rate)
384 288 7.375 768/767 52.06780 383+1/2 288 N 1/4 of "industry standard" 768576
384 280 7.375 768/767 52.06780 383+1/2 288 N CD-i
352 576 6.75 256/117 52.14815 351 576 Y DVD
352 288 6.75 128/117 52.14815 351 288 N VCD, DVD, H.261 + H.263 (CIF)
176 144 3.375 128/117 52.14815 175+1/2 144 N H.261 + H.263 (QCIF)
The formats related to 525-line systems with a 59.94 Hz field rate
sampling matrix sampling
rate (MHz)
pixel aspect
ratio (x/y)
sampling matrix width in s actual active picture size supports
interlacing
notes
width height width height
720 540 ambiguous 1/1 ambiguous 720 540 N Oddball compromise format. Better to avoid unless you really know what you are doing.
720 486 13.5 4320/4739 53.33333 710.85 486 Y D1
720 480 13.5 4320/4739 53.33333 710.85 486 Y DV, DVB, DVD, SVCD
711 486 13.5 4320/4739 52.66667 710.85 486 Y Active picture frame for 525/59.94 systems in ITU-R BT.601-4 pixels.
704 486 13.5 4320/4739 52.14815 710.85 486 Y  
704 480 13.5 4320/4739 52.14815 710.85 486 Y ATSC, DVD, VCD
648 486 12 + 1452/4739 1/1 52.65556 648 486 Y "True" computer square-pixel resolution (all 486 active scanlines)
640 480 12 + 3/11 4752/4739 52.14815 646+5/22 486 Y D2: "industry standard" 525/59.94 square-pixel video
640 480 12 + 1452/4739 1/1 52.00549 648 486 Y "True" computer square-pixel format (cropped)
480 480 9 6480/4739 53.33333 473.9 486 Y SVCD (2/3 of BT.601 sampling rate)
352 480 6.75 8640/4739 52.14815 355.425 486 Y DVD
352 240 6.75 4320/4739 52.14815 355.425 243 N VCD, DVD
320 240 6 + 3/22 4572/4739 52.14815 324 243 N 1/4 of 640480
59.94 Hz is only a conventional approximation; the mathematically exact field rate is 60 Hz * 1000/1001.
A calculated sampling rate, represented here only for completeness. Does not exist in actual 525/625 video equipment.
Only used for still images.

3.1 How to use the table for conversions

Let's assume you have a video clip in one format and wish to convert it to another, so that it remains in correct aspect ratio throughout the process.

  1. Locate your source and target formats in the table.
  2. Calculate the vertical conversion factor by using the following formula: vertical_conversion_factor = target_active_picture_height / source_active_picture_height. (Be sure to use the active picture values from the table, not the sampling matrix size values.)
  3. Calculate the horizontal conversion factor: horizontal_conversion_factor = (source_aspect_ratio) / (destination_aspect_ratio) * (vertical_conversion_factor)
  4. Calculate the new horizontal size: target_sampling_matrix_width = horizontal_conversion_factor * source_sampling_matrix_width
  5. Calculate the new vertical size: target_sampling_matrix_height = vertical_conversion_factor * source_sampling_matrix_height
  6. Resample the image to the new size
  7. Check if the new size matches the target resolution's sampling matrix dimensions. If not, crop (i.e. cut at the edges) and pad (i.e., add black borders) accordingly so that it will.

3.2 Some practical examples of the above

3.2.1 640480 "industry standard" square pixels to 720480 ITU-R BT.601 pixels

Let's say I have captured a video clip from 525/59.94 source using an old M-JPEG card that only allows sampling in "industry standard" (12+3/11 MHz) square pixel format. The resolution of the clip is 640480. Now I would like to incorporate this into a DV project that uses ITU-R BT.601 pixels and a resolution of 720480.

  1. The first step is to look up the correct source and target formats from the table.
  2. The second step is to calculate the vertical conversion factor. In our case, it is 486/486 = 1
  3. Now we need a horizontal rescaling factor, which in this case is (4752/4739) / (4320/4739) * 1 which equals to 11/10.
  4. Then we can calculate the new image width from the old one: 11/10 * 640 = 704 pixels
  5. The image height will stay unchanged, since 1 * 480 is still 480.
  6. Thus, we need to resample the 640480 image to 704480.
  7. However, our original target resolution was 720480. Now we need to pad the image (with black vertical bars on the side) so that the frame width will become 720 pixels. A natural conclusion is that we need to add 8 pixels black to both side edges.

3.2.2 720576 ITU-R BT.601 pixels to 720480 ITU-R BT.601 pixels

In other words, a "PAL" to "NTSC" conversion:

  1. Again, the first step is to look up the correct source and target formats from the table.
  2. We need to alculate the vertical conversion factor. In our case, it is 486/576 = 27/32
  3. Now we need a horizontal rescaling factor, which in our case is (128/117) / (4320/4739) * (27/32) which equals to 4739/4680.
  4. Then we can calculate the new image width from the old one: 4739/4680 * 720 = 729+1/13 pixels
  5. The new image height will be 27/32 * 576 = 486 pixels.
  6. Thus, we need to resample the 720576 image to (729+1/13)486. As we normally cannot use subpixel sampling, we must round the figure 729+1/13 to some reasonable number - in this case probably 729.
  7. However, our original target resolution was 720480. Now we need to crop the 729486 image sufficiently from the edges so that the frame width will become 720 pixels and frame height 480 pixels.

4. Frequently Argued Questions

4.1 Isn't 720 the real width of a 4:3 image? If not, then why are 720 pixels sampled instead of 711 or 702 (or whatever)?

720 pixels are sampled to allow for little deviation from the ideal timing values for blanking and active line lenght in analog signal. In practice, analog video signal - especially if coming from a wobbly home video tape recorder - can never be that precise in timing. It is useful to have a little headroom for digitizing all of the signal even if it is of a bit shoddy quality or otherwise non-standard.

720 pixels are also sampled to make it sure that the signal-to-be-digitized has had the time to slope back to blanking level at the both ends. (This is to avoid nasty overshooting or ringing effects, comparable to the clicks and pops you can hear at the start and end of an audio sample.)

Last but not least, 720 pixels are sampled because a common sampling rate (13.5 MHz) and amount of samples per line (720) makes it easier for the hardware manufactures to design multi-standard digital video equipment.

4.2 What does this mean, considering ITU-R BT.601 compliant equipment?

It means that the sampled horizontal range of the signal is a bit wider than the actual active image frame:

Yes, you understood correctly. 720x576 is not exactly 4:3, and neither is 720x480. The real 4:3 frame (as defined in the analog video standards) is a bit narrower than the horizontal range of signal that actually gets digitized.

Yes, it is the same for all generally available digitizing equipment; tv tuner cards, digital video cameras and such. It is true even for all-digital systems; otherwise they would not be compatible with ITU-R BT.601.

4.3 You must be kidding! I am pretty sure there is a mistake in your calculations. It says everywhere that 720576 or 720480 really is 4:3. Please stop propagating this misinformation!

I admit that the figures presented on this web site are not very well-known facts even amongst professional videographers, not to mention hobbyists. Aspect ratio is one of the most misunderstood "black magic" issue in digital video. That is precisely why I constructed the web site in the first place - to share the knowledge.

As for my calculations; feel free to prove them wrong. For starters, you might want to read the documents in the Related Links section.

4.4 I have been doing digital video projects for the last 50 years. I know my stuff! If you were correct, everything I have done to process my precious video has always been wrong, aspect-ratio wise!

That may very well be the sad truth. Fortunately, even if you had used wrong methods for scaling/resampling the image, the difference between the correct aspect ratio and a wrong aspect ratio is often small enough to go unnoticed unless you really start looking for it.

4.5 It still does not make any sense. For starters, all the 525/59.94 equipment I have only works in 720480, not in 720486 (and definitely not in 711486)! How do you explain that?

525/59.94 video signal has 486 active (image-carrying) scanlines, but modern digital video equipment usually crops 6 of them off. Why? To get the height of the image down to 480 pixels, which is neatly divisible by 16. See for yourself:

Also note that 720 / 16 equals exactly to 45 so the width of the image is divisible by 16, as well!

4.5.1 Why is it important to have the height and width of the raster image divisible by 16?

Modern digital video applications such as DV, DVD and digital television (DVB, ATSC) often use MPEG-1 or MPEG-2 formats (or their derivatives) which are all based on 1616 pixel macroblocks. Having the height and width of the image readily divisible by 16 makes it easier and more efficient for an MPEG encoder to compress video.

4.5.2 Doesn't this mean that when capturing in 720480, I will lose six scanlines worth of valuable information that was once present in the original video signal?

Correct, but the information might not have been that valuable in the first place. Most 525/59.94 video work is already done solely in the digital domain and in the 720480 format, so there is usually nothing to digitize on those scanlines anymore. Moreover, in the good old days (when all of those 486 scanlines were still in active use) most of the time the edges only carried flickering VCR head noise.

The video image is masked by the overscan edges of a CRT based television, so you would not normally see the "missing" scanlines, anyway.

4.5.3 You keep saying the "real" 4:3 resolution is at about 711486 for 525/59.94 systems. OK, maybe there really are 9 extra pixels on the sides, but how do I cope with the fact my equipment only records 480 active scanlines, not 486?

Think it this way:

There is also another way of thinking it:

The latter way of thinking will also lead to cropping off the side edges of the image to get it inside a 4:3 rectangle (albeit a bit smaller than the "real" one), but then again, if you are restricted to using 704480, that decision has already pretty much been made for you.

4.6 What about standards conversion? Doesn't PAL 720576 exactly equal to NTSC 720480?

As can be seen from the example in section 3.2.2, the answer is no. If you simply resample from 720576 to 720480, the analog active areas of the source and target formats will not match. Fortunately, there is a bit fool-proofness built-in to the relationship of these two frame sizes. What you will actually get from the process is an image in which the original analog active area (702576 centermost pixels of 720576) has become 702480 in the target format's pixels. This, in turn, almost represents a 4:3 area, albeit a bit smaller than what would be needed for a perfect conversion.

The area that 702480 covers is not the same as the actual analog active image frame (which would be 710.85486, or, in practical terms, 711486). It is more like a smaller 4:3 frame inside it.

In other words, the result is that the active 4:3 image frame in the source format has shrunk a bit in the conversion: it has lost six (target) scanlines in vertical direction and the same relative amount of width. However, for all practical purposes, it has still retained its original aspect ratio. The easiest way to see this is converting 702480 (in 13.5 MHz 525-line ITU-R BT.601 format) to "true" square pixels: 639 + 4419/4739 square pixels by 480 scanlines is a close enough match to 640480, which is 4:3. Wonderful coincidence, isn't it? :)

The same peculiar relationship applies to all 525/625 "sister resolutions" derived from 13.5 MHz:

This holds true on two conditions:

  1. The source sampling matrix width (in microseconds) must be exactly the same as the target's.
  2. You can only convert between a full-height 625-line resolution and a cropped-height 525-line resolution (i.e. use only those formats that represent exactly 480 scanlines worth of 525/60 data, instead of full 486.)

As direct resampling involves shrinkage (or when going in another direction, enlargement), I cannot really recommend this method for any real standards conversion work. It is more like a quick hack, suitable for use e.g. if the software does not allow proper resizing and cropping.

Note: Many people use direct resampling for all the wrong reasons: 1) They think that a 720480 frame directly equals to a 720576 frame. 2) They also think that both aforementioned frame sizes represent exactly the active 4:3 (or 16:9) picture area, edge to edge. As you already know from Section 2.1, both of these assumptions are wrong. The fact that direct resampling works at all is mostly a quirky coincidence

4.7 What do you mean by saying it is better to avoid 720540?

The problem with this resolution is that while you think you are editing in a format that is both 1) 4:3 square pixels and 2) easily convertable to a standard video resolution (either 720576 or 720480) just by vertical resampling, you are not. See the table. There is no real world video format that would use full 720 pixel horizontal range as the width of the active 4:3 frame.

In order to get to a standard video format from this one, you need to take in account the actual form of the sampling matrices. The 4:3 area in 625-line formats is 702576, not 720576. In 525-line formats it is 711486, not 720480. Resizing a 720 pixels wide 4:3 format directly to 720576 or 720480 simply won't work. You will either have to resample in both directions (unlike you originally thought, you do not get to keep the image width neatly as 720 pixels at all times), or to crop some top and bottom lines off.

If you need to construct an intermediary square-pixel resolution that is a) exactly 720 pixels wide and b) covers exactly the same area as 720x576 or 720x480 (thus only having to resample in vertical direction for conversions), you will end up with two separate resolutions, one for each video standard:

Fortunately, the numbers will nicely round up to 720527 for both standards.

Note that the original interlaced field structure (if any) will go haywire as you mess around scaling in the vertical direction.

4.8 Why does your table list two slightly different definitions for square pixels?

"Square pixels", as digitized by a TV tuner or an M-JPEG card, are not exactly square. The "industry standard" sampling rates used in square-pixel video equipment actually give out pixels that are almost square, but not exactly. As you can see for yourself in the table, the difference is very small - for all practical purposes meaningless - but it is still useful to know that sampled "video" square-pixels differ a bit from ideal "computer" square pixels.

Converting "computer" square pixels to "video" square pixels is usually a futile effort. You will not see the difference, anyway, and probably only lose some quality in the interpolation process.

4.9 This is really scary and nasty stuff. I thought digital video was simple! Now my head hurts!

But that's just the way video is. Fortunately, the conversions are not really that complicated once you practice them a little.

4.10 I think you're just nit-picking. No-one will ever notice if I consider all "4:3" video formats just 4:3, without doing any complicated aspect ratio or "active image area" calculations.

Feel free to process your video just the way you like it. But there are still many people who would like to get as close to the ideal aspect ratio correctness as possible, instead of only using rough "ballpark figures" in their video work.

4.11 Help! My capture card does not seem to do it this way!

You may be correct. The professional video gear is very strict about conforming to the ITU-R BT.601 standard, and you can also generally trust DV camcorders and DVD players/recorders using the correct sampling rates and pixel clocks. However, the PC hardware market is different: cheap mass-marketed tv tuner cards and "tv out" cards ofter seem to have these design flaws and inaccuracies in their drivers: sometimes they are using the common, industry-standard frame formats (such as 720480) with sampling rates that are just plain wrong or sufficiently off the mark to create problems.

It is usually not the hardware that is the culprit here – the chips on the card may be perfectly capable of producing images (or digitizing them) using exactly the correct sampling rates and pixel clocks, but the programmer who designed the driver that controls the hardware may have taken some special liberties and shortcuts, leading to inaccuracies. (Possibly the drivers for these problematic devices were designed by someone who has not studied the relevant video standards.)

Fortunately, you can check out your devices and, if necessary, calibrate your capture workflow by following these instructions.  (The only way you can find out these flaws for sure is comparing test images as detailed in the above link, or using a test card generator and an oscilloscope.)

5. Related Links


[Back] This page is maintained by Jukka Aho. Last updated: 15-Jan-2008