|
The picture formats
and their resolution supported in H263 video are SQCIF at 128
* 96 pixels and QCIF at 176 * 144 pixels. As options, it also
has CIF, QCIF, 4CIF and 8CIF that are 2, 4, 8 and 16 times the
resolution of QCIF respectively. The compression ratio of H.263
ranges from 1:1 up to 133:1 and depends on the quality required.
[87]
H.263 is an ITU
standard, which support video compression (coding) and decompression
(decoding) for video-conferencing and video-telephony applications.
H.263 is designed
for video coding at a bit rates around 20-30kbps or above.
It specifies the
requirements and the data contents and format for a video encoder
and decoder [16].
Originally H.263
was a recommendation of ITU for Very Low Bit rate encoding such
as video telephony on normal analogue telephone lines (PSTN video
telephony) with the bit rates below 28.8 Kbps. Later on, it has
been used for a wide range of bit rates, not only just low bitrate
applications but also as a replacement of H.261.
There are several
data compression and decompression strategies used in H.263. The
following gives a quick introduction to this method.
Strategies implemented
in the encoding process are Motion estimation and compensation,
Discrete Cosine Transform (DCT),
Quantisation , Entropy
encoding and Frame store.

Figure 2-9 H.263
video encoder block diagram [16]
Motion estimation
and compensation: To reduce the bandwidth, the current frame
is compared with the previous frame. The similar part is ignored.
Only the differences are encoded. The current frame is divided
into 16 * 16 pixel blocks. Each block is compared with its surrounding
in the previous frame in order to determine where it came from.
If a match is found, the motion is recorded and this block is
reduced from the current frame. After this motion estimation and
compensation, the current frame retains just some "residual" information.
Discrete Cosine
Transform (DCT): These "residual" blocks then are compressed
using a two-dimensional Discrete Cosine Transform (DCT) that converts
the 2-dimensional data in the block into a series of coefficients.
H.263 uses an 8 * 8 DCT to discretise the 8 * 8 blocks of original
pixels or motion-compensated difference pixels and to compact
their energy into as few coefficients as possible [16]
[17].
The detail is described in the section of MPEG video in 2.1.1
Video codecs.
Quantisation:
This step is required to discard the data from high frequency
changes, which is not perceived by the humans eye[17].
The converted coefficients are then divided by a scale factor.
Just the coefficients with significant values are kept. Other
coefficients have their values set to "0". This process
will cause information loss.
Entropy
encoding: Known as Huffman encoding, ensures that after quantisation,
the frequently-occurring values of the coefficients are replaces
with short binary codes and infrequently-occurring values are
replaces with longer binary codes. Then a sequence of variable-length
binary codes is generated.
Frame
store: These binary codes, now, are re-scaled, inverse
transformed using an Inverse Discrete Cosine Transform and are
combined with synchronisation and control information (motion
"vectors" for instance) to form the encoded H.263 bitstream. The
contents of this frame are stored and will be used to determine
the best matching area for motion compensation by the motion estimator
when the next frame is encoded.
Strategies implemented
in the decoding process are entropy decoding, re-scaling, inverse
discrete cosine transform and motion compensation.

Figure 2-10 H.263
video decoder block diagram [16]
Entropy decoding:
This is the process that extracts the coefficient values from
the variable-length codes in the H.263 bitstream and, also, extracts
motion vector information as well.
Rescaling: The
"reverse" of quantisation. The coefficients are scaled back to
their original values. As the insignificant coefficients were
ignored during quantisation, some information from the block being
encoded can no longer be rebuilt.
Inverse Discrete
Cosine Transform (IDCT): Each micro block of samples
is recreated by reversing the DCT operation. These blocks are
the differences between the current frame and the previous frame.
Motion compensation:
Based on the previous frame, using the differences
of values in each micro block and their motion vector information,
the values of the current frame can be reconstructed. Of course
this frame is not exactly the same as the original one because
there is data lost during encoding. The reconstructed frame is
placed in a frame store and it is used to motion-compensate the
next received frame.
Real-time video
communications: This is a practical issue which varies with
different environments. To realise a real-time communication,
some extra controls should be considered. Those are bit rate control,
synchronisation and audio multiplexing.
Firstly, the bandwidth
of the network used for the communication normally is fixed. As
the contents of video stream vary over time, the size of the encoded
data in each second is not fixed. To ensure the maximum size of
the encoded data per second does not exceed the existing bandwidth
and also, to keep the encoded data to be a constant bit rate for
convenience, a bit rate control is added into the quantisation
section of the encoding process. If the output bit rate of the
encoder is too high, the compression ratio will be increased by
increasing the quantiser scale factor. Of course this will give
a worse quality image at the decoder. On the other end, if the
bit rate is too low, the compression ratio will be decreased by
decreasing the quantiser scale factor. Then, the quality of the
image will be increased, as well as its bit rate.
Second, to guarantee
the synchronisation of the video stream, there are many headers
or markers in the encoded stream which record the position of
the current data in the frame and the time code of the frame.
The decoder will check each header to make sure that the video
stream is in the proper order and do some adjustments if necessary.
Audio multiplexing:
H.263 is only a video coding description. For the audio codec,
please see the next section for H.723 audio. For the synchronisation,
multiplexing and protocol issues with audio stream, the "umbrella"
standards such as H.320 (ISDN-based videoconferencing), H.324
(POTS-based video telephony) and H.323 (LAN or IP-based videoconferencing)
are referred.
There are five
picture formats supported by H.263. They are SQCIF (sub-QCIF)
at 128 * 96 pixels, QCIF at 176 * 144 pixels, CIF, 4CIF and 16CIF.
Where SQCIF are approximately half the resolution of QCIF, 4CIF
and 16CIF are 4 and 16 time of the resolution of CIF respectively.
This causes H.263 to compete with other higher bit-rate video
coding standards such as the MPEG standard. The table below shows
those picture formats supported by H.263 [18].
Table 2-2 Picture
Formats Supported [18]
|
Picture
format
|
Luminance pixels
|
Luminance
lines
|
H.261 support
|
H.263 support
|
Uncompressed bitrate (Mbit/s)
|
|
10 frames/s
|
30 frames/s
|
|
Grey
|
Colour
|
Grey
|
Colour
|
|
SQCIF
|
128
|
96
|
|
Yes
|
1.0
|
1.5
|
3.0
|
4.4
|
|
QCIF
|
176
|
144
|
Yes
|
Yes
|
2.0
|
3.0
|
6.1
|
9.1
|
|
CIF
|
352
|
288
|
Optional
|
Optional
|
8.1
|
12.2
|
24.3
|
36.5
|
|
4CIF
|
704
|
576
|
|
Optional
|
32.4
|
48.7
|
97.3
|
146.0
|
|
8CIF
|
1408
|
1152
|
|
Optional
|
129.8
|
194.6
|
389.3
|
583.9
|
With the current
level of development of computer hardware and network technology,
all aspects in the H.263 described above can be realised to achieve
a "reasonable" video quality by software in real time communication,
using the Pentium family computers.
|