Never Suffer From WHAT IS CODECS? Again
Coder-Decoder has a very particular job
Now when we think about Kodak it is software and hardware that works toward this encoding of the audio or the video signal and the coda can be located in the phone. It can be located in the PBX or the switch itself generally speaking when we talk about codecs. There are two flavors. We’ve got audio codecs and we’ve got video codecs the audio codecs are covered by the AT&T G series or at least the most popular ones. And so I’ve listed some of those here. There are lots and lots of techniques for encoding audio we’re going to talk about one called the wave form technique and at 7-Eleven. But more on that a little bit later.
Encoding of video
There have been a couple of efforts in the encoding of video and these are covered by the I tot h series and by the MPEG standard with the H series going in Asia 2 6 1 8 2 6 3 and a set 2 6 4 and we’ve seen pegs and Big Threes and Big Fours and think for a start 2 6 4 or the 2 standards efforts sort of came together a little bit whether you’re doing audio or video.
We start off with a process called postcode modulation and then we modify it a little bit so it doesn’t really matter what type of technique you’re applying to encoding the audio you’re going to start off with something that looks a lot like postcode modulation. And so that’s the one that we always talk about to explain how we encode and then decode audio signals and this is standardised in I.T. UTI GTA 711.
Sample of the voice signal
There are a couple of parts to the process. First, we have the sample of the voice signal and then we have to quantize it or apply values to these signals based on their amplitude. Now a voice channel or telco system is based on the frequencies that can be generated by a human. So these are I talk a little about these in the chapter in the book. So when we talk about human communication there are two sets of frequencies that we think about.
One is the or one set is the frequency that we can hear and so we can hear up to about twenty thousand hertz maybe a little higher maybe a little lower depending on your hearing loss or how good of hearing you have. And then there are all the sounds that you can generate and the sounds that you can generate are a subset of those that you can hear. So the frequencies that we can generate are about zero to four thousand hertz the channels for communication don’t have to be based on what we can hear they’re based on what we can say.
You take this voice signal that you’re that you’re speaking and you have to now sample it. So now the question is how do we take samples. We’ll
Then you’ve got a lot of extra data that you’re never gonna be able to use. Anyway the right amount of signaling. I’m sorry the right amount of sampling comes from Nai Quist and Claude Shannon and it turns out that the right value is twice the bandwidth of the incoming signal. So if the channel is four thousand hertz we sample about eight thousand times per second. Now here is a sample audio snippet. Turns out that this is just me saying the word hello three times in a row. But if I take a portion of one of these words and expand it that’s what the actual audio signal looks like they’re on the bottom. So now what we have to do is so many times a second we have to figure out what each one of those values is going to be.
So you can imagine a bunch of samples all trying to measure what the analog signal is doing at that particular point in time so here is a grid that I’m going to overlay on top of this audio signal and the red arrows indicate some of the sample values. So all the vertical lines are where I’m going to take sample values and these red arrows having to point out problem areas. So the vertical lines are where we’re going to take our samples and we do this a whole bunch of times a second the horizontal lines are the actual values that we assign to the samples.
So here is the problem. If we have a certain number of horizontal lines and the values don’t fall right on one of those horizontal lines we have to figure out which one of the horizontal lines we want to use to represent the sample. So any value that doesn’t fall right on the line like those indicated by the Red Arrows here is gonna result in what we call quantized errors. So we have a couple of problems when we’re trying to figure out what we’re gonna do with his analog signal. How often do we sample it and then for each sample?
How many bits per sample or how are we going to accurately represent the magnitude of the sample it turns out that with human speech we want to use about 8 bits per sample. This gives us about 256 possible values that we could assign to our individual sample. So 8 bits per sample a thousand samples per second. That’s where a sixty-four thousand bits per second data rate comes from now where we have a basis for understanding how we do our encoding of the audio. We now have to sort of sit back and think about how this really works. All right. So you’re sitting there you’re talking on the phone and you’re an analog signal your analog voice is sampled and then quantized and then the digital data is sent toward the other end. And at the other end, the digital data is reconfigured to recreate the analog signal.
Now I mentioned that there are a lot of different codecs that we could use for this and at 7-Eleven represents this thing called pulse code modulation but pulse code modulation takes that sixty-four thousand most other codecs to take less data rate than that or a lower data rate. So which one do you use? So when you’re making a codec selection you have a couple of questions that you have to ask yourself what are my performance expectations. It turns out that it’s very tough to beat git out 711 in terms of clarity and performance. But as I mentioned it does take sixty-four thousand bits per second of bandwidth. So if you’re somewhat limited and sometimes when connections are sometimes we pick a codec that takes less bandwidth. She got 729 for example. Now one of the other things that you have to consider is the collection of codecs that might be in the system.
So what is the source Codec and what is the terminating codec some codecs worked better with others. So if I encode with Gita at 711 you try to decode Gita 729. What does that mean? Sometimes there are losses between codex and we call those transcoding errors another problem for some Codex is packet loss packet loss makes it very very tough to recreate the analog signal at the other end. So packet loss concealment is a technique that Kodak uses to sort of determine or guess what the data was going to be. If the packet ended and lost but packet loss concealment actually takes a certain amount of time to accomplish and so you’re trying to fix up a packet last problem but you’re creating latency in the codex so it’s sort of a trade-off to decide what you want to do.
Most Codex also includes some form of compression and so we have to ask ourselves how much compression do we actually need. The last thing we’ll talk about today is video usually in video we’re sending a series of pictures and when we’re sending a series of pictures particularly when we start talking about lots and lots of resolution and lots and lots of color depth then we start to get into big bandwidth video is also analog data and so we start with approaches similar to PCM but then we vary. We have variations on a theme.
Get our Tips and Tricks to your Inbox