|2 channels of LPCM audio, each signed 16-bit values sampled at 44100 Hz|
|up to 74–80 minutes (up to 24 minutes for mini 8 cm CD)|
|Semiconductor laser (780 nm wavelength)|
|Sony & Philips|
Compact Disc Digital Audio (CDDA or CD-DA) is the standard format for audio compact discs. The standard is defined in the Red Book, one of a series of "Rainbow Books" (named for their binding colors) that contain the technical specifications for all Compact Disc formats.
The Red Book specifies the physical parameters and properties of the CD, the optical "stylus" parameters, deviations and error rate, modulation system (eight-to-fourteen modulation, EFM) and error correction facility (cross-interleaved Reed–Solomon coding, CIRC), and the eight subcode channels. These parameters are common to all compact discs and used by all logical formats, such as CD-ROM. The standard also specifies the form of digital audio encoding: 2-channel signed 16-bit Linear PCM sampled at 44,100 Hz. Although rarely used, the specification allows for discs to be mastered with a form of emphasis.
The first edition of the Red Book was released in 1980 by Philips and Sony; it was adopted by the Digital Audio Disc Committee and ratified by the International Electrotechnical Commission Technical Committee 100, as an International Standard in 1987 with the reference IEC 60908. The second edition of IEC 60908 was published in 1999 and it cancels and replaces the first edition, amendment 1 (1992) and the corrigendum to amendment 1. The IEC 60908 however does not contain all the information for extensions that is available in the Red Book, such as the details for CD-Text, CD+G and CD+EG.
The standard is not freely available and must be licensed. It is available from Philips and the IEC. As of 2013, Philips outsources licensing of the standard to Adminius, which charges US$100 for the Red Book, plus US$50 each for the Subcode Channels R-W and CD Text Mode annexes.
The sampling rate is adapted from that attained when recording digital audio on a PAL (or NTSC) videotape with a PCM adaptor, an earlier way of storing digital audio. An audio CD can represent frequencies up to 22.05 kHz, the Nyquist frequency of the 44.1 kHz sample rate.
The selection of the sample rate was based primarily on the need to reproduce the audible frequency range of 20–20,000 Hz (20 kHz). The Nyquist–Shannon sampling theorem states that a sampling rate of more than twice the maximum frequency of the signal to be recorded is needed, resulting in a required rate of at least 40 kHz. The exact sampling rate of 44.1 kHz was inherited from a method of converting digital audio into an analog video signal for storage on U-matic video tape, which was the most affordable way to transfer data from the recording studio to the CD manufacturer at the time the CD specification was being developed. The device that converts an analog audio signal into PCM audio, which in turn is changed into an analog video signal is called a PCM adaptor. This technology could store six samples (three samples per stereo channel) in a single horizontal line. 60 field/s black and white video (not 59.94 color) was required, and in NTSC countries (USA/Japan) that video signal has 245 usable lines per field, which works out to be (245 * 60 * 3) = 44,100 samples/s/stereo channel. Similarly, PAL has 294 lines and 50 fields, which gives 44,100 samples/s/stereo channel. This system could store 14-bit samples with some error correction, or 16-bit samples with almost no error correction.
There was a long debate over the use of 14-bit (Philips) or 16-bit (Sony) quantization, and 44,056 or 44,100 samples/s (Sony) or approximately 44,000 samples/s (Philips). When the Sony/Philips task force designed the Compact Disc, Philips had already developed a 14-bit D/A converter (DAC), but Sony insisted on 16-bit. In the end, 16 bits and 44.1 kilosamples per second prevailed. Philips found a way to produce 16-bit quality using its 14-bit DAC by using four times oversampling.
Some CDs are mastered with pre-emphasis, an artificial boost of high audio frequencies. The pre-emphasis improves the apparent signal-to-noise ratio by making better use of the channel's dynamic range. On playback, the player applies a de-emphasis filter to restore the frequency response curve to an overall flat one. Pre-emphasis time constants are 50µs and 15µs (9.49 dB boost at 20 kHz), and a binary flag in the disc subcode instructs the player to apply de-emphasis filtering if appropriate. Playback of such discs in a computer or 'ripping' to wave files typically does not take into account the pre-emphasis, so such files play back with a distorted frequency response.
Storage capacity and playing time
The creators of the CD originally aimed at a playing time of 60 minutes with a disc diameter of 100 mm (Sony) or 115 mm (Philips). Sony vice-president Norio Ohga suggested extending the capacity to 74 minutes to accommodate the recording of Wilhelm Furtwängler conducting Ludwig van Beethoven's Ninth Symphony at the 1951 Bayreuth Festival. The additional 14-minute playing time subsequently required changing to a 120 mm disc. Kees Schouhamer Immink, Philips' chief engineer, however, denies this, claiming that the increase was motivated by technical considerations, and that even after the increase in size, the Furtwängler recording would not have fit on one of the earliest CDs.
According to a Sunday Tribune interview, the story is slightly more involved. In 1979, Philips owned PolyGram, one of the world's largest distributors of music. PolyGram had set up a large experimental CD plant in Hannover, Germany, which could produce huge numbers of CDs having a diameter of 115 mm. Sony did not yet have such a facility. If Sony had agreed on the 115-mm disc, Philips would have had a significant competitive edge in the market. The long playing time of Beethoven's Ninth Symphony imposed by Ohga was used to push Philips to accept 120 mm, so that Philips' PolyGram lost its edge on disc fabrication.
The 74-minute playing time of a CD, which is longer than the 22 minutes per side typical of long-playing (LP) vinyl albums, was often used to the CD's advantage during the early years when CDs and LPs vied for commercial sales. CDs would often be released with one or more bonus tracks, enticing consumers to buy the CD for the extra material. However, attempts to combine double LPs onto one CD occasionally resulted in the opposite situation in which the CD would instead offer fewer tracks than the LP.
Playing times beyond 74 minutes are achieved by decreasing track pitch (the distance separating the track as it spirals the disc) in violation of strict Red Book standards. However, most players can still accommodate the more closely spaced data if it is still within Red Book tolerances. Current manufacturing processes allow an audio CD to contain up to 80 minutes (variable from one replication plant to another) without requiring the content creator to sign a waiver releasing the plant owner from responsibility if the CD produced is marginally or entirely unreadable by some playback equipment. In current practice, maximum CD playing time has crept higher by reducing minimum engineering tolerances.
This table shows the progression in the maximum duration of released audio CDs:
|Mission of Burma (compilation)||Mission of Burma||Rykodisc||1988||80:08|
|Proclamation (bass trombone recital)||Douglas Yeo with Black Dyke Band||Doyen DOY CD 055||1996||80:17|
|Tchaikovsky's The Nutcracker||Kirov Orchestra cond. Valery Gergiev||Philips/Polygram 462 114||1998||81:14|
|Bruckner's Fifth (live)||Munich Philharmonic cond. Christian Thielemann||Deutsche Grammophon||2004||82:34|
|Chopin & Schumann Etudes||Valentina Lisitsa||Decca||2014||85:16|
|Mozart violin concertos||Various artists (Mozart 225 box set)||Decca / Deutsche Grammophon||2016||86:30|
Each audio sample is a signed 16-bit two's complement integer, with sample values ranging from −32768 to +32767. The source audio data is divided into frames, containing twelve samples each (six left and right samples, alternating), for a total of 192 bits (24 bytes) of audio data per frame.
This stream of audio frames, as a whole, is then subjected to CIRC encoding, which segments and rearranges the data and expands it with parity bits in a way that allows occasional read errors to be detected and corrected. CIRC encoding also interleaves the audio frames throughout the disc over several consecutive frames so that the information will be more resistant to burst errors. Therefore, a physical frame on the disc will actually contain information from multiple logical audio frames. This process adds 64 bits of error correction data to each frame. After this, 8 bits of subcode or subchannel data are added to each of these encoded frames, which is used for control and addressing when playing the CD.
CIRC encoding plus the subcode byte generate 33-bytes long frames, called "channel-data" frames. These frames are then modulated through eight-to-fourteen modulation (EFM), where each 8-bit word is replaced with a corresponding 14-bit word designed to reduce the number of transitions between 0 and 1. This reduces the density of physical pits on the disc and provides an additional degree of error tolerance. Three "merging" bits are added before each 14-bit word for disambiguation and synchronization. In total there are 33 × (14 + 3) = 561 bits. A 27-bit word (a 24-bit pattern plus 3 merging bits) is added to the beginning of each frame to assist with synchronization, so the reading device can locate frames easily. With this, a frame ends up containing 588 bits of "channel data" (which are decoded to only 192 bits music).
The frames of channel data are finally written to disc physically in the form of pits and lands, with each pit or land representing a series of zeroes, and with the transition points—the edge of each pit—representing 1. A Red Book-compatible CD-R has pit-and-land-shaped spots on a layer of organic dye instead of actual pits and lands; a laser creates the spots by altering the reflective properties of the dye.
The audio data stream in an audio CD is continuous, but has three parts. The main portion, which is further divided into playable audio tracks, is the program area. This section is preceded by a lead-in track and followed by a lead-out track. The lead-in and lead-out tracks encode only silent audio, but all three sections contain subcode data streams.
The lead-in's subcode contains repeated copies of the disc's Table Of Contents (TOC), which provides an index of the start positions of the tracks in the program area and lead-out. The track positions are referenced by absolute timecode, relative to the start of the program area, in MSF format: minutes, seconds, and fractional seconds called frames. Each timecode frame is one seventy-fifth of a second, and corresponds to a block of 98 channel-data frames—ultimately, a block of 588 pairs of left and right audio samples. Timecode contained in the subchannel data allows the reading device to locate the region of the disc that corresponds to the timecode in the TOC. The TOC on discs is analogous to the partition table on hard drives. Nonstandard or corrupted TOC records are abused as a form of CD/DVD copy protection, in e.g. the key2Audio scheme.
The largest entity on a CD is called a track. A CD can contain up to 99 tracks (including a data track for mixed mode discs). Each track can in turn have up to 100 indexes, though players which handle this feature are rarely found outside of pro audio, particularly radio broadcasting. The vast majority of songs are recorded under index 1, with the pre-gap being index 0. Sometimes hidden tracks are placed at the end of the last track of the disc, often using index 2 or 3. This is also the case with some discs offering "101 sound effects", with 100 and 101 being indexed as two and three on track 99. The index, if used, is occasionally put on the track listing as a decimal part of the track number, such as 99.2 or 99.3. (Information Society's Hack was one of very few CD releases to do this, following a release with an equally obscure CD+G feature.) The track and index structure of the CD were carried forward to the DVD format as title and chapter, respectively.
Tracks, in turn, are divided into timecode frames (or sectors), which are further subdivided into channel-data frames.
Frames and timecode frames
The smallest entity in a CD is a channel-data frame, which consists of 33 bytes and contains six complete 16-bit stereo samples: 24 bytes for the audio (two bytes × two channels × six samples = 24 bytes), eight CIRC error-correction bytes, and one subcode byte. As described in the "Data encoding" section, after the EFM modulation the number of bits in a frame totals 588.
On a Red Book audio CD, data is addressed using the MSF scheme, with timecodes expressed in minutes, seconds and another type of frames (mm:ss:ff), where one frame corresponds to 1/75th of a second of audio: 588 pairs of left and right samples. This timecode frame is distinct from the 33-byte channel-data frame described above, and is used for time display and positioning the reading laser. When editing and extracting CD audio, this timecode frame is the smallest addressable time interval for an audio CD; thus, track boundaries only occur on these frame boundaries. Each of these structures contains 98 channel-data frames, totaling 98 × 24 = 2,352 bytes of music. The CD is played at a speed of 75 frames (or sectors) per second, thus 44,100 samples or 176,400 bytes per second.
In the 1990s, CD-ROM and related Digital Audio Extraction (DAE) technology introduced the term sector to refer to each timecode frame, with each sector being identified by a sequential integer number starting at zero, and with tracks aligned on sector boundaries. An audio CD sector corresponds to 2,352 bytes of decoded data. The Red Book does not refer to sectors, nor does it distinguish the corresponding sections of the disc's data stream except as "frames" in the MSF addressing scheme.
The following table shows the relation between tracks, timecode frames (sectors) and channel-data frames:
|Timecode frame or sector 1 (2,352 b of data)||Timecode frame or sector 2 (2,352 b of data)||...|
|Channel-data frame 1 (24 b of data)||...||Channel-data frame 98 (24 b of data)||...||...|
The audio bit rate for a Red Book audio CD is 1,411,200 bits per second or 176,400 bytes per second; 2 channels × 44,100 samples per second per channel × 16 bits per sample. Audio data coming in from a CD is contained in sectors, each sector being 2,352 bytes, and with 75 sectors containing 1 second of audio. For comparison, the bit rate of a "1×" CD-ROM is defined as 2,048 bytes per sector × 75 sectors per second = 153,600 bytes per second. The remaining 304 bytes in a sector are used for additional data error correction.
Data access from computers
Unlike on a DVD or CD-ROM, there are no "files" on a Red Book audio CD; there is only one continuous stream of LPCM audio data, and a parallel, smaller set of 8 subcode data streams. Computer operating systems, however, may provide access to an audio CD as if it contains files. For example, Windows represents the CD's Table of Contents as a set of Compact Disc Audio track (CDA) files, each file containing indexing information, not audio data.
In a process called ripping, digital audio extraction software can be used to read CD-DA audio data and store it in files. Common audio file formats for this purpose include WAV and AIFF, which simply preface the LPCM data with a short header; FLAC, ALAC, and Windows Media Audio Lossless, which compress the LPCM data in ways that conserve space yet allow it to be restored without any changes; and various lossy, perceptual coding formats like MP3 and AAC, which modify and compress the audio data in ways that irreversibly change the audio, but that exploit features of human hearing to make the changes difficult to discern.
Recording publishers have created CDs that violate the Red Book standard. Some do so for the purpose of copy prevention, using systems like Copy Control. Some do so for extra features such as DualDisc, which includes both a CD layer and a DVD layer whereby the CD layer is much thinner, 0.9 mm, than required by the Red Book, which stipulates a nominal 1.2 mm, but at least 1.1 mm. Philips and many other companies have stated that including the Compact Disc Digital Audio logo on such non-conforming discs may constitute trademark infringement. Either in anticipation or in response, recent copy-protected CDs bear stickers and warnings that the CD is not standard and may not play in all CD players, and no longer display the long-familiar logo.
Super Audio CD was a standard published in 1999 that aimed to provide better audio quality in CDs, but it never became very popular. DVD Audio, an advanced version of the audio CD, emerged in 1999. The format was designed to feature audio of higher fidelity. It applies a higher sampling rate and used 650 nm lasers.
There have been moves by the recording industry to make audio CDs (Compact Disc Digital Audio) unplayable on computer CD-ROM drives, to prevent the copying of music. This is done by intentionally introducing errors onto the disc that the embedded circuits on most stand-alone audio players can automatically compensate for, but which may confuse CD-ROM drives. Consumer rights advocates as of October 2001 pushed to require warning labels on compact discs that do not conform to the official Compact Disc Digital Audio standard (often called the Red Book) to inform consumers which discs do not permit full fair use of their content.
In 2005, Sony BMG Music Entertainment was criticised when a copy protection mechanism known as Extended Copy Protection (XCP) used on some of their audio CDs automatically and surreptitiously installed copy-prevention software on computers (see 2005 Sony BMG CD copy protection scandal). Such discs are not legally allowed to be called CDs or Compact Discs because they break the Red Book standard governing CDs, and Amazon.com for example describes them as "copy protected discs" rather than "compact discs" or "CDs".