The WAV (or PCM) audio format is the most basic format for storing audio. WAV files can be of different extended formats , but PCM is the most popular and common. The other formats are A-law and Mu-law. The PCM format stores raw audio data without any compression or conversion, thus leading to the largest file sizes, as compared to other formats like AIFF or MP3 or OGG.
While there are existing libraries in several languages which allow you to work with WAV files, this post is an attempt to understand how to read the WAV file format without any external library. The language used here is C, and has been compiled using GCC under Linux, but it can be easily run under Windows also with minimal modifications. Most likely for VC++ you will have to replace #include <unistd.h> with #include <io.h>
WAV HEADER STRUCTURE
The header structure is 44 bytes long and has the following structure:
Positions | Sample Value | Description |
1 – 4 | “RIFF” | Marks the file as a riff file. Characters are each 1 byte long. |
5 – 8 | File size (integer) | Size of the overall file – 8 bytes, in bytes (32-bit integer). Typically, you’d fill this in after creation. |
9 -12 | “WAVE” | File Type Header. For our purposes, it always equals “WAVE”. |
13-16 | “fmt “ | Format chunk marker. Includes trailing null |
17-20 | 16 | Length of format data as listed above |
21-22 | 1 | Type of format (1 is PCM) – 2 byte integer |
23-24 | 2 | Number of Channels – 2 byte integer |
25-28 | 44100 | Sample Rate – 32 byte integer. Common values are 44100 (CD), 48000 (DAT). Sample Rate = Number of Samples per second, or Hertz. |
29-32 | 176400 | (Sample Rate * BitsPerSample * Channels) / 8. |
33-34 | 4 | (BitsPerSample * Channels) / 8.1 – 8 bit mono2 – 8 bit stereo/16 bit mono4 – 16 bit stereo |
35-36 | 16 | Bits per sample |
37-40 | “data” | “data” chunk header. Marks the beginning of the data section. |
41-44 | File size (data) | Size of the data section. |
Sample values are given above for a 16-bit stereo source. |
It is important to note that the WAV format uses little-endian format to store bytes, so you need to convert the bytes to big-endian in code for the values to make sense.
EDIT: AUG 2020
Thanks to a bug pointed out by Kalpathi Subramanian, the code has been updated to rectify the bug. He has adapted this code into a C++ implementation which is available on https://github.com/BridgesUNCC/bridges-cxx/blob/master/src/AudioClip.h
CODE
The code consists of a header file wave.h which is included in wave.c . Once you compile it and run it, it accepts the path of a wav file from the command line and dumps the structure information including the size of each sample and the total duration of the wav audio.
wave.h
// WAVE file header format struct HEADER { unsigned char riff[4]; // RIFF string unsigned int overall_size ; // overall size of file in bytes unsigned char wave[4]; // WAVE string unsigned char fmt_chunk_marker[4]; // fmt string with trailing null char unsigned int length_of_fmt; // length of the format data unsigned int format_type; // format type. 1-PCM, 3- IEEE float, 6 - 8bit A law, 7 - 8bit mu law unsigned int channels; // no.of channels unsigned int sample_rate; // sampling rate (blocks per second) unsigned int byterate; // SampleRate * NumChannels * BitsPerSample/8 unsigned int block_align; // NumChannels * BitsPerSample/8 unsigned int bits_per_sample; // bits per sample, 8- 8bits, 16- 16 bits etc unsigned char data_chunk_header [4]; // DATA string or FLLR string unsigned int data_size; // NumSamples * NumChannels * BitsPerSample/8 - size of the next chunk that will be read };
wave.c
/** * Read and parse a wave file * **/ #include <unistd.h> #include <stdio.h> #include <string.h> #include <stdlib.h> #include "wave.h" #define TRUE 1 #define FALSE 0 // WAVE header structure unsigned char buffer4[4]; unsigned char buffer2[2]; char* seconds_to_time(float seconds); FILE *ptr; char *filename; struct HEADER header; int main(int argc, char **argv) { filename = (char*) malloc(sizeof(char) * 1024); if (filename == NULL) { printf("Error in mallocn"); exit(1); } // get file path char cwd[1024]; if (getcwd(cwd, sizeof(cwd)) != NULL) { strcpy(filename, cwd); // get filename from command line if (argc < 2) { printf("No wave file specifiedn"); return; } strcat(filename, "/"); strcat(filename, argv[1]); printf("%sn", filename); } // open file printf("Opening file..n"); ptr = fopen(filename, "rb"); if (ptr == NULL) { printf("Error opening filen"); exit(1); } int read = 0; // read header parts read = fread(header.riff, sizeof(header.riff), 1, ptr); printf("(1-4): %s n", header.riff); read = fread(buffer4, sizeof(buffer4), 1, ptr); printf("%u %u %u %un", buffer4[0], buffer4[1], buffer4[2], buffer4[3]); // convert little endian to big endian 4 byte int header.overall_size = buffer4[0] | (buffer4[1]<<8) | (buffer4[2]<<16) | (buffer4[3]<<24); printf("(5-8) Overall size: bytes:%u, Kb:%u n", header.overall_size, header.overall_size/1024); read = fread(header.wave, sizeof(header.wave), 1, ptr); printf("(9-12) Wave marker: %sn", header.wave); read = fread(header.fmt_chunk_marker, sizeof(header.fmt_chunk_marker), 1, ptr); printf("(13-16) Fmt marker: %sn", header.fmt_chunk_marker); read = fread(buffer4, sizeof(buffer4), 1, ptr); printf("%u %u %u %un", buffer4[0], buffer4[1], buffer4[2], buffer4[3]); // convert little endian to big endian 4 byte integer header.length_of_fmt = buffer4[0] | (buffer4[1] << 8) | (buffer4[2] << 16) | (buffer4[3] << 24); printf("(17-20) Length of Fmt header: %u n", header.length_of_fmt); read = fread(buffer2, sizeof(buffer2), 1, ptr); printf("%u %u n", buffer2[0], buffer2[1]); header.format_type = buffer2[0] | (buffer2[1] << 8); char format_name[10] = ""; if (header.format_type == 1) strcpy(format_name,"PCM"); else if (header.format_type == 6) strcpy(format_name, "A-law"); else if (header.format_type == 7) strcpy(format_name, "Mu-law"); printf("(21-22) Format type: %u %s n", header.format_type, format_name); read = fread(buffer2, sizeof(buffer2), 1, ptr); printf("%u %u n", buffer2[0], buffer2[1]); header.channels = buffer2[0] | (buffer2[1] << 8); printf("(23-24) Channels: %u n", header.channels); read = fread(buffer4, sizeof(buffer4), 1, ptr); printf("%u %u %u %un", buffer4[0], buffer4[1], buffer4[2], buffer4[3]); header.sample_rate = buffer4[0] | (buffer4[1] << 8) | (buffer4[2] << 16) | (buffer4[3] << 24); printf("(25-28) Sample rate: %un", header.sample_rate); read = fread(buffer4, sizeof(buffer4), 1, ptr); printf("%u %u %u %un", buffer4[0], buffer4[1], buffer4[2], buffer4[3]); header.byterate = buffer4[0] | (buffer4[1] << 8) | (buffer4[2] << 16) | (buffer4[3] << 24); printf("(29-32) Byte Rate: %u , Bit Rate:%un", header.byterate, header.byterate*8); read = fread(buffer2, sizeof(buffer2), 1, ptr); printf("%u %u n", buffer2[0], buffer2[1]); header.block_align = buffer2[0] | (buffer2[1] << 8); printf("(33-34) Block Alignment: %u n", header.block_align); read = fread(buffer2, sizeof(buffer2), 1, ptr); printf("%u %u n", buffer2[0], buffer2[1]); header.bits_per_sample = buffer2[0] | (buffer2[1] << 8); printf("(35-36) Bits per sample: %u n", header.bits_per_sample); read = fread(header.data_chunk_header, sizeof(header.data_chunk_header), 1, ptr); printf("(37-40) Data Marker: %s n", header.data_chunk_header); read = fread(buffer4, sizeof(buffer4), 1, ptr); printf("%u %u %u %un", buffer4[0], buffer4[1], buffer4[2], buffer4[3]); header.data_size = buffer4[0] | (buffer4[1] << 8) | (buffer4[2] << 16) | (buffer4[3] << 24 ); printf("(41-44) Size of data chunk: %u n", header.data_size); // calculate no.of samples long num_samples = (8 * header.data_size) / (header.channels * header.bits_per_sample); printf("Number of samples:%lu n", num_samples); long size_of_each_sample = (header.channels * header.bits_per_sample) / 8; printf("Size of each sample:%ld bytesn", size_of_each_sample); // calculate duration of file float duration_in_seconds = (float) header.overall_size / header.byterate; printf("Approx.Duration in seconds=%fn", duration_in_seconds); printf("Approx.Duration in h:m:s=%sn", seconds_to_time(duration_in_seconds)); // read each sample from data chunk if PCM if (header.format_type == 1) { // PCM printf("Dump sample data? Y/N?"); char c = 'n'; scanf("%c", &c); if (c == 'Y' || c == 'y') { long i =0; char data_buffer[size_of_each_sample]; int size_is_correct = TRUE; // make sure that the bytes-per-sample is completely divisible by num.of channels long bytes_in_each_channel = (size_of_each_sample / header.channels); if ((bytes_in_each_channel * header.channels) != size_of_each_sample) { printf("Error: %ld x %ud <> %ldn", bytes_in_each_channel, header.channels, size_of_each_sample); size_is_correct = FALSE; } if (size_is_correct) { // the valid amplitude range for values based on the bits per sample long low_limit = 0l; long high_limit = 0l; switch (header.bits_per_sample) { case 8: low_limit = -128; high_limit = 127; break; case 16: low_limit = -32768; high_limit = 32767; break; case 32: low_limit = -2147483648; high_limit = 2147483647; break; } printf("nn.Valid range for data values : %ld to %ld n", low_limit, high_limit); for (i =1; i <= num_samples; i++) { printf("==========Sample %ld / %ld=============n", i, num_samples); read = fread(data_buffer, sizeof(data_buffer), 1, ptr); if (read == 1) { // dump the data read unsigned int xchannels = 0; int data_in_channel = 0; int offset = 0; // move the offset for every iteration in the loop below for (xchannels = 0; xchannels < header.channels; xchannels ++ ) { printf("Channel#%d : ", (xchannels+1)); // convert data from little endian to big endian based on bytes in each channel sample if (bytes_in_each_channel == 4) { data_in_channel = (data_buffer[offset] & 0x00ff) | ((data_buffer[offset + 1] & 0x00ff) <<8) | ((data_buffer[offset + 2] & 0x00ff) <<16) | (data_buffer[offset + 3]<<24); } else if (bytes_in_each_channel == 2) { data_in_channel = (data_buffer[offset] & 0x00ff) | (data_buffer[offset + 1] << 8); } else if (bytes_in_each_channel == 1) { data_in_channel = data_buffer[offset] & 0x00ff; data_in_channel -= 128; //in wave, 8-bit are unsigned, so shifting to signed } offset += bytes_in_each_channel; printf("%d ", data_in_channel); // check if value was in range if (data_in_channel < low_limit || data_in_channel > high_limit) printf("**value out of rangen"); printf(" | "); } printf("n"); } else { printf("Error reading file. %d bytesn", read); break; } } // for (i =1; i <= num_samples; i++) { } // if (size_is_correct) { } // if (c == 'Y' || c == 'y') { } // if (header.format_type == 1) { printf("Closing file..n"); fclose(ptr); // cleanup before quitting free(filename); return 0; } /** * Convert seconds into hh:mm:ss format * Params: * seconds - seconds value * Returns: hms - formatted string **/ char* seconds_to_time(float raw_seconds) { char *hms; int hours, hours_residue, minutes, seconds, milliseconds; hms = (char*) malloc(100); sprintf(hms, "%f", raw_seconds); hours = (int) raw_seconds/3600; hours_residue = (int) raw_seconds % 3600; minutes = hours_residue/60; seconds = hours_residue % 60; milliseconds = 0; // get the decimal part of raw_seconds to get milliseconds char *pos; pos = strchr(hms, '.'); int ipos = (int) (pos - hms); char decimalpart[15]; memset(decimalpart, ' ', sizeof(decimalpart)); strncpy(decimalpart, &hms[ipos+1], 3); milliseconds = atoi(decimalpart); sprintf(hms, "%d:%d:%d.%d", hours, minutes, seconds, milliseconds); return hms; }
A sample run is given below:
Hii Amit,
Thanks a lot for such a nice tutorial. In fact I used your code it was working properly. I have a question if suppose the wave header is of not 44 bytes may be 48 bytes or any other length how the code structure would be?
Hi Suraj,
There are non-standard wav file formats which are used, where the header size will be more than 44 bytes. Some of the commercial audio programs and music softwares add their own undocumented data into the header. The only way to know if this is a not a 44 byte header is to check the location of the “data” chunk marker. It means that a really robust program would go on parsing the file till it finds a valid header instead of assuming that it will always be at the beginning. This is much like parsing an MP3 file where the header can be anywhere in the file.
header.block_align and size_of_each_sample seem to be equivalent.
also using header.block_align for calculating num_samples also makes for easier to understand math i.e. num_samples = header.data_size / header.block_align
@mel Your statement is correct. The definition of “block align” is “The number of bytes for one sample including all channels” which makes you wonder why cant it be called something more relevant.
To get actual amplitude on line 222, 228 or 232 wouldn’t you need take two’s complement?
Thanks for posting the code
@Jay The code uses byte shifting to convert from little endian to big endian. I am not sure if two’s complement is required here because there are no negative values involved so we dont need to worry about the sign bit. Correct me if I am wrong.
can u please help me how to run this code on windows through code:block IDE???
please reply me it’s very urgent.
@sudha I have never used Code Block IDE so I cant help you in that. But this is a very simple file – one c file and which includes a header file. It should run in any IDE .
Hi Amit, I am getting an error ‘multiple definition of main’. Any idea why it is so? Thanks.
Hi Amit, actually this is solved but the program is giving ‘No wave file specified’ without giving me the chance to input it at the command line.
Also in wave.c, in line no. 42, it (eclipse IDE) is giving me an error :
“Return without value, in function returning non-void”
Finally I am getting the output but could you tell me how to get the samples as an array?
@Anamay , there are no built in dynamic array primitives in pure C++ . Perhaps you want to use an external library or create a class for array handling. One alternative is to use the vector class from STL.
Actually I mean that which part of the program are we dealing with the actual samples of the audio file? Which variable are they stored, if so?
@Anamay . The actual data is found in data_in_channel
@Amit, The actual samples I obtained from matlab are all in the range of -1 to 1 whereas the values obtained from this program are in the ranges of 10000 also. What could be the reason for that?
The thing is, some of the values agree with the matlab ones, while some don’t. Is that because of endianness?
@anamay I really cant say since I am not familiar with Matlab.
Okay, no worries.. Thanks for your help.
Some feedback:
In the ‘get filename from command line’ section, for my program, the variable ‘argc’ was always 1. So I manually set it to 2 to avoid getting ‘No wave file specified’. Though my wave file was in the same working directory. Also argv[1] was null. Also, I added a ‘scanf’ function there to input the file name from the keyboard.
Hello Amit, finally I got my output as expected. Thanks so much for this code. What solved the problem was in the part where conversion from little endian to big endian occurs, i.e. lines 220 – 232, my case being bytes_in_each_channel == 2, lines 227 to 230,
else if (bytes_in_each_channel == 2)
{
data_in_channel = data_buffer[0] & 255 | (data_buffer[1] << 8);
}
Just added '& 255' to it. i.e. ANDing data_buffer[0] with 255 (all ones). The value of data_buffer[0], remains the same but the conversion of little endian to big endian occurs correctly even when data_buffer[0] is negative,
something which was not happening before.
@anamay,
Thanks a lot for the solution. Hope it helps others who come across this problem.
how do i input a wav file in windows??
@shri Agnish Sorry I didnt understand. In the code you just put the location of the wave file in your hard disk.
Hai…
In the command line I am giving “./a.out “, in the terminal I have seen the sample number and channel number but unable to listen the sound. Could you please suggest me what’s wrong?
compiled the same code but couldn’t get any audio. Could you suggest me any solution?
@haneesh . This code is just for parsing the contents of a wav file. It will not play the wav file or generate any sound.
hello! I want help for a function that must chop a track .
The -chop argument clears
one audio file from one time to another.
The result is saved in a new file named chopped-sound1.wav, where sound1.wav is the original audio file name. Below is an example of splitting a sound file from 2nd to 4th:
$ ./wavengine-chopsound1.wav2 4
@mada do you mean that you want to extract the wave file between seconds 2 to 4?
Hello and thank you for the code!
i try to run your code but
1) I get an error form line 42 (return;)
2) if i comment line 42 or change in to return 0; it compiles but i get
“Opening file..
Error opening file”
Any ideas why is that happening;
I run:
gcc wave.c
./a.out /home/path/Test1.wav
Thanks in andvance
@outatime It looks like a standard file access error. Either the file is not present in the path provided or perhaps the folder does not have permissions to allow the file to be read. A good idea would be to use ferror() to print out the error message.
Hello, thanks a lot for the code.
I’m really starting out at sound programming and I’ve got the struct (mostly) correctly filled. But you left me wondering how i would actually play this file. Any tips?
I’m using windows.
@Marcio You are welcome. Playing a wav file is a completely different thing than parsing it. You can always use any existing player app for playing a wav file. Trying to make a player of your own can be more challenging, especially you want to do everything from scratch.
Thank you. Actually I found I can use win32 functions like waveOutOpen to play my struct, if someone has same problem.
In your format description it says sample rate is 32 byte integer. Shouldn`t it be 32 bit integer, or 4 byte instead?
Hey there,
I have a simple but important question:
Where is the audio data stored (in which array)? I am hesitating between data_buffer and data_in_channel.
Could someone tell me which one it is?
Thanks in advance!
@Rpy Circuits the actual data is in data_in_channel
Although I know that this program is not designed to actually take the audio data and play it, I would like to do just that. How would I use data_in_channel to actually play the audio? Let’s suppose that I have a function to store the audio data and play it that takes these parameters:
(buffer to store data in, format of audio data, actual audio data, size of the audio data, frequency of the audio data)
Could I simply take all of the information from the WAV header and input it in the function? Can I actually use the buffer data_in_channel to play the file?
Thanks!
The actual documentation for the function is here: https://www.openal.org/documentation/OpenAL_Programmers_Guide.pdf
go to function alBufferData!
@rpy Circuits, I had the same purpose when I wrote this code. But I got busy with other projects so never got around to actually being able play the wav data. It is definitely non-trivial and requires working closely with the soundcard. From what I can make out, playing an audio sound is a function of playing a certain number of samples for a certain period of time on a given frequency.
There is a great article here: http://digitalsoundandmusic.com/2-3-12-modeling-sound-in-c-under-linux/ which explains how sound is modeled. Of particular interest is the paragraph:
******
The sound wave is created by taking the sine of the appropriate frequency (262 Hz, for example) at 44,100 evenly-spaced intervals for one second of audio data. The value returned from the sine function is between -1 and 1. However, the sound card expects a value that is stored in one byte (i.e., 8 bits), ranging from -128 to 127. To put the value into this range, we multiply by 127 and, with the floor function, round down.
******
Only sound card manufacturers deal with the actual process of creating sound, so the info about the software logic is a little hard to get and a lot of device manufacturers keep their code a proprietary secret. But for people like us who want to learn the actual process of making sound, this is the holy grail.
format_type; channels;block_align; bits_per_sample; – must to be an unsigned short
So negative values in the signal can mess this up? I was able to run the code, and tried on a piano sequence, but it comes out noisy.. My data range is from
-14064, 12903
Not really. 8-bit samples are stored as unsigned bytes, ranging from 0 to 255. 16-bit samples are stored as 2’s-complement signed integers, ranging from -32768 to 32767
In line 211, you are reading into the data buffer the audio data for all channels (I was using a 2 channel example and it was 4 bytes); however inside the for loop [ line 228 ] (looping over each channel) you are always using the data for the first channel. I had to modify this so that each channel data was recorded correctly. Can you clarify? I still get a noisy data from my player, but thats a different issue.
Its been a few years since I wrote the code so I will have to go over it once to see if there is any issue. At first glance, what you say could be correct, because the channel loop is using the same data extract in every iteration. I believe the pointer within the data array should be incremented within the channel loop. If you have already made the change and its working fine , you could probably post the fix. I will go over the code in the next couple of days and apply your fix
No, I dont have it working satisfactorily. I have a Java version that reads the wave file correctly (uses the Jave Wav related classes) (and plays fine), but the C++ version (I converted the above from C to C++) is not reading the data correctly so far (at least it doesnt match the Java version and hence the sound comes out noisy). Once I have it working, I will be happy to share – in fact we are hoping to use this this for an open source educational toolkit.
So in short, still fighting with it!
Amit:
I got the parser working yesterday. There are two issues with your original code. One I mentioned above, where for multiple channels, you need to update the pointer. Secondly, this code wont work correctly for negative signal values. For this you need to do the following where you do the endian conversion:
data_in_channel = data_buffer[0] & 0x00ff | (data_buffer[1] << 8)
and similarly for the 24 and 32 bit cases. Otherwise the values are changed to incorrect values.
I have a version for C I can send you with these changes. I also did a C++ version to work with our system. I will send a link to that with acknowledgements for your approval. . We would like to use that in an open source educational toolkit we are building.
Hello, firstly excellent code and thanks for sharing but I’m having a problem. To check the correct results from matlab, each sample in matlab is stored as a double instead of an int in this example. I tried to store it as int by chaning data_in_channel to double but it just adds some zeros. Also, the values still differ from the ones parsed by matlab. Thanks for your time!
Hello @Dimitris, I am not sure how the results will work if the channel data is converted into double because then the bit shift operations will not work correctly. The wav data is actually supposed to be only integers, so converting them into floating point will have side-effects.
Hi
How to read the wave file main data after byte 44?
how to read samples of the two channels of a wave file using 2 dimensional array?….pls help with the code……..
When compiling the code, I received the following message:
[Warning] this decimal constant is unsigned only in ISO C90
So, i fix it this way:
203 low_limit = -2147483648u;
So, this compile fine in Windows 10 with Dev-Cpp Portable compiler.
Thank you.
Thanks for input @Gilberto, I have never tried to run the code on Windows, so this is helpful for people who want to try on Windows.
Hi, I am a student studying in Korea. I wonder about the license of this code.
Is it GNU GPL license or MIT license? Thank you for reading my comment.
Hello, There is no license associated with the code. You can use it and modify as you want.