Overview

The Fast Fourier Transform (FFT) is a technique used to process data and is a major component of the spectrogram. Audio data is typically recorded in the time domain, meaning that the signal is represented as a function of time. The FFT transforms the frequency from the time domain to the frequency domain, meaning that its signal is represented in its frequency components instead. Putting the signal in this domain gives useful data about the signal and makes it easier to alter frequencies if needed. By using the FFT, we can calculate the magnitude and make a visual representation of the frequencies.


Background

System Requirements

What are the requirements for your system? What functionality must it have? Are there any design space/size requirements? Use the table down below to efficiently document these:

Requirement DesignatorRequirement Explanation


Interface

The main function takes in a .wav file, processes it, and then outputs the number of PCM samples as well as a message to tell the client what file the PCM data was written to. 


Implementation

Timeline

  • Finish the FFT at the end of the semester  
  • Elias Castro Revise and organize FFT code and begin talking to the Digital Subteam about their FFT  

Files

The whole process is stored within three files: main.c, src.c, and src.h. 

main.c: Contains the main function which takes in a .WAV file, preprocesses it, and applies the FFT to the PCM samples. 

src.c: Contains the source code for the functions used.

src.h: Contains the declaration and documentation of the functions used.

GitHub repo: cornell-c2s2/spectrogram_software (github.com)

Theory of Design

In a perfect world, audio data would immediately be ready to be processed. However, many things can go wrong when capturing and recording data. There can be noise such as sudden spikes and unnecessary frequencies. This is why data must first be preprocessed. We normalize the audio to reduce the sudden spikes within the audio and use the hamming process to detect and correct errors that occur when transmitting the audio. 

After we properly preprocessed the audio, we wanted to get data from the audio. This is where the FFT comes in as it can properly transfer the PCM samples to the frequency domain and a visualization of the data can be made. From the FFT we can also calculate the magnitude of certain frequencies which could help identify birds through their calls. 


Testing

Testing Strategy

The main testing strategy was inputting a PCM sample. We wanted to see if a normal test case would work. To determine a successful test case, we made an output file called processed_output.pcm to compare the PCM samples from before being processed to after. We've also printed out the number of PCM samples to make sure it was properly taken in and counted all the samples. 

Running Tests

Running the test is very easy! Simply run the command gcc -0 main main.c src.o -lm to see the overall output. 


Appendix

Resources

Hamming and Hanning:

https://towardsdatascience.com/brief-introduction-of-hamming-and-hanning-function-as-the-preprocessing-of-discrete-fourier-8b87fe538bb7

https://stackoverflow.com/questions/28215536/how-do-i-apply-the-hanning-function-to-my-audio-sample

https://en.wikipedia.org/wiki/Window_function (more of an introduction to what window functions are supposed to do in general) 

Lessons Learned

Collaboration: This project would not have been possible without the software team. With this project, we learned collaboration as a soft and technical skill. We constantly communicated with each other whenever there was an update in development and asked questions to each other for clarification. Through this, we were able to have the same vision of the project and effectively navigate through any challenges. As a technical skill, we used Github to make and merge branches during the development. This taught the team the value of source control and how it can be used effectively. 

Audio Processing: No one on the software team had any experience processing audio before. The team had to research audio processing and the techniques used. For the team to make effective algorithms, we had to understand all of the processes as we did black box testing to verify our functions. In addition, we needed to understand the process to properly document functions for future clients and maintainers.

  • No labels