Discrete Signal Processing (DSP):

Fourier Transform Implementation and STFT

Table of Contents

1. FFT with Sampling Frequency¶

In practical applications, signals are sampled from the continuous-time domain. To properly analyze such signals using the FFT, we must take the sampling frequency into account, converting the DFT index $k$ into a frequency variable $f$.

This section follows the computational approach outlined in Computational Methods for Data Analysis by Nathan Kutz (University of Washington).

1.1. Implementing an FFT Routine¶

Consider the discrete-time signal sampled from a continuous sinusoid:

$$ x[n] = 2\cos (2\pi f t) = 2 \cos(2\pi \cdot 60 \cdot t), \quad \text{with sampling rate } f_s $$

Here, the signal has a frequency component at 60 Hz. You can think of this as a vibration signal from a rotating machine operating at 60 Hz (or 3600 RPM).

To analyze this using FFT:

Choose a sampling frequency $f_s$ (e.g., $f_s = 1000$ Hz)
Choose a sampling interval $\Delta t = 1/f_s$
Collect $N$ samples: $x[n] = 2\cos(2\pi \cdot 60 \cdot n\Delta t)$

import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline

from scipy.fftpack import fft, fftshift
from scipy.signal import spectrogram

Fs = 2**10              # Sampling frequency
T = 1/Fs                # Sampling period (or sampling interval)

N = 5000                # Total data points (signal length)

t = np.arange(0, N)*T   # Time vector (time range)

k = np.arange(0, N)     # vector from 0 to N-1
f = (Fs/N)*k            # frequency range

x = 2*np.cos(2*np.pi*60*t) + np.random.randn(N)

plt.figure(figsize = (10, 4))
plt.plot(t, x)
plt.xlim([0, 0.5])
plt.xlabel('t (sec)')
plt.show()

The Discrete Fourier Transform is computed and normalized as:

$$ X[k] = \frac{1}{N} \sum_{n=0}^{N-1} x[n] e^{-j \frac{2\pi}{N}kn}, \quad k = 0, 1, \dots, N-1 $$

This corresponds to the code:

xt = fft(x)/N

This normalization ensures that the amplitude of the DFT matches the scale of the original signal.

The corresponding frequency bins (in Hz) are:

$$ f[k] = \frac{F_s}{N}k, \quad k = 0, 1, \cdots, N-1 $$

This gives a frequency range from 0 up to just below the sampling frequency $F_s$, but for spectral analysis, it is often more informative to center the zero-frequency component.

To construct a centered frequency axis, we define:

$$ k_r = \begin{bmatrix} 0, 1, \cdots, \frac{N}{2} - 1, -\frac{N}{2}, \cdots, -1 \end{bmatrix} $$

which is implemented in code as:

kr = np.hstack([np.arange(0, N/2), np.arange(-N/2, 0)])

The corresponding physical frequencies (in Hz) are:

$$ f_r[k] = \frac{F_s}{N}k_r $$

This mapping is computed by:

fr = (Fs/N) * kr

To align the DFT result to this symmetric frequency axis, we apply the fftshift operation:

fs = fftshift(fr)
xtshift = fftshift(xt)

This reorders the FFT output so that the zero frequency is at the center and negative frequencies appear on the left side of the plot, yielding a more natural and interpretable frequency-domain representation.

# ft using fft

xt = fft(x)/N
xtshift = fftshift(xt)

kr = np.hstack([np.arange(0, N/2), np.arange(-N/2, 0)])
fr = (Fs/N)*kr
fs = fftshift(fr)

plt.figure(figsize = (10,8))

plt.subplot(2,1,1)
plt.plot(f, np.abs(xt))
plt.ylabel('|X(f)|', fontsize = 15)
plt.ylim([0, 2.1])
plt.xlim([0, 1000])
plt.title('FFT')

plt.subplot(2,1,2)
plt.plot(fs, np.abs(xtshift))
plt.ylim([0, 2.1])
plt.xlim([-500, 500])
plt.xlabel('f')
plt.ylabel('|X(f)|', fontsize = 15)
plt.title('Shifted FFT')
plt.show()

1.2. Single-Sided FFT (Positive Frequencies Only)¶

When applying the FFT to a real-valued time-domain signal, the resulting DFT exhibits Hermitian symmetry:

$$ X[N - k] = X[k]^*, \quad \text{for } x[n] \in \mathbb{R} $$

This implies that the second half of the FFT (frequencies above the Nyquist frequency) contains no new information. Therefore, we often compute the single-sided amplitude spectrum, which includes only the non-redundant frequency components:

Construction:

Retain only the first $N/2$ frequency bins (from $k = 0$ to $k = N/2 - 1$)
Multiply the amplitude values by 2 (except for the DC component $k = 0$ and the Nyquist component if $N$ is even) to conserve total signal energy

Frequency Axis (in Hz):

$$ f_k = \frac{k f_s}{N}, \quad 0 \leq k < \frac{N}{2} $$

This form of the FFT is commonly used for plotting and interpreting the spectrum of real signals in practical engineering contexts.

# single-sides fft

xt = fft(x)/N
xtss = xt[0:int(N/2)+1]      # 0:N/2
xtss[1:-1] = 2*xtss[1:-1]

fss = f[0:int(N/2)+1]

plt.figure(figsize = (10,8))
plt.subplot(2,1,1)
plt.plot(fs, np.abs(xtshift))
plt.ylabel('|X(f)|', fontsize = 15)
plt.ylim([0, 2.1])
plt.xlim([np.min(fs), np.max(fs)])
plt.title('Shifted FFT')

plt.subplot(2,1,2)
plt.plot(fss, np.abs(xtss))
plt.xlim([np.min(fss), np.max(fss)])
plt.xlabel('f')
plt.ylabel('|X(f)|', fontsize = 15)
plt.ylim([0, 2.1])
plt.title('Single-sided FFT')
plt.show()

2. Examples of Fourier Transform¶

(1) Gaussian Function

One of the most fundamental examples in Fourier analysis is the Gaussian function. Remarkably, the Fourier Transform of a Gaussian is also a Gaussian, a property that underpins its central role in signal processing and quantum mechanics.

Consider the function

$$ f(x) = e^{-\alpha x^2}, \quad \alpha > 0 $$

The Fourier Transform of $f(x)$ is given by

$$ F(k) = \int_{-\infty}^{\infty} e^{-\alpha x^2} \, e^{-j 2\pi k x} \, dx = \sqrt{\frac{\pi}{\alpha}} \, e^{-\frac{(\pi k)^2}{\alpha}} $$

Alternatively, in a normalized form commonly seen in physics and engineering (up to scaling constants), the transform pair is often written as:

$$ f(x) = e^{-\alpha x^2} \quad \longleftrightarrow \quad F(k) = \frac{1}{\sqrt{2\alpha}} \, e^{-\frac{k^2}{4\alpha}} $$

This result illustrates that the Gaussian function is invariant under the Fourier Transform, up to scaling of its width and amplitude. Moreover, the product of variances in time and frequency domains remains constant, achieving the minimum bound in the uncertainty principle.

%plot -s 800,600

%% un-scaled

L = 20;
n = 2^7;

x2 = linspace(-L/2,L/2,n+1);
x = x2(1:n);

u = exp(-x.*x);
ut = fft(u);
utshift = fftshift(ut);

subplot(2,2,1), plot(x,u), title('original signal'), xlabel('t'),
subplot(2,2,2), plot(real(ut)), title('Re(fft)'), xlabel('k'), axis tight
subplot(2,2,3), plot(real(utshift)), title('Re(fftshift)'), xlabel('k'), axis tight
subplot(2,2,4), plot(abs(utshift)), title('abs(fftshift)'), xlabel('k'), axis tight

%plot -s 800,600

%% scaled

L = 20;
n = 2^7;

x2 = linspace(-L/2,L/2,n+1);
x = x2(1:n);

k = (2*pi/L)*[0:n-1];
kr = (2*pi/L)*[0:n/2-1 -n/2:-1];
ks = fftshift(kr);

u = exp(-x.*x);
ut = fft(u)/n;
utshift = fftshift(ut);

subplot(2,2,1), plot(x,u), title('original signal'), xlabel('time')
subplot(2,2,2), plot(k,real(ut)), title('Re(fft)'), xlabel('\omega'), axis tight
subplot(2,2,3), plot(ks,real(utshift)), title('Re(fftshift)'), xlabel('\omega'), axis tight
subplot(2,2,4), plot(ks,abs(utshift)), title('abs(fftshift)'), xlabel('\omega'), axis tight

(2) Sinusoidal Signals

A cosine function can be expressed in terms of complex exponentials using Euler's formula:

$$ \cos(\omega t) = \frac{e^{j \omega t} + e^{-j \omega t}}{2} $$

This identity is useful because the Fourier Transform of a complex exponential is a delta function. Specifically, the Fourier Transform of $\cos(\omega t)$ is given by:

$$ \mathcal{F}\{ \cos(\omega t) \} = \pi \left[ \delta(\Omega - \omega) + \delta(\Omega + \omega) \right] $$

where:

$\delta(\cdot)$ denotes the Dirac delta function

This result shows that a pure cosine wave corresponds to two impulses in the frequency domain, located symmetrically at $\pm \omega$ (or $\pm f$). This is consistent with the idea that a cosine wave consists of two complex exponentials at positive and negative frequencies.

%plot -s 800,600

T = 20;
N = 2^7;

t2 = linspace(-T/2,T/2,N+1);
t = t2(1:N);

k = (2*pi/T)*[0:N-1];
kr = (2*pi/T)*[0:N/2-1 -N/2:-1];
ks = fftshift(kr);

u = cos(5*t);
ut = fft(u)/N;
utshift = fftshift(ut);

subplot(2,2,1), plot(t,u), title('original signal'), xlabel('time')
subplot(2,2,2), plot(k,real(ut)), title('Re(fft)'), xlabel('k'), axis tight
subplot(2,2,3), plot(ks,real(utshift)), title('Re(fftshift)'), xlabel('\omega'), axis tight
subplot(2,2,4), plot(ks,abs(utshift)), title('abs(fftshift)'), xlabel('\omega'), axis tight

%plot -s 800,600

T = 20;
N = 2^10;

t2 = linspace(-T/2,T/2,N+1);
t = t2(1:N);

k = (2*pi/T)*[0:N-1];
kr = (2*pi/T)*[0:N/2-1 -N/2:-1];
ks = fftshift(kr);

u1 = cos(5*t);
ut1 = fft(u1)/N;
utshift1 = fftshift(ut1);

u2 = cos(50*t);
ut2 = fft(u2)/N;
utshift2 = fftshift(ut2);

u3 = cos(100*t);
ut3 = fft(u3)/N;
utshift3 = fftshift(ut3);

subplot(3,1,1), plot(ks,abs(utshift1)), title('cos(5t)'), xlabel('\omega'), axis tight
subplot(3,1,2), plot(ks,abs(utshift2)), title('cos(50t)'), xlabel('\omega'), axis tight
subplot(3,1,3), plot(ks,abs(utshift3)), title('cos(100t)'), xlabel('\omega'), axis tight

3. STFT (Short-Time Fourier Transform)¶

3.1. Motivation¶

The Fourier Transform provides global frequency information but loses time localization. When analyzing signals whose frequency content changes over time (e.g., speech, machinery faults, or musical tones), a time-localized frequency analysis is needed.

Let us examine two signals, $x(t)$ and $z(t)$:

$$ x(t) = \begin{cases} \cos(2\pi f_1 t), & 0 \leq t < 1 \\ \cos(2\pi f_2 t), & 1 \leq t \leq 2 \end{cases} $$

This signal clearly exhibits nonstationarity: a low frequency in the first second and a higher frequency in the second.

$$ z(t) = \frac{1}{2} \left( \cos(2\pi f_1 t) + \cos(2\pi f_2 t) \right) $$

This is a stationary signal, where both frequencies are present throughout the entire duration.

Fs = 2**10                # Sampling frequency
T = 1/Fs                  # Sampling period (or sampling interval)

N = 2*Fs                  # Total data points (signal length)

t = np.arange(0, N)*T     # Time vector (time range)

k = np.arange(0, N)       # vector from 0 to N-1
f = (Fs/N)*k              # frequency range

x1 = np.cos(2*np.pi*50*t)
x2 = np.cos(2*np.pi*100*t)

x = np.zeros(t.shape)
x[0:int(N/2)] = x1[0:int(N/2)]
x[int(N/2):-1] = x2[int(N/2):-1]

plt.figure(figsize = (10,8))

plt.subplot(2,1,1)
plt.plot(t, x)
plt.xlabel('t')
plt.ylabel('x(t)', fontsize = 15)
plt.xlim([np.min(t), np.max(t)])

plt.subplot(2,1,2)
plt.plot(t, x)
plt.xlim([0.9, 1.1])
plt.xlabel('t')
plt.ylabel('x(t)', fontsize = 15)
plt.show()

z = 1/2*(x1 + x2)

plt.figure(figsize=(10,8))

plt.subplot(2,1,1)
plt.plot(t, z)
plt.xlabel('t')
plt.ylabel('z(t)', fontsize = 15)
plt.xlim([np.min(t), np.max(t)])

plt.subplot(2,1,2)
plt.plot(t, z)
plt.xlabel('t')
plt.ylabel('z(t)', fontsize = 15)
plt.xlim([0.9,1.1])
plt.ylim([-1,1])
plt.show()

Now, let's apply the Fourier Transform to both signals.

xt = fft(x)/N

xtss = xt[0:int(N/2)+1]
xtss[1:-1] = 2*xtss[1:-1]

fss = f[0:int(N/2)+1]

plt.figure(figsize=(10,4))

plt.plot(fss, np.abs(xtss))
plt.xlim([0, 500])
plt.ylim([-0.05,1])
plt.xlabel('f')
plt.ylabel('|X(f)|', fontsize = 15)
plt.show()

zt = fft(z)/N

ztss = zt[0:int(N/2)+1]
ztss[1:-1] = 2*ztss[1:-1]

fss = f[0:int(N/2)+1]

plt.figure(figsize=(10,4))

plt.plot(fss, np.abs(ztss))
plt.xlim([0, 500])
plt.ylim([-0.05,1])
plt.xlabel('f')
plt.ylabel('|X(f)|', fontsize = 15)
plt.show()

The Fourier Transforms of $x(t)$ and $z(t)$ appear almost identical, although there are minor differences (which we will not discuss here, as they are beyond the scope of this section).

A standard Fourier Transform identifies the presence of both $f_1$ and $f_2$ in each signal, but it does not indicate when each frequency occurs.

This limitation motivates the need for a new method that captures both frequency and time information simultaneously.

3.2. Short-Time Fourier Transform (STFT)¶

Many real-world signals, such as speech, exhibit nonstationarity, meaning their frequency content changes over time. For instance, in spoken language, the positions of spectral peaks (formants) vary as different phonemes are articulated. Applying the Discrete Fourier Transform (DFT) over a long time window in such cases obscures these spectral transitions, as it captures only the global frequency content.

To resolve this issue, one approach is to compute the DFT over short, localized segments of the signal. When the window is sufficiently short, the signal within each segment can be approximated as stationary. However, this introduces an inherent tradeoff between time and frequency resolution, which must be carefully managed.

To formalize this approach, consider the Short-Time Fourier Transform (STFT), which performs a localized Fourier analysis by applying a sliding window across the signal. For a given signal $x(t)$, the STFT is defined as

$$ \text{STFT}_x(\tau, f) = \int_{-\infty}^{\infty} x(t) \, w(t - \tau) \, e^{-j 2\pi f t} \, dt $$

Here, $w(t)$ is a window function centered at time $\tau$ (e.g., Hamming, Hann, or Gaussian), which restricts the analysis to a local region of the signal.

$\tau$ determines the time location of the windowed segment
$f$ specifies the frequency variable
$w(t - \tau)$ localizes the analysis in time

By evaluating the transform at multiple values of $\tau$ and $f$, the STFT provides a time-frequency representation of the signal.

3.3. Example: Time-Frequency Analysis of $x(t)$ and $z(t)$¶

Consider again the two signals introduced earlier:

$$ \begin{align*} x(t) &= \begin{cases} \cos(2\pi f_1 t), & 0 \leq t < 1 \\ \cos(2\pi f_2 t), & 1 \leq t \leq 2 \end{cases}\\\\ z(t) &= \frac{1}{2} \left( \cos(2\pi f_1 t) + \cos(2\pi f_2 t) \right) \end{align*} $$

The signal $x(t)$ is nonstationary, as its frequency content changes from $f_1$ to $f_2$ at $t = 1$. In contrast, $z(t)$ is stationary, containing both frequencies throughout its duration. When the Fourier Transform is applied to either signal, both $f_1$ and $f_2$ are detected, but the timing of their presence is not revealed.

However, using the STFT:

The STFT of $x(t)$ exhibits energy concentrated at $f_1$ for times $0 \leq t < 1$ and at $f_2$ for $1 \leq t \leq 2$
The STFT of $z(t)$ displays both frequency components at all time points

This example highlights the fundamental capability of STFT to resolve time-varying spectral content.

windowsize = 2**7
window = np.hanning(windowsize)
nfft = windowsize
noverlap = windowsize/2

f, t, Sxx = spectrogram(x,
                        Fs,
                        window = window,
                        noverlap = noverlap,
                        nfft = nfft,
                        scaling = 'spectrum',
                        mode = 'psd')

plt.figure(figsize = (6, 6))

plt.pcolormesh(t, f, Sxx)
plt.ylabel('Frequency [Hz]', fontsize = 15)
plt.xlabel('Time [sec]', fontsize = 15)
plt.show()

f, t, Szz = spectrogram(z, Fs, window=window, noverlap=noverlap, nfft=nfft, scaling='spectrum', mode='psd')

plt.figure(figsize = (6, 6))

plt.pcolormesh(t, f, Szz)
plt.ylabel('Frequency [Hz]', fontsize = 15)
plt.xlabel('Time [sec]', fontsize = 15)
plt.show()

3.4. Time-Frequency Resolution Tradeoff¶

The application of a window function introduces a tradeoff between time and frequency resolution, governed by the uncertainty principle:

A narrow window yields good time resolution but poor frequency resolution
A wide window improves frequency resolution but degrades time localization

This tradeoff is intrinsic to all time-frequency analysis methods and must be considered when selecting a window length.

windowsize = 2**8
window = np.hanning(windowsize)
nfft = windowsize
noverlap = windowsize/2

f, t, Sxx = spectrogram(x, Fs, window=window, noverlap=noverlap, nfft=nfft, scaling='spectrum', mode='psd')

plt.figure(figsize = (6, 6))

plt.pcolormesh(t, f, Sxx)
plt.ylabel('Frequency [Hz]', fontsize = 15)
plt.xlabel('Time [sec]', fontsize = 15)
plt.show()

windowsize = 2**6
window = np.hanning(windowsize)
nfft = windowsize
noverlap = windowsize/2

f, t, Sxx = spectrogram(x, Fs, window=window, noverlap=noverlap, nfft=nfft, scaling='spectrum', mode='psd')

plt.figure(figsize = (6, 6))

plt.pcolormesh(t, f, Sxx)
plt.ylabel('Frequency [Hz]', fontsize = 15)
plt.xlabel('Time [sec]', fontsize = 15)
plt.show()

%%javascript
$.getScript('https://kmahelona.github.io/ipython_notebook_goodies/ipython_notebook_toc.js')