Conceptual background of higher-order Fourier analysis#

Note

This page provides an intuitive mathematical explanation of central concepts in higher-order Fourier analysis. The goal is not to be exhaustive or fully rigorous, but to provide a coherent mental model of what is going on deep below the grounds of the functionality of the software package. For precise mathematical results and references, please see Theoretical foundations and extensions.

Classical Fourier analysis decomposes signals into linear harmonics (fixed-frequency sine and cosine waves). However, this framework struggles with signals whose frequency content itself varies over time, such as chirps. In a chirp, the instantaneous frequency changes (e.g., linearly or quadratically), so its energy is smeared across many Fourier modes rather than concentrated in a few. This makes it difficult to recover or interpret the signal using traditional Fourier methods: no single mode (or small set of modes) captures the essential structure of the chirp.

To address this, we need a way to detect whether a signal’s structure can be described by a small number of simple components, or if it requires a more complex higher-order description. This motivates the introduction of the Gowers norms, which act as “detectors” for such higher-order structure, generalizing the idea of Fourier coefficients in order to capture polynomial phase relationships. Before introducing these norms, let us look at the case of classical Fourier analysis.

The case of classical Fourier analysis #

In the classical setting, we are often interested in knowing whether a signal can be represented using a few Fourier characters. To quantify this, we can ask how concentrated is the signal’s energy in the frequency domain. A natural measure of such a concentration is the maximum magnitude of the signal’s Fourier coefficients:

\[\|\hat{f}\|_\infty = \max_{k} |\hat{f}(k)|,\]

where \(\hat{f}(k)\) is the Fourier coefficient corresponding to frequency \(k\). If this maximum is large, the signal has a strong linear harmonic, capturing a significant part of its structure. Conversely, if \(\|\hat{f}\|_\infty\) is small, the signal’s energy is spread across many frequencies, and Fourier methods may struggle to capture its structure efficiently.

Classical Fourier analysis through a different lens: the Gowers \(U^2\)-norm #

To introduce this concept, let us consider a typical case of a periodic 1-dimensional signal where we have a sample of \(n\) points. We represent such a signal with a function \(f:\mathbb{Z}/n\mathbb{Z}\to \mathbb{C}\). Recall that \(\mathbb{Z}/n\mathbb{Z}\) is simply the set of integers \(\{0,\ldots,n-1\}\) with addition modulo \(n\) (meaning that if we sum two elements of \(\{0,\ldots,n-1\}\) and the sum is greater than \(n\), then we substract \(n\) from the sum). Then, the Gowers \(U^2\)-norm is defined as follows.

The Gowers \(U^2\)-norm: Let \(f:\mathbb{Z}/n\mathbb{Z}\to \mathbb{C}\) be a function. The Gowers \(U^2\)-norm of \(f\) is given by the formula

\[\|f\|_{U^2}^{4} = \mathbb{E}_{x,t_1,t_2\in \mathbb{Z}/n\mathbb{Z}} f(x)\overline{f(x+t_1)}\overline{f(x+t_2)}f(x+t_1+t_2)\]

where \(\mathbb{E}_{x,t_1,t_2\in \mathbb{Z}/n\mathbb{Z}}\) stands for \(\tfrac{1}{n^{3}}\sum_{x=0}^{n-1}\sum_{t_1=0}^{n-1} \sum_{t_2=0}^{n-1}\), i.e. it is simply the average over the set \((\mathbb{Z}/n\mathbb{Z})^{3}\).

We claim that this norm plays an analogous role to that of the maximum magnitude of the Fourier coefficients.

A more precise relationship between the Gowers \(U^2\)-norm and the Fourier \(\ell^\infty\)-norm.#

More generally, for a function \(f:\mathbb{Z}/n\mathbb{Z}\to \mathbb{C}\) whose absolute value is at most 1, i.e. \(|f(x)|\le 1\) for all \(x\in \mathbb{Z}/n\mathbb{Z}\), we have that

\[\|\widehat{f}\|_\infty \le \|f\|_{U^2} \le \|\widehat{f}\|_\infty^{1/2}.\]

Recall that the size of \(\|\widehat{f}\|_\infty\) indicates whether a function can or cannot be approximated by a few Fourier characters. The previous inequalities tell us that the \(U^2\)-norm is essentially equivalent to the norm \(\|\widehat{f}\|_\infty\), and can therefore be used to similar effects.

Optional Exercise

Try to prove the previous inequalities. Hint: write \(f\) as the inverse of its Fourier transform. Plug that expression into the definition of the \(U^2\)-norm and simplify. This should yield that the \(U^2\)-norm is equal to the \(\ell^4\)-norm of the Fourier transform. To conclude, apply Parseval’s identity.

Generalizing to higher-order structure #

For higher-order structures (e.g., polynomial phases), there is no direct analog of \(\|\hat{f}\|_\infty\) in the frequency domain. However, a natural generalization emerges when we shift our perspective from frequency space (mathematically, the dual of the original abelian group) to physical space (the original abelian group). Instead of measuring concentration in the Fourier domain, we can work directly in the original group to define a hierarchy of detectors that measure how much a function correlates with polynomial phases of increasing complexity. This is precisely the role of the Gowers norms: they quantify, directly in the physical domain, the extent to which a function exhibits higher-order structure.

To introduce these norms, let us stay focused on the case of periodic signals defined on \(n\) points, that is, on functions \(f:\mathbb{Z}/n\mathbb{Z}\to \mathbb{C}\):

Gowers norms: Let \(k\ge 2\) be an integer and let \(f:\mathbb{Z}/n\mathbb{Z}\to \mathbb{C}\) be any function. Then, the Gowers \(U^k\)-norm of \(f\) is given by the formula

\[\|f\|_{U^k}^{2^k} = \mathbb{E}_{x,t_1,\ldots,t_k\in \mathbb{Z}/n\mathbb{Z}} \prod_{w\in \{0,1\}^{k}} \mathcal{C}^{w_1+\cdots+w_k}f(x+w_1t_1+\cdots+w_kt_k)\]

where \(\mathbb{E}_{x,t_1,\ldots,t_k\in \mathbb{Z}/n\mathbb{Z}}\) stands for \(\tfrac{1}{n^{k+1}}\sum_{x=0}^{n-1}\sum_{t_1=0}^{n-1}\cdots \sum_{t_k=0}^{n-1}\) and \(\mathcal{C}^{n}\) is the conjugation operator applied \(n\) times.

Note

The Gowers norms are defined analogously for functions on any finite abelian group. Here we presented the notion in the simple setting of the groups \(\mathbb{Z}/n\mathbb{Z}\) for expository reasons, but the norms may very well be defined for 2-dimensional signals (on \(\mathbb{Z}/n\mathbb{Z}\times \mathbb{Z}/m\mathbb{Z}\)) or even for high-dimensional vector spaces, e.g. on \((\mathbb{Z}/2\mathbb{Z})^d\) for any \(d\), a case of particular interest for coding theory.

While this may at first look like a generalization for the sake of it, we soon find that it is highly relevant even with simple examples, in particular with chirps.

Higher-order structure measured by the \(U^k\)-norms #

Note

There are several ways of describing what kind of structure is detected by the Gowers norms. Here, we are going to explain the one that is related to the HoFa package. For a more complete picture see Theoretical foundations and extensions.

The case of the \(U^2\)-norm #

The best way to interpret how the HoFa package decomposes functions is first to recall how classical Fourier analysis decomposes functions depending on the magnitudes of their Fourier coefficients. Continuing with our example, let \(f:\mathbb{Z}/n\mathbb{Z}\to \mathbb{C}\) be a function of absolute value at most 1. Then we have the decomposition

\[f = f_s+f_r\]

where

\[f_s = \sum_{|\widehat{f}(\chi)|\text{ large}} \widehat{f}(\chi)\chi \quad \text{and} \quad f_r = \sum_{|\widehat{f}(\chi)|\text{ small}} \widehat{f}(\chi)\chi.\]

The term \(f_r\) is the \(U^2\) noise term, i.e. a term whose Fourier amplitudes are all small. The other term, \(f_s\), is the structured term, which has a sparse Fourier description, typically consisting in a linear combination of a few Fourier characters.

The case of the \(U^3\)-norm #

When we go one step higher, to the \(U^3\)-norm, we could expect that a similar decomposition holds but with quadratic phase polynomials, e.g. \(\exp(2\pi i x^2)\), instead of the usual Fourier characters. It turns out that this is (almost) the case. Indeed, with respect to the \(U^3\)-norm, we also have (approximately) a decomposition of the form

\[f = f_s+f_r\]

where (as expected) \(f_r\) is the noise term with respect to the \(U^3\)-norm and

\[f_s = \sum_{\ell=0}^{M-1} g_\ell\]

for some small (i.e. bounded) integer \(M\), where the functions \(g_\ell\) (\(\ell\in \{0,\ldots,M-1\}\)) are what we call dominant quadratic characters of \(f\).

But what are these quadratic characters? Their defining feature can be explained starting again from classical Fourier characters. Note that any Fourier character \(\chi(x)=\exp(2\pi i r x/n)\) has the property that its multiplicative derivatives are constant functions, that is, for any element \(h\in \mathbb{Z}/n\mathbb{Z}\), the multiplicative derivative \(\Delta_h(\chi)(x):=\chi(x+h)\overline{\chi(x)}\) equals the constant function \(\exp(2\pi i r h/n)\). It can be proved that, actually, this is essentially a defining property of Fourier characters. Taking this idea one step higher, it turns out that quadratic characters \(g\) can also be defined by a property concerning their multiplicative derivatives \(\Delta_h(g)\). But what property?

The property in question stands in relation to quadratic Fourier analysis analogously to how the property of being a constant function stands relative to linear (classical) Fourier analysis. A good such property turns out to be that each multiplicative derivative of \(g_\ell\) should be expressible (approximately) with only a few Fourier characters.

Note that this property is very much satisfied by quadratic phase polynomials (or linear chirps): if \(g(x)=\exp(2\pi i r x^2/n)\), then \(\Delta_h(g)(x)\) equals the single Fourier character \(\exp(2\pi i r 2 hx/n)\) multiplied by a constant (depending on \(h\)).

Important observation

In the case of classical Fourier analysis, if we multiply a (classical) Fourier character by a constant we obtain a function whose multiplicative derivatives are also constant. Thus, a linear character (using this more general definition of a function whose multiplicative derivatives are constant) multiplied by a \(U^1\)-structured function (i.e. a constant) is again a linear character (in the sense that its multiplicative derivatives are constant).

Note now that in the quadratic case we have a similar phenomenon. If we let \(g(x)\) be a quadratic character and \(p(x)\) be a \(U^2\)-structured function, i.e. the sum of few (classical) Fourier characters mutiplied by constants, then the product \(p(x)g(x)\) is also a quadratic character. The proof is simply noting that \(p(x+h)g(x+h)\overline{p(x)g(x)} = \Delta_h(g(x))\Delta_h(p(x))\) and on the one hand \(\Delta_h(g(x))\) is the sum of few Fourier characters, by definition of quadratic character, and on the other \(\Delta_h(p(x))\) is as well, simply by unfolding the fact that \(p\) is itself the sum of few Fourier characters.

This observation has the important consequence of implying that, if we consider a sum of chirps like

\[\exp(2\pi i x^2/n)+\exp(2\pi i (x^2+x+4)/n)+\exp(2\pi i (x^2+10x)/n),\]

with respect to higher-order Fourier analysis this is a single quadratic character, and will be treated as such by the HoFa package. Indeed, it equals \(\exp(2\pi i x^2/n)(1+\exp(2\pi i (x+4)/n)+\exp(20\pi i x/n))\), which falls into the class of quadratic characters described above (a quadratic phase function times the sum of few Fourier terms).

Important

More general quadratic characters may involve objects more complicated than periodic quadratic phase polynomials. A more detailed discussion can be found in Theoretical foundations and extensions. However, quadratic phase polynomials are particularly clean examples to keep in mind, in addition to being highly relevant for applications (e.g. as chirps).

The general case #

If we keep climbing to higher orders, we will find that a similar picture emerges, and in general we will have decompositions of the form

\[f = f_s+f_r\]

where \(f_r\) is the \(U^k\) noise term and

\[f_s = \sum_{\ell=0}^{M-1} g_\ell,\]

where the components \(g_\ell\) are now the dominant components of \(f\) of order \(k-1\). A notion of a function \(g\) being a character of order \(k-1\) can be defined similarly, in terms of its multiplicative derivatives being all so-called structured functions of order \(k-2\). More details on these ideas can be found in Theoretical foundations and extensions.

The generalizations of orthogonality #

It turns out that such general decompositions enjoy some (approximate) notion of uniqueness and orthogonality, similar to the properties enjoyed by classical Fourier characters. Indeed, if \(f = \sum_{\ell=0}^{M-1} g_\ell+f_r\) where the functions \(g_\ell\) are the dominant higher-order characters of \(f\), then we have that they satisfy the following property:

Approximate orthogonality: For any two distinct \(g_\ell, g_{\ell'}\) we have that \(|\langle g_\ell,g_{\ell'}\rangle|\) is small.

And, possibly more interestingly, we have the following:

Approximate higher-order orthogonality: For any two distinct \(g_\ell, g_{\ell'}\), if we define the “higher-order product” \(\langle g_\ell, g_{\ell'}\rangle_{U^k}\) as

\[\mathbb{E}_{x,t_1,\ldots,t_k\in \mathbb{Z}/n\mathbb{Z}} \prod_{w\in \{0,1\}^{k-1}} \mathcal{C}^{w_1+\cdots+w_{k-1}}g_\ell(x+w_1t_1+\cdots+w_{k-1}t_{k-1}) \overline{\prod_{w\in \{0,1\}^{k-1}} \mathcal{C}^{w_1+\cdots+w_{k-1}}g_{\ell'}(x+w_1t_1+\cdots+w_{k-1}t_{k-1}+t_k)}\]

then we have that \(|\langle g_\ell, g_{\ell'}\rangle_{U^k}|\) is also small. Note that this formula is very similar to the one defining the \(U^k\)-norm, and this is not a coincidence. Indeed, the formula for the product \(\langle g_\ell, g_{\ell'}\rangle_{U^k}\) can be obtained from that of (the \(2^k\) power of) the \(U^k\)-norm replacing one of the functions in half of the product and the other function in the opposite half of the product. In particular, note that \(\langle f, f\rangle_{U^k} = \|f\|_{U^k}^{2^k}\).

What the HoFa package can do #

Once we understand how higher-order Fourier analysis handles quadratic (and higher-order) polynomial phase functions, the natural question arises:

What does the HoFa package do?

The straightforward answer is that HoFa provides the following core functionalities:

Compute Gowers norms of a function.
Remove the \(U^k\) noise term from a function.
Identify the dominant higher-order characters of a function.

We now invite the reader to revisit (or try for the first time) the Tutorial - A first encounter with the HoFa package (Checked: D), equipped with the insights gained from this section.

Computational complexity #

The current implementation of HoFa uses a recursive approach with NumPy, which prioritizes clarity and modularity in the codebase. While this design choice improves readability and maintainability, it does not fully exploit the potential for computational efficiency. Nevertheless, the implementation achieves good time complexity for typical use cases.

A parallelized implementation will significantly improve execution speed, particularly for large-scale computations. This parallel version will leverage multi-core processing to distribute the workload, though it will come at the cost of increased memory usage due to the overhead of parallel execution.

The bottleneck in both approaches is the computation of the eigenvalues and eigenvectors of a self-adjoint matrix, see Tutorial - Denoising (Checked: D) for the actual implementation. While a naive approach for eigendecomposition usually takes \(O(n^3)\) operations, if we are interested only in the few largest eigenvalues and corresponding eigenvectors (as in the case of the HoFa package) the computational complexity is reduced by using the Implicitly Restarted Lanczos Method. This is done in HoFa by using the method scipy.sparse.linalg.eigsh.

Using this method, if we want to compute the \(m\ll n\) largest eigenvalues and corresponding eigenvectors of an \(n\times n\) matrix, the computational complexity is \(O(mn^2)\). This improvement is already implemented in HoFa, which allows tunable ways of computing the eigendecomposition of a self-adjoint matrix.

Assuming that for the eigendecompositions present in the algorithm we only require the largest \(m\) eigenvalues (which typically can be fixed, or allowed to grow up to \(O(\log n)\), say), the complexity of the main algorithms in HoFa (namely hofa.rgz.regularize() to remove the \(U^k\) noise term, and hofa.char.spechoft() to identify the dominant higher-order characters), is as follows (for an imput of size \(n\)):

	Quadratic (\(U^3\)) case	General (\(U^k\)) case
Recursive (Current)	Time: \(O(mn^2+n^2\log(n))\) Memory: \(O(n^2)\)	Time: \(O(mn^{k-1}+n^{k-1}\log(n))\) Memory: \(O(kn^2)\)
Parallelized (Theoretical)	Time: \(O(mn^2)\) Memory: \(O(n^3)\)	Time: \(O(kmn^2)\) Memory: \(O(n^k)\)

Important

To implement the parallelized version, a parallelized Implicitly Restarted Lanczos Method should be implemented. However, this is only necessary for the general case (cubic or higher order), but for the quadratic case (for the \(U^3\)-norm), it suffices to have a non-parallelized version. We are currently working to implement this functionality in HoFa.

Conceptual background of higher-order Fourier analysis#

The case of classical Fourier analysis #

Classical Fourier analysis through a different lens: the Gowers \(U^2\)-norm #

A more precise relationship between the Gowers \(U^2\)-norm and the Fourier \(\ell^\infty\)-norm.#

Generalizing to higher-order structure #

Higher-order structure measured by the \(U^k\)-norms #

The case of the \(U^2\)-norm #

The case of the \(U^3\)-norm #

The general case #

The generalizations of orthogonality #

What the HoFa package can do #

Computational complexity #

Table of Contents

Previous topic

Next topic