It's true that the Torah -- the visible Torah, that is -- is only one
of the possible permutations of the letters of the eternal Torah,
as God crated it and delivered it to the angels. By rearranging the
letters of the book over the centuries, we may someday arrive again at
the original Torah. But the important thing is not the finding, it is
the seeking, it is the devotion with which one spins the wheel of the
prayer and scripture, discovering the truth little by little.
Diotallevi in Foucault's Pendulum, by Umberto Eco
Using the quote above is, perhaps, ironic, since the character, Diotallevi, in Umberto Eco's Foucault's Pendulum, goes on in the next sentence to denounce the use of computers as tools for seeking truth.
Signal processing and filtering is, in its modest way, an attempt to find a better form for a set of information, either by reshaping it or filtering out selected parts (parts that are sometimes labeled as noise). Put another way, signal processing allows us to uncover a form of the signal that is closer to the truth (or a truth).
Although we have powerful computing and mathematical tools, perhaps there is some value in taking Diotellevi warning to heart: your machine may bring you delirium instead of ecstasy. Inexpensive high speed computing, modern statistics and mathematics sometimes seem like the modern version of rune marked bones, thrown by a shaman in cave lit by a smoky tallow lamp. It remains to be seen whether computing and mathematics are more reliable than the bones at unmasking truth and predicting the future. In theory statistics give us tools that tell us how successful our techniques are. But statistical tests rarely give us a final answer.
The grains of understanding that I have gathered regarding wavelets has come from reading several books and a number of journal articles. The wavelet web pages published here cannot replace this literature. In fact, what is written here is largely incomplete without this literature. My point of view differs from most authors who write on wavelets. I am not a mathematician. I am a software engineer and my interests in wavelets comes from an applied point of view. A software engineer wants to know what the algorithms is. Most software engineers are only interested in proofs only when computer simulation cannot show that the algorithm is robust and reliable (the algoritm cannot be tested through a complete range of values). As I understand the mathematicians view, they are interested in proof, because that is the nature of mathematics.
In most cases when a topic is discussed, I have published the Java or C++ code. This code is documented with either Javadoc or doxygen formatted comments and I have tried to include extensive documentation. Working software is missing from much of the wavelet literature, so I hope that this is a useful contribution. I hope that the material published here will make your journey easier and faster than mine has been.
These web pages evolved over a couple of years. This was entirely unplanned. I never expected that I would spend so much time on a topic which previously seemed alien to my background. As I've noted in the discussion of Haar wavelets, I stumbled into wavelets by accident. Wavelets lead me to classical signal processing (e.g., signal processing based on the Fourier transform), since this provides the foundation for signal processing using the wavelet transform. As I read more of the literature on wavelets, I found a wide breadth of applications for wavelets. I also found interconnections between wavelets other areas like fractal mathematics. As I found more interconnections between wavelets and other areas, I found it difficult to disentangle myself from this fascinating web of ideas.
The nature of my discovery and fascination has followed a wandering course, which is reflected in the structure of these web pages. As I read the literature on wavelets, finance, statistics, fractals and other areas, I wrote the web pages that are cataloged here, as a reference for myself and for others.
I originally became interested in wavelets because I had some fuzzy idea that the wavelet transform might have some application in quantitative finance. My early web pages explore simple Haar wavelets and an even more simplistic (and in some cases misguided) application of Haar wavelet based filters to financial time series. The later web pages were written as I came to understand other wavelet functions through the lens of the "lifting scheme". As time has gone on I have concentrated more on applications, since I now have a number of wavelet tool hanging on my "peg board". The web page on time series forecasting is built on a broad foundation which includes a collection of wavelet functions, wavelet packets, lossless wavelet compression, histograms, statistical functions, Gaussian distributed random numbers and some fractal mathematics (the Hurst exponent). The background for this web page also include my modest sampling of the literature on quantitative finance and economics.
One of the things that is missing so far is a centralized annotated bibliography. This is a side effect of the evolutionary nature of these web pages. Rather than having a single bibliography, the references to literature (books, articles and web pages) are scattered among the web pages. Many of the basic references can be found on the web page that discusses the applicaton of the the Haar transform to time series information.
Fourier analysis, using the Fourier transform, is a powerful tool for analyzing the components of a stationary signal (a stationary signal is a signal that repeats). For example, the Fourier transform is a powerful tool for processing signals that are composed of some combination of sine and cosine signals.
The Fourier transform is less useful in analyzing non-stationary data, where there is no repetition within the region sampled. Wavelet transforms (of which there are, at least formally, an infinite number) allow the components of a non-stationary signal to be analyzed. Wavelets also allow filters to be constructed for stationary and non-stationary signals.
Although Haar wavelets date back to the beginning of the twentieth century, wavelets as they are thought of today are new. Wavelet mathematics is less than a quarter of a century old. Some techniques, like the wavelet packet transform are barely ten years old. This makes wavelet mathematics a new tool which is slowly moving from the realm of mathematics into engineering. For example, the JPEG 2000 standard is based on the wavelet lifting scheme.
The Fourier transform shows up in a remarkable number of areas outside of classic signal processing. Even taking this into account, I think that it is safe to say that the mathematics of wavelets is much larger than that of the Fourier transform. In fact, the mathematics of wavelets encompasses the Fourier transform. The size of wavelet theory is matched by the size of the application area. Initial wavelet applications involved signal processing and filtering. However, wavelets have been applied in many other areas including non-linear regression and compression. An offshoot of wavelet compression allows the amount of determinism in a time series to be estimated.
All of the wavelet algorithms that I am aware of must be applied to a data set (a time series or a signal) that with a power of two number of elements (e.g., 256 elements = 28).
All of the algorithms discussed on these web pages are ordered wavelet transforms. This means that the result consists of a wavelet scaling function value (also known as a smooth value or a low pass filter value), followed by bands of wavelet function values (sometimes called wavelet coefficients), in increasing frequency. The sizes of these wavelet coefficient bands are ordered in increasing powers of two. If there are N elements in the data set (where N is a power of two) The coefficient bands following the scaling value will have sizes 20, 21, 22 ... N/2.
The wavelet coefficient bands represent change in the signal at a paticular resolution.
This is the earliest web page I wrote on wavelets. It provides some background. The ordered Haar wavelet transform code that is linked to on this web page is not as simple and elegant as the ordered Haar transform implemented via the lifting scheme.
Each of the wavelet web pages has links to Java (or in some cases Java and C++ source code). This page collects these links.
This web page discusses how the forward and inverse wavelet transform can be implemented using linear algebra operations (matrix/vector multiplication). This is useful in understanding the Daubechies wavelet transform.
This web page discusses the Daubechies D4 wavelet transform. This web page publishes two versions of the Daubechies transform, one implemented via the Lifting Scheme.
The wavelet Lifting Scheme has been developed by Wim Sweldens and others relatively recently (e.g., in the last decade or so). When wavelet algorithms are expressed in a lifting scheme structure they are more efficient (no temporaries are required) and they are simpler and easier to understand.
This web page discusses wavelet based spectral analysis and filtering. Wavelet spectral analysis plots provide a graphical representation of the filtering behavior of a particular wavelet transform. This allows a wavelet filter to be constructed that selects particular wavelet coefficient bands for a given filter.
This is a companion web page for Spectral Analysis and Filtering with the Wavelet Transform. It discusses some serious problems with the polynomial interpolation wavelet.
There are many processes which have a random (stochastic) component, but also exhibit some predictability between one element and the next. In statistics this is sometimes described as autocorrelation (the correlation of a data set with with shifted versions of the data set). Autocorrelation is one measure of whether a past value can be used to predict a future value.
A random process that has some degree of autocorrelation is referred to as a long memory process (or long range dependence). River flow shows this kind of long term dependence. The height of the river waters follows a random pattern, which is seasonally influenced. Within a season the trends are influenced by rainfall. A hydrologist, named Hurst, studied Nile river flows and reservoir modeling. The Hurst exponent discussed here takes its name from its discoverer. The Hurst exponent is related to the fractal dimension and appears in the analysis of self-similar 1-D data sets.
A number of people have applied the Hurst exponent to financial data sets, which may show some amount of long range dependence. I originally hoped to use the Hurst exponent to verify the results I got from using wavelet compression as an estimator for predictability in financial time series.
The Hurst exponent is intellectually interesting because it occurs in a surprising number of areas in applied mathematics. Also, the Hurst exponent can be calculated using the Wavelet transform. As you might guess from these web pages, I find applications of the wavelet transform difficult to resist. The Hurst exponent has found practical application in computer network traffic simulation and analysis. I will warn you up front that it seems to be a dead end for financial modeling and prediction.
To develop and verify the wavelet Hurst exponent calculation code, a fair amount of infra-structure had to be built. This support code is discussed on the web pages that are listed below:
The basic statistics functions described and implemented in C++ are linear regression, which is used by many methods for calculating the Hurst exponent (R/S, Fourier transform scalogram, wavelet scalogram). The autocorrelation function, standard deviation and probability density function calculation.
Generating a random walk data set or fractional brownian motion (which is scaled for a particular Hurst exponent) requires a source of high quality Gaussian pseudo-random numbers. This web page discusses some techniques for calculating Gaussian distributed pseudo-random numbers. I have used the functions provided by GNU Scientific Library (GSL) for Gaussian pseudo-random numbers.
In the book Essential Wavelets for Statistical Applications and Data Analysis (by R. Todd Ogden) and in a set of papers on database query optimization, wavelet techniques are applied to histogram creation. I have found that the coverage of wavelets applied to histograms is obscure. This web page explores wavelet techniques applied to histograms and finds that wavelets seem to be more computationally expensive than other techniques, without providing any obvious advantages.
A more promising application of wavelets is density estimation, which is related on a basic level to histogram creation. In many cases a data sample does not include enough elements to display the smooth probability distribution that would be apparent in a larger sample. A data sample may also contain noise, which obscures the distribution. Wavelets are one technique where a non-parametric function can be fit to the data distribution. This fit will remove (or partially remove) noise and can approximate the underlying distribution of the data.
The Wavelet Packet Transform
This web page discusses the Wavelet Packet Transform and the associated best basis algorithm. The inverse wavelet packet transform, calculated from the best basis set, is also described. An implementation in C++ can be downloaded.
A slightly modified version of the wavelet packet transform allows the time varying nature of a signal to be analyzed.
This web page discusses integer to integer wavelet transforms and an integer version of the wavelet packet transform. Integer algorithms are useful in lossless wavelet data compression algorithms.
The algorithms discussed here are 1-D compression algorithms (e.g., for signals or time series). These can be directly generalized to 2-D image compression algorithms.
Data compression is interesting topic and has a variety of applications. Along with the practical applications in storage and data transmission, predictive compression (of which wavelet compression is one variety) can be used to measure of the amount of determinism in a time series. This may be useful in short term time series forecasting.
Doxygen generated documentation for the C++ wavelet packet source code.
This web page is under construction
A time series is a data set that varies through time (e.g., each data element is associated with a time element). Examples of time series include the water level of the Nile river in Egypt, sun spots, a digitally sampled Jazz performance and the neutron emission of a radio active element over time as the element decays.
Wavelet compression can be used to estimate the amount of determinism in a particular region of a time series (or, looked at another way, the amount of noise). By estimating the amount of determinism that is present in a region of a time series, the success of a forecasting technique in that region of the time series can also be estimated.
A Notebook compiled while reading Understanding Digital Signal Processing by Richard G. Lyons, Prentice Hall, 2001
Some of the thought, mathematics and terminology underlying wavelets is derived from digital signal processing (the Fourier transform and digital filters). Richard G. Lyons books is the best introduction I've seen on digital signal processing. I only wish that a similar book had been written on wavelets.
I strongly believe that it is important to acknowledge and thank those who help an author in their path between an idea and a published work. In working on wavelets and, to a lesser extent, Fourier analysis, I have had many helpful discussions with colleagues.
I would like to thank Chris Strom for his encouragement and his generosity with his time.
Over time I hope to have the opportunity to add to the list above. For right now I don't feel free to thank all of my colleagues by name. I still hope that if any of these individuals make their way here, they will accept my warmest thanks. It was a great pleasure working with you and I appreciate your patience with my questions and half baked ideas.
I am the sole author of these web pages and they have not been written in direct consultation with anyone. Any errors are mine.
Ian Kaplan, 2001, 2002, 2003