ASTRON & IBM Center for Exascale Technology

Compressive Sampling

This project will consist of fundamental research into tailored signal processing and machine learning algorithms for the capture, processing, and analysis of the radio astronomy data for the next generation of radio telescopes (LOFAR, SKA). The primary focus is on the use of such methods as compressive sampling (aka compressive sensing), signal processing using algebraic systems, machine learning and pattern recognition. The project will result in advancement of the techniques used for radio-astronomical data analysis, and pose new challenges to advance the development of machine learning, compressive sampling and other signal processing algorithms. The results of the commonly undertaken research are of direct relevance for instance in VLSI analysis, which relies heavily on Fourier analysis (due to optical lithography), and pattern recognition and data analysis of the enormous datasets (e.g., applied to the problems of printability prediction).

Traditional capture and processing of analogue signals consists of 2 steps: sampling followed by compression. If the signal is band-limited, then sample at a frequency exceeding the Nyquist frequency, twice the frequency range. Lossy compression techniques (e.g., JPEG for an image) then throw away lots of the acquired information. It turns out not to be necessary to gather all this information, only to essentially throw most of it away. The result, that signals can be recovered with far less samples, is counter to conventional wisdom. Previously infeasible tasks become possible such as sampling a very high bandwidth signal or obtaining accuracy with far fewer sensors. Power savings can result and the amount of needed circuitry is reduced.

For this project we focus on its application to radio astronomy, a potential fit for exploitation of CS. Indeed, combining CS and radio astronomy has already emerged, e.g., by Wiaux et. al, where they exploit CS recovery using prior signal information that offers potentially significant performance improvement relative to the standard local matching pursuit algorithm CLEAN. In this project our goal is to develop CS algorithms that can be used in capturing and calibration.

Machine learning techniques owe their growing popularity and prominence to their capacity to discover patterns embedded in large, multi-dimensional data sets. Detection of large numbers of clusters Cluster analysis and cluster detection is one of the most important aspects of unsupervised learning. Traditional techniques have been developed to detect a few clusters in a data set and often fail when the number of clusters is very large. In such situation, also the legacy approach of running a clustering algorithm with a different hypothesized number of clusters fails due to the immense computational burden. We are currently working on developing new methods for high cluster number detection.

Degraded quality of data is one of the main problems that pattern recognition algorithms must contend with. Though, by defining appropriate quality measures for signals one can improve classification accuracy. We are currently pursuing the problem of defining multiple quality measures using statistical dependence criteria and of reducing the dimensionality of the resulting feature set. We would like to address the problem of defining the quality measures directly on the measured frequency data, before moving to the image domain, in this way we would like to incorporate such quality measures into a system that could be used for such purposes as outlier detection and object detection and classification.