Functional Principal Component Analysis

Introduction

Functional Principal Component Analysis (FPCA) is a generalization of PCA where entire functions act as samples (X in L^2(mathcal{T}) over time a interval mathcal{T}) instead of scalar values (X in mathbb{R}^p). The FPCA can be used to find the dominant modes of a set of functions. One of the central ideas is to redefine the scalar product from beta^T x = left langle beta, x right rangle = sum_j beta_j x_j into a functional equivalent leftlangle beta, x rightrangle = int_{mathcal{T}} beta(s) x(s) ds.

Temperature in Gothenburg

Using data from SMHI, we are going to look at variations of temperature over the year in Gothenburg.

temperature_per_month_gothenburg

The data spans from 1961 to today and all measurements have been averaged per month and grouped by year. To be able to do an FPCA we need to remove the mean from the data.

temperature_per_month_gothenburg_no_mean

The first principal component of the data which explains 94% of the total variation is unsurprisingly the variation over seasons followed by the second and third principal components at 1.8% and 0.95% respectively.

 

Looking at the scores for the two first components gives us an idea which years differ the most from each other, i.e., the points which are farthest away from each other.

temperature_per_month_gothenburg_scores

Horizontally, the years 1989 and 2010 seem to be different for the first principal component. Apparently, the winter of 1989 was much warmer than 2010.

temperature_per_month_gothenburg_compare_1989_vs_2010

The years 1971 and 2004 are very close to each other which suggests that they should be very similar, and they are.

temperature_per_month_gothenburg_show_1971_vs_2004

The second principal component represents a mode where the late winter differs from the autumn/early winter between the years. The year 2006 had a cold early year and a warm late year while 2002 was warm to start with and cold at the end.

temperature_per_month_gothenburg_show_2002_vs_2006.png

Conclusion

The FPCA is a powerful tool when analyzing variations in functional data. It applies to multidimensional functional data as well. Functional data analysis, in general, is a powerful tool which also can be used to categorization where different clusters of, e.g., motion trajectories needs to be found.