Kolmogorov-Smirnov for comparing samples (plus, sample code!)

The Kolmogorov-Smirnov test (KS test) is a test which allows you to compare two univariate, continuous distributions by looking at their CDFs. Such CDFs can both be empirical (two-sample KS) or one of them can be empirical, and the other one built parametrically (one-sample). Client: Good Evening. Bartender: Good evening. Rough day? Client: I should … Continue reading Kolmogorov-Smirnov for comparing samples (plus, sample code!)

Trying out Copula packages in Python – I

You may ask, why copulas? We do not mean this copulas. We mean the mathematical concept. Simply put, copulas are joint distribution functions with uniform marginals. The kicker, is that they allow you to study dependencies separately from marginals. Sometimes you have more information on the marginals than on the joint function of a dataset, … Continue reading Trying out Copula packages in Python – I

Sympathy for the Extreme

Every now and then, a data science practitioner will be tasked with making sense out of rare, extreme situations. And the good news is, there exist mathematical tools that can help you make sense of extreme events. And some of those tools are structured under a branch of probability which has (conveniently) been named Extreme Value Theory (EVT).

Using higher moments to your advantage: Kurtosis

Or "can daedalean words actually help make more accurate descriptions of your random variable? Part 1: Kurtosis Is a common belief that gaussians and uniform distributions will take you a long way. Which is understandable if one considers the law of large numbers: with a large enough number of trials, the mean converges to the expectation. … Continue reading Using higher moments to your advantage: Kurtosis