Looking into Maximum Spacing Estimation (MSP) & ML.

The maximum spacing estimation (MSE or MSP) is one of those not-so-known statistic tools that are good to have in your toolbox if you ever bump into a misbehaving ML estimation. Finding something about it is a bit tricky, because if you look for something on MSE, you will find “Mean Squared Error” as one of the […]

Kolmogorov-Smirnov for comparing samples (plus, sample code!)

The Kolmogorov-Smirnov test (KS test) is a test which allows you to compare two univariate, continuous distributions by looking at their CDFs. Such CDFs can both be empirical (two-sample KS) or one of them can be empirical, and the other one built parametrically (one-sample). Client: Good Evening. Bartender: Good evening. Rough day? Client: I should […]

Trying out Copula packages in Python – II

And here we go with the copula package in (the sandbox of) statsmodels! You can look at the code first here. I am in love with this package. I was in love with statsmodels already, but this tiny little copula package has everything one can hope for! First Impressions First I was not sure about […]

Trying out Copula packages in Python – I

You may ask, why copulas? We do not mean this copulas. We mean the mathematical concept. Simply put, copulas are joint distribution functions with uniform marginals. The kicker, is that they allow you to study dependencies separately from marginals. Sometimes you have more information on the marginals than on the joint function of a dataset, […]

Sympathy for the Extreme

Every now and then, a data science practitioner will be tasked with making sense out of rare, extreme situations. And the good news is, there exist mathematical tools that can help you make sense of extreme events. And some of those tools are structured under a branch of probability which has (conveniently) been named Extreme Value Theory (EVT).