The maximum spacing estimation (MSE or MSP) is one of those not-so-known statistic tools that are good to have in your toolbox if you ever bump into a misbehaving ML estimation. Finding something about it is a bit tricky, because if you look for something on MSE, you will find "Mean Squared Error" as one of the … Continue reading Looking into Maximum Spacing Estimation (MSP) & ML.
Combine AB is hiring in Data Science!
The Kolmogorov-Smirnov test (KS test) is a test which allows you to compare two univariate, continuous distributions by looking at their CDFs. Such CDFs can both be empirical (two-sample KS) or one of them can be empirical, and the other one built parametrically (one-sample). Client: Good Evening. Bartender: Good evening. Rough day? Client: I should … Continue reading Kolmogorov-Smirnov for comparing samples (plus, sample code!)
And here we go with the copula package in (the sandbox of) statsmodels! You can look at the code first here. I am in love with this package. I was in love with statsmodels already, but this tiny little copula package has everything one can hope for! First Impressions First I was not sure about … Continue reading Trying out Copula packages in Python – II
You may ask, why copulas? We do not mean this copulas. We mean the mathematical concept. Simply put, copulas are joint distribution functions with uniform marginals. The kicker, is that they allow you to study dependencies separately from marginals. Sometimes you have more information on the marginals than on the joint function of a dataset, … Continue reading Trying out Copula packages in Python – I
Every now and then, a data science practitioner will be tasked with making sense out of rare, extreme situations. And the good news is, there exist mathematical tools that can help you make sense of extreme events. And some of those tools are structured under a branch of probability which has (conveniently) been named Extreme Value Theory (EVT).
Embracing the turbulence
Or "can daedalean words actually help make more accurate descriptions of your random variable? Part 1: Kurtosis Is a common belief that gaussians and uniform distributions will take you a long way. Which is understandable if one considers the law of large numbers: with a large enough number of trials, the mean converges to the expectation. … Continue reading Using higher moments to your advantage: Kurtosis
There is more than Monte Carlo when talking about randomized algorithms. It is not uncommon to see the expresions "Monte Carlo Approach" and "randomized approach" used interchangeably. More than once you start reading a paper or listening to a presentation, in which the words "Monte Carlo" appear on the keywords and even on the title, … Continue reading On types of randomized algorithms
Remember your friend from our very first post? . Well, I am sorry to say that he never really reached French Guyana. He ended up in Carcass, one of the Malvinas/Falkland islands. And his boat was (peacefully) captured by overly friendly pirate penguins. Now he spends his days counting penguins and sheep. He did keep a coin and … Continue reading Basic Statistics with Sympathy – Part 4: Building arbitrary RNGs in Sympathy