by Jonathan Widarsa

Category: Uncategorized

  • Time Series Talks: Consistency is King

    Time Series Talks: Consistency is King

    One of the most important assumptions for statistical models to work is the notion of consistency. This means that statisticians often drool with excitement when they find out that their data has approximately stable statistical properties, because they can finally unlock the cabinet of unused dusty models. In time series analysis (and several other disciplines), […]

    Read more: Time Series Talks: Consistency is King
  • Everything is Significant

    Everything is Significant

    We’ve briefly talked about how pp-values should be interpreted. It’s crucial to understand that a pp-value of 0.01 doesn’t mean that there is a 1% chance of some null hypothesis being true. Instead, it implies a 1% chance of observing data as extreme or more extreme than the current data under the condition that the […]

    Read more: Everything is Significant
  • Everything is Normal

    Everything is Normal

    The normal distribution is one of statistics’ most precious models of reality. It’s analytically tractable, computationally simple, and provides a universal language for uncertainty. As such, it definitely deserves an in-depth exploration of its characteristics, properties, and significance. And then, we’ll explode in, Game of Thrones style, to ruin the perfect rainbow world of normality […]

    Read more: Everything is Normal
  • Regression Crumbs on a Silver Platter

    Regression Crumbs on a Silver Platter

    There was a time when I used to apply linear regression to some data and if the resulting metrics (R2R^2, RMSE, MAE, etc.) were unsatisfactory, I simply concluded that the regression wasn’t a good fit and I should probably instead look at other models like gradient boosting or neural networks. If you don’t think this […]

    Read more: Regression Crumbs on a Silver Platter
  • To Squish Data and Not Break It

    To Squish Data and Not Break It

    I always knew Principal Component Analysis (PCA) as a dimensionality reduction technique. Way too many features? PCA. Need to visualize clustering? PCA. Exploratory data analysis? PCA. It’s definitely one of my go-to analysis back then, but not because of its usefulness—it was one of the few tools I knew existed, so might as well, I […]

    Read more: To Squish Data and Not Break It
  • A Tale of Gender Bias from Berkeley

    A Tale of Gender Bias from Berkeley

    During my early days of learning statistics, I encountered a pretty interesting phenomenon while reading a (relatively) ancient article. The story goes like this: In the fall of 1973, a study on gender bias among graduate school admissions to University of California, Berkeley made headlines. The reason for this was that the admission figures showed […]

    Read more: A Tale of Gender Bias from Berkeley
  • The Two Faces of Chi-Square

    The Two Faces of Chi-Square

    Back in university, my genetics professor introduced the concept of chi-square tests like it was a magical instrument. Before I was ever interested in any statistics, I always had a script from the lecture notes that ran some code on R, and all I had to do was reject either the hypothesis that my two […]

    Read more: The Two Faces of Chi-Square