Appreciate this post on helping to choose between a few available tests in determining if there are meaningful relationships between feature data. In particular,

  • ANOVA compares two variables, where one is categorical (binning is helpful here) and one is continuous.
  • Chi-square is useful for two categorical comparing two cateorical varables, on the other hand.
  • And Pearson Correlation can be used between two continiuous variables
    • But the caveat is that this test assumes both variables are normally distributed
    • And outliers should be chopped off with some preprocessing.