"Mish" is a new activation function that seems to be beating most of the state-of-the-art benchmarks!

In this kernel, I used Mish for the two dense layers of my previous kernel on using categorical embeddings: https://lnkd.in/d8p-NUB

In comparison to my previous kernel which used ReLU, Mish shows a considerable improvement with a 5 fold score 0.80544 (AUC) and 20 fold score of 0.80743 (AUC).

The best part about Mish is that it has been created by a budding researcher who is currently an undergraduate student! Kudos to Diganta Misra

