Kaggler 0.5.0 Released

by Jeong-Yoon Lee

I am glad to announce the release of Kaggler 0.5.0. Kaggler 0.5.0 has a significant improvement in the performance of the FTRL algorithm thanks to Po-Hsien Chu (github, kaggle, linkedin).

Results

We increase the train speed by up to 100 times compare to 0.4.x. Our benchmark shows that one epoch with 1MM records with 8 features takes 1.2 seconds with 0.5.0 compared to 98 seconds with 0.4.x on an i7 CPU.

Motivation

The FTRL algorithm has been a popular algorithm since its first appearance on a paper published by Google. It is suitable for highly sparse data, so it has been widely used for click-through-rate (CTR) prediction in online advertisement. Many Kagglers use FTRL as one of their base algorithms in CTR prediction competitions. Therefore, we want to improve our FTRL implementation and benefit Kagglers who use our package.

Methods

We profile the code with cProfile and resolve the overheads one by one:

Contributor

Po-Hsien Chu (github, kaggle, linkedin)