Hi, I’d like to share my team, ensemble’s solution and framework.
The code is available at gitlab:
and team’s internal LB is available here:
We joined the competition late, and had just enough time to build and run the end-to-end framework without much feature engineering. So feature-wise, there is nothing fancy, but I hope that you can find the framework itself helpful. 🙂
As you can see, it uses Makefiles to pipeline feature generation, single model training, and ensemble training. The main benefits of our framework based on Makefiles are:
- It’s language agnostic – You can use any language to do any parts of pipeline. Although this specific version uses Python throughout the pipeline, I used to mix R, Python, and other executables to run the pipeline.
- It checks dependencies automatically – It checks if previous steps were completed, and if not, it runs those steps automatically.
- It’s modular – When working with others, it’s easy to split tasks across team members so that each one can focus on different parts of pipeline.
If you are new to Makefiles, here are some references:
- http://kaggler.com/kagglers-toolbox-setup/ – see the Makefile section
Kaggler. Data Scientist. Father of Five.