[mlpack] Cross-validation and hyper-parameter tuning infrastructure

Ryan Curtin ryan at ratml.org
Wed Feb 22 11:10:57 EST 2017


On Wed, Feb 22, 2017 at 05:07:39PM +0500, Kirill Mishchenko wrote:
> Hi,
> 
> my name is Kirill. I’m interested in the contribution to the project
> “Cross-validation and hyper-parameter tuning infrastructure”. I have
> already gone through some starting steps, like building the code and
> running a few ML algorithms (more precisely, I have did it for Linear
> Regression and Logistic Regression). Now I’m going to read rigorously
> the wiki page "Design Guidelines” and to go through the interfaces in
> the code base . Are there any other suggestions how I can start to
> work on the project? Is there some way to make a related small
> contribution to the code base?

Hi Kirill,

The cross-validation and hyper-parameter tuning project is pretty new,
and there is not much in the way of existing bugs that will help
understand it since the project involves generating a completely new
piece of code for mlpack.

I just opened some issues for the decision tree code today; maybe you
can find one of those interesting?

https://github.com/mlpack/mlpack/issues
(the top 5 are related to decision trees, at least when I wrote this
email)

I think one approach would be to use the various different classifiers
and functionality inside of mlpack, and then write some simple C++
programs to do cross-validation or hyper-parameter tuning by hand.
Then, this could help make it more clear what the needs of the
hyper-parameter tuning module and cross-validation module would be.

Maybe these pages are also helpful:

http://www.mlpack.org/involved.html
http://www.mlpack.org/gsoc.html

There are also other issues open in the Github issue tracker, and any
contributions of new techniques or efficiency improvements for existing
implementations are always welcome.

> Briefly about myself. I am a PhD student working on Computational
> Humor. More precisely I’m working on the problem of finding/generating
> a humorous response given a textual input. My programming experience
> includes two summer internships in big Russian IT companies: in one I
> was programming in C# (SKB Kontur), in another I was a C++ developer
> (Yandex search). In daily life I use Python. I have taken the online
> course Machine Learning by Stanford (Coursera), as well as some other
> courses related to ML (AI by Berkeley (EdX), Deep Learning by Google
> (Udacity), and others).

Wow, computational humor, that is very cool!  There was a group that I
worked with briefly at Georgia Tech on computational humor:

http://www.vip.gatech.edu/teams/humor-genome

I gave a talk to that group on the mlpack collaborative filtering code,
and I think that one point they were using mlpack_cf as a recommender
system for jokes, but I am not sure what came of it.  I will have to
ask...

I always thought it would be interesting to use generative deep neural
networks to try and generate jokes.  I don't think they would be good
jokes, but I think they would be funny for the same reason my favorite
comic Garkov is funny:

http://joshmillard.com/garkov/

I'd be interested to hear more about what you are doing there, if you'd
like to elaborate.  I think that is a very neat field.

Thanks,

Ryan

-- 
Ryan Curtin    | "If it's something that can be stopped, then just try to stop it!"
ryan at ratml.org |   - Skull Kid


More information about the mlpack mailing list