Saturday, June 9, 2012

Rcpp vs. R implementation of cosine similarity

While speeding up some code the other day working on a project with a colleague I ended up trying Rcpp for the first time. I re-implemented the cosine distance function using RcppArmadillo relatively easily using bits and pieces of code I found scattered around the web. But the speed increase was not as much as I expected comparing the Rcpp code to pure R.
And here is the speed comparison...

I don't know really if my implementation can be improved? For example, there is this step at the beginning where the R matrix is transformed to a Rcpp::NumericMatrix and then to an arma::mat matrix. I could not ran the code without this step. I don't think it plays that much into the run time anyway as it should be all in-memory operation but I would be curious to know if there is another way.