Add benchmarks for the core algorithm #4

jlperla · 2020-10-21T17:58:31Z

For the most part, you will want to benchmark the time to do XXX rademachers for the larger problems, or it might take too long to run multiple benchmarks.

Lets do a sanity check then on how are speed for teh large and huge compare to the existing matlab code. Maybe put the timing of the matlab as a comment in the benchmarks so that we don't need to rerun it ever again.

paulcorcuera · 2020-10-23T15:12:56Z

There is a benchmark\compute_whole.jl file that records the speed between Julia and Matlab for the large problem. Both are reasonably close, and the Julia code gets faster (relative to Matlab) as the number of rademacher simulations increase. I will have to use the cluster to benchmark things in Julia for the huge network, since the problem is infeasible in my computer.

jlperla · 2020-10-23T16:32:48Z

Great!

It would be good to be able to say for the number of rademachers given by the heuristic Raffa gave what the performance of the matlab stuff was on the cluster for the large and huge datasets. We should keep track of the matlab numbers so we don't have to rerun matlab for a long time. I think it would also be useful for you to put the time for the large system on your computer as a comment as well. I would put the number of random projections in the comment as well so if we change the heuristic we will know what number of projections it was run on.

I think having these in comments in the benchmark julia source is good enough for now. Eventually we won't even need them once this becomes the defacto benchmark.

paulcorcuera · 2020-10-26T19:34:28Z

All timings are recorded in some comments in my code (for both my computer and on the cluster). For the huge dataset I encountered a killed process ,with both Julia and Matlab ,even when running for a small number of repetitions (50, and I have been allocating 3-4 hrs and 256GB memory to avoid long queue times). I can attempt an even smaller number like 5 or 10 rademachers, if you guys think it's useful to have it there. However, if you guys think having the benchmarks for the large network is enough, we could close this issue now. @rsaggio87

rsaggio87 · 2020-10-27T03:48:36Z

are u saying that the code crashed when running on the huge dataset? I thought we had this under control...

paulcorcuera · 2020-10-27T16:21:29Z

Sorry, no. What I am saying is that I receive a killed message, which is most likely due to reaching the memory allocation limit (I am using 256GB).

jlperla added the High Priority label Oct 21, 2020

jlperla assigned Alim-faraji and paulcorcuera and unassigned Alim-faraji Oct 23, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add benchmarks for the core algorithm #4

Add benchmarks for the core algorithm #4

jlperla commented Oct 21, 2020

paulcorcuera commented Oct 23, 2020

jlperla commented Oct 23, 2020

paulcorcuera commented Oct 26, 2020

rsaggio87 commented Oct 27, 2020

paulcorcuera commented Oct 27, 2020

Add benchmarks for the core algorithm #4

Add benchmarks for the core algorithm #4

Comments

jlperla commented Oct 21, 2020

paulcorcuera commented Oct 23, 2020

jlperla commented Oct 23, 2020

paulcorcuera commented Oct 26, 2020

rsaggio87 commented Oct 27, 2020

paulcorcuera commented Oct 27, 2020