Ticket #303 (closed Defect: fixed)

Opened 5 years ago

Last modified 5 years ago

Even with "chunks" option, Predict GLM and Predict GAM tools can still fail with RPy_RException: Error: cannot allocate vector of size 92.6 Mb

Reported by: jjr8 Owned by: jjr8
Priority: Medium Milestone: 0.7
Component: Tools - Statistics Version:
Keywords: Cc:

Description

The problem is that the existing "chunking" code only comes into play for the calls to the R predict function. We are not reading the rasters in chunks. Instead we read all of the rasters into memory, then chunk through the predictions. We did not consider the fact that the rasters themselves could be so large that they would not fit into memory. But this is definitely a legitimate case, particularly when you're working with many predictors, and it can still cause the error.

The solution is to modify the chunking code so it includes reading the rasters, as well doing the predictions.

Thanks to Pat Iampietro for reporting this problem.

Change History

Changed 5 years ago by jjr8

  • status changed from new to assigned
  • component changed from Unknown to Tools - Statistics
  • milestone set to 0.7

Changed 5 years ago by jjr8

  • status changed from assigned to closed
  • resolution set to fixed

As part of fixing this, I decided to remove the "chunks" parameter from the tool and make the chunking automatic. After doing some performance testing, I found that it was only about 1.5x slower to read and process the predictor rasters one row at a time, compared to reading them entirely into memory and then processing them one row at a time. Given that the bottleneck in the tool is not running the predictions, but actually doing the ArcGIS operations that occur beforehand and afterwards, I believe it is a good tradeoff to increase the prediction time by 50% in exchange for eliminating the chunks parameter entirely. This makes life much simpler for the user.

Note that this still needs to be done for the Bayesian Predict tool. I will open a separate ticket for that.

Fixed in [314], released in MGET 0.7a1.

Note: See TracTickets for help on using tickets.