Module llmsearch.utils.mem_utils
Inspired from toma, Memory related utils to do friendly inference.
Functions
def batch_without_oom_error(func)-
Perform Inference on a batch of samples by dividing the batch_size by 2 each time whenever OOM error happens, Function should have a
batch_sizeanddisable_batch_size_cacheparameterArgs
func:Callable- function having this signature of arguments
*args, batch_size, disable_batch_size_cache, **kwargs
def gc_cuda()-
Gargage collect RAM & Torch (CUDA) memory.
def get_gpu_information()-
Get CUDA gpu related info if gpu exist
Returns
Union[None, Tuple[int, float, float]]- total available gpus, total occupied memory gb, total available gpu memeory
Noneif unable to get CUDA GPU related info def get_total_available_ram()-
Get total available ram in GB
Returns
float- available ram in GB
def get_traceback(ignore_first=0, stack_context=5)-
Get traceback from first to latest call
Args
ignore_first:int, optional- ignore first n traceback. Defaults to 0.
stack_context:int, optional- context for traceback. Defaults to 5.
Returns
Tuple[Tuple]- Tuples of Function call and code
def is_cuda_out_of_memory(exception)-
Checks for CUDA OOM Error
def is_cudnn_snafu(exception)-
For/because of https://github.com/pytorch/pytorch/issues/4107
def is_out_of_cpu_memory(exception)-
Checks for CPU OOM Error
def should_reduce_batch_size(exception)-
Checks whether batch size can be reduced or not
Classes
class Cache-
Cache to store the optimal batch size for a specific configuration call using traceback and memory information
Initializes cache
Methods
def empty_cache(self)-
Empties cache
def get_value(self, current_value, stacktrace, total_available_gpu_memory, total_available_ram_memory)-
Tries to get a value for a particular set of hashes, returns
current_value(this only happens during the initial call)Args
current_value:Any- current value
stacktrace:Tuple- stacktrace of the method call
total_available_gpu_memory:float- total available gpu memory
total_available_ram_memory:float- total available ram memory
Returns
floatcurrent_valueif hash_key is not present, else returns the hashed key
def is_empty(self)-
Checks if the cache is empty
Returns
bool- empty or not
def set_value(self, value, stacktrace, total_available_gpu_memory, total_available_ram_memory)-
Sets a value based on certain combination of different hashes
Args
value:Any- hash value
stacktrace:Tuple- stacktrace of the method call
total_available_gpu_memory:float- total available gpu memory
total_available_ram_memory:float- total available ram memory