Module llmsearch.utils.mem_utils

Inspired from toma, Memory related utils to do friendly inference.

Functions

def batch_without_oom_error(func)

Perform Inference on a batch of samples by dividing the batch_size by 2 each time whenever OOM error happens, Function should have a batch_size and disable_batch_size_cache parameter

Args

func : Callable
function having this signature of arguments *args, batch_size, disable_batch_size_cache, **kwargs
def gc_cuda()

Gargage collect RAM & Torch (CUDA) memory.

def get_gpu_information()

Get CUDA gpu related info if gpu exist

Returns

Union[None, Tuple[int, float, float]]
total available gpus, total occupied memory gb, total available gpu memeory

None if unable to get CUDA GPU related info

def get_total_available_ram()

Get total available ram in GB

Returns

float
available ram in GB
def get_traceback(ignore_first=0, stack_context=5)

Get traceback from first to latest call

Args

ignore_first : int, optional
ignore first n traceback. Defaults to 0.
stack_context : int, optional
context for traceback. Defaults to 5.

Returns

Tuple[Tuple]
Tuples of Function call and code
def is_cuda_out_of_memory(exception)

Checks for CUDA OOM Error

def is_cudnn_snafu(exception)
def is_out_of_cpu_memory(exception)

Checks for CPU OOM Error

def should_reduce_batch_size(exception)

Checks whether batch size can be reduced or not

Classes

class Cache

Cache to store the optimal batch size for a specific configuration call using traceback and memory information

Initializes cache

Methods

def empty_cache(self)

Empties cache

def get_value(self, current_value, stacktrace, total_available_gpu_memory, total_available_ram_memory)

Tries to get a value for a particular set of hashes, returns current_value (this only happens during the initial call)

Args

current_value : Any
current value
stacktrace : Tuple
stacktrace of the method call
total_available_gpu_memory : float
total available gpu memory
total_available_ram_memory : float
total available ram memory

Returns

float
current_value if hash_key is not present, else returns the hashed key
def is_empty(self)

Checks if the cache is empty

Returns

bool
empty or not
def set_value(self, value, stacktrace, total_available_gpu_memory, total_available_ram_memory)

Sets a value based on certain combination of different hashes

Args

value : Any
hash value
stacktrace : Tuple
stacktrace of the method call
total_available_gpu_memory : float
total available gpu memory
total_available_ram_memory : float
total available ram memory