Module llmsearch.utils.mem_utils
Inspired from toma, Memory related utils to do friendly inference.
Functions
def batch_without_oom_error(func)
-
Perform Inference on a batch of samples by dividing the batch_size by 2 each time whenever OOM error happens, Function should have a
batch_size
anddisable_batch_size_cache
parameterArgs
func
:Callable
- function having this signature of arguments
*args, batch_size, disable_batch_size_cache, **kwargs
def gc_cuda()
-
Gargage collect RAM & Torch (CUDA) memory.
def get_gpu_information()
-
Get CUDA gpu related info if gpu exist
Returns
Union[None, Tuple[int, float, float]]
- total available gpus, total occupied memory gb, total available gpu memeory
None
if unable to get CUDA GPU related info def get_total_available_ram()
-
Get total available ram in GB
Returns
float
- available ram in GB
def get_traceback(ignore_first=0, stack_context=5)
-
Get traceback from first to latest call
Args
ignore_first
:int
, optional- ignore first n traceback. Defaults to 0.
stack_context
:int
, optional- context for traceback. Defaults to 5.
Returns
Tuple[Tuple]
- Tuples of Function call and code
def is_cuda_out_of_memory(exception)
-
Checks for CUDA OOM Error
def is_cudnn_snafu(exception)
-
For/because of https://github.com/pytorch/pytorch/issues/4107
def is_out_of_cpu_memory(exception)
-
Checks for CPU OOM Error
def should_reduce_batch_size(exception)
-
Checks whether batch size can be reduced or not
Classes
class Cache
-
Cache to store the optimal batch size for a specific configuration call using traceback and memory information
Initializes cache
Methods
def empty_cache(self)
-
Empties cache
def get_value(self, current_value, stacktrace, total_available_gpu_memory, total_available_ram_memory)
-
Tries to get a value for a particular set of hashes, returns
current_value
(this only happens during the initial call)Args
current_value
:Any
- current value
stacktrace
:Tuple
- stacktrace of the method call
total_available_gpu_memory
:float
- total available gpu memory
total_available_ram_memory
:float
- total available ram memory
Returns
float
current_value
if hash_key is not present, else returns the hashed key
def is_empty(self)
-
Checks if the cache is empty
Returns
bool
- empty or not
def set_value(self, value, stacktrace, total_available_gpu_memory, total_available_ram_memory)
-
Sets a value based on certain combination of different hashes
Args
value
:Any
- hash value
stacktrace
:Tuple
- stacktrace of the method call
total_available_gpu_memory
:float
- total available gpu memory
total_available_ram_memory
:float
- total available ram memory