Module `llmsearch.utils.model_utils`

Common Utilties for Models

Functions

def batcher(iterable, batch_size)

Batch a iterable into batches of batch_size

Args

iterable : Iterable: iterable
batch_size : int: batch size

Yields

Iterator: iterator over batches

def decoder_parser(outputs, formatted_prompts, prepoc)

Removes the prompt from the text and calls prepoc on the completion

Args

outputs : List[str]: model outputs
formatted_prompts : List[str]: formatted prompts
prepoc : callable: prepoc function

Returns

List: processed outputs

def encoder_decoder_parser(outputs, prepoc)

Applies the prepoc function on the completion

Args

outputs : str: model outputs
prepoc : callable: prepoc function

Returns

List: processed outputs

def get_device()

Get device one of "cpu", "cuda", "mps"

Returns

str: device str

def run_inference(model, tokenizer, is_encoder_decoder, batch_size, disable_batch_size_cache, device, model_inputs, tokenizer_encode_args, tokenizer_decode_args, generation_args=None, disable_generation_param_checks=False, return_optimal_batch_size=False, output_preproc=<function <lambda>>, callbacks=None)

Infer on data with a specific batch size

Args

model : AutoModelForSeq2SeqLM: model with a .generate method
tokenizer : AutoTokenizer: tokenizer to tokenize the input
is_encoder_decoder : bool: whether the model is an encoder-decoder model, False if not
batch_size : int: batch_size to run inference with, this gets dynamically reduced if the inference function encounters OOM errors
disable_batch_size_cache : bool: If True for each cross validation run, the pre-defined batch_size is used, this could lead to wasted computation time if OOM is raised
device : str: device to run the inference on
model_inputs : List: model inputs to do inference on
tokenizer_encode_args : Dict: Encoding arguments for the tokenizer
tokenizer_decode_args : Dict, optional: Decoding arguments for the tokenizer. Defaults to {'skip_special_tokens' : True}.
generation_args : Dict, optional: generation kwargs to use while generating the output. Defaults to None.
disable_generation_param_checks : bool, optional: Disables the custom generation parameter checks, this check does a sanity check of the parameters & produces warnings before doing generation, Not stable right now.
return_optimal_batch_size : bool, optional: if the function should return the optimal batch size found, useful for caching when performing cross validation. Defaults to False.
output_preproc : Callable, optional: Prepoc to run on the completion, by default strips the output. Note that this is applied on the completion. Defaults to False.
callbacks : List, optional: List of callbacks to run after each generation, by default None.

Returns

Union[Tuple[List, int], List]: outputs and or best batch size

def seed_everything(seed)

Seed for reproducibilty