I just put together a short Jupyter notebook with tips and tricks for reducing memory usage when loading larger and larger models (like LLMs) in PyTorch. These approaches become increasingly important as we are trying to make this models work on limit hardware configurations.
Here's the link to the notebook: https://lnkd.in/gt-VMGfv
By the way, the examples aren't just for LLMs. These techniques apply to any model in PyTorch.
Here's the link to the notebook: https://lnkd.in/gt-VMGfv
By the way, the examples aren't just for LLMs. These techniques apply to any model in PyTorch.