Should use `mmap` for model loading · Issue #91 · ggerganov/llama.cpp

So it doesn’t create an extra copy in RAM and lives in the kernel page cache happily, loading instantly on subsequent runs.

Actions
Flag