Should use `mmap` for model loading · Issue #91 · ggerganov/llama.cpp
So it doesn’t create an extra copy in RAM and lives in the kernel page cache happily, loading instantly on subsequent runs.
Actions
Flag
1 Connection
So it doesn’t create an extra copy in RAM and lives in the kernel page cache happily, loading instantly on subsequent runs.