we have GPU servers and we are using ollama models on to it we have nvme disk attached to it. we wanted to know did ollama models will work better on nvme or os disk? we are using autoshutdown and autostart to save cost as well.

Jay Namdeo 0 Reputation points
2025-04-24T10:25:41.7633333+00:00

We're running Ollama models on GPU servers with an NVMe disk attached. To optimize performance, we're considering whether to store models on the NVMe or the OS disk. Since Ollama frequently accesses model files, using the high-speed NVMe disk will improve load times and inference performance compared to standard OS disks. We're also using auto-shutdown and auto-start to save costs, and storing models on the persistent NVMe ensures models are readily available at startup without re-downloading.

Developer technologies | XAML
Developer technologies | XAML
A language based on Extensible Markup Language (XML) that enables developers to specify a hierarchy of objects with a set of properties and logic.
{count} votes

1 answer

Sort by: Most helpful
  1. Harry Vo (WICLOUD CORPORATION) 3,910 Reputation points Microsoft External Staff Moderator
    2025-09-15T10:04:36.3333333+00:00

    Hi @Jay Namdeo , sorry for late response!

    As I understand, Ollama stores model weights on disks. When the model starts, those files are read from disk and copied into system RAM and then GPU VRAM. After this step, most of operations happen in VRAM and the disk should not be related to this process unless you unload or reload the models.

    So, your choice really depends on how frequently the model is started. Since you mentioned using auto shutdown and auto start, it sounds like the model restarts quite often. In that case, I’d strongly recommend going with NVMe. Keeping your files on a separate NVMe disk not only speeds up read times but also ensures persistence, since it's independent of the system. That makes things a lot easier to manage.

    I hope this helps you get things back on track quickly! If my suggestions can solve your issue, feel free to interact with the system accordingly!

    Thank you!

    1 person found this answer helpful.

Your answer

Answers can be marked as 'Accepted' by the question author and 'Recommended' by moderators, which helps users know the answer solved the author's problem.