Monitoring Nvidia GPUs using REST API
In this article I will tell how can get information about the state of the Nvidia GPU through the REST API.
For work, let’s take https://github.com/lampaa/nvidia-smi-rest. SmiRest works through nvidia-smi, receives all information about the GPU and outputs it in json format (more about the format here):
After launch, the REST API is available at http://localhost:8176
Get information about the version of the driver and cuda, a list of all GPUs http://localhost:8176/v1
Get information about one GPU (/v1/GPU_UUID): http://localhost:8176/v1/GPU-a9685d8a-fbf2-7465-ee1f-307141ef06a8
Get information about a specific field of one GPU: (/v1/GPU_UUID/field[/subfield][/subsubfield]): http://localhost:8176/v1/GPU-a9685d8a-fbf2-7465-ee1f-307141ef06a8/pci/pciGpuLinkInfo/pcieGen
The service has built-in statistics per minute and displaying statistics on charts.
Get statistics for all GPUs (/stats): http://localhost:8176/stats
Get statistics for one GPU (/stats/GPU_UUID): http://localhost:8176/stats/GPU-a9685d8a-fbf2-7465-ee1f-307141ef06a8
Display graphics for all GPUs (/stats/graphs): http://localhost:8176/stats/graphs
You can add a library for working with a nvidia-smi to your maven project:
<dependency>
<groupId>com.github.lampaa</groupId>
<artifactId>smirest</artifactId>
<version>0.0.2</version>
</dependency>
and get data:
package com.github.lampaa.smi;
import com.github.lampaa.smi.dtoV1.NvidiaSmiLogType;
import jakarta.xml.bind.JAXBException;
import java.io.IOException;
class Application {
@Test
public void void readingTest() throws JAXBException {
//read from system
NvidiaSmiLogType smiLog = SmiReader.read();
}
}