Logo

User Guide:

  • Overview
    • Terminology
    • Focus Areas
      • Provide robust, online health and diagnostics
      • Enable job-level statistics and continuous GPU telemetry
      • Manage GPUs as collections of related resources
      • Configure NVSwitches
      • Define and enforce GPU configuration state
      • Automate GPU management policies
    • Target Users
  • Getting Started
    • Supported Platforms
      • Supported Linux Distributions
    • Installation
      • System Requirements
      • Pre-Requisites
      • Installation
        • Ubuntu LTS and Debian
        • RHEL / CentOS / Rocky Linux
        • SUSE SLES / OpenSUSE
      • Post-Install
    • Basic Components
      • DCGM shared library
      • NVIDIA Host Engine
      • DCGM CLI Tool
      • Python Bindings
      • Software Development Kit
    • Modes of Operation
      • Embedded Mode
      • Standalone Mode
    • Static Library
  • Feature Overview
    • Groups
    • Configuration
    • Policy
      • Notifications
      • Actions
    • Job Statistics
    • Health and Diagnostics
      • Background Health Checks
      • Active Health Checks
    • Topology
    • NVLink Counters
    • Field Groups
    • Link Status
    • CPU Monitoring
      • Overview
      • Requirements
      • Introduction
      • CPU And Core Fields
      • Examples
    • Profiling Metrics
      • Metrics
      • Multiplexing of Profiling Counters
      • Concurrent Usage of NVIDIA Profiling Tools
      • CUDA Test Generator (dcgmproftester)
      • Metrics on Multi-Instance GPU
        • Example 1
        • Understanding Metrics
        • Platform Support
  • DCGM Diagnostics
    • Overview
      • DCGM Diagnostic Goals
      • Beyond the Scope of the DCGM Diagnostics
      • Run Levels and Tests
    • Getting Started with DCGM Diagnostics
      • Command Line options
      • Configuration File
      • Exit Codes
      • Usage Examples
        • Custom Configuration File
        • Tests and Parameters
        • Iterations
    • Logging
    • Overview of Plugins
      • Deployment Plugin
        • Preconditions
        • Configuration Parameters
        • Stat Outputs
        • Failure
      • Diagnostic Plugin
        • Overview
        • Test Description
        • Supported Parameters
        • Sample Commands
        • Failure Conditions
      • PCIe - GPU Bandwidth Plugin
        • Overview
        • Preconditions
        • Sub tests
      • Targeted Power Plugin
        • Overview
        • Test Description
        • Supported Parameters
        • Sample Commands
        • Failure Conditions
      • Memtest Diagnostic
        • Overview
        • Test Descriptions
        • Supported Parameters
        • Sample Commands
      • Pulse Test Diagnostic
        • Overview
        • Test Description
        • Sample Commands
        • Failure Conditions
      • Extended Utility Diagnostics (EUD)
        • Supported Products
        • Included Tests
        • Getting Started with EUD
        • Supported Deployment Models
        • Running the EUD
        • Logging
      • CPU Extended Utility Diagnostics (CPU EUD)
        • Supported Products
        • Included Tests
        • Getting Started with CPU EUD
        • Running the CPU EUD
        • Logging
        • Command Usage
      • Automating Responses to DCGM Diagnostic Failures
        • Overview
  • DCGM Modularity
    • Module List
    • Disabling Modules
  • Error Injection
    • Overview
      • Error Injection Workflow
      • Field Identifiers
      • Examples with dcgmi
        • Thermal Violation
        • PCIe Replay Errors
        • ECC Errors
      • API Examples
  • Debugging and Troubleshooting
    • General Problem Reporting
    • Logging
      • Enable Logging Using Standalone Hostengine
      • Enable Logging Using Embedded Hostengine
      • Enable Diagnostic Logging
      • Enable NVML Logging

API Reference:

  • Modules
    • Administrative
      • Init and Shutdown
        • dcgmInit()
        • dcgmShutdown()
        • dcgmStartEmbedded()
        • dcgmStartEmbedded_v2()
        • dcgmStopEmbedded()
        • dcgmConnect()
        • dcgmConnect_v2()
        • dcgmDisconnect()
      • Auxilary information about DCGM engine
        • dcgmVersionInfo()
        • dcgmHostengineVersionInfo()
        • dcgmHostengineSetLoggingSeverity()
        • dcgmHostengineIsHealthy()
        • errorString()
        • dcgmModuleIdToName()
    • System
      • dcgmGetAllDevices()
      • dcgmGetAllSupportedDevices()
      • dcgmGetDeviceAttributes()
      • dcgmGetDeviceWorkloadPowerProfileInfo()
      • dcgmGetEntityGroupEntities()
      • dcgmGetGpuChipArchitecture()
      • dcgmGetGpuInstanceHierarchy()
      • dcgmGetNvLinkLinkStatus()
      • dcgmGetCpuHierarchy()
      • dcgmGetCpuHierarchy_v2()
      • dcgmGroupCreate()
      • dcgmGroupDestroy()
      • dcgmGroupAddDevice()
      • dcgmGroupAddEntity()
      • dcgmGroupRemoveDevice()
      • dcgmGroupRemoveEntity()
      • dcgmGroupGetInfo()
      • dcgmGroupGetAllIds()
      • dcgmFieldGroupCreate()
      • dcgmFieldGroupDestroy()
      • dcgmFieldGroupGetInfo()
      • dcgmFieldGroupGetAll()
      • dcgmStatusCreate()
      • dcgmStatusDestroy()
      • dcgmStatusGetCount()
      • dcgmStatusPopError()
      • dcgmStatusClear()
      • Discovery
      • Grouping
      • Field Grouping
      • Status Handling
    • Configuration
      • dcgmConfigSet()
      • dcgmConfigGet()
      • dcgmConfigEnforce()
      • Setup and Management
      • Manual Invocation
    • Field APIs
      • dcgmWatchFields()
      • dcgmUnwatchFields()
      • dcgmGetValuesSince()
      • dcgmGetValuesSince_v2()
      • dcgmGetLatestValues()
      • dcgmGetLatestValues_v2()
      • dcgmGetLatestValuesForFields()
      • dcgmEntityGetLatestValues()
      • dcgmEntitiesGetLatestValues()
      • dcgmGetFieldSummary()
    • Process Statistics
      • dcgmWatchPidFields()
      • dcgmGetPidInfo()
    • Job Statistics
      • dcgmWatchJobFields()
      • dcgmJobStartStats()
      • dcgmJobStopStats()
      • dcgmJobGetStats()
      • dcgmJobRemove()
      • dcgmJobRemoveAll()
    • Health Monitor
      • dcgmHealthSet()
      • dcgmHealthSet_v2()
      • dcgmHealthGet()
      • dcgmHealthCheck()
    • Policies
      • dcgmPolicySet()
      • dcgmPolicyGet()
      • dcgmPolicyRegister_v2()
      • dcgmPolicyUnregister()
      • dcgmActionValidate()
      • dcgmActionValidate_v2()
      • dcgmRunDiagnostic()
      • Setup and Management
      • Manual Invocation
    • Topology
      • dcgmGetDeviceTopology()
      • dcgmGetGroupTopology()
    • Metadata
      • dcgmIntrospectGetHostengineMemoryUsage()
      • dcgmIntrospectGetHostengineCpuUtilization()
    • Topology
      • dcgmSelectGpusByTopology()
    • Modules
      • dcgmModuleDenylist()
      • dcgmModuleGetStatuses()
    • Profiling
      • dcgmProfGetSupportedMetricGroups()
      • dcgmProfPause()
      • dcgmProfResume()
    • Enums and Macros
      • MAKE_DCGM_VERSION
      • DCGM_BLANK_VALUES
      • DCGM_INT8_BLANK
      • DCGM_INT32_BLANK
      • DCGM_INT64_BLANK
      • DCGM_FP64_BLANK
      • DCGM_STR_BLANK
      • DCGM_INT32_NOT_FOUND
      • DCGM_INT64_NOT_FOUND
      • DCGM_FP64_NOT_FOUND
      • DCGM_STR_NOT_FOUND
      • DCGM_INT32_NOT_SUPPORTED
      • DCGM_INT64_NOT_SUPPORTED
      • DCGM_FP64_NOT_SUPPORTED
      • DCGM_STR_NOT_SUPPORTED
      • DCGM_INT32_NOT_PERMISSIONED
      • DCGM_INT64_NOT_PERMISSIONED
      • DCGM_FP64_NOT_PERMISSIONED
      • DCGM_STR_NOT_PERMISSIONED
      • DCGM_INT8_IS_BLANK
      • DCGM_INT32_IS_BLANK
      • DCGM_INT64_IS_BLANK
      • DCGM_FP64_IS_BLANK
      • DCGM_STR_IS_BLANK
      • DCGM_MAX_NUM_DEVICES
      • DCGM_NVLINK_MAX_LINKS_PER_GPU
      • DCGM_NVLINK_ERROR_COUNT
      • DCGM_HEALTH_WATCH_NVLINK_ERROR_NUM_FIELDS
      • DCGM_NVLINK_MAX_LINKS_PER_GPU_LEGACY1
      • DCGM_NVLINK_MAX_LINKS_PER_GPU_LEGACY2
      • DCGM_MAX_NUM_SWITCHES
      • DCGM_MAX_XID_INFO
      • DCGM_NVLINK_MAX_LINKS_PER_NVSWITCH
      • DCGM_LANE_MAX_LANES_PER_NVSWICH_LINK
      • DCGM_MAX_VGPU_INSTANCES_PER_PGPU
      • DCGM_MAX_NUM_CPUS
      • DCGM_MAX_NUM_CPU_CORES
      • DCGM_MAX_STR_LENGTH
      • DCGM_MAX_AGE_USEC_DEFAULT
      • DCGM_MAX_CLOCKS
      • DCGM_MAX_NUM_GROUPS
      • DCGM_MAX_FBC_SESSIONS
      • DCGM_VGPU_NAME_BUFFER_SIZE
      • DCGM_GRID_LICENSE_BUFFER_SIZE
      • DCGM_CONFIG_COMPUTEMODE_DEFAULT
      • DCGM_CONFIG_COMPUTEMODE_PROHIBITED
      • DCGM_CONFIG_COMPUTEMODE_EXCLUSIVE_PROCESS
      • DCGM_HE_PORT_NUMBER
      • DCGM_GROUP_ALL_GPUS
      • DCGM_GROUP_ALL_NVSWITCHES
      • DCGM_GROUP_ALL_INSTANCES
      • DCGM_GROUP_ALL_COMPUTE_INSTANCES
      • DCGM_GROUP_ALL_ENTITIES
      • DCGM_GROUP_NULL
      • DCGM_GROUP_MAX_ENTITIES_V1
      • DCGM_GROUP_MAX_ENTITIES_V2
      • dcgmOperationMode_t
      • dcgmOrder_t
      • dcgmReturn_t
      • dcgmGroupType_t
      • dcgmChipArchitecture_t
      • dcgmConfigType_t
      • dcgmConfigPowerLimitType_t
      • dcgmOperationMode_enum
        • DCGM_OPERATION_MODE_AUTO
        • DCGM_OPERATION_MODE_MANUAL
      • dcgmOrder_enum
        • DCGM_ORDER_ASCENDING
        • DCGM_ORDER_DESCENDING
      • dcgmReturn_enum
        • DCGM_ST_OK
        • DCGM_ST_BADPARAM
        • DCGM_ST_GENERIC_ERROR
        • DCGM_ST_MEMORY
        • DCGM_ST_NOT_CONFIGURED
        • DCGM_ST_NOT_SUPPORTED
        • DCGM_ST_INIT_ERROR
        • DCGM_ST_NVML_ERROR
        • DCGM_ST_PENDING
        • DCGM_ST_UNINITIALIZED
        • DCGM_ST_TIMEOUT
        • DCGM_ST_VER_MISMATCH
        • DCGM_ST_UNKNOWN_FIELD
        • DCGM_ST_NO_DATA
        • DCGM_ST_STALE_DATA
        • DCGM_ST_NOT_WATCHED
        • DCGM_ST_NO_PERMISSION
        • DCGM_ST_GPU_IS_LOST
        • DCGM_ST_RESET_REQUIRED
        • DCGM_ST_FUNCTION_NOT_FOUND
        • DCGM_ST_CONNECTION_NOT_VALID
        • DCGM_ST_GPU_NOT_SUPPORTED
        • DCGM_ST_GROUP_INCOMPATIBLE
        • DCGM_ST_MAX_LIMIT
        • DCGM_ST_LIBRARY_NOT_FOUND
        • DCGM_ST_DUPLICATE_KEY
        • DCGM_ST_GPU_IN_SYNC_BOOST_GROUP
        • DCGM_ST_GPU_NOT_IN_SYNC_BOOST_GROUP
        • DCGM_ST_REQUIRES_ROOT
        • DCGM_ST_NVVS_ERROR
        • DCGM_ST_INSUFFICIENT_SIZE
        • DCGM_ST_FIELD_UNSUPPORTED_BY_API
        • DCGM_ST_MODULE_NOT_LOADED
        • DCGM_ST_IN_USE
        • DCGM_ST_GROUP_IS_EMPTY
        • DCGM_ST_PROFILING_NOT_SUPPORTED
        • DCGM_ST_PROFILING_LIBRARY_ERROR
        • DCGM_ST_PROFILING_MULTI_PASS
        • DCGM_ST_DIAG_ALREADY_RUNNING
        • DCGM_ST_DIAG_BAD_JSON
        • DCGM_ST_DIAG_BAD_LAUNCH
        • DCGM_ST_DIAG_UNUSED
        • DCGM_ST_DIAG_THRESHOLD_EXCEEDED
        • DCGM_ST_INSUFFICIENT_DRIVER_VERSION
        • DCGM_ST_INSTANCE_NOT_FOUND
        • DCGM_ST_COMPUTE_INSTANCE_NOT_FOUND
        • DCGM_ST_CHILD_NOT_KILLED
        • DCGM_ST_3RD_PARTY_LIBRARY_ERROR
        • DCGM_ST_INSUFFICIENT_RESOURCES
        • DCGM_ST_PLUGIN_EXCEPTION
        • DCGM_ST_NVVS_ISOLATE_ERROR
        • DCGM_ST_NVVS_BINARY_NOT_FOUND
        • DCGM_ST_NVVS_KILLED
        • DCGM_ST_PAUSED
        • DCGM_ST_ALREADY_INITIALIZED
        • DCGM_ST_NVML_NOT_LOADED
        • DCGM_ST_NVML_DRIVER_TIMEOUT
        • DCGM_ST_NVVS_NO_AVAILABLE_TEST
      • dcgmGroupType_enum
        • DCGM_GROUP_DEFAULT
        • DCGM_GROUP_EMPTY
        • DCGM_GROUP_DEFAULT_NVSWITCHES
        • DCGM_GROUP_DEFAULT_INSTANCES
        • DCGM_GROUP_DEFAULT_COMPUTE_INSTANCES
        • DCGM_GROUP_DEFAULT_EVERYTHING
      • dcgmChipArchitecture_enum
        • DCGM_CHIP_ARCH_OLDER
        • DCGM_CHIP_ARCH_KEPLER
        • DCGM_CHIP_ARCH_MAXWELL
        • DCGM_CHIP_ARCH_PASCAL
        • DCGM_CHIP_ARCH_VOLTA
        • DCGM_CHIP_ARCH_TURING
        • DCGM_CHIP_ARCH_AMPERE
        • DCGM_CHIP_ARCH_ADA
        • DCGM_CHIP_ARCH_HOPPER
        • DCGM_CHIP_ARCH_BLACKWELL
        • DCGM_CHIP_ARCH_COUNT
        • DCGM_CHIP_ARCH_UNKNOWN
      • dcgmConfigType_enum
        • DCGM_CONFIG_TARGET_STATE
        • DCGM_CONFIG_CURRENT_STATE
      • dcgmConfigPowerLimitType_enum
        • DCGM_CONFIG_POWER_CAP_INDIVIDUAL
        • DCGM_CONFIG_POWER_BUDGET_GROUP
      • errorString()
    • Structure Definitions
      • DCGM_HOME_DIR_VAR_NAME
      • DCGM_RUN_FLAGS_VERBOSE
      • DCGM_RUN_FLAGS_STATSONFAIL
      • DCGM_RUN_FLAGS_TRAIN
      • DCGM_RUN_FLAGS_FORCE_TRAIN
      • DCGM_RUN_FLAGS_FAIL_EARLY
      • DCGM_TOPO_HINT_F_NONE
      • DCGM_TOPO_HINT_F_IGNOREHEALTH
      • dcgmConnectV2Params_version1
      • dcgmConnectV2Params_version2
      • dcgmConnectV2Params_version
      • dcgmHostengineHealth_version1
      • dcgmHostengineHealth_version
      • dcgmGroupInfo_version2
      • dcgmGroupInfo_version3
      • dcgmGroupInfo_version
      • DCGM_MAX_INSTANCES_PER_GPU
      • DCGM_MAX_COMPUTE_INSTANCES_PER_GPU
      • DCGM_MAX_TOTAL_INSTANCES_PER_GPU
      • DCGM_MAX_HIERARCHY_INFO
      • DCGM_MAX_INSTANCES
      • DCGM_MAX_COMPUTE_INSTANCES
      • dcgmMigHierarchy_version2
      • dcgmMigHierarchy_version
      • DCGM_CPU_CORE_BITMASK_COUNT_V1
      • dcgmCpuHierarchyOwnedCores_version1
      • dcgmCpuHierarchy_version1
      • dcgmCpuHierarchy_version2
      • dcgmCpuHierarchy_version
      • DCGM_MAX_NUM_FIELD_GROUPS
      • DCGM_MAX_FIELD_IDS_PER_FIELD_GROUP
      • dcgmFieldGroupInfo_version1
      • dcgmFieldGroupInfo_version
      • dcgmAllFieldGroup_version1
      • dcgmAllFieldGroup_version
      • dcgmClockSet_version1
      • dcgmClockSet_version
      • dcgmDeviceSupportedClockSets_version1
      • dcgmDeviceSupportedClockSets_version
      • dcgmDevicePidAccountingStats_version1
      • dcgmDevicePidAccountingStats_version
      • dcgmDeviceThermals_version1
      • dcgmDeviceThermals_version
      • dcgmDevicePowerLimits_version1
      • dcgmDevicePowerLimits_version
      • dcgmDeviceIdentifiers_version1
      • dcgmDeviceIdentifiers_version
      • dcgmDeviceMemoryUsage_version1
      • dcgmDeviceMemoryUsage_version
      • dcgmDeviceVgpuUtilInfo_version1
      • dcgmDeviceVgpuUtilInfo_version
      • dcgmDeviceEncStats_version1
      • dcgmDeviceEncStats_version
      • dcgmDeviceFbcStats_version1
      • dcgmDeviceFbcStats_version
      • dcgmDeviceFbcSessionInfo_version1
      • dcgmDeviceFbcSessionInfo_version
      • dcgmDeviceFbcSessions_version1
      • dcgmDeviceFbcSessions_version
      • dcgmDeviceVgpuEncSessions_version1
      • dcgmDeviceVgpuEncSessions_version
      • dcgmDeviceVgpuProcessUtilInfo_version1
      • dcgmDeviceVgpuProcessUtilInfo_version
      • dcgmDeviceVgpuTypeInfo_version1
      • dcgmDeviceVgpuTypeInfo_version2
      • dcgmDeviceVgpuTypeInfo_version
      • dcgmDeviceSupportedVgpuTypeInfo_version1
      • dcgmDeviceSupportedVgpuTypeInfo_version
      • dcgmDeviceSettings_version2
      • dcgmDeviceSettings_version
      • dcgmDeviceAttributes_version3
      • dcgmDeviceAttributes_version
      • dcgmDeviceMigAttributesInfo_version1
      • dcgmDeviceMigAttributesInfo_version
      • dcgmDeviceMigAttributes_version1
      • dcgmDeviceMigAttributes_version
      • dcgmGpuInstanceProfileInfo_version1
      • dcgmGpuInstanceProfileInfo_version
      • dcgmGpuInstanceProfiles_version1
      • dcgmGpuInstanceProfiles_version
      • dcgmComputeInstanceProfileInfo_version1
      • dcgmComputeInstanceProfileInfo_version
      • dcgmComputeInstanceProfiles_version1
      • dcgmComputeInstanceProfiles_version
      • DCGM_MAX_VGPU_TYPES_PER_PGPU
      • DCGM_DEVICE_UUID_BUFFER_SIZE
      • DCGM_POWER_PROFILE_ARRAY_SIZE
      • DCGM_POWER_PROFILE_MASK_BITS_PER_ELEM
      • DCGM_POWER_PROFILE_MAX_NUM
      • dcgmWorkloadPowerProfileInfo_version1
      • dcgmWorkloadPowerProfileInfo_version
      • dcgmWorkloadPowerProfileProfilesInfo_version1
      • dcgmWorkloadPowerProfileProfilesInfo_version
      • dcgmDeviceWorkloadPowerProfilesStatus_version1
      • dcgmDeviceWorkloadPowerProfilesStatus_version
      • dcgmConfig_version1
      • dcgmConfig_version2
      • dcgmConfig_version
      • dcgmPolicyViolation_version1
      • dcgmPolicyViolation_version
      • DCGM_POLICY_COND_IDX_MAX
      • DCGM_POLICY_COND_MAX
      • dcgmPolicy_version1
      • dcgmPolicy_version
      • dcgmPolicyCallbackResponse_version2
      • dcgmPolicyCallbackResponse_version
      • DCGM_MAX_BLOB_LENGTH
      • dcgmFieldValue_version1
      • dcgmFieldValue_version2
      • DCGM_FV_FLAG_LIVE_DATA
      • DCGM_HEALTH_WATCH_COUNT_V1
      • DCGM_HEALTH_WATCH_COUNT_V2
      • DCGM_ERR_MSG_LENGTH
      • DCGM_DIAG_AUX_DATA_LEN
      • dcgmDiagTestAuxData_version1
      • dcgmDiagTestAuxData_version
      • DCGM_DIAG_TEST_RUN_ERROR_INDICES_MAX
      • DCGM_DIAG_TEST_RUN_INFO_INDICES_MAX
      • DCGM_DIAG_TEST_RUN_INFO_INDICES_MAX_V2
      • DCGM_DIAG_TEST_RUN_RESULTS_MAX
      • DCGM_DIAG_TEST_RUN_NAME_LEN
      • DCGM_DEVICE_ID_LEN
      • DCGM_VERSION_LEN
      • DCGM_HEALTH_WATCH_MAX_INCIDENTS_V2
      • dcgmHealthResponse_version5
      • dcgmHealthResponse_version
      • dcgmHealthSetParams_version2
      • DCGM_MAX_PID_INFO_NUM
      • dcgmPidInfo_version2
      • dcgmPidInfo_version
      • dcgmJobInfo_version3
      • dcgmJobInfo_version
      • dcgmRunningProcess_version1
      • dcgmRunningProcess_version
      • DCGM_MAX_ERRORS
      • DCGM_SM_PERF_INDEX
      • DCGM_TARGETED_PERF_INDEX
      • DCGM_PER_GPU_TEST_COUNT_V8
      • DCGM_PER_GPU_TEST_COUNT_V7
      • DCGM_SWTEST_COUNT
      • LEVEL_ONE_MAX_RESULTS
      • DCGM_DIAG_RESPONSE_TESTS_MAX
      • DCGM_DIAG_RESPONSE_SYSTEM_ERROR
      • DCGM_DIAG_RESPONSE_ERRORS_MAX
      • DCGM_DIAG_RESPONSE_INFO_MAX
      • DCGM_DIAG_RESPONSE_INFO_MAX_V2
      • DCGM_DIAG_RESPONSE_ENTITIES_MAX
      • DCGM_DIAG_RESPONSE_RESULTS_MAX
      • DCGM_DIAG_RESPONSE_CATEGORIES_MAX
      • DCGM_DIAG_RESPONSE_CATEGORY_LEN
      • DCGM_DIAG_RESPONSE_V11_UNUSED_LEN
      • DCGM_DIAG_RESPONSE_V12_UNUSED_LEN
      • dcgmDiagResponse_version12
      • dcgmDiagResponse_version11
      • dcgmDiagResponse_version10
      • dcgmDiagResponse_version9
      • dcgmDiagResponse_version8
      • dcgmDiagResponse_version7
      • dcgmDiagResponse_version
      • dcgmDiagStatus_version1
      • dcgmDiagStatus_version
      • DCGM_TOPOLOGY_PATH_PCI
      • DCGM_TOPOLOGY_PATH_NVLINK
      • DCGM_AFFINITY_BITMASK_ARRAY_SIZE
      • dcgmDeviceTopology_version1
      • dcgmDeviceTopology_version
      • dcgmGroupTopology_version1
      • dcgmGroupTopology_version
      • dcgmIntrospectMemory_version1
      • dcgmIntrospectMemory_version
      • dcgmIntrospectCpuUtil_version1
      • dcgmIntrospectCpuUtil_version
      • DCGM_MAX_CONFIG_FILE_LEN
      • DCGM_MAX_TEST_NAMES
      • DCGM_MAX_TEST_NAMES_LEN
      • DCGM_MAX_TEST_PARMS
      • DCGM_MAX_TEST_PARMS_LEN
      • DCGM_MAX_TEST_PARMS_LEN_V2
      • DCGM_GPU_LIST_LEN
      • DCGM_ENTITY_ID_LIST_LEN
      • DCGM_EXPECTED_ENTITIES_LEN
      • DCGM_FILE_LEN
      • DCGM_PATH_LEN
      • DCGM_CLOCKS_EVENT_MASK_LEN
      • DCGM_IGNORE_ERROR_MAX_LEN
      • DCGM_THROTTLE_MASK_LEN
      • dcgmRunDiag_version7
      • dcgmRunDiag_version8
      • dcgmRunDiag_version9
      • dcgmRunDiag_version10
      • DCGM_GEGE_FLAG_ONLY_SUPPORTED
      • dcgmTopoSchedHint_version1
      • dcgmNvLinkStatus_version4
      • DCGM_SUMMARY_MIN
      • DCGM_SUMMARY_MAX
      • DCGM_SUMMARY_AVG
      • DCGM_SUMMARY_SUM
      • DCGM_SUMMARY_COUNT
      • DCGM_SUMMARY_INTEGRAL
      • DCGM_SUMMARY_DIFF
      • DCGM_SUMMARY_SIZE
      • dcgmFieldSummaryRequest_version1
      • DCGM_MODULE_STATUSES_CAPACITY
      • dcgmModuleGetStatuses_version1
      • dcgmModuleGetStatuses_version
      • dcgmStartEmbeddedV2Params_version1
      • dcgmStartEmbeddedV2Params_version2
      • DCGM_PROF_MAX_NUM_GROUPS_V2
      • DCGM_PROF_MAX_FIELD_IDS_PER_GROUP_V2
      • dcgmProfGetMetricGroups_version3
      • dcgmProfGetMetricGroups_version
      • dcgmSettingsSetLoggingSeverity_version1
      • dcgmSettingsSetLoggingSeverity_version2
      • dcgmSettingsSetLoggingSeverity_version
      • dcgmVersionInfo_version2
      • dcgmVersionInfo_version
      • dcgmHandle_t
      • dcgmGpuGrp_t
      • dcgmFieldGrp_t
      • dcgmStatus_t
      • dcgm_link_t
      • dcgmConnectV2Params_t
      • dcgmHostengineHealth_t
      • dcgmGroupInfo_t
      • dcgmCpuHierarchyOwnedCores_t
      • dcgmCpuHierarchy_t
      • dcgmFieldGroupInfo_t
      • dcgmAllFieldGroup_t
      • dcgmClockSet_t
      • dcgmDeviceSupportedClockSets_t
      • dcgmDevicePidAccountingStats_t
      • dcgmDeviceThermals_t
      • dcgmDevicePowerLimits_t
      • dcgmDeviceIdentifiers_t
      • dcgmDeviceMemoryUsage_t
      • dcgmDeviceVgpuUtilInfo_t
      • dcgmDeviceEncStats_t
      • dcgmDeviceFbcStats_t
      • dcgmFBCSessionType_t
      • dcgmDeviceFbcSessionInfo_t
      • dcgmDeviceFbcSessions_t
      • dcgmEncoderType_t
      • dcgmDeviceVgpuEncSessions_t
      • dcgmDeviceVgpuProcessUtilInfo_t
      • dcgmDeviceVgpuTypeInfo_t
      • dcgmDeviceSupportedVgpuTypeInfo_t
      • dcgmDeviceSettings_t
      • dcgmDeviceAttributes_t
      • dcgmDeviceMigAttributesInfo_t
      • dcgmDeviceMigAttributes_t
      • dcgmGpuInstanceProfileInfo_t
      • dcgmGpuInstanceProfiles_t
      • dcgmComputeInstanceProfileInfo_t
      • dcgmComputeInstanceProfiles_t
      • dcgmWorkloadPowerProfileInfo_t
      • dcgmWorkloadPowerProfileProfilesInfo_t
      • dcgmDeviceWorkloadPowerProfilesStatus_t
      • dcgmConfig_t
      • dcgmPolicyViolation_t
      • dcgmPolicyConditionIdx_t
      • dcgmPolicyCondition_t
      • dcgmPolicyConditionParams_t
      • dcgmPolicyMode_t
      • dcgmPolicyIsolation_t
      • dcgmPolicyAction_t
      • dcgmPolicyValidation_t
      • dcgmPolicyFailureResp_t
      • dcgmPolicy_t
      • dcgmPolicyCallbackResponse_t
      • fpRecvUpdates
      • dcgmFieldValueEnumeration_f
      • dcgmFieldValueEntityEnumeration_f
      • dcgmHealthSystems_t
      • dcgmHealthWatchResults_t
      • dcgmDiagResult_t
      • dcgmHealthResponse_t
      • dcgmPidInfo_t
      • dcgmJobInfo_t
      • dcgmRunningProcess_t
      • dcgmPerGpuTestIndices_t
      • dcgmSoftwareTest_t
      • dcgmDiagResponse_t
      • dcgmDiagStatus_t
      • dcgmGpuTopologyLevel_t
      • dcgmDeviceTopology_t
      • dcgmGroupTopology_t
      • dcgmIntrospectMemory_t
      • dcgmIntrospectCpuUtil_t
      • dcgmGpuNVLinkErrorType_t
      • dcgmTopoSchedHint_t
      • dcgmNvLinkLinkState_t
      • dcgmNvLinkStatus_t
      • dcgmFieldSummaryRequest_t
      • dcgmModuleGetStatuses_t
      • dcgmProfGetMetricGroups_t
      • dcgmSettingsSetLoggingSeverity_t
      • dcgmVersionInfo_t
      • DcgmLoggingSeverity_t
        • DcgmLoggingSeverityUnspecified
        • DcgmLoggingSeverityNone
        • DcgmLoggingSeverityFatal
        • DcgmLoggingSeverityError
        • DcgmLoggingSeverityWarning
        • DcgmLoggingSeverityInfo
        • DcgmLoggingSeverityDebug
        • DcgmLoggingSeverityVerbose
      • dcgmMigProfile_t
        • DcgmMigProfileNone
        • DcgmMigProfileGpuInstanceSlice1
        • DcgmMigProfileGpuInstanceSlice2
        • DcgmMigProfileGpuInstanceSlice3
        • DcgmMigProfileGpuInstanceSlice4
        • DcgmMigProfileGpuInstanceSlice7
        • DcgmMigProfileGpuInstanceSlice8
        • DcgmMigProfileGpuInstanceSlice6
        • DcgmMigProfileGpuInstanceSlice1Rev1
        • DcgmMigProfileGpuInstanceSlice2Rev1
        • DcgmMigProfileGpuInstanceSlice1Rev2
        • DcgmMigProfileGpuInstanceSlice1GFX
        • DcgmMigProfileGpuInstanceSlice2GFX
        • DcgmMigProfileGpuInstanceSlice4GFX
        • DcgmMigProfileComputeInstanceSlice1
        • DcgmMigProfileComputeInstanceSlice2
        • DcgmMigProfileComputeInstanceSlice3
        • DcgmMigProfileComputeInstanceSlice4
        • DcgmMigProfileComputeInstanceSlice7
        • DcgmMigProfileComputeInstanceSlice8
        • DcgmMigProfileComputeInstanceSlice6
        • DcgmMigProfileComputeInstanceSlice1Rev1
      • dcgmFBCSessionType_enum
        • DCGM_FBC_SESSION_TYPE_UNKNOWN
        • DCGM_FBC_SESSION_TYPE_TOSYS
        • DCGM_FBC_SESSION_TYPE_CUDA
        • DCGM_FBC_SESSION_TYPE_VID
        • DCGM_FBC_SESSION_TYPE_HWENC
      • dcgmEncoderQueryType_enum
        • DCGM_ENCODER_QUERY_H264
        • DCGM_ENCODER_QUERY_HEVC
      • dcgmPowerProfileType_t
        • DCGM_POWER_PROFILE_MAX_P
        • DCGM_POWER_PROFILE_MAX_Q
        • DCGM_POWER_PROFILE_COMPUTE
        • DCGM_POWER_PROFILE_MEMORY_BOUND
        • DCGM_POWER_PROFILE_NETWORK
        • DCGM_POWER_PROFILE_BALANCED
        • DCGM_POWER_PROFILE_LLM_INFERENCE
        • DCGM_POWER_PROFILE_LLM_TRAINING
        • DCGM_POWER_PROFILE_RBM
        • DCGM_POWER_PROFILE_DCPCIE
        • DCGM_POWER_PROFILE_HMMA_SPARSE
        • DCGM_POWER_PROFILE_HMMA_DENSE
        • DCGM_POWER_PROFILE_SYNC_BALANCED
        • DCGM_POWER_PROFILE_HPC
        • DCGM_POWER_PROFILE_MIG
        • DCGM_POWER_PROFILE_MAX
      • dcgmPolicyConditionIdx_enum
        • DCGM_POLICY_COND_IDX_DBE
        • DCGM_POLICY_COND_IDX_PCI
        • DCGM_POLICY_COND_IDX_MAX_PAGES_RETIRED
        • DCGM_POLICY_COND_IDX_THERMAL
        • DCGM_POLICY_COND_IDX_POWER
        • DCGM_POLICY_COND_IDX_NVLINK
        • DCGM_POLICY_COND_IDX_XID
      • dcgmPolicyCondition_enum
        • DCGM_POLICY_COND_DBE
        • DCGM_POLICY_COND_PCI
        • DCGM_POLICY_COND_MAX_PAGES_RETIRED
        • DCGM_POLICY_COND_THERMAL
        • DCGM_POLICY_COND_POWER
        • DCGM_POLICY_COND_NVLINK
        • DCGM_POLICY_COND_XID
      • dcgmPolicyMode_enum
        • DCGM_POLICY_MODE_AUTOMATED
        • DCGM_POLICY_MODE_MANUAL
      • dcgmPolicyIsolation_enum
        • DCGM_POLICY_ISOLATION_NONE
      • dcgmPolicyAction_enum
        • DCGM_POLICY_ACTION_NONE
        • DCGM_POLICY_ACTION_GPURESET
      • dcgmPolicyValidation_enum
        • DCGM_POLICY_VALID_NONE
        • DCGM_POLICY_VALID_SV_SHORT
        • DCGM_POLICY_VALID_SV_MED
        • DCGM_POLICY_VALID_SV_LONG
        • DCGM_POLICY_VALID_SV_XLONG
      • dcgmPolicyFailureResp_enum
        • DCGM_POLICY_FAILURE_NONE
      • dcgmHealthSystems_enum
        • DCGM_HEALTH_WATCH_PCIE
        • DCGM_HEALTH_WATCH_NVLINK
        • DCGM_HEALTH_WATCH_PMU
        • DCGM_HEALTH_WATCH_MCU
        • DCGM_HEALTH_WATCH_MEM
        • DCGM_HEALTH_WATCH_SM
        • DCGM_HEALTH_WATCH_INFOROM
        • DCGM_HEALTH_WATCH_THERMAL
        • DCGM_HEALTH_WATCH_POWER
        • DCGM_HEALTH_WATCH_DRIVER
        • DCGM_HEALTH_WATCH_NVSWITCH_NONFATAL
        • DCGM_HEALTH_WATCH_NVSWITCH_FATAL
        • DCGM_HEALTH_WATCH_ALL
      • dcgmHealthWatchResult_enum
        • DCGM_HEALTH_RESULT_PASS
        • DCGM_HEALTH_RESULT_WARN
        • DCGM_HEALTH_RESULT_FAIL
      • dcgmDiagResult_enum
        • DCGM_DIAG_RESULT_PASS
        • DCGM_DIAG_RESULT_SKIP
        • DCGM_DIAG_RESULT_WARN
        • DCGM_DIAG_RESULT_FAIL
        • DCGM_DIAG_RESULT_NOT_RUN
      • dcgmDiagnosticLevel_t
        • DCGM_DIAG_LVL_INVALID
        • DCGM_DIAG_LVL_SHORT
        • DCGM_DIAG_LVL_MED
        • DCGM_DIAG_LVL_LONG
        • DCGM_DIAG_LVL_XLONG
      • dcgmPerGpuTestIndices_enum
        • DCGM_MEMORY_INDEX
        • DCGM_DIAGNOSTIC_INDEX
        • DCGM_PCI_INDEX
        • DCGM_SM_STRESS_INDEX
        • DCGM_TARGETED_STRESS_INDEX
        • DCGM_TARGETED_POWER_INDEX
        • DCGM_MEMORY_BANDWIDTH_INDEX
        • DCGM_MEMTEST_INDEX
        • DCGM_PULSE_TEST_INDEX
        • DCGM_EUD_TEST_INDEX
        • DCGM_NVBANDWIDTH_INDEX
        • DCGM_UNUSED2_TEST_INDEX
        • DCGM_UNUSED3_TEST_INDEX
        • DCGM_UNUSED4_TEST_INDEX
        • DCGM_UNUSED5_TEST_INDEX
        • DCGM_SOFTWARE_INDEX
        • DCGM_CONTEXT_CREATE_INDEX
        • DCGM_UNKNOWN_INDEX
      • dcgmSoftwareTest_enum
        • DCGM_SWTEST_DENYLIST
        • DCGM_SWTEST_NVML_LIBRARY
        • DCGM_SWTEST_CUDA_MAIN_LIBRARY
        • DCGM_SWTEST_CUDA_RUNTIME_LIBRARY
        • DCGM_SWTEST_PERMISSIONS
        • DCGM_SWTEST_PERSISTENCE_MODE
        • DCGM_SWTEST_ENVIRONMENT
        • DCGM_SWTEST_PAGE_RETIREMENT
        • DCGM_SWTEST_GRAPHICS_PROCESSES
        • DCGM_SWTEST_INFOROM
        • DCGM_SWTEST_FABRIC_MANAGER
      • dcgmGpuLevel_enum
        • DCGM_TOPOLOGY_UNINITIALIZED
        • DCGM_TOPOLOGY_BOARD
        • DCGM_TOPOLOGY_SINGLE
        • DCGM_TOPOLOGY_MULTIPLE
        • DCGM_TOPOLOGY_HOSTBRIDGE
        • DCGM_TOPOLOGY_CPU
        • DCGM_TOPOLOGY_SYSTEM
        • DCGM_TOPOLOGY_NVLINK1
        • DCGM_TOPOLOGY_NVLINK2
        • DCGM_TOPOLOGY_NVLINK3
        • DCGM_TOPOLOGY_NVLINK4
        • DCGM_TOPOLOGY_NVLINK5
        • DCGM_TOPOLOGY_NVLINK6
        • DCGM_TOPOLOGY_NVLINK7
        • DCGM_TOPOLOGY_NVLINK8
        • DCGM_TOPOLOGY_NVLINK9
        • DCGM_TOPOLOGY_NVLINK10
        • DCGM_TOPOLOGY_NVLINK11
        • DCGM_TOPOLOGY_NVLINK12
        • DCGM_TOPOLOGY_NVLINK13
        • DCGM_TOPOLOGY_NVLINK14
        • DCGM_TOPOLOGY_NVLINK15
        • DCGM_TOPOLOGY_NVLINK16
        • DCGM_TOPOLOGY_NVLINK17
        • DCGM_TOPOLOGY_NVLINK18
      • dcgmGpuNVLinkErrorType_enum
        • DCGM_GPU_NVLINK_ERROR_RECOVERY_REQUIRED
        • DCGM_GPU_NVLINK_ERROR_FATAL
      • dcgmNvLinkLinkState_enum
        • DcgmNvLinkLinkStateNotSupported
        • DcgmNvLinkLinkStateDisabled
        • DcgmNvLinkLinkStateDown
        • DcgmNvLinkLinkStateUp
      • dcgmModuleId_t
        • DcgmModuleIdCore
        • DcgmModuleIdNvSwitch
        • DcgmModuleIdVGPU
        • DcgmModuleIdIntrospect
        • DcgmModuleIdHealth
        • DcgmModuleIdPolicy
        • DcgmModuleIdConfig
        • DcgmModuleIdDiag
        • DcgmModuleIdProfiling
        • DcgmModuleIdSysmon
        • DcgmModuleIdCount
      • dcgmModuleStatus_t
        • DcgmModuleStatusNotLoaded
        • DcgmModuleStatusDenylisted
        • DcgmModuleStatusFailed
        • DcgmModuleStatusLoaded
        • DcgmModuleStatusUnloaded
        • DcgmModuleStatusPaused
      • dcgmFabricManagerStatus_t
        • DcgmFMStatusNotSupported
        • DcgmFMStatusNotStarted
        • DcgmFMStatusInProgress
        • DcgmFMStatusSuccess
        • DcgmFMStatusFailure
        • DcgmFMStatusUnrecognized
        • DcgmFMStatusNvmlTooOld
        • DcgmFMStatusCount
      • dcgm_link_s
        • type
        • index
        • gpuId
        • switchId
        • parsed
        • raw
      • dcgmConnectV2Params_v1
        • version
        • persistAfterDisconnect
      • dcgmConnectV2Params_v2
        • version
        • persistAfterDisconnect
        • timeoutMs
        • addressIsUnixSocket
      • dcgmHostengineHealth_v1
        • version
        • overallHealth
      • dcgmGroupEntityPair_t
        • entityGroupId
        • entityId
      • dcgmGroupInfo_v2
        • version
        • count
        • groupName
        • entityList
      • dcgmGroupInfo_v3
        • version
        • count
        • groupName
        • entityList
      • dcgmMigHierarchyInfo_t
        • entity
        • parent
        • sliceProfile
      • dcgmMigEntityInfo_t
        • gpuUuid
        • nvmlGpuIndex
        • nvmlInstanceId
        • nvmlComputeInstanceId
        • nvmlMigProfileId
        • nvmlProfileSlices
      • dcgmMigHierarchyInfo_v2
      • dcgmMigHierarchy_v2
      • dcgmCpuHierarchyOwnedCores_v1
      • dcgmCpuHierarchy_v1
        • dcgmCpuHierarchy_v1::dcgmCpuHierarchyCpu_v1
      • dcgmCpuHierarchy_v2
        • dcgmCpuHierarchy_v2::dcgmCpuHierarchyCpu_v2
      • dcgmFieldGroupInfo_v1
        • version
        • numFieldIds
        • fieldGroupId
        • fieldGroupName
        • fieldIds
      • dcgmAllFieldGroup_v1
        • version
        • numFieldGroups
        • fieldGroups
      • dcgmErrorInfo_t
        • gpuId
        • fieldId
        • status
      • dcgmClockSet_v1
        • version
        • memClock
        • smClock
      • dcgmDeviceSupportedClockSets_v1
        • version
        • count
        • clockSet
      • dcgmDevicePidAccountingStats_v1
        • version
        • pid
        • gpuUtilization
        • memoryUtilization
        • maxMemoryUsage
        • startTimestamp
        • activeTimeUsec
      • dcgmDeviceThermals_v1
        • version
        • slowdownTemp
        • shutdownTemp
      • dcgmDevicePowerLimits_v1
        • version
        • curPowerLimit
        • defaultPowerLimit
        • enforcedPowerLimit
        • minPowerLimit
        • maxPowerLimit
      • dcgmDeviceIdentifiers_v1
        • version
        • brandName
        • deviceName
        • pciBusId
        • serial
        • uuid
        • vbios
        • inforomImageVersion
        • pciDeviceId
        • pciSubSystemId
        • driverVersion
        • virtualizationMode
      • dcgmDeviceMemoryUsage_v1
        • version
        • bar1Total
        • fbTotal
        • fbUsed
        • fbFree
      • dcgmDeviceVgpuUtilInfo_v1
        • version
        • vgpuId
        • smUtil
        • memUtil
        • encUtil
        • decUtil
      • dcgmDeviceEncStats_v1
        • version
        • sessionCount
        • averageFps
        • averageLatency
      • dcgmDeviceFbcStats_v1
        • version
        • sessionCount
        • averageFps
        • averageLatency
      • dcgmDeviceFbcSessionInfo_v1
        • version
        • sessionId
        • pid
        • vgpuId
        • displayOrdinal
        • sessionType
        • sessionFlags
        • hMaxResolution
        • vMaxResolution
        • hResolution
        • vResolution
        • averageFps
        • averageLatency
      • dcgmDeviceFbcSessions_v1
        • version
        • sessionCount
        • sessionInfo
      • dcgmDeviceVgpuEncSessions_v1
        • version
        • vgpuId
        • sessionId
        • pid
        • codecType
        • hResolution
        • vResolution
        • averageFps
        • averageLatency
      • dcgmDeviceVgpuProcessUtilInfo_v1
        • version
        • vgpuId
        • vgpuProcessSamplesCount
        • pid
        • processName
        • smUtil
        • memUtil
        • encUtil
        • decUtil
      • dcgmDeviceVgpuTypeInfo_v1
        • version
        • vgpuTypeInfo
        • vgpuTypeName
        • vgpuTypeClass
        • vgpuTypeLicense
        • deviceId
        • subsystemId
        • numDisplayHeads
        • maxInstances
        • frameRateLimit
        • maxResolutionX
        • maxResolutionY
        • fbTotal
      • dcgmDeviceVgpuTypeInfo_v2
        • version
        • vgpuTypeInfo
        • vgpuTypeName
        • vgpuTypeClass
        • vgpuTypeLicense
        • deviceId
        • subsystemId
        • numDisplayHeads
        • maxInstances
        • frameRateLimit
        • maxResolutionX
        • maxResolutionY
        • fbTotal
        • gpuInstanceProfileId
      • dcgmDeviceSupportedVgpuTypeInfo_v1
        • version
        • deviceId
        • subsystemId
        • numDisplayHeads
        • maxInstances
        • frameRateLimit
        • maxResolutionX
        • maxResolutionY
        • fbTotal
        • gpuInstanceProfileId
      • dcgmDeviceSettings_v2
      • dcgmDeviceAttributes_v3
        • version
        • clockSets
        • thermalSettings
        • powerLimits
        • identifiers
        • memoryUsage
        • settings
      • dcgmDeviceMigAttributesInfo_v1
        • version
        • gpuInstanceId
        • computeInstanceId
        • multiprocessorCount
        • sharedCopyEngineCount
        • sharedDecoderCount
        • sharedEncoderCount
        • sharedJpegCount
        • sharedOfaCount
        • gpuInstanceSliceCount
        • computeInstanceSliceCount
        • memorySizeMB
      • dcgmDeviceMigAttributes_v1
        • version
        • migDevicesCount
        • migAttributesInfo
      • dcgmGpuInstanceProfileInfo_v1
        • version
        • id
        • isP2pSupported
        • sliceCount
        • instanceCount
        • multiprocessorCount
        • copyEngineCount
        • decoderCount
        • encoderCount
        • jpegCount
        • ofaCount
        • memorySizeMB
      • dcgmGpuInstanceProfiles_v1
        • version
        • profileCount
        • profileInfo
      • dcgmComputeInstanceProfileInfo_v1
        • version
        • gpuInstanceId
        • id
        • sliceCount
        • instanceCount
        • multiprocessorCount
        • sharedCopyEngineCount
        • sharedDecoderCount
        • sharedEncoderCount
        • sharedJpegCount
        • sharedOfaCount
      • dcgmComputeInstanceProfiles_v1
        • version
        • profileCount
        • profileInfo
      • dcgmWorkloadPowerProfileInfo_v1
        • version
        • profileId
        • priority
        • conflictingMask
      • dcgmWorkloadPowerProfileProfilesInfo_v1
        • version
        • workloadPowerProfile
        • profileCount
      • dcgmDeviceWorkloadPowerProfilesStatus_v1
        • version
        • profileMask
        • requestedProfileMask
        • enforcedProfileMask
      • dcgmConfigPerfStateSettings_t
        • syncBoost
        • targetClocks
      • dcgmConfigPowerLimit_t
        • type
        • val
      • dcgmConfig_v1
        • version
        • gpuId
        • eccMode
        • computeMode
        • perfState
        • powerLimit
      • dcgmConfig_v2
        • version
        • gpuId
        • eccMode
        • computeMode
        • perfState
        • powerLimit
        • workloadPowerProfiles
      • dcgmPolicyViolation_v1
        • version
        • notifyOnEccDbe
        • notifyOnPciEvent
        • notifyOnMaxRetiredPages
      • dcgmPolicyConditionParams_st
      • dcgmPolicyViolationNotify_t
        • gpuId
        • violationOccurred
      • dcgmPolicy_v1
        • version
        • condition
        • mode
        • isolation
        • action
        • validation
        • response
        • parms
      • dcgmPolicyConditionDbe_t
        • timestamp
        • location
        • numerrors
      • dcgmPolicyConditionPci_t
        • timestamp
        • counter
      • dcgmPolicyConditionMpr_t
        • timestamp
        • sbepages
        • dbepages
      • dcgmPolicyConditionThermal_t
        • timestamp
        • thermalViolation
      • dcgmPolicyConditionPower_t
        • timestamp
        • powerViolation
      • dcgmPolicyConditionNvlink_t
        • timestamp
        • fieldId
        • counter
      • dcgmPolicyConditionXID_t
        • timestamp
        • errnum
      • dcgmPolicyCallbackResponse_v2
        • version
        • condition
        • dbe
        • pci
        • mpr
        • thermal
        • power
        • nvlink
        • xid
        • gpuId
      • dcgmFieldValue_v1
        • version
        • fieldId
        • fieldType
        • status
        • ts
        • i64
        • dbl
        • str
        • blob
        • value
      • dcgmFieldValue_v2
        • version
        • entityGroupId
        • entityId
        • fieldId
        • fieldType
        • status
        • unused
        • ts
        • i64
        • dbl
        • str
        • blob
        • value
      • dcgmStatSummaryInt64_t
        • minValue
        • maxValue
        • average
      • dcgmStatSummaryInt32_t
        • minValue
        • maxValue
        • average
      • dcgmStatSummaryFp64_t
        • minValue
        • maxValue
        • average
      • dcgmDiagErrorDetail_t
      • dcgmDiagErrorDetail_v2
        • category
        • severity
      • dcgmDiagInfo_v1
        • entity
        • msg
        • testId
      • dcgmDiagError_v1
        • entity
        • code
        • category
        • severity
        • msg
        • testId
      • dcgmDiagEntityResult_v1
        • entity
        • result
        • testId
      • dcgmDiagTestAuxData_v1
        • version
      • dcgmDiagTestRun_v2
        • name
        • pluginName
        • result
        • numErrors
        • numInfo
        • categoryIndex
        • _unused
        • numResults
        • errorIndices
        • infoIndices
        • resultIndices
      • dcgmDiagTestRun_v1
        • name
        • pluginName
        • result
        • numErrors
        • numInfo
        • categoryIndex
        • _unused
        • numResults
        • errorIndices
        • infoIndices
        • resultIndices
      • dcgmDiagEntity_v1
        • entity
        • serialNum
        • skuDeviceId
      • dcgmIncidentInfo_t
        • system
        • health
        • error
        • entityInfo
      • dcgmHealthResponse_v5
        • version
        • overallHealth
        • incidentCount
        • incidents
      • dcgmHealthSetParams_v2
        • version
        • groupId
        • systems
        • updateInterval
        • maxKeepAge
      • dcgmProcessUtilInfo_t
      • dcgmProcessUtilSample_t
      • dcgmPidSingleInfo_t
        • gpuId
        • energyConsumed
        • pcieRxBandwidth
        • pcieTxBandwidth
        • pcieReplays
        • startTime
        • endTime
        • processUtilization
        • smUtilization
        • memoryUtilization
        • eccSingleBit
        • eccDoubleBit
        • memoryClock
        • smClock
        • numXidCriticalErrors
        • xidCriticalErrorsTs
        • numOtherComputePids
        • otherComputePids
        • numOtherGraphicsPids
        • otherGraphicsPids
        • maxGpuMemoryUsed
        • powerViolationTime
        • thermalViolationTime
        • reliabilityViolationTime
        • boardLimitViolationTime
        • lowUtilizationTime
        • syncBoostTime
        • overallHealth
        • system
        • health
      • dcgmPidInfo_v2
        • version
        • pid
        • numGpus
        • summary
        • gpus
      • dcgmGpuUsageInfo_t
        • gpuId
        • energyConsumed
        • powerUsage
        • pcieRxBandwidth
        • pcieTxBandwidth
        • pcieReplays
        • startTime
        • endTime
        • smUtilization
        • memoryUtilization
        • eccSingleBit
        • eccDoubleBit
        • memoryClock
        • smClock
        • numXidCriticalErrors
        • xidCriticalErrorsTs
        • numComputePids
        • computePidInfo
        • numGraphicsPids
        • graphicsPidInfo
        • maxGpuMemoryUsed
        • powerViolationTime
        • thermalViolationTime
        • reliabilityViolationTime
        • boardLimitViolationTime
        • lowUtilizationTime
        • syncBoostTime
        • overallHealth
        • system
        • health
      • dcgmJobInfo_v3
        • version
        • numGpus
        • summary
        • gpus
      • dcgmRunningProcess_v1
        • version
        • pid
        • memoryUsed
      • dcgmDiagTestResult_v2
        • status
        • error
        • info
      • dcgmDiagTestResult_v3
        • status
        • error
        • info
      • dcgmDiagResponsePerGpu_v4
        • gpuId
        • hwDiagnosticReturn
        • results
      • dcgmDiagResponsePerGpu_v5
        • gpuId
        • hwDiagnosticReturn
        • results
      • dcgmDiagResponsePerGpu_v3
        • gpuId
        • hwDiagnosticReturn
        • results
      • dcgmDiagResponse_v12
        • version
        • numTests
        • numErrors
        • numInfo
        • numCategories
        • numEntities
        • numResults
        • tests
        • entities
        • errors
        • info
        • results
        • categories
        • dcgmVersion
        • driverVersion
        • _unused
      • dcgmDiagResponse_v11
        • version
        • numTests
        • numErrors
        • numInfo
        • numCategories
        • numEntities
        • numResults
        • tests
        • entities
        • errors
        • info
        • results
        • categories
        • dcgmVersion
        • driverVersion
        • _unused
      • dcgmDiagResponse_v10
        • version
        • gpuCount
        • levelOneTestCount
        • levelOneResults
        • perGpuResponses
        • systemError
        • devIds
        • devSerials
        • dcgmVersion
        • driverVersion
        • auxDataPerTest
      • dcgmDiagResponse_v9
        • version
        • gpuCount
        • levelOneTestCount
        • levelOneResults
        • perGpuResponses
        • systemError
        • devIds
        • devSerials
        • dcgmVersion
        • driverVersion
        • _unused
      • dcgmDiagResponse_v8
        • version
        • gpuCount
        • levelOneTestCount
        • levelOneResults
        • perGpuResponses
        • systemError
        • devIds
        • dcgmVersion
        • driverVersion
        • _unused
      • dcgmDiagResponse_v7
        • version
        • gpuCount
        • levelOneTestCount
        • levelOneResults
        • perGpuResponses
        • systemError
        • _unused
      • dcgmDiagStatus_v1
        • version
        • totalTests
        • completedTests
        • testName
        • errorCode
      • dcgmDeviceTopology_v1
        • version
        • cpuAffinityMask
        • numGpus
        • gpuId
        • path
        • localNvLinkIds
      • dcgmGroupTopology_v1
        • version
        • groupCpuAffinityMask
        • numaOptimalFlag
        • slowestPath
      • dcgmIntrospectMemory_v1
        • version
        • bytesUsed
      • dcgmIntrospectCpuUtil_v1
        • version
        • total
        • kernel
        • user
      • dcgmRunDiag_v7
        • version
        • flags
        • debugLevel
        • groupId
        • validate
        • testNames
        • testParms
        • fakeGpuList
        • gpuList
        • debugLogFile
        • statsPath
        • configFileContents
        • clocksEventMask
        • pluginPath
        • currentIteration
        • totalIterations
        • timeoutSeconds
        • _unusedBuf
        • failCheckInterval
      • dcgmRunDiag_v8
        • version
        • flags
        • debugLevel
        • groupId
        • validate
        • testNames
        • testParms
        • fakeGpuList
        • gpuList
        • debugLogFile
        • statsPath
        • configFileContents
        • clocksEventMask
        • pluginPath
        • currentIteration
        • totalIterations
        • timeoutSeconds
        • _unusedBuf
        • failCheckInterval
        • expectedNumEntities
      • dcgmRunDiag_v9
        • version
        • flags
        • debugLevel
        • groupId
        • validate
        • testNames
        • testParms
        • fakeGpuList
        • debugLogFile
        • statsPath
        • configFileContents
        • clocksEventMask
        • pluginPath
        • currentIteration
        • totalIterations
        • timeoutSeconds
        • _unusedBuf
        • failCheckInterval
        • expectedNumEntities
        • entityIds
        • watchFrequency
      • dcgmRunDiag_v10
        • version
        • flags
        • debugLevel
        • groupId
        • validate
        • testNames
        • testParms
        • fakeGpuList
        • debugLogFile
        • statsPath
        • configFileContents
        • clocksEventMask
        • pluginPath
        • currentIteration
        • totalIterations
        • timeoutSeconds
        • _unusedBuf
        • failCheckInterval
        • expectedNumEntities
        • entityIds
        • watchFrequency
        • ignoreErrorCodes
      • dcgmTopoSchedHint_v1
        • version
        • inputGpuIds
        • numGpus
        • hintFlags
      • dcgmNvLinkGpuLinkStatus_v1
        • entityId
        • linkState
      • dcgmNvLinkGpuLinkStatus_v2
        • entityId
        • linkState
      • dcgmNvLinkGpuLinkStatus_v3
        • entityId
        • linkState
      • dcgmNvLinkNvSwitchLinkStatus_t
        • entityId
        • linkState
      • dcgmNvLinkStatus_v4
        • version
        • numGpus
        • gpus
        • numNvSwitches
        • nvSwitches
      • dcgmSummaryResponse_t
        • fieldType
        • summaryCount
        • values
      • dcgmFieldSummaryRequest_v1
        • version
        • fieldId
        • entityGroupId
        • entityId
        • summaryTypeMask
        • startTime
        • endTime
        • response
      • dcgmModuleGetStatusesModule_t
        • id
        • status
      • dcgmModuleGetStatuses_v1
        • version
        • numStatuses
        • statuses
      • dcgmStartEmbeddedV2Params_v1
        • version
        • opMode
        • dcgmHandle
        • logFile
        • severity
        • denyListCount
      • dcgmStartEmbeddedV2Params_v2
        • version
        • opMode
        • dcgmHandle
        • logFile
        • severity
        • denyListCount
        • serviceAccount
        • denyList
      • dcgmProfMetricGroupInfo_v2
        • majorId
        • minorId
        • numFieldIds
        • fieldIds
      • dcgmProfGetMetricGroups_v3
        • version
        • unused
        • gpuId
        • numMetricGroups
        • metricGroups
      • dcgmSettingsSetLoggingSeverity_v1
      • dcgmSettingsSetLoggingSeverity_v2
      • dcgmVersionInfo_v2
        • rawBuildInfoString
    • Field Types
      • DCGM_FT_BINARY
      • DCGM_FT_DOUBLE
      • DCGM_FT_INT64
      • DCGM_FT_STRING
      • DCGM_FT_TIMESTAMP
    • Field Scope
      • DCGM_FS_GLOBAL
      • DCGM_FS_ENTITY
      • DCGM_FS_DEVICE
    • Field Entity
      • dcgm_field_eid_t
      • dcgm_field_entity_group_t
        • DCGM_FE_NONE
        • DCGM_FE_GPU
        • DCGM_FE_VGPU
        • DCGM_FE_SWITCH
        • DCGM_FE_GPU_I
        • DCGM_FE_GPU_CI
        • DCGM_FE_LINK
        • DCGM_FE_CPU
        • DCGM_FE_CPU_CORE
        • DCGM_FE_CONNECTX
        • DCGM_FE_COUNT
    • Field Identifiers
      • DCGM_FI_UNKNOWN
      • DCGM_FI_DRIVER_VERSION
      • DCGM_FI_NVML_VERSION
      • DCGM_FI_PROCESS_NAME
      • DCGM_FI_DEV_COUNT
      • DCGM_FI_CUDA_DRIVER_VERSION
      • DCGM_FI_DEV_NAME
      • DCGM_FI_DEV_BRAND
      • DCGM_FI_DEV_NVML_INDEX
      • DCGM_FI_DEV_SERIAL
      • DCGM_FI_DEV_UUID
      • DCGM_FI_DEV_MINOR_NUMBER
      • DCGM_FI_DEV_OEM_INFOROM_VER
      • DCGM_FI_DEV_PCI_BUSID
      • DCGM_FI_DEV_PCI_COMBINED_ID
      • DCGM_FI_DEV_PCI_SUBSYS_ID
      • DCGM_FI_GPU_TOPOLOGY_PCI
      • DCGM_FI_GPU_TOPOLOGY_NVLINK
      • DCGM_FI_GPU_TOPOLOGY_AFFINITY
      • DCGM_FI_DEV_CUDA_COMPUTE_CAPABILITY
      • DCGM_FI_DEV_COMPUTE_MODE
      • DCGM_FI_DEV_PERSISTENCE_MODE
      • DCGM_FI_DEV_MIG_MODE
      • DCGM_FI_DEV_CUDA_VISIBLE_DEVICES_STR
      • DCGM_FI_DEV_MIG_MAX_SLICES
      • DCGM_FI_DEV_CPU_AFFINITY_0
      • DCGM_FI_DEV_CPU_AFFINITY_1
      • DCGM_FI_DEV_CPU_AFFINITY_2
      • DCGM_FI_DEV_CPU_AFFINITY_3
      • DCGM_FI_DEV_CC_MODE
      • DCGM_FI_DEV_MIG_ATTRIBUTES
      • DCGM_FI_DEV_MIG_GI_INFO
      • DCGM_FI_DEV_MIG_CI_INFO
      • DCGM_FI_DEV_ECC_INFOROM_VER
      • DCGM_FI_DEV_POWER_INFOROM_VER
      • DCGM_FI_DEV_INFOROM_IMAGE_VER
      • DCGM_FI_DEV_INFOROM_CONFIG_CHECK
      • DCGM_FI_DEV_INFOROM_CONFIG_VALID
      • DCGM_FI_DEV_VBIOS_VERSION
      • DCGM_FI_DEV_MEM_AFFINITY_0
      • DCGM_FI_DEV_MEM_AFFINITY_1
      • DCGM_FI_DEV_MEM_AFFINITY_2
      • DCGM_FI_DEV_MEM_AFFINITY_3
      • DCGM_FI_DEV_BAR1_TOTAL
      • DCGM_FI_SYNC_BOOST
      • DCGM_FI_DEV_BAR1_USED
      • DCGM_FI_DEV_BAR1_FREE
      • DCGM_FI_DEV_GPM_SUPPORT
      • DCGM_FI_DEV_SM_CLOCK
      • DCGM_FI_DEV_MEM_CLOCK
      • DCGM_FI_DEV_VIDEO_CLOCK
      • DCGM_FI_DEV_APP_SM_CLOCK
      • DCGM_FI_DEV_APP_MEM_CLOCK
      • DCGM_FI_DEV_CLOCKS_EVENT_REASONS
      • DCGM_FI_DEV_CLOCK_THROTTLE_REASONS
      • DCGM_FI_DEV_MAX_SM_CLOCK
      • DCGM_FI_DEV_MAX_MEM_CLOCK
      • DCGM_FI_DEV_MAX_VIDEO_CLOCK
      • DCGM_FI_DEV_AUTOBOOST
      • DCGM_FI_DEV_SUPPORTED_CLOCKS
      • DCGM_FI_DEV_MEMORY_TEMP
      • DCGM_FI_DEV_GPU_TEMP
      • DCGM_FI_DEV_MEM_MAX_OP_TEMP
      • DCGM_FI_DEV_GPU_MAX_OP_TEMP
      • DCGM_FI_DEV_GPU_TEMP_LIMIT
      • DCGM_FI_DEV_POWER_USAGE
      • DCGM_FI_DEV_TOTAL_ENERGY_CONSUMPTION
      • DCGM_FI_DEV_POWER_USAGE_INSTANT
      • DCGM_FI_DEV_SLOWDOWN_TEMP
      • DCGM_FI_DEV_SHUTDOWN_TEMP
      • DCGM_FI_DEV_POWER_MGMT_LIMIT
      • DCGM_FI_DEV_POWER_MGMT_LIMIT_MIN
      • DCGM_FI_DEV_POWER_MGMT_LIMIT_MAX
      • DCGM_FI_DEV_POWER_MGMT_LIMIT_DEF
      • DCGM_FI_DEV_ENFORCED_POWER_LIMIT
      • DCGM_FI_DEV_REQUESTED_POWER_PROFILE_MASK
      • DCGM_FI_DEV_ENFORCED_POWER_PROFILE_MASK
      • DCGM_FI_DEV_VALID_POWER_PROFILE_MASK
      • DCGM_FI_DEV_FABRIC_MANAGER_STATUS
      • DCGM_FI_DEV_FABRIC_MANAGER_ERROR_CODE
      • DCGM_FI_DEV_FABRIC_CLUSTER_UUID
      • DCGM_FI_DEV_FABRIC_CLIQUE_ID
      • DCGM_FI_DEV_PSTATE
      • DCGM_FI_DEV_FAN_SPEED
      • DCGM_FI_DEV_PCIE_TX_THROUGHPUT
      • DCGM_FI_DEV_PCIE_RX_THROUGHPUT
      • DCGM_FI_DEV_PCIE_REPLAY_COUNTER
      • DCGM_FI_DEV_GPU_UTIL
      • DCGM_FI_DEV_MEM_COPY_UTIL
      • DCGM_FI_DEV_ACCOUNTING_DATA
      • DCGM_FI_DEV_ENC_UTIL
      • DCGM_FI_DEV_DEC_UTIL
      • DCGM_FI_DEV_XID_ERRORS
      • DCGM_FI_DEV_PCIE_MAX_LINK_GEN
      • DCGM_FI_DEV_PCIE_MAX_LINK_WIDTH
      • DCGM_FI_DEV_PCIE_LINK_GEN
      • DCGM_FI_DEV_PCIE_LINK_WIDTH
      • DCGM_FI_DEV_POWER_VIOLATION
      • DCGM_FI_DEV_THERMAL_VIOLATION
      • DCGM_FI_DEV_SYNC_BOOST_VIOLATION
      • DCGM_FI_DEV_BOARD_LIMIT_VIOLATION
      • DCGM_FI_DEV_LOW_UTIL_VIOLATION
      • DCGM_FI_DEV_RELIABILITY_VIOLATION
      • DCGM_FI_DEV_TOTAL_APP_CLOCKS_VIOLATION
      • DCGM_FI_DEV_TOTAL_BASE_CLOCKS_VIOLATION
      • DCGM_FI_DEV_FB_TOTAL
      • DCGM_FI_DEV_FB_FREE
      • DCGM_FI_DEV_FB_USED
      • DCGM_FI_DEV_FB_RESERVED
      • DCGM_FI_DEV_FB_USED_PERCENT
      • DCGM_FI_DEV_C2C_LINK_COUNT
      • DCGM_FI_DEV_C2C_LINK_STATUS
      • DCGM_FI_DEV_C2C_MAX_BANDWIDTH
      • DCGM_FI_DEV_ECC_CURRENT
      • DCGM_FI_DEV_ECC_PENDING
      • DCGM_FI_DEV_ECC_SBE_VOL_TOTAL
      • DCGM_FI_DEV_ECC_DBE_VOL_TOTAL
      • DCGM_FI_DEV_ECC_SBE_AGG_TOTAL
      • DCGM_FI_DEV_ECC_DBE_AGG_TOTAL
      • DCGM_FI_DEV_ECC_SBE_VOL_L1
      • DCGM_FI_DEV_ECC_DBE_VOL_L1
      • DCGM_FI_DEV_ECC_SBE_VOL_L2
      • DCGM_FI_DEV_ECC_DBE_VOL_L2
      • DCGM_FI_DEV_ECC_SBE_VOL_DEV
      • DCGM_FI_DEV_ECC_DBE_VOL_DEV
      • DCGM_FI_DEV_ECC_SBE_VOL_REG
      • DCGM_FI_DEV_ECC_DBE_VOL_REG
      • DCGM_FI_DEV_ECC_SBE_VOL_TEX
      • DCGM_FI_DEV_ECC_DBE_VOL_TEX
      • DCGM_FI_DEV_ECC_SBE_AGG_L1
      • DCGM_FI_DEV_ECC_DBE_AGG_L1
      • DCGM_FI_DEV_ECC_SBE_AGG_L2
      • DCGM_FI_DEV_ECC_DBE_AGG_L2
      • DCGM_FI_DEV_ECC_SBE_AGG_DEV
      • DCGM_FI_DEV_ECC_DBE_AGG_DEV
      • DCGM_FI_DEV_ECC_SBE_AGG_REG
      • DCGM_FI_DEV_ECC_DBE_AGG_REG
      • DCGM_FI_DEV_ECC_SBE_AGG_TEX
      • DCGM_FI_DEV_ECC_DBE_AGG_TEX
      • DCGM_FI_DEV_ECC_SBE_VOL_SHM
      • DCGM_FI_DEV_ECC_DBE_VOL_SHM
      • DCGM_FI_DEV_ECC_SBE_VOL_CBU
      • DCGM_FI_DEV_ECC_DBE_VOL_CBU
      • DCGM_FI_DEV_ECC_SBE_AGG_SHM
      • DCGM_FI_DEV_ECC_DBE_AGG_SHM
      • DCGM_FI_DEV_ECC_SBE_AGG_CBU
      • DCGM_FI_DEV_ECC_DBE_AGG_CBU
      • DCGM_FI_DEV_ECC_SBE_VOL_SRM
      • DCGM_FI_DEV_ECC_DBE_VOL_SRM
      • DCGM_FI_DEV_ECC_SBE_AGG_SRM
      • DCGM_FI_DEV_ECC_DBE_AGG_SRM
      • DCGM_FI_DEV_THRESHOLD_SRM
      • DCGM_FI_DEV_DIAG_MEMORY_RESULT
      • DCGM_FI_DEV_DIAG_DIAGNOSTIC_RESULT
      • DCGM_FI_DEV_DIAG_PCIE_RESULT
      • DCGM_FI_DEV_DIAG_TARGETED_STRESS_RESULT
      • DCGM_FI_DEV_DIAG_TARGETED_POWER_RESULT
      • DCGM_FI_DEV_DIAG_MEMORY_BANDWIDTH_RESULT
      • DCGM_FI_DEV_DIAG_MEMTEST_RESULT
      • DCGM_FI_DEV_DIAG_PULSE_TEST_RESULT
      • DCGM_FI_DEV_DIAG_EUD_RESULT
      • DCGM_FI_DEV_DIAG_CPU_EUD_RESULT
      • DCGM_FI_DEV_DIAG_SOFTWARE_RESULT
      • DCGM_FI_DEV_DIAG_NVBANDWIDTH_RESULT
      • DCGM_FI_DEV_DIAG_STATUS
      • DCGM_FI_DEV_BANKS_REMAP_ROWS_AVAIL_MAX
      • DCGM_FI_DEV_BANKS_REMAP_ROWS_AVAIL_HIGH
      • DCGM_FI_DEV_BANKS_REMAP_ROWS_AVAIL_PARTIAL
      • DCGM_FI_DEV_BANKS_REMAP_ROWS_AVAIL_LOW
      • DCGM_FI_DEV_BANKS_REMAP_ROWS_AVAIL_NONE
      • DCGM_FI_DEV_RETIRED_SBE
      • DCGM_FI_DEV_RETIRED_DBE
      • DCGM_FI_DEV_RETIRED_PENDING
      • DCGM_FI_DEV_UNCORRECTABLE_REMAPPED_ROWS
      • DCGM_FI_DEV_CORRECTABLE_REMAPPED_ROWS
      • DCGM_FI_DEV_ROW_REMAP_FAILURE
      • DCGM_FI_DEV_ROW_REMAP_PENDING
      • DCGM_FI_DEV_NVLINK_CRC_FLIT_ERROR_COUNT_L0
      • DCGM_FI_DEV_NVLINK_CRC_FLIT_ERROR_COUNT_L1
      • DCGM_FI_DEV_NVLINK_CRC_FLIT_ERROR_COUNT_L2
      • DCGM_FI_DEV_NVLINK_CRC_FLIT_ERROR_COUNT_L3
      • DCGM_FI_DEV_NVLINK_CRC_FLIT_ERROR_COUNT_L4
      • DCGM_FI_DEV_NVLINK_CRC_FLIT_ERROR_COUNT_L5
      • DCGM_FI_DEV_NVLINK_CRC_FLIT_ERROR_COUNT_TOTAL
      • DCGM_FI_DEV_NVLINK_CRC_DATA_ERROR_COUNT_L0
      • DCGM_FI_DEV_NVLINK_CRC_DATA_ERROR_COUNT_L1
      • DCGM_FI_DEV_NVLINK_CRC_DATA_ERROR_COUNT_L2
      • DCGM_FI_DEV_NVLINK_CRC_DATA_ERROR_COUNT_L3
      • DCGM_FI_DEV_NVLINK_CRC_DATA_ERROR_COUNT_L4
      • DCGM_FI_DEV_NVLINK_CRC_DATA_ERROR_COUNT_L5
      • DCGM_FI_DEV_NVLINK_CRC_DATA_ERROR_COUNT_TOTAL
      • DCGM_FI_DEV_NVLINK_REPLAY_ERROR_COUNT_L0
      • DCGM_FI_DEV_NVLINK_REPLAY_ERROR_COUNT_L1
      • DCGM_FI_DEV_NVLINK_REPLAY_ERROR_COUNT_L2
      • DCGM_FI_DEV_NVLINK_REPLAY_ERROR_COUNT_L3
      • DCGM_FI_DEV_NVLINK_REPLAY_ERROR_COUNT_L4
      • DCGM_FI_DEV_NVLINK_REPLAY_ERROR_COUNT_L5
      • DCGM_FI_DEV_NVLINK_REPLAY_ERROR_COUNT_TOTAL
      • DCGM_FI_DEV_NVLINK_RECOVERY_ERROR_COUNT_L0
      • DCGM_FI_DEV_NVLINK_RECOVERY_ERROR_COUNT_L1
      • DCGM_FI_DEV_NVLINK_RECOVERY_ERROR_COUNT_L2
      • DCGM_FI_DEV_NVLINK_RECOVERY_ERROR_COUNT_L3
      • DCGM_FI_DEV_NVLINK_RECOVERY_ERROR_COUNT_L4
      • DCGM_FI_DEV_NVLINK_RECOVERY_ERROR_COUNT_L5
      • DCGM_FI_DEV_NVLINK_RECOVERY_ERROR_COUNT_TOTAL
      • DCGM_FI_DEV_NVLINK_BANDWIDTH_L0
      • DCGM_FI_DEV_NVLINK_BANDWIDTH_L1
      • DCGM_FI_DEV_NVLINK_BANDWIDTH_L2
      • DCGM_FI_DEV_NVLINK_BANDWIDTH_L3
      • DCGM_FI_DEV_NVLINK_BANDWIDTH_L4
      • DCGM_FI_DEV_NVLINK_BANDWIDTH_L5
      • DCGM_FI_DEV_NVLINK_BANDWIDTH_TOTAL
      • DCGM_FI_DEV_GPU_NVLINK_ERRORS
      • DCGM_FI_DEV_NVLINK_CRC_FLIT_ERROR_COUNT_L6
      • DCGM_FI_DEV_NVLINK_CRC_FLIT_ERROR_COUNT_L7
      • DCGM_FI_DEV_NVLINK_CRC_FLIT_ERROR_COUNT_L8
      • DCGM_FI_DEV_NVLINK_CRC_FLIT_ERROR_COUNT_L9
      • DCGM_FI_DEV_NVLINK_CRC_FLIT_ERROR_COUNT_L10
      • DCGM_FI_DEV_NVLINK_CRC_FLIT_ERROR_COUNT_L11
      • DCGM_FI_DEV_NVLINK_CRC_DATA_ERROR_COUNT_L6
      • DCGM_FI_DEV_NVLINK_CRC_DATA_ERROR_COUNT_L7
      • DCGM_FI_DEV_NVLINK_CRC_DATA_ERROR_COUNT_L8
      • DCGM_FI_DEV_NVLINK_CRC_DATA_ERROR_COUNT_L9
      • DCGM_FI_DEV_NVLINK_CRC_DATA_ERROR_COUNT_L10
      • DCGM_FI_DEV_NVLINK_CRC_DATA_ERROR_COUNT_L11
      • DCGM_FI_DEV_NVLINK_REPLAY_ERROR_COUNT_L6
      • DCGM_FI_DEV_NVLINK_REPLAY_ERROR_COUNT_L7
      • DCGM_FI_DEV_NVLINK_REPLAY_ERROR_COUNT_L8
      • DCGM_FI_DEV_NVLINK_REPLAY_ERROR_COUNT_L9
      • DCGM_FI_DEV_NVLINK_REPLAY_ERROR_COUNT_L10
      • DCGM_FI_DEV_NVLINK_REPLAY_ERROR_COUNT_L11
      • DCGM_FI_DEV_NVLINK_RECOVERY_ERROR_COUNT_L6
      • DCGM_FI_DEV_NVLINK_RECOVERY_ERROR_COUNT_L7
      • DCGM_FI_DEV_NVLINK_RECOVERY_ERROR_COUNT_L8
      • DCGM_FI_DEV_NVLINK_RECOVERY_ERROR_COUNT_L9
      • DCGM_FI_DEV_NVLINK_RECOVERY_ERROR_COUNT_L10
      • DCGM_FI_DEV_NVLINK_RECOVERY_ERROR_COUNT_L11
      • DCGM_FI_DEV_NVLINK_BANDWIDTH_L6
      • DCGM_FI_DEV_NVLINK_BANDWIDTH_L7
      • DCGM_FI_DEV_NVLINK_BANDWIDTH_L8
      • DCGM_FI_DEV_NVLINK_BANDWIDTH_L9
      • DCGM_FI_DEV_NVLINK_BANDWIDTH_L10
      • DCGM_FI_DEV_NVLINK_BANDWIDTH_L11
      • DCGM_FI_DEV_NVLINK_CRC_FLIT_ERROR_COUNT_L12
      • DCGM_FI_DEV_NVLINK_CRC_FLIT_ERROR_COUNT_L13
      • DCGM_FI_DEV_NVLINK_CRC_FLIT_ERROR_COUNT_L14
      • DCGM_FI_DEV_NVLINK_CRC_FLIT_ERROR_COUNT_L15
      • DCGM_FI_DEV_NVLINK_CRC_FLIT_ERROR_COUNT_L16
      • DCGM_FI_DEV_NVLINK_CRC_FLIT_ERROR_COUNT_L17
      • DCGM_FI_DEV_NVLINK_CRC_DATA_ERROR_COUNT_L12
      • DCGM_FI_DEV_NVLINK_CRC_DATA_ERROR_COUNT_L13
      • DCGM_FI_DEV_NVLINK_CRC_DATA_ERROR_COUNT_L14
      • DCGM_FI_DEV_NVLINK_CRC_DATA_ERROR_COUNT_L15
      • DCGM_FI_DEV_NVLINK_CRC_DATA_ERROR_COUNT_L16
      • DCGM_FI_DEV_NVLINK_CRC_DATA_ERROR_COUNT_L17
      • DCGM_FI_DEV_NVLINK_REPLAY_ERROR_COUNT_L12
      • DCGM_FI_DEV_NVLINK_REPLAY_ERROR_COUNT_L13
      • DCGM_FI_DEV_NVLINK_REPLAY_ERROR_COUNT_L14
      • DCGM_FI_DEV_NVLINK_REPLAY_ERROR_COUNT_L15
      • DCGM_FI_DEV_NVLINK_REPLAY_ERROR_COUNT_L16
      • DCGM_FI_DEV_NVLINK_REPLAY_ERROR_COUNT_L17
      • DCGM_FI_DEV_NVLINK_RECOVERY_ERROR_COUNT_L12
      • DCGM_FI_DEV_NVLINK_RECOVERY_ERROR_COUNT_L13
      • DCGM_FI_DEV_NVLINK_RECOVERY_ERROR_COUNT_L14
      • DCGM_FI_DEV_NVLINK_RECOVERY_ERROR_COUNT_L15
      • DCGM_FI_DEV_NVLINK_RECOVERY_ERROR_COUNT_L16
      • DCGM_FI_DEV_NVLINK_RECOVERY_ERROR_COUNT_L17
      • DCGM_FI_DEV_NVLINK_BANDWIDTH_L12
      • DCGM_FI_DEV_NVLINK_BANDWIDTH_L13
      • DCGM_FI_DEV_NVLINK_BANDWIDTH_L14
      • DCGM_FI_DEV_NVLINK_BANDWIDTH_L15
      • DCGM_FI_DEV_NVLINK_BANDWIDTH_L16
      • DCGM_FI_DEV_NVLINK_BANDWIDTH_L17
      • DCGM_FI_DEV_NVLINK_ERROR_DL_CRC
      • DCGM_FI_DEV_NVLINK_ERROR_DL_RECOVERY
      • DCGM_FI_DEV_NVLINK_ERROR_DL_REPLAY
      • DCGM_FI_DEV_VIRTUAL_MODE
      • DCGM_FI_DEV_SUPPORTED_TYPE_INFO
      • DCGM_FI_DEV_CREATABLE_VGPU_TYPE_IDS
      • DCGM_FI_DEV_VGPU_INSTANCE_IDS
      • DCGM_FI_DEV_VGPU_UTILIZATIONS
      • DCGM_FI_DEV_VGPU_PER_PROCESS_UTILIZATION
      • DCGM_FI_DEV_ENC_STATS
      • DCGM_FI_DEV_FBC_STATS
      • DCGM_FI_DEV_FBC_SESSIONS_INFO
      • DCGM_FI_DEV_SUPPORTED_VGPU_TYPE_IDS
      • DCGM_FI_DEV_VGPU_TYPE_INFO
      • DCGM_FI_DEV_VGPU_TYPE_NAME
      • DCGM_FI_DEV_VGPU_TYPE_CLASS
      • DCGM_FI_DEV_VGPU_TYPE_LICENSE
      • DCGM_FI_DEV_VGPU_VM_ID
      • DCGM_FI_DEV_VGPU_VM_NAME
      • DCGM_FI_DEV_VGPU_TYPE
      • DCGM_FI_DEV_VGPU_UUID
      • DCGM_FI_DEV_VGPU_DRIVER_VERSION
      • DCGM_FI_DEV_VGPU_MEMORY_USAGE
      • DCGM_FI_DEV_VGPU_LICENSE_STATUS
      • DCGM_FI_DEV_VGPU_FRAME_RATE_LIMIT
      • DCGM_FI_DEV_VGPU_ENC_STATS
      • DCGM_FI_DEV_VGPU_ENC_SESSIONS_INFO
      • DCGM_FI_DEV_VGPU_FBC_STATS
      • DCGM_FI_DEV_VGPU_FBC_SESSIONS_INFO
      • DCGM_FI_DEV_VGPU_INSTANCE_LICENSE_STATE
      • DCGM_FI_DEV_VGPU_PCI_ID
      • DCGM_FI_DEV_VGPU_VM_GPU_INSTANCE_ID
      • DCGM_FI_FIRST_VGPU_FIELD_ID
      • DCGM_FI_LAST_VGPU_FIELD_ID
      • DCGM_FI_MAX_VGPU_FIELDS
      • DCGM_FI_DEV_PLATFORM_INFINIBAND_GUID
      • DCGM_FI_DEV_PLATFORM_CHASSIS_SERIAL_NUMBER
      • DCGM_FI_DEV_PLATFORM_CHASSIS_SLOT_NUMBER
      • DCGM_FI_DEV_PLATFORM_TRAY_INDEX
      • DCGM_FI_DEV_PLATFORM_HOST_ID
      • DCGM_FI_DEV_PLATFORM_PEER_TYPE
      • DCGM_FI_DEV_PLATFORM_MODULE_ID
      • DCGM_FI_INTERNAL_FIELDS_0_START
      • DCGM_FI_INTERNAL_FIELDS_0_END
      • DCGM_FI_FIRST_NVSWITCH_FIELD_ID
      • DCGM_FI_DEV_NVSWITCH_VOLTAGE_MVOLT
      • DCGM_FI_DEV_NVSWITCH_CURRENT_IDDQ
      • DCGM_FI_DEV_NVSWITCH_CURRENT_IDDQ_REV
      • DCGM_FI_DEV_NVSWITCH_CURRENT_IDDQ_DVDD
      • DCGM_FI_DEV_NVSWITCH_POWER_VDD
      • DCGM_FI_DEV_NVSWITCH_POWER_DVDD
      • DCGM_FI_DEV_NVSWITCH_POWER_HVDD
      • DCGM_FI_DEV_NVSWITCH_LINK_THROUGHPUT_TX
      • DCGM_FI_DEV_NVSWITCH_LINK_THROUGHPUT_RX
      • DCGM_FI_DEV_NVSWITCH_LINK_FATAL_ERRORS
      • DCGM_FI_DEV_NVSWITCH_LINK_NON_FATAL_ERRORS
      • DCGM_FI_DEV_NVSWITCH_LINK_REPLAY_ERRORS
      • DCGM_FI_DEV_NVSWITCH_LINK_RECOVERY_ERRORS
      • DCGM_FI_DEV_NVSWITCH_LINK_FLIT_ERRORS
      • DCGM_FI_DEV_NVSWITCH_LINK_CRC_ERRORS
      • DCGM_FI_DEV_NVSWITCH_LINK_ECC_ERRORS
      • DCGM_FI_DEV_NVSWITCH_LINK_LATENCY_LOW_VC0
      • DCGM_FI_DEV_NVSWITCH_LINK_LATENCY_LOW_VC1
      • DCGM_FI_DEV_NVSWITCH_LINK_LATENCY_LOW_VC2
      • DCGM_FI_DEV_NVSWITCH_LINK_LATENCY_LOW_VC3
      • DCGM_FI_DEV_NVSWITCH_LINK_LATENCY_MEDIUM_VC0
      • DCGM_FI_DEV_NVSWITCH_LINK_LATENCY_MEDIUM_VC1
      • DCGM_FI_DEV_NVSWITCH_LINK_LATENCY_MEDIUM_VC2
      • DCGM_FI_DEV_NVSWITCH_LINK_LATENCY_MEDIUM_VC3
      • DCGM_FI_DEV_NVSWITCH_LINK_LATENCY_HIGH_VC0
      • DCGM_FI_DEV_NVSWITCH_LINK_LATENCY_HIGH_VC1
      • DCGM_FI_DEV_NVSWITCH_LINK_LATENCY_HIGH_VC2
      • DCGM_FI_DEV_NVSWITCH_LINK_LATENCY_HIGH_VC3
      • DCGM_FI_DEV_NVSWITCH_LINK_LATENCY_PANIC_VC0
      • DCGM_FI_DEV_NVSWITCH_LINK_LATENCY_PANIC_VC1
      • DCGM_FI_DEV_NVSWITCH_LINK_LATENCY_PANIC_VC2
      • DCGM_FI_DEV_NVSWITCH_LINK_LATENCY_PANIC_VC3
      • DCGM_FI_DEV_NVSWITCH_LINK_LATENCY_COUNT_VC0
      • DCGM_FI_DEV_NVSWITCH_LINK_LATENCY_COUNT_VC1
      • DCGM_FI_DEV_NVSWITCH_LINK_LATENCY_COUNT_VC2
      • DCGM_FI_DEV_NVSWITCH_LINK_LATENCY_COUNT_VC3
      • DCGM_FI_DEV_NVSWITCH_LINK_CRC_ERRORS_LANE0
      • DCGM_FI_DEV_NVSWITCH_LINK_CRC_ERRORS_LANE1
      • DCGM_FI_DEV_NVSWITCH_LINK_CRC_ERRORS_LANE2
      • DCGM_FI_DEV_NVSWITCH_LINK_CRC_ERRORS_LANE3
      • DCGM_FI_DEV_NVSWITCH_LINK_ECC_ERRORS_LANE0
      • DCGM_FI_DEV_NVSWITCH_LINK_ECC_ERRORS_LANE1
      • DCGM_FI_DEV_NVSWITCH_LINK_ECC_ERRORS_LANE2
      • DCGM_FI_DEV_NVSWITCH_LINK_ECC_ERRORS_LANE3
      • DCGM_FI_DEV_NVSWITCH_LINK_CRC_ERRORS_LANE4
      • DCGM_FI_DEV_NVSWITCH_LINK_CRC_ERRORS_LANE5
      • DCGM_FI_DEV_NVSWITCH_LINK_CRC_ERRORS_LANE6
      • DCGM_FI_DEV_NVSWITCH_LINK_CRC_ERRORS_LANE7
      • DCGM_FI_DEV_NVSWITCH_LINK_ECC_ERRORS_LANE4
      • DCGM_FI_DEV_NVSWITCH_LINK_ECC_ERRORS_LANE5
      • DCGM_FI_DEV_NVSWITCH_LINK_ECC_ERRORS_LANE6
      • DCGM_FI_DEV_NVSWITCH_LINK_ECC_ERRORS_LANE7
      • DCGM_FI_DEV_NVLINK_TX_BANDWIDTH_L0
      • DCGM_FI_DEV_NVLINK_TX_BANDWIDTH_L1
      • DCGM_FI_DEV_NVLINK_TX_BANDWIDTH_L2
      • DCGM_FI_DEV_NVLINK_TX_BANDWIDTH_L3
      • DCGM_FI_DEV_NVLINK_TX_BANDWIDTH_L4
      • DCGM_FI_DEV_NVLINK_TX_BANDWIDTH_L5
      • DCGM_FI_DEV_NVLINK_TX_BANDWIDTH_L6
      • DCGM_FI_DEV_NVLINK_TX_BANDWIDTH_L7
      • DCGM_FI_DEV_NVLINK_TX_BANDWIDTH_L8
      • DCGM_FI_DEV_NVLINK_TX_BANDWIDTH_L9
      • DCGM_FI_DEV_NVLINK_TX_BANDWIDTH_L10
      • DCGM_FI_DEV_NVLINK_TX_BANDWIDTH_L11
      • DCGM_FI_DEV_NVLINK_TX_BANDWIDTH_L12
      • DCGM_FI_DEV_NVLINK_TX_BANDWIDTH_L13
      • DCGM_FI_DEV_NVLINK_TX_BANDWIDTH_L14
      • DCGM_FI_DEV_NVLINK_TX_BANDWIDTH_L15
      • DCGM_FI_DEV_NVLINK_TX_BANDWIDTH_L16
      • DCGM_FI_DEV_NVLINK_TX_BANDWIDTH_L17
      • DCGM_FI_DEV_NVLINK_TX_BANDWIDTH_TOTAL
      • DCGM_FI_DEV_NVSWITCH_FATAL_ERRORS
      • DCGM_FI_DEV_NVSWITCH_NON_FATAL_ERRORS
      • DCGM_FI_DEV_NVSWITCH_TEMPERATURE_CURRENT
      • DCGM_FI_DEV_NVSWITCH_TEMPERATURE_LIMIT_SLOWDOWN
      • DCGM_FI_DEV_NVSWITCH_TEMPERATURE_LIMIT_SHUTDOWN
      • DCGM_FI_DEV_NVSWITCH_THROUGHPUT_TX
      • DCGM_FI_DEV_NVSWITCH_THROUGHPUT_RX
      • DCGM_FI_DEV_NVSWITCH_PHYS_ID
      • DCGM_FI_DEV_NVSWITCH_RESET_REQUIRED
      • DCGM_FI_DEV_NVSWITCH_LINK_ID
      • DCGM_FI_DEV_NVSWITCH_PCIE_DOMAIN
      • DCGM_FI_DEV_NVSWITCH_PCIE_BUS
      • DCGM_FI_DEV_NVSWITCH_PCIE_DEVICE
      • DCGM_FI_DEV_NVSWITCH_PCIE_FUNCTION
      • DCGM_FI_DEV_NVSWITCH_LINK_STATUS
      • DCGM_FI_DEV_NVSWITCH_LINK_TYPE
      • DCGM_FI_DEV_NVSWITCH_LINK_REMOTE_PCIE_DOMAIN
      • DCGM_FI_DEV_NVSWITCH_LINK_REMOTE_PCIE_BUS
      • DCGM_FI_DEV_NVSWITCH_LINK_REMOTE_PCIE_DEVICE
      • DCGM_FI_DEV_NVSWITCH_LINK_REMOTE_PCIE_FUNCTION
      • DCGM_FI_DEV_NVSWITCH_LINK_DEVICE_LINK_ID
      • DCGM_FI_DEV_NVSWITCH_LINK_DEVICE_LINK_SID
      • DCGM_FI_DEV_NVSWITCH_DEVICE_UUID
      • DCGM_FI_DEV_NVLINK_RX_BANDWIDTH_L0
      • DCGM_FI_DEV_NVLINK_RX_BANDWIDTH_L1
      • DCGM_FI_DEV_NVLINK_RX_BANDWIDTH_L2
      • DCGM_FI_DEV_NVLINK_RX_BANDWIDTH_L3
      • DCGM_FI_DEV_NVLINK_RX_BANDWIDTH_L4
      • DCGM_FI_DEV_NVLINK_RX_BANDWIDTH_L5
      • DCGM_FI_DEV_NVLINK_RX_BANDWIDTH_L6
      • DCGM_FI_DEV_NVLINK_RX_BANDWIDTH_L7
      • DCGM_FI_DEV_NVLINK_RX_BANDWIDTH_L8
      • DCGM_FI_DEV_NVLINK_RX_BANDWIDTH_L9
      • DCGM_FI_DEV_NVLINK_RX_BANDWIDTH_L10
      • DCGM_FI_DEV_NVLINK_RX_BANDWIDTH_L11
      • DCGM_FI_DEV_NVLINK_RX_BANDWIDTH_L12
      • DCGM_FI_DEV_NVLINK_RX_BANDWIDTH_L13
      • DCGM_FI_DEV_NVLINK_RX_BANDWIDTH_L14
      • DCGM_FI_DEV_NVLINK_RX_BANDWIDTH_L15
      • DCGM_FI_DEV_NVLINK_RX_BANDWIDTH_L16
      • DCGM_FI_DEV_NVLINK_RX_BANDWIDTH_L17
      • DCGM_FI_DEV_NVLINK_RX_BANDWIDTH_TOTAL
      • DCGM_FI_LAST_NVSWITCH_FIELD_ID
      • DCGM_FI_MAX_NVSWITCH_FIELDS
      • DCGM_FI_PROF_GR_ENGINE_ACTIVE
      • DCGM_FI_PROF_SM_ACTIVE
      • DCGM_FI_PROF_SM_OCCUPANCY
      • DCGM_FI_PROF_PIPE_TENSOR_ACTIVE
      • DCGM_FI_PROF_DRAM_ACTIVE
      • DCGM_FI_PROF_PIPE_FP64_ACTIVE
      • DCGM_FI_PROF_PIPE_FP32_ACTIVE
      • DCGM_FI_PROF_PIPE_FP16_ACTIVE
      • DCGM_FI_PROF_PCIE_TX_BYTES
      • DCGM_FI_PROF_PCIE_RX_BYTES
      • DCGM_FI_PROF_NVLINK_TX_BYTES
      • DCGM_FI_PROF_NVLINK_RX_BYTES
      • DCGM_FI_PROF_PIPE_TENSOR_IMMA_ACTIVE
      • DCGM_FI_PROF_PIPE_TENSOR_HMMA_ACTIVE
      • DCGM_FI_PROF_PIPE_TENSOR_DFMA_ACTIVE
      • DCGM_FI_PROF_PIPE_INT_ACTIVE
      • DCGM_FI_PROF_NVDEC0_ACTIVE
      • DCGM_FI_PROF_NVDEC1_ACTIVE
      • DCGM_FI_PROF_NVDEC2_ACTIVE
      • DCGM_FI_PROF_NVDEC3_ACTIVE
      • DCGM_FI_PROF_NVDEC4_ACTIVE
      • DCGM_FI_PROF_NVDEC5_ACTIVE
      • DCGM_FI_PROF_NVDEC6_ACTIVE
      • DCGM_FI_PROF_NVDEC7_ACTIVE
      • DCGM_FI_PROF_NVJPG0_ACTIVE
      • DCGM_FI_PROF_NVJPG1_ACTIVE
      • DCGM_FI_PROF_NVJPG2_ACTIVE
      • DCGM_FI_PROF_NVJPG3_ACTIVE
      • DCGM_FI_PROF_NVJPG4_ACTIVE
      • DCGM_FI_PROF_NVJPG5_ACTIVE
      • DCGM_FI_PROF_NVJPG6_ACTIVE
      • DCGM_FI_PROF_NVJPG7_ACTIVE
      • DCGM_FI_PROF_NVOFA0_ACTIVE
      • DCGM_FI_PROF_NVOFA1_ACTIVE
      • DCGM_FI_PROF_NVLINK_L0_TX_BYTES
      • DCGM_FI_PROF_NVLINK_L0_RX_BYTES
      • DCGM_FI_PROF_NVLINK_L1_TX_BYTES
      • DCGM_FI_PROF_NVLINK_L1_RX_BYTES
      • DCGM_FI_PROF_NVLINK_L2_TX_BYTES
      • DCGM_FI_PROF_NVLINK_L2_RX_BYTES
      • DCGM_FI_PROF_NVLINK_L3_TX_BYTES
      • DCGM_FI_PROF_NVLINK_L3_RX_BYTES
      • DCGM_FI_PROF_NVLINK_L4_TX_BYTES
      • DCGM_FI_PROF_NVLINK_L4_RX_BYTES
      • DCGM_FI_PROF_NVLINK_L5_TX_BYTES
      • DCGM_FI_PROF_NVLINK_L5_RX_BYTES
      • DCGM_FI_PROF_NVLINK_L6_TX_BYTES
      • DCGM_FI_PROF_NVLINK_L6_RX_BYTES
      • DCGM_FI_PROF_NVLINK_L7_TX_BYTES
      • DCGM_FI_PROF_NVLINK_L7_RX_BYTES
      • DCGM_FI_PROF_NVLINK_L8_TX_BYTES
      • DCGM_FI_PROF_NVLINK_L8_RX_BYTES
      • DCGM_FI_PROF_NVLINK_L9_TX_BYTES
      • DCGM_FI_PROF_NVLINK_L9_RX_BYTES
      • DCGM_FI_PROF_NVLINK_L10_TX_BYTES
      • DCGM_FI_PROF_NVLINK_L10_RX_BYTES
      • DCGM_FI_PROF_NVLINK_L11_TX_BYTES
      • DCGM_FI_PROF_NVLINK_L11_RX_BYTES
      • DCGM_FI_PROF_NVLINK_L12_TX_BYTES
      • DCGM_FI_PROF_NVLINK_L12_RX_BYTES
      • DCGM_FI_PROF_NVLINK_L13_TX_BYTES
      • DCGM_FI_PROF_NVLINK_L13_RX_BYTES
      • DCGM_FI_PROF_NVLINK_L14_TX_BYTES
      • DCGM_FI_PROF_NVLINK_L14_RX_BYTES
      • DCGM_FI_PROF_NVLINK_L15_TX_BYTES
      • DCGM_FI_PROF_NVLINK_L15_RX_BYTES
      • DCGM_FI_PROF_NVLINK_L16_TX_BYTES
      • DCGM_FI_PROF_NVLINK_L16_RX_BYTES
      • DCGM_FI_PROF_NVLINK_L17_TX_BYTES
      • DCGM_FI_PROF_NVLINK_L17_RX_BYTES
      • DCGM_FI_PROF_NVLINK_THROUGHPUT_FIRST
      • DCGM_FI_PROF_NVLINK_THROUGHPUT_LAST
      • DCGM_FI_PROF_C2C_TX_ALL_BYTES
      • DCGM_FI_PROF_C2C_TX_DATA_BYTES
      • DCGM_FI_PROF_C2C_RX_ALL_BYTES
      • DCGM_FI_PROF_C2C_RX_DATA_BYTES
      • DCGM_FI_DEV_CPU_UTIL_TOTAL
      • DCGM_FI_DEV_CPU_UTIL_USER
      • DCGM_FI_DEV_CPU_UTIL_NICE
      • DCGM_FI_DEV_CPU_UTIL_SYS
      • DCGM_FI_DEV_CPU_UTIL_IRQ
      • DCGM_FI_DEV_CPU_TEMP_CURRENT
      • DCGM_FI_DEV_CPU_TEMP_WARNING
      • DCGM_FI_DEV_CPU_TEMP_CRITICAL
      • DCGM_FI_DEV_CPU_CLOCK_CURRENT
      • DCGM_FI_DEV_CPU_POWER_UTIL_CURRENT
      • DCGM_FI_DEV_CPU_POWER_LIMIT
      • DCGM_FI_DEV_SYSIO_POWER_UTIL_CURRENT
      • DCGM_FI_DEV_MODULE_POWER_UTIL_CURRENT
      • DCGM_FI_DEV_CPU_VENDOR
      • DCGM_FI_DEV_CPU_MODEL
      • DCGM_FI_DEV_NVLINK_COUNT_TX_PACKETS
      • DCGM_FI_DEV_NVLINK_COUNT_TX_BYTES
      • DCGM_FI_DEV_NVLINK_COUNT_RX_PACKETS
      • DCGM_FI_DEV_NVLINK_COUNT_RX_BYTES
      • DCGM_FI_DEV_NVLINK_COUNT_RX_MALFORMED_PACKET_ERRORS
      • DCGM_FI_DEV_NVLINK_COUNT_RX_BUFFER_OVERRUN_ERRORS
      • DCGM_FI_DEV_NVLINK_COUNT_RX_ERRORS
      • DCGM_FI_DEV_NVLINK_COUNT_RX_REMOTE_ERRORS
      • DCGM_FI_DEV_NVLINK_COUNT_RX_GENERAL_ERRORS
      • DCGM_FI_DEV_NVLINK_COUNT_LOCAL_LINK_INTEGRITY_ERRORS
      • DCGM_FI_DEV_NVLINK_COUNT_TX_DISCARDS
      • DCGM_FI_DEV_NVLINK_COUNT_LINK_RECOVERY_SUCCESSFUL_EVENTS
      • DCGM_FI_DEV_NVLINK_COUNT_LINK_RECOVERY_FAILED_EVENTS
      • DCGM_FI_DEV_NVLINK_COUNT_LINK_RECOVERY_EVENTS
      • DCGM_FI_DEV_NVLINK_COUNT_RX_SYMBOL_ERRORS
      • DCGM_FI_DEV_NVLINK_COUNT_SYMBOL_BER
      • DCGM_FI_DEV_NVLINK_COUNT_SYMBOL_BER_FLOAT
      • DCGM_FI_DEV_NVLINK_COUNT_EFFECTIVE_BER
      • DCGM_FI_DEV_NVLINK_COUNT_EFFECTIVE_BER_FLOAT
      • DCGM_FI_DEV_NVLINK_COUNT_EFFECTIVE_ERRORS
      • DCGM_FI_DEV_FIRST_CONNECTX_FIELD_ID
      • DCGM_FI_DEV_CONNECTX_HEALTH
      • DCGM_FI_DEV_CONNECTX_ACTIVE_PCIE_LINK_WIDTH
      • DCGM_FI_DEV_CONNECTX_ACTIVE_PCIE_LINK_SPEED
      • DCGM_FI_DEV_CONNECTX_EXPECT_PCIE_LINK_WIDTH
      • DCGM_FI_DEV_CONNECTX_EXPECT_PCIE_LINK_SPEED
      • DCGM_FI_DEV_CONNECTX_CORRECTABLE_ERR_STATUS
      • DCGM_FI_DEV_CONNECTX_CORRECTABLE_ERR_MASK
      • DCGM_FI_DEV_CONNECTX_UNCORRECTABLE_ERR_STATUS
      • DCGM_FI_DEV_CONNECTX_UNCORRECTABLE_ERR_MASK
      • DCGM_FI_DEV_CONNECTX_UNCORRECTABLE_ERR_SEVERITY
      • DCGM_FI_DEV_CONNECTX_DEVICE_TEMPERATURE
      • DCGM_FI_DEV_LAST_CONNECTX_FIELD_ID
      • DCGM_FI_DEV_C2C_LINK_ERROR_INTR
      • DCGM_FI_DEV_C2C_LINK_ERROR_REPLAY
      • DCGM_FI_DEV_C2C_LINK_ERROR_REPLAY_B2B
      • DCGM_FI_DEV_C2C_LINK_POWER_STATE
      • DCGM_FI_DEV_NVLINK_COUNT_FEC_HISTORY_0
      • DCGM_FI_DEV_NVLINK_COUNT_FEC_HISTORY_1
      • DCGM_FI_DEV_NVLINK_COUNT_FEC_HISTORY_2
      • DCGM_FI_DEV_NVLINK_COUNT_FEC_HISTORY_3
      • DCGM_FI_DEV_NVLINK_COUNT_FEC_HISTORY_4
      • DCGM_FI_DEV_NVLINK_COUNT_FEC_HISTORY_5
      • DCGM_FI_DEV_NVLINK_COUNT_FEC_HISTORY_6
      • DCGM_FI_DEV_NVLINK_COUNT_FEC_HISTORY_7
      • DCGM_FI_DEV_NVLINK_COUNT_FEC_HISTORY_8
      • DCGM_FI_DEV_NVLINK_COUNT_FEC_HISTORY_9
      • DCGM_FI_DEV_NVLINK_COUNT_FEC_HISTORY_10
      • DCGM_FI_DEV_NVLINK_COUNT_FEC_HISTORY_11
      • DCGM_FI_DEV_NVLINK_COUNT_FEC_HISTORY_12
      • DCGM_FI_DEV_NVLINK_COUNT_FEC_HISTORY_13
      • DCGM_FI_DEV_NVLINK_COUNT_FEC_HISTORY_14
      • DCGM_FI_DEV_NVLINK_COUNT_FEC_HISTORY_15
      • DCGM_FI_DEV_CLOCKS_EVENT_REASON_SW_POWER_CAP_NS
      • DCGM_FI_DEV_CLOCKS_EVENT_REASON_SYNC_BOOST_NS
      • DCGM_FI_DEV_CLOCKS_EVENT_REASON_SW_THERM_SLOWDOWN_NS
      • DCGM_FI_DEV_CLOCKS_EVENT_REASON_HW_THERM_SLOWDOWN_NS
      • DCGM_FI_DEV_CLOCKS_EVENT_REASON_HW_POWER_BRAKE_SLOWDOWN_NS
      • DCGM_FI_MAX_FIELDS
      • DcgmFieldGetById()
      • DcgmFieldGetByTag()
      • DcgmFieldsInit()
      • DcgmFieldsTerm()
      • DcgmFieldsGetEntityGroupString()
    • Execution Control
      • dcgmUpdateAllFields()
      • dcgmPolicyTrigger()
  • Data Structures
    • DCGM_HOME_DIR_VAR_NAME
    • DCGM_RUN_FLAGS_VERBOSE
    • DCGM_RUN_FLAGS_STATSONFAIL
    • DCGM_RUN_FLAGS_TRAIN
    • DCGM_RUN_FLAGS_FORCE_TRAIN
    • DCGM_RUN_FLAGS_FAIL_EARLY
    • DCGM_TOPO_HINT_F_NONE
    • DCGM_TOPO_HINT_F_IGNOREHEALTH
    • dcgmConnectV2Params_version1
    • dcgmConnectV2Params_version2
    • dcgmConnectV2Params_version
    • dcgmHostengineHealth_version1
    • dcgmHostengineHealth_version
    • dcgmGroupInfo_version2
    • dcgmGroupInfo_version3
    • dcgmGroupInfo_version
    • DCGM_MAX_INSTANCES_PER_GPU
    • DCGM_MAX_COMPUTE_INSTANCES_PER_GPU
    • DCGM_MAX_TOTAL_INSTANCES_PER_GPU
    • DCGM_MAX_HIERARCHY_INFO
    • DCGM_MAX_INSTANCES
    • DCGM_MAX_COMPUTE_INSTANCES
    • dcgmMigHierarchy_version2
    • dcgmMigHierarchy_version
    • DCGM_CPU_CORE_BITMASK_COUNT_V1
    • dcgmCpuHierarchyOwnedCores_version1
    • dcgmCpuHierarchy_version1
    • dcgmCpuHierarchy_version2
    • dcgmCpuHierarchy_version
    • DCGM_MAX_NUM_FIELD_GROUPS
    • DCGM_MAX_FIELD_IDS_PER_FIELD_GROUP
    • dcgmFieldGroupInfo_version1
    • dcgmFieldGroupInfo_version
    • dcgmAllFieldGroup_version1
    • dcgmAllFieldGroup_version
    • dcgmClockSet_version1
    • dcgmClockSet_version
    • dcgmDeviceSupportedClockSets_version1
    • dcgmDeviceSupportedClockSets_version
    • dcgmDevicePidAccountingStats_version1
    • dcgmDevicePidAccountingStats_version
    • dcgmDeviceThermals_version1
    • dcgmDeviceThermals_version
    • dcgmDevicePowerLimits_version1
    • dcgmDevicePowerLimits_version
    • dcgmDeviceIdentifiers_version1
    • dcgmDeviceIdentifiers_version
    • dcgmDeviceMemoryUsage_version1
    • dcgmDeviceMemoryUsage_version
    • dcgmDeviceVgpuUtilInfo_version1
    • dcgmDeviceVgpuUtilInfo_version
    • dcgmDeviceEncStats_version1
    • dcgmDeviceEncStats_version
    • dcgmDeviceFbcStats_version1
    • dcgmDeviceFbcStats_version
    • dcgmDeviceFbcSessionInfo_version1
    • dcgmDeviceFbcSessionInfo_version
    • dcgmDeviceFbcSessions_version1
    • dcgmDeviceFbcSessions_version
    • dcgmDeviceVgpuEncSessions_version1
    • dcgmDeviceVgpuEncSessions_version
    • dcgmDeviceVgpuProcessUtilInfo_version1
    • dcgmDeviceVgpuProcessUtilInfo_version
    • dcgmDeviceVgpuTypeInfo_version1
    • dcgmDeviceVgpuTypeInfo_version2
    • dcgmDeviceVgpuTypeInfo_version
    • dcgmDeviceSupportedVgpuTypeInfo_version1
    • dcgmDeviceSupportedVgpuTypeInfo_version
    • dcgmDeviceSettings_version2
    • dcgmDeviceSettings_version
    • dcgmDeviceAttributes_version3
    • dcgmDeviceAttributes_version
    • dcgmDeviceMigAttributesInfo_version1
    • dcgmDeviceMigAttributesInfo_version
    • dcgmDeviceMigAttributes_version1
    • dcgmDeviceMigAttributes_version
    • dcgmGpuInstanceProfileInfo_version1
    • dcgmGpuInstanceProfileInfo_version
    • dcgmGpuInstanceProfiles_version1
    • dcgmGpuInstanceProfiles_version
    • dcgmComputeInstanceProfileInfo_version1
    • dcgmComputeInstanceProfileInfo_version
    • dcgmComputeInstanceProfiles_version1
    • dcgmComputeInstanceProfiles_version
    • DCGM_MAX_VGPU_TYPES_PER_PGPU
    • DCGM_DEVICE_UUID_BUFFER_SIZE
    • DCGM_POWER_PROFILE_ARRAY_SIZE
    • DCGM_POWER_PROFILE_MASK_BITS_PER_ELEM
    • DCGM_POWER_PROFILE_MAX_NUM
    • dcgmWorkloadPowerProfileInfo_version1
    • dcgmWorkloadPowerProfileInfo_version
    • dcgmWorkloadPowerProfileProfilesInfo_version1
    • dcgmWorkloadPowerProfileProfilesInfo_version
    • dcgmDeviceWorkloadPowerProfilesStatus_version1
    • dcgmDeviceWorkloadPowerProfilesStatus_version
    • dcgmConfig_version1
    • dcgmConfig_version2
    • dcgmConfig_version
    • dcgmPolicyViolation_version1
    • dcgmPolicyViolation_version
    • DCGM_POLICY_COND_IDX_MAX
    • DCGM_POLICY_COND_MAX
    • dcgmPolicy_version1
    • dcgmPolicy_version
    • dcgmPolicyCallbackResponse_version2
    • dcgmPolicyCallbackResponse_version
    • DCGM_MAX_BLOB_LENGTH
    • dcgmFieldValue_version1
    • dcgmFieldValue_version2
    • DCGM_FV_FLAG_LIVE_DATA
    • DCGM_HEALTH_WATCH_COUNT_V1
    • DCGM_HEALTH_WATCH_COUNT_V2
    • DCGM_ERR_MSG_LENGTH
    • DCGM_DIAG_AUX_DATA_LEN
    • dcgmDiagTestAuxData_version1
    • dcgmDiagTestAuxData_version
    • DCGM_DIAG_TEST_RUN_ERROR_INDICES_MAX
    • DCGM_DIAG_TEST_RUN_INFO_INDICES_MAX
    • DCGM_DIAG_TEST_RUN_INFO_INDICES_MAX_V2
    • DCGM_DIAG_TEST_RUN_RESULTS_MAX
    • DCGM_DIAG_TEST_RUN_NAME_LEN
    • DCGM_DEVICE_ID_LEN
    • DCGM_VERSION_LEN
    • DCGM_HEALTH_WATCH_MAX_INCIDENTS_V2
    • dcgmHealthResponse_version5
    • dcgmHealthResponse_version
    • dcgmHealthSetParams_version2
    • DCGM_MAX_PID_INFO_NUM
    • dcgmPidInfo_version2
    • dcgmPidInfo_version
    • dcgmJobInfo_version3
    • dcgmJobInfo_version
    • dcgmRunningProcess_version1
    • dcgmRunningProcess_version
    • DCGM_MAX_ERRORS
    • DCGM_SM_PERF_INDEX
    • DCGM_TARGETED_PERF_INDEX
    • DCGM_PER_GPU_TEST_COUNT_V8
    • DCGM_PER_GPU_TEST_COUNT_V7
    • DCGM_SWTEST_COUNT
    • LEVEL_ONE_MAX_RESULTS
    • DCGM_DIAG_RESPONSE_TESTS_MAX
    • DCGM_DIAG_RESPONSE_SYSTEM_ERROR
    • DCGM_DIAG_RESPONSE_ERRORS_MAX
    • DCGM_DIAG_RESPONSE_INFO_MAX
    • DCGM_DIAG_RESPONSE_INFO_MAX_V2
    • DCGM_DIAG_RESPONSE_ENTITIES_MAX
    • DCGM_DIAG_RESPONSE_RESULTS_MAX
    • DCGM_DIAG_RESPONSE_CATEGORIES_MAX
    • DCGM_DIAG_RESPONSE_CATEGORY_LEN
    • DCGM_DIAG_RESPONSE_V11_UNUSED_LEN
    • DCGM_DIAG_RESPONSE_V12_UNUSED_LEN
    • dcgmDiagResponse_version12
    • dcgmDiagResponse_version11
    • dcgmDiagResponse_version10
    • dcgmDiagResponse_version9
    • dcgmDiagResponse_version8
    • dcgmDiagResponse_version7
    • dcgmDiagResponse_version
    • dcgmDiagStatus_version1
    • dcgmDiagStatus_version
    • DCGM_TOPOLOGY_PATH_PCI
    • DCGM_TOPOLOGY_PATH_NVLINK
    • DCGM_AFFINITY_BITMASK_ARRAY_SIZE
    • dcgmDeviceTopology_version1
    • dcgmDeviceTopology_version
    • dcgmGroupTopology_version1
    • dcgmGroupTopology_version
    • dcgmIntrospectMemory_version1
    • dcgmIntrospectMemory_version
    • dcgmIntrospectCpuUtil_version1
    • dcgmIntrospectCpuUtil_version
    • DCGM_MAX_CONFIG_FILE_LEN
    • DCGM_MAX_TEST_NAMES
    • DCGM_MAX_TEST_NAMES_LEN
    • DCGM_MAX_TEST_PARMS
    • DCGM_MAX_TEST_PARMS_LEN
    • DCGM_MAX_TEST_PARMS_LEN_V2
    • DCGM_GPU_LIST_LEN
    • DCGM_ENTITY_ID_LIST_LEN
    • DCGM_EXPECTED_ENTITIES_LEN
    • DCGM_FILE_LEN
    • DCGM_PATH_LEN
    • DCGM_CLOCKS_EVENT_MASK_LEN
    • DCGM_IGNORE_ERROR_MAX_LEN
    • DCGM_THROTTLE_MASK_LEN
    • dcgmRunDiag_version7
    • dcgmRunDiag_version8
    • dcgmRunDiag_version9
    • dcgmRunDiag_version10
    • DCGM_GEGE_FLAG_ONLY_SUPPORTED
    • dcgmTopoSchedHint_version1
    • dcgmNvLinkStatus_version4
    • DCGM_SUMMARY_MIN
    • DCGM_SUMMARY_MAX
    • DCGM_SUMMARY_AVG
    • DCGM_SUMMARY_SUM
    • DCGM_SUMMARY_COUNT
    • DCGM_SUMMARY_INTEGRAL
    • DCGM_SUMMARY_DIFF
    • DCGM_SUMMARY_SIZE
    • dcgmFieldSummaryRequest_version1
    • DCGM_MODULE_STATUSES_CAPACITY
    • dcgmModuleGetStatuses_version1
    • dcgmModuleGetStatuses_version
    • dcgmStartEmbeddedV2Params_version1
    • dcgmStartEmbeddedV2Params_version2
    • DCGM_PROF_MAX_NUM_GROUPS_V2
    • DCGM_PROF_MAX_FIELD_IDS_PER_GROUP_V2
    • dcgmProfGetMetricGroups_version3
    • dcgmProfGetMetricGroups_version
    • dcgmSettingsSetLoggingSeverity_version1
    • dcgmSettingsSetLoggingSeverity_version2
    • dcgmSettingsSetLoggingSeverity_version
    • dcgmVersionInfo_version2
    • dcgmVersionInfo_version
    • dcgmHandle_t
    • dcgmGpuGrp_t
    • dcgmFieldGrp_t
    • dcgmStatus_t
    • dcgm_link_t
    • dcgmConnectV2Params_t
    • dcgmHostengineHealth_t
    • dcgmGroupInfo_t
    • dcgmCpuHierarchyOwnedCores_t
    • dcgmCpuHierarchy_t
    • dcgmFieldGroupInfo_t
    • dcgmAllFieldGroup_t
    • dcgmClockSet_t
    • dcgmDeviceSupportedClockSets_t
    • dcgmDevicePidAccountingStats_t
    • dcgmDeviceThermals_t
    • dcgmDevicePowerLimits_t
    • dcgmDeviceIdentifiers_t
    • dcgmDeviceMemoryUsage_t
    • dcgmDeviceVgpuUtilInfo_t
    • dcgmDeviceEncStats_t
    • dcgmDeviceFbcStats_t
    • dcgmFBCSessionType_t
    • dcgmDeviceFbcSessionInfo_t
    • dcgmDeviceFbcSessions_t
    • dcgmEncoderType_t
    • dcgmDeviceVgpuEncSessions_t
    • dcgmDeviceVgpuProcessUtilInfo_t
    • dcgmDeviceVgpuTypeInfo_t
    • dcgmDeviceSupportedVgpuTypeInfo_t
    • dcgmDeviceSettings_t
    • dcgmDeviceAttributes_t
    • dcgmDeviceMigAttributesInfo_t
    • dcgmDeviceMigAttributes_t
    • dcgmGpuInstanceProfileInfo_t
    • dcgmGpuInstanceProfiles_t
    • dcgmComputeInstanceProfileInfo_t
    • dcgmComputeInstanceProfiles_t
    • dcgmWorkloadPowerProfileInfo_t
    • dcgmWorkloadPowerProfileProfilesInfo_t
    • dcgmDeviceWorkloadPowerProfilesStatus_t
    • dcgmConfig_t
    • dcgmPolicyViolation_t
    • dcgmPolicyConditionIdx_t
    • dcgmPolicyCondition_t
    • dcgmPolicyConditionParams_t
    • dcgmPolicyMode_t
    • dcgmPolicyIsolation_t
    • dcgmPolicyAction_t
    • dcgmPolicyValidation_t
    • dcgmPolicyFailureResp_t
    • dcgmPolicy_t
    • dcgmPolicyCallbackResponse_t
    • fpRecvUpdates
    • dcgmFieldValueEnumeration_f
    • dcgmFieldValueEntityEnumeration_f
    • dcgmHealthSystems_t
    • dcgmHealthWatchResults_t
    • dcgmDiagResult_t
    • dcgmHealthResponse_t
    • dcgmPidInfo_t
    • dcgmJobInfo_t
    • dcgmRunningProcess_t
    • dcgmPerGpuTestIndices_t
    • dcgmSoftwareTest_t
    • dcgmDiagResponse_t
    • dcgmDiagStatus_t
    • dcgmGpuTopologyLevel_t
    • dcgmDeviceTopology_t
    • dcgmGroupTopology_t
    • dcgmIntrospectMemory_t
    • dcgmIntrospectCpuUtil_t
    • dcgmGpuNVLinkErrorType_t
    • dcgmTopoSchedHint_t
    • dcgmNvLinkLinkState_t
    • dcgmNvLinkStatus_t
    • dcgmFieldSummaryRequest_t
    • dcgmModuleGetStatuses_t
    • dcgmProfGetMetricGroups_t
    • dcgmSettingsSetLoggingSeverity_t
    • dcgmVersionInfo_t
    • DcgmLoggingSeverity_t
      • DcgmLoggingSeverityUnspecified
      • DcgmLoggingSeverityNone
      • DcgmLoggingSeverityFatal
      • DcgmLoggingSeverityError
      • DcgmLoggingSeverityWarning
      • DcgmLoggingSeverityInfo
      • DcgmLoggingSeverityDebug
      • DcgmLoggingSeverityVerbose
    • dcgmMigProfile_t
      • DcgmMigProfileNone
      • DcgmMigProfileGpuInstanceSlice1
      • DcgmMigProfileGpuInstanceSlice2
      • DcgmMigProfileGpuInstanceSlice3
      • DcgmMigProfileGpuInstanceSlice4
      • DcgmMigProfileGpuInstanceSlice7
      • DcgmMigProfileGpuInstanceSlice8
      • DcgmMigProfileGpuInstanceSlice6
      • DcgmMigProfileGpuInstanceSlice1Rev1
      • DcgmMigProfileGpuInstanceSlice2Rev1
      • DcgmMigProfileGpuInstanceSlice1Rev2
      • DcgmMigProfileGpuInstanceSlice1GFX
      • DcgmMigProfileGpuInstanceSlice2GFX
      • DcgmMigProfileGpuInstanceSlice4GFX
      • DcgmMigProfileComputeInstanceSlice1
      • DcgmMigProfileComputeInstanceSlice2
      • DcgmMigProfileComputeInstanceSlice3
      • DcgmMigProfileComputeInstanceSlice4
      • DcgmMigProfileComputeInstanceSlice7
      • DcgmMigProfileComputeInstanceSlice8
      • DcgmMigProfileComputeInstanceSlice6
      • DcgmMigProfileComputeInstanceSlice1Rev1
    • dcgmFBCSessionType_enum
      • DCGM_FBC_SESSION_TYPE_UNKNOWN
      • DCGM_FBC_SESSION_TYPE_TOSYS
      • DCGM_FBC_SESSION_TYPE_CUDA
      • DCGM_FBC_SESSION_TYPE_VID
      • DCGM_FBC_SESSION_TYPE_HWENC
    • dcgmEncoderQueryType_enum
      • DCGM_ENCODER_QUERY_H264
      • DCGM_ENCODER_QUERY_HEVC
    • dcgmPowerProfileType_t
      • DCGM_POWER_PROFILE_MAX_P
      • DCGM_POWER_PROFILE_MAX_Q
      • DCGM_POWER_PROFILE_COMPUTE
      • DCGM_POWER_PROFILE_MEMORY_BOUND
      • DCGM_POWER_PROFILE_NETWORK
      • DCGM_POWER_PROFILE_BALANCED
      • DCGM_POWER_PROFILE_LLM_INFERENCE
      • DCGM_POWER_PROFILE_LLM_TRAINING
      • DCGM_POWER_PROFILE_RBM
      • DCGM_POWER_PROFILE_DCPCIE
      • DCGM_POWER_PROFILE_HMMA_SPARSE
      • DCGM_POWER_PROFILE_HMMA_DENSE
      • DCGM_POWER_PROFILE_SYNC_BALANCED
      • DCGM_POWER_PROFILE_HPC
      • DCGM_POWER_PROFILE_MIG
      • DCGM_POWER_PROFILE_MAX
    • dcgmPolicyConditionIdx_enum
      • DCGM_POLICY_COND_IDX_DBE
      • DCGM_POLICY_COND_IDX_PCI
      • DCGM_POLICY_COND_IDX_MAX_PAGES_RETIRED
      • DCGM_POLICY_COND_IDX_THERMAL
      • DCGM_POLICY_COND_IDX_POWER
      • DCGM_POLICY_COND_IDX_NVLINK
      • DCGM_POLICY_COND_IDX_XID
    • dcgmPolicyCondition_enum
      • DCGM_POLICY_COND_DBE
      • DCGM_POLICY_COND_PCI
      • DCGM_POLICY_COND_MAX_PAGES_RETIRED
      • DCGM_POLICY_COND_THERMAL
      • DCGM_POLICY_COND_POWER
      • DCGM_POLICY_COND_NVLINK
      • DCGM_POLICY_COND_XID
    • dcgmPolicyMode_enum
      • DCGM_POLICY_MODE_AUTOMATED
      • DCGM_POLICY_MODE_MANUAL
    • dcgmPolicyIsolation_enum
      • DCGM_POLICY_ISOLATION_NONE
    • dcgmPolicyAction_enum
      • DCGM_POLICY_ACTION_NONE
      • DCGM_POLICY_ACTION_GPURESET
    • dcgmPolicyValidation_enum
      • DCGM_POLICY_VALID_NONE
      • DCGM_POLICY_VALID_SV_SHORT
      • DCGM_POLICY_VALID_SV_MED
      • DCGM_POLICY_VALID_SV_LONG
      • DCGM_POLICY_VALID_SV_XLONG
    • dcgmPolicyFailureResp_enum
      • DCGM_POLICY_FAILURE_NONE
    • dcgmHealthSystems_enum
      • DCGM_HEALTH_WATCH_PCIE
      • DCGM_HEALTH_WATCH_NVLINK
      • DCGM_HEALTH_WATCH_PMU
      • DCGM_HEALTH_WATCH_MCU
      • DCGM_HEALTH_WATCH_MEM
      • DCGM_HEALTH_WATCH_SM
      • DCGM_HEALTH_WATCH_INFOROM
      • DCGM_HEALTH_WATCH_THERMAL
      • DCGM_HEALTH_WATCH_POWER
      • DCGM_HEALTH_WATCH_DRIVER
      • DCGM_HEALTH_WATCH_NVSWITCH_NONFATAL
      • DCGM_HEALTH_WATCH_NVSWITCH_FATAL
      • DCGM_HEALTH_WATCH_ALL
    • dcgmHealthWatchResult_enum
      • DCGM_HEALTH_RESULT_PASS
      • DCGM_HEALTH_RESULT_WARN
      • DCGM_HEALTH_RESULT_FAIL
    • dcgmDiagResult_enum
      • DCGM_DIAG_RESULT_PASS
      • DCGM_DIAG_RESULT_SKIP
      • DCGM_DIAG_RESULT_WARN
      • DCGM_DIAG_RESULT_FAIL
      • DCGM_DIAG_RESULT_NOT_RUN
    • dcgmDiagnosticLevel_t
      • DCGM_DIAG_LVL_INVALID
      • DCGM_DIAG_LVL_SHORT
      • DCGM_DIAG_LVL_MED
      • DCGM_DIAG_LVL_LONG
      • DCGM_DIAG_LVL_XLONG
    • dcgmPerGpuTestIndices_enum
      • DCGM_MEMORY_INDEX
      • DCGM_DIAGNOSTIC_INDEX
      • DCGM_PCI_INDEX
      • DCGM_SM_STRESS_INDEX
      • DCGM_TARGETED_STRESS_INDEX
      • DCGM_TARGETED_POWER_INDEX
      • DCGM_MEMORY_BANDWIDTH_INDEX
      • DCGM_MEMTEST_INDEX
      • DCGM_PULSE_TEST_INDEX
      • DCGM_EUD_TEST_INDEX
      • DCGM_NVBANDWIDTH_INDEX
      • DCGM_UNUSED2_TEST_INDEX
      • DCGM_UNUSED3_TEST_INDEX
      • DCGM_UNUSED4_TEST_INDEX
      • DCGM_UNUSED5_TEST_INDEX
      • DCGM_SOFTWARE_INDEX
      • DCGM_CONTEXT_CREATE_INDEX
      • DCGM_UNKNOWN_INDEX
    • dcgmSoftwareTest_enum
      • DCGM_SWTEST_DENYLIST
      • DCGM_SWTEST_NVML_LIBRARY
      • DCGM_SWTEST_CUDA_MAIN_LIBRARY
      • DCGM_SWTEST_CUDA_RUNTIME_LIBRARY
      • DCGM_SWTEST_PERMISSIONS
      • DCGM_SWTEST_PERSISTENCE_MODE
      • DCGM_SWTEST_ENVIRONMENT
      • DCGM_SWTEST_PAGE_RETIREMENT
      • DCGM_SWTEST_GRAPHICS_PROCESSES
      • DCGM_SWTEST_INFOROM
      • DCGM_SWTEST_FABRIC_MANAGER
    • dcgmGpuLevel_enum
      • DCGM_TOPOLOGY_UNINITIALIZED
      • DCGM_TOPOLOGY_BOARD
      • DCGM_TOPOLOGY_SINGLE
      • DCGM_TOPOLOGY_MULTIPLE
      • DCGM_TOPOLOGY_HOSTBRIDGE
      • DCGM_TOPOLOGY_CPU
      • DCGM_TOPOLOGY_SYSTEM
      • DCGM_TOPOLOGY_NVLINK1
      • DCGM_TOPOLOGY_NVLINK2
      • DCGM_TOPOLOGY_NVLINK3
      • DCGM_TOPOLOGY_NVLINK4
      • DCGM_TOPOLOGY_NVLINK5
      • DCGM_TOPOLOGY_NVLINK6
      • DCGM_TOPOLOGY_NVLINK7
      • DCGM_TOPOLOGY_NVLINK8
      • DCGM_TOPOLOGY_NVLINK9
      • DCGM_TOPOLOGY_NVLINK10
      • DCGM_TOPOLOGY_NVLINK11
      • DCGM_TOPOLOGY_NVLINK12
      • DCGM_TOPOLOGY_NVLINK13
      • DCGM_TOPOLOGY_NVLINK14
      • DCGM_TOPOLOGY_NVLINK15
      • DCGM_TOPOLOGY_NVLINK16
      • DCGM_TOPOLOGY_NVLINK17
      • DCGM_TOPOLOGY_NVLINK18
    • dcgmGpuNVLinkErrorType_enum
      • DCGM_GPU_NVLINK_ERROR_RECOVERY_REQUIRED
      • DCGM_GPU_NVLINK_ERROR_FATAL
    • dcgmNvLinkLinkState_enum
      • DcgmNvLinkLinkStateNotSupported
      • DcgmNvLinkLinkStateDisabled
      • DcgmNvLinkLinkStateDown
      • DcgmNvLinkLinkStateUp
    • dcgmModuleId_t
      • DcgmModuleIdCore
      • DcgmModuleIdNvSwitch
      • DcgmModuleIdVGPU
      • DcgmModuleIdIntrospect
      • DcgmModuleIdHealth
      • DcgmModuleIdPolicy
      • DcgmModuleIdConfig
      • DcgmModuleIdDiag
      • DcgmModuleIdProfiling
      • DcgmModuleIdSysmon
      • DcgmModuleIdCount
    • dcgmModuleStatus_t
      • DcgmModuleStatusNotLoaded
      • DcgmModuleStatusDenylisted
      • DcgmModuleStatusFailed
      • DcgmModuleStatusLoaded
      • DcgmModuleStatusUnloaded
      • DcgmModuleStatusPaused
    • dcgmFabricManagerStatus_t
      • DcgmFMStatusNotSupported
      • DcgmFMStatusNotStarted
      • DcgmFMStatusInProgress
      • DcgmFMStatusSuccess
      • DcgmFMStatusFailure
      • DcgmFMStatusUnrecognized
      • DcgmFMStatusNvmlTooOld
      • DcgmFMStatusCount
    • dcgm_link_s
      • type
      • index
      • gpuId
      • switchId
      • parsed
      • raw
    • dcgmConnectV2Params_v1
      • version
      • persistAfterDisconnect
    • dcgmConnectV2Params_v2
      • version
      • persistAfterDisconnect
      • timeoutMs
      • addressIsUnixSocket
    • dcgmHostengineHealth_v1
      • version
      • overallHealth
    • dcgmGroupEntityPair_t
      • entityGroupId
      • entityId
    • dcgmGroupInfo_v2
      • version
      • count
      • groupName
      • entityList
    • dcgmGroupInfo_v3
      • version
      • count
      • groupName
      • entityList
    • dcgmMigHierarchyInfo_t
      • entity
      • parent
      • sliceProfile
    • dcgmMigEntityInfo_t
      • gpuUuid
      • nvmlGpuIndex
      • nvmlInstanceId
      • nvmlComputeInstanceId
      • nvmlMigProfileId
      • nvmlProfileSlices
    • dcgmMigHierarchyInfo_v2
    • dcgmMigHierarchy_v2
    • dcgmCpuHierarchyOwnedCores_v1
    • dcgmCpuHierarchy_v1
      • dcgmCpuHierarchy_v1::dcgmCpuHierarchyCpu_v1
    • dcgmCpuHierarchy_v2
      • dcgmCpuHierarchy_v2::dcgmCpuHierarchyCpu_v2
    • dcgmFieldGroupInfo_v1
      • version
      • numFieldIds
      • fieldGroupId
      • fieldGroupName
      • fieldIds
    • dcgmAllFieldGroup_v1
      • version
      • numFieldGroups
      • fieldGroups
    • dcgmErrorInfo_t
      • gpuId
      • fieldId
      • status
    • dcgmClockSet_v1
      • version
      • memClock
      • smClock
    • dcgmDeviceSupportedClockSets_v1
      • version
      • count
      • clockSet
    • dcgmDevicePidAccountingStats_v1
      • version
      • pid
      • gpuUtilization
      • memoryUtilization
      • maxMemoryUsage
      • startTimestamp
      • activeTimeUsec
    • dcgmDeviceThermals_v1
      • version
      • slowdownTemp
      • shutdownTemp
    • dcgmDevicePowerLimits_v1
      • version
      • curPowerLimit
      • defaultPowerLimit
      • enforcedPowerLimit
      • minPowerLimit
      • maxPowerLimit
    • dcgmDeviceIdentifiers_v1
      • version
      • brandName
      • deviceName
      • pciBusId
      • serial
      • uuid
      • vbios
      • inforomImageVersion
      • pciDeviceId
      • pciSubSystemId
      • driverVersion
      • virtualizationMode
    • dcgmDeviceMemoryUsage_v1
      • version
      • bar1Total
      • fbTotal
      • fbUsed
      • fbFree
    • dcgmDeviceVgpuUtilInfo_v1
      • version
      • vgpuId
      • smUtil
      • memUtil
      • encUtil
      • decUtil
    • dcgmDeviceEncStats_v1
      • version
      • sessionCount
      • averageFps
      • averageLatency
    • dcgmDeviceFbcStats_v1
      • version
      • sessionCount
      • averageFps
      • averageLatency
    • dcgmDeviceFbcSessionInfo_v1
      • version
      • sessionId
      • pid
      • vgpuId
      • displayOrdinal
      • sessionType
      • sessionFlags
      • hMaxResolution
      • vMaxResolution
      • hResolution
      • vResolution
      • averageFps
      • averageLatency
    • dcgmDeviceFbcSessions_v1
      • version
      • sessionCount
      • sessionInfo
    • dcgmDeviceVgpuEncSessions_v1
      • version
      • vgpuId
      • sessionId
      • pid
      • codecType
      • hResolution
      • vResolution
      • averageFps
      • averageLatency
    • dcgmDeviceVgpuProcessUtilInfo_v1
      • version
      • vgpuId
      • vgpuProcessSamplesCount
      • pid
      • processName
      • smUtil
      • memUtil
      • encUtil
      • decUtil
    • dcgmDeviceVgpuTypeInfo_v1
      • version
      • vgpuTypeInfo
      • vgpuTypeName
      • vgpuTypeClass
      • vgpuTypeLicense
      • deviceId
      • subsystemId
      • numDisplayHeads
      • maxInstances
      • frameRateLimit
      • maxResolutionX
      • maxResolutionY
      • fbTotal
    • dcgmDeviceVgpuTypeInfo_v2
      • version
      • vgpuTypeInfo
      • vgpuTypeName
      • vgpuTypeClass
      • vgpuTypeLicense
      • deviceId
      • subsystemId
      • numDisplayHeads
      • maxInstances
      • frameRateLimit
      • maxResolutionX
      • maxResolutionY
      • fbTotal
      • gpuInstanceProfileId
    • dcgmDeviceSupportedVgpuTypeInfo_v1
      • version
      • deviceId
      • subsystemId
      • numDisplayHeads
      • maxInstances
      • frameRateLimit
      • maxResolutionX
      • maxResolutionY
      • fbTotal
      • gpuInstanceProfileId
    • dcgmDeviceSettings_v2
    • dcgmDeviceAttributes_v3
      • version
      • clockSets
      • thermalSettings
      • powerLimits
      • identifiers
      • memoryUsage
      • settings
    • dcgmDeviceMigAttributesInfo_v1
      • version
      • gpuInstanceId
      • computeInstanceId
      • multiprocessorCount
      • sharedCopyEngineCount
      • sharedDecoderCount
      • sharedEncoderCount
      • sharedJpegCount
      • sharedOfaCount
      • gpuInstanceSliceCount
      • computeInstanceSliceCount
      • memorySizeMB
    • dcgmDeviceMigAttributes_v1
      • version
      • migDevicesCount
      • migAttributesInfo
    • dcgmGpuInstanceProfileInfo_v1
      • version
      • id
      • isP2pSupported
      • sliceCount
      • instanceCount
      • multiprocessorCount
      • copyEngineCount
      • decoderCount
      • encoderCount
      • jpegCount
      • ofaCount
      • memorySizeMB
    • dcgmGpuInstanceProfiles_v1
      • version
      • profileCount
      • profileInfo
    • dcgmComputeInstanceProfileInfo_v1
      • version
      • gpuInstanceId
      • id
      • sliceCount
      • instanceCount
      • multiprocessorCount
      • sharedCopyEngineCount
      • sharedDecoderCount
      • sharedEncoderCount
      • sharedJpegCount
      • sharedOfaCount
    • dcgmComputeInstanceProfiles_v1
      • version
      • profileCount
      • profileInfo
    • dcgmWorkloadPowerProfileInfo_v1
      • version
      • profileId
      • priority
      • conflictingMask
    • dcgmWorkloadPowerProfileProfilesInfo_v1
      • version
      • workloadPowerProfile
      • profileCount
    • dcgmDeviceWorkloadPowerProfilesStatus_v1
      • version
      • profileMask
      • requestedProfileMask
      • enforcedProfileMask
    • dcgmConfigPerfStateSettings_t
      • syncBoost
      • targetClocks
    • dcgmConfigPowerLimit_t
      • type
      • val
    • dcgmConfig_v1
      • version
      • gpuId
      • eccMode
      • computeMode
      • perfState
      • powerLimit
    • dcgmConfig_v2
      • version
      • gpuId
      • eccMode
      • computeMode
      • perfState
      • powerLimit
      • workloadPowerProfiles
    • dcgmPolicyViolation_v1
      • version
      • notifyOnEccDbe
      • notifyOnPciEvent
      • notifyOnMaxRetiredPages
    • dcgmPolicyConditionParams_st
    • dcgmPolicyViolationNotify_t
      • gpuId
      • violationOccurred
    • dcgmPolicy_v1
      • version
      • condition
      • mode
      • isolation
      • action
      • validation
      • response
      • parms
    • dcgmPolicyConditionDbe_t
      • timestamp
      • location
      • numerrors
    • dcgmPolicyConditionPci_t
      • timestamp
      • counter
    • dcgmPolicyConditionMpr_t
      • timestamp
      • sbepages
      • dbepages
    • dcgmPolicyConditionThermal_t
      • timestamp
      • thermalViolation
    • dcgmPolicyConditionPower_t
      • timestamp
      • powerViolation
    • dcgmPolicyConditionNvlink_t
      • timestamp
      • fieldId
      • counter
    • dcgmPolicyConditionXID_t
      • timestamp
      • errnum
    • dcgmPolicyCallbackResponse_v2
      • version
      • condition
      • dbe
      • pci
      • mpr
      • thermal
      • power
      • nvlink
      • xid
      • gpuId
    • dcgmFieldValue_v1
      • version
      • fieldId
      • fieldType
      • status
      • ts
      • i64
      • dbl
      • str
      • blob
      • value
    • dcgmFieldValue_v2
      • version
      • entityGroupId
      • entityId
      • fieldId
      • fieldType
      • status
      • unused
      • ts
      • i64
      • dbl
      • str
      • blob
      • value
    • dcgmStatSummaryInt64_t
      • minValue
      • maxValue
      • average
    • dcgmStatSummaryInt32_t
      • minValue
      • maxValue
      • average
    • dcgmStatSummaryFp64_t
      • minValue
      • maxValue
      • average
    • dcgmDiagErrorDetail_t
    • dcgmDiagErrorDetail_v2
      • category
      • severity
    • dcgmDiagInfo_v1
      • entity
      • msg
      • testId
    • dcgmDiagError_v1
      • entity
      • code
      • category
      • severity
      • msg
      • testId
    • dcgmDiagEntityResult_v1
      • entity
      • result
      • testId
    • dcgmDiagTestAuxData_v1
      • version
    • dcgmDiagTestRun_v2
      • name
      • pluginName
      • result
      • numErrors
      • numInfo
      • categoryIndex
      • _unused
      • numResults
      • errorIndices
      • infoIndices
      • resultIndices
    • dcgmDiagTestRun_v1
      • name
      • pluginName
      • result
      • numErrors
      • numInfo
      • categoryIndex
      • _unused
      • numResults
      • errorIndices
      • infoIndices
      • resultIndices
    • dcgmDiagEntity_v1
      • entity
      • serialNum
      • skuDeviceId
    • dcgmIncidentInfo_t
      • system
      • health
      • error
      • entityInfo
    • dcgmHealthResponse_v5
      • version
      • overallHealth
      • incidentCount
      • incidents
    • dcgmHealthSetParams_v2
      • version
      • groupId
      • systems
      • updateInterval
      • maxKeepAge
    • dcgmProcessUtilInfo_t
    • dcgmProcessUtilSample_t
    • dcgmPidSingleInfo_t
      • gpuId
      • energyConsumed
      • pcieRxBandwidth
      • pcieTxBandwidth
      • pcieReplays
      • startTime
      • endTime
      • processUtilization
      • smUtilization
      • memoryUtilization
      • eccSingleBit
      • eccDoubleBit
      • memoryClock
      • smClock
      • numXidCriticalErrors
      • xidCriticalErrorsTs
      • numOtherComputePids
      • otherComputePids
      • numOtherGraphicsPids
      • otherGraphicsPids
      • maxGpuMemoryUsed
      • powerViolationTime
      • thermalViolationTime
      • reliabilityViolationTime
      • boardLimitViolationTime
      • lowUtilizationTime
      • syncBoostTime
      • overallHealth
      • system
      • health
    • dcgmPidInfo_v2
      • version
      • pid
      • numGpus
      • summary
      • gpus
    • dcgmGpuUsageInfo_t
      • gpuId
      • energyConsumed
      • powerUsage
      • pcieRxBandwidth
      • pcieTxBandwidth
      • pcieReplays
      • startTime
      • endTime
      • smUtilization
      • memoryUtilization
      • eccSingleBit
      • eccDoubleBit
      • memoryClock
      • smClock
      • numXidCriticalErrors
      • xidCriticalErrorsTs
      • numComputePids
      • computePidInfo
      • numGraphicsPids
      • graphicsPidInfo
      • maxGpuMemoryUsed
      • powerViolationTime
      • thermalViolationTime
      • reliabilityViolationTime
      • boardLimitViolationTime
      • lowUtilizationTime
      • syncBoostTime
      • overallHealth
      • system
      • health
    • dcgmJobInfo_v3
      • version
      • numGpus
      • summary
      • gpus
    • dcgmRunningProcess_v1
      • version
      • pid
      • memoryUsed
    • dcgmDiagTestResult_v2
      • status
      • error
      • info
    • dcgmDiagTestResult_v3
      • status
      • error
      • info
    • dcgmDiagResponsePerGpu_v4
      • gpuId
      • hwDiagnosticReturn
      • results
    • dcgmDiagResponsePerGpu_v5
      • gpuId
      • hwDiagnosticReturn
      • results
    • dcgmDiagResponsePerGpu_v3
      • gpuId
      • hwDiagnosticReturn
      • results
    • dcgmDiagResponse_v12
      • version
      • numTests
      • numErrors
      • numInfo
      • numCategories
      • numEntities
      • numResults
      • tests
      • entities
      • errors
      • info
      • results
      • categories
      • dcgmVersion
      • driverVersion
      • _unused
    • dcgmDiagResponse_v11
      • version
      • numTests
      • numErrors
      • numInfo
      • numCategories
      • numEntities
      • numResults
      • tests
      • entities
      • errors
      • info
      • results
      • categories
      • dcgmVersion
      • driverVersion
      • _unused
    • dcgmDiagResponse_v10
      • version
      • gpuCount
      • levelOneTestCount
      • levelOneResults
      • perGpuResponses
      • systemError
      • devIds
      • devSerials
      • dcgmVersion
      • driverVersion
      • auxDataPerTest
    • dcgmDiagResponse_v9
      • version
      • gpuCount
      • levelOneTestCount
      • levelOneResults
      • perGpuResponses
      • systemError
      • devIds
      • devSerials
      • dcgmVersion
      • driverVersion
      • _unused
    • dcgmDiagResponse_v8
      • version
      • gpuCount
      • levelOneTestCount
      • levelOneResults
      • perGpuResponses
      • systemError
      • devIds
      • dcgmVersion
      • driverVersion
      • _unused
    • dcgmDiagResponse_v7
      • version
      • gpuCount
      • levelOneTestCount
      • levelOneResults
      • perGpuResponses
      • systemError
      • _unused
    • dcgmDiagStatus_v1
      • version
      • totalTests
      • completedTests
      • testName
      • errorCode
    • dcgmDeviceTopology_v1
      • version
      • cpuAffinityMask
      • numGpus
      • gpuId
      • path
      • localNvLinkIds
    • dcgmGroupTopology_v1
      • version
      • groupCpuAffinityMask
      • numaOptimalFlag
      • slowestPath
    • dcgmIntrospectMemory_v1
      • version
      • bytesUsed
    • dcgmIntrospectCpuUtil_v1
      • version
      • total
      • kernel
      • user
    • dcgmRunDiag_v7
      • version
      • flags
      • debugLevel
      • groupId
      • validate
      • testNames
      • testParms
      • fakeGpuList
      • gpuList
      • debugLogFile
      • statsPath
      • configFileContents
      • clocksEventMask
      • pluginPath
      • currentIteration
      • totalIterations
      • timeoutSeconds
      • _unusedBuf
      • failCheckInterval
    • dcgmRunDiag_v8
      • version
      • flags
      • debugLevel
      • groupId
      • validate
      • testNames
      • testParms
      • fakeGpuList
      • gpuList
      • debugLogFile
      • statsPath
      • configFileContents
      • clocksEventMask
      • pluginPath
      • currentIteration
      • totalIterations
      • timeoutSeconds
      • _unusedBuf
      • failCheckInterval
      • expectedNumEntities
    • dcgmRunDiag_v9
      • version
      • flags
      • debugLevel
      • groupId
      • validate
      • testNames
      • testParms
      • fakeGpuList
      • debugLogFile
      • statsPath
      • configFileContents
      • clocksEventMask
      • pluginPath
      • currentIteration
      • totalIterations
      • timeoutSeconds
      • _unusedBuf
      • failCheckInterval
      • expectedNumEntities
      • entityIds
      • watchFrequency
    • dcgmRunDiag_v10
      • version
      • flags
      • debugLevel
      • groupId
      • validate
      • testNames
      • testParms
      • fakeGpuList
      • debugLogFile
      • statsPath
      • configFileContents
      • clocksEventMask
      • pluginPath
      • currentIteration
      • totalIterations
      • timeoutSeconds
      • _unusedBuf
      • failCheckInterval
      • expectedNumEntities
      • entityIds
      • watchFrequency
      • ignoreErrorCodes
    • dcgmTopoSchedHint_v1
      • version
      • inputGpuIds
      • numGpus
      • hintFlags
    • dcgmNvLinkGpuLinkStatus_v1
      • entityId
      • linkState
    • dcgmNvLinkGpuLinkStatus_v2
      • entityId
      • linkState
    • dcgmNvLinkGpuLinkStatus_v3
      • entityId
      • linkState
    • dcgmNvLinkNvSwitchLinkStatus_t
      • entityId
      • linkState
    • dcgmNvLinkStatus_v4
      • version
      • numGpus
      • gpus
      • numNvSwitches
      • nvSwitches
    • dcgmSummaryResponse_t
      • fieldType
      • summaryCount
      • values
    • dcgmFieldSummaryRequest_v1
      • version
      • fieldId
      • entityGroupId
      • entityId
      • summaryTypeMask
      • startTime
      • endTime
      • response
    • dcgmModuleGetStatusesModule_t
      • id
      • status
    • dcgmModuleGetStatuses_v1
      • version
      • numStatuses
      • statuses
    • dcgmStartEmbeddedV2Params_v1
      • version
      • opMode
      • dcgmHandle
      • logFile
      • severity
      • denyListCount
    • dcgmStartEmbeddedV2Params_v2
      • version
      • opMode
      • dcgmHandle
      • logFile
      • severity
      • denyListCount
      • serviceAccount
      • denyList
    • dcgmProfMetricGroupInfo_v2
      • majorId
      • minorId
      • numFieldIds
      • fieldIds
    • dcgmProfGetMetricGroups_v3
      • version
      • unused
      • gpuId
      • numMetricGroups
      • metricGroups
    • dcgmSettingsSetLoggingSeverity_v1
    • dcgmSettingsSetLoggingSeverity_v2
    • dcgmVersionInfo_v2
      • rawBuildInfoString

Release Notes:

  • DCGM Release Notes
    • 4.2.3
      • New Features
      • Improvements
      • Bug Fixes
    • 4.2.2
      • New Features
      • Improvements
      • Bug Fixes
    • 4.2.1
    • 4.2.0
      • New Features
    • 4.1.1
      • New Features
      • Bug Fixes
    • 4.1.0
      • New Features
      • Bug Fixes
    • 4.0.0
      • New Features
        • Entity Centric Messages
        • NVBandwidth
        • NVLink5 Monitoring
        • Miscellaneous
      • Improvements
      • Fixed Issues
      • Deprecations and Breaking Changes
      • Known Issues
    • 3.3.9
      • New Features
      • Improvements
      • Fixed Issues
    • 3.3.8
      • New Features
      • Improvements
      • Fixed Issues
    • 3.3.7
      • New Features
      • Improvements
      • Fixed Issues
    • 3.3.6
      • New Features
      • Fixed Issues
    • 3.3.5
      • New Features
      • Improvements
      • Fixed Issues
    • 3.3.3
      • New Features
      • Improvements
      • Fixed Issues
    • 3.3.2
      • New Features
      • Fixed Issues
    • 3.3.1
      • New Features
      • Improvements
      • Fixed Issues
    • 3.3.0
      • New Features
      • Improvements
      • Fixed Issues
    • 3.2.6
      • New Features
      • Improvements
      • Fixed Issues
    • 3.2.5
      • New Features
      • Improvements
      • Fixed Issues
    • 3.2.3
      • New Features
      • Improvements
      • Fixed Issues
      • Deprecations
    • 3.1.8
      • Improvements
      • Fixed Issues
    • 3.1.7
      • Improvements
      • Fixed Issues
      • Known Issues
    • 3.1.6
      • Improvements
      • Fixed Issues
    • 3.1.3
      • New Features
        • Major API changes and Deprecations
      • Fixed Issues
      • Known Issues
NVIDIA DCGM Documentation
  • Select Version latest
  • »
  • Search


© Copyright 2018-2025, NVIDIA Corporation. Last updated on 2025-05-02.

Built with Sphinx using a theme provided by Read the Docs.