tensorflow默认情况下是使用所有GPU和显存,有时候我们需要分配显卡资源,需要手动设置,

常用代码

显示gpu或cpu块数

gpus = tf.config.experimental.list_physical_devices(device_type='GPU')
cpus = tf.config.experimental.list_physical_devices(device_type='CPU')
print(gpus, cpus)

使用下标为0,1的两块显卡

gpus = tf.config.experimental.list_physical_devices(device_type='GPU')
tf.config.experimental.set_visible_devices(devices=gpus[0:2], device_type='GPU')

两种设置显存使用策略

1、按需申请

gpus = tf.config.experimental.list_physical_devices(device_type='GPU')
for gpu in gpus:
tf.config.experimental.set_memory_growth(device=gpu, enable=True)

2、固定分配,分配4G显存

gpus = tf.config.experimental.list_physical_devices(device_type='GPU')
tf.config.experimental.set_virtual_device_configuration(gpus[0],
[tf.config.experimental.VirtualDeviceConfiguration(memory_limit=4096)])

3、固定分配,分配60%的GPU显存

config = tf.compat.v1.ConfigProto()
config.gpu_options.per_process_gpu_memory_fraction = 0.6
tf.compat.v1.Session(config=config)

模拟多GPU,建立22G 显存的虚拟GPU,可以让多GPU环境代码在单GPU环境运行

gpus = tf.config.experimental.list_physical_devices('GPU')
tf.config.experimental.set_virtual_device_configuration(gpus[0],
[tf.config.experimental.VirtualDeviceConfiguration(memory_limit=2048),
tf.config.experimental.VirtualDeviceConfiguration(memory_limit=2048)])

参考:

https://www.shouxicto.com/article/1371.html

https://tf.wiki/en/appendix/distributed.html#multi-gpu