On this page
tf.distribute.HierarchicalCopyAllReduce
Reduction using hierarchical copy all-reduce.
tf.distribute.HierarchicalCopyAllReduce(
    num_packs=1
)
It reduces to one GPU along edges in some hierarchy and broadcasts back to each GPU along the same path. Before performing all-reduce, tensors will be repacked or aggregated for more efficient cross-device transportation.
This is a reduction created for Nvidia DGX-1 which assumes GPUs connects like that on DGX-1 machine. If you have different GPU inter-connections, it is likely that it would be slower than tf.distribute.ReductionToOneDevice.
| Args | |
|---|---|
| num_packs | values will be packed in this many splits. num_packsshould be greater than or equals 0. When it is zero, no packing will be done. | 
| Raises | |
|---|---|
| ValueError if num_packsis negative. | 
Methods
batch_reduce
  
  batch_reduce(
    reduce_op, value_destination_pairs, experimental_hints=None
)
Reduce PerReplica objects in a batch.
Reduce each first element in value_destination_pairs to each second element which indicates the destinations.
This can be faster than multiple individual reduces because we can fuse several tensors into one or multiple packs before reduction.
| Args | |
|---|---|
| reduce_op | An instance of tf.distribute.ReduceOpthat indicates how theper_replica_valuewill be reduced. | 
| value_destination_pairs | A list or a tuple of PerReplica objects (or tensors with device set if there is one device) and destinations. | 
| experimental_hints | A tf.distrbute.experimental.CollectiveHints. Hints to perform collective operations. | 
| Returns | |
|---|---|
| a list of Mirrored objects. | 
| Raises | |
|---|---|
| ValueError | if value_destination_pairsis not an iterable of tuples of PerReplica objects and destinations. | 
broadcast
  
  broadcast(
    tensor, destinations
)
Broadcast the tensor to destinations.
| Args | |
|---|---|
| tensor | the tensor to broadcast. | 
| destinations | the broadcast destinations. | 
| Returns | |
|---|---|
| a Mirrored object. | 
reduce
  
  reduce(
    reduce_op, per_replica_value, destinations, experimental_hints=None
)
Reduce per_replica_value to destinations.
It runs the reduction operation defined by reduce_op and put the result on destinations.
| Args | |
|---|---|
| reduce_op | An instance of tf.distribute.ReduceOpthat indicates how per_replica_value will be reduced. | 
| per_replica_value | A tf.distribute.DistributedValuesobject or a tensor with device set. | 
| destinations | the reduction destinations. | 
| experimental_hints | A tf.distrbute.experimental.CollectiveHints. Hints to perform collective operations. | 
| Returns | |
|---|---|
| a Mirrored object. | 
| Raises | |
|---|---|
| ValueError | if per_replica_value can't be converted to a PerReplica object or if destinations aren't strings, Variables or DistributedValues | 
© 2020 The TensorFlow Authors. All rights reserved.
Licensed under the Creative Commons Attribution License 3.0.
Code samples licensed under the Apache 2.0 License.
 https://www.tensorflow.org/versions/r2.3/api_docs/python/tf/distribute/HierarchicalCopyAllReduce