04-15 08:47 阅读 172

Broadcasting

Broadcasting
Key idea

How to understand?
Why broadcasting？
Broadcastable?
Broadcast VS Tile

Broadcasting

expand（扩展数据）
without copying data（不复制数据）
tf.broadcast_to

Key idea

Insert 1 dim ahead if needed
Expand dims with size 1 to same size
example:

[4,16,16,32]
```
    [32]
```
[4,16,16,32]
[1,1,1,32]
[4,16,16,32]
[4,16,16,32]

05-Broadcasting-broadcast示例.jpg

How to understand?

When it has no axis

Create a new concepy
[classes, students, scores] + [scores]

When it has dim of size 1

Treat it shared by all
[classes,students,scores] + [students,1]

Broadcasting可以理解成把维度分成大维度和小维度，小维度较为具体，大维度更加抽象。也就是小维度针对某个示例，然后让这个示例通用语大维度。

Why broadcasting？

for real demanding

[classes, students, scores]
Add bias for every student: +5 score
[4,32,8] + [4,32,8]
[4,32,8] + [5.0]

memory consumption

[4,32,8] -> 1024
bias = [8]: [5.0,5.0,5.0,...] -> 8

Broadcastable?

Match from Last dim!

if current dim=1, expand to same
if either has no dim, insert one dim and expand to same
otherwise, Not Broadcastable

[4,32,14,14]
[1,32,1,1] -> [4,32,14,14] √
[14,14] -> [1,1,14,14] -> [4,32,14,14] √
[2,32,14,14] ×
[3] √
[32,32,1] √
[4,1,1,1] √

import tensorflow as tf

x = tf.random.normal([4,32,32,3])
x.shape

TensorShape([4, 32, 32, 3])

(x+tf.random.normal([3])).shape

TensorShape([4, 32, 32, 3])

(x+tf.random.normal([32,32,1])).shape

TensorShape([4, 32, 32, 3])

(x+tf.random.normal([4,1,1,1])).shape

TensorShape([4, 32, 32, 3])

try:
(x+tf.random.normal([1,4,1,1])).shape
except Exception as e:
print(e)

Incompatible shapes: [4,32,32,3] vs. [1,4,1,1] [Op:Add] name: add/

(x+tf.random.normal([4,1,1,1])).shape

TensorShape([4, 32, 32, 3])

b = tf.broadcast_to(tf.random.normal([4,1,1,1]),[4,32,32,3])

b.shape

TensorShape([4, 32, 32, 3])

Broadcast VS Tile

a = tf.ones([3,4])

a.shape

TensorShape([3, 4])

a1 = tf.broadcast_to(a,[2,3,4])

a1.shape

TensorShape([2, 3, 4])

a2 = tf.expand_dims(a,axis=0) # 0前插入一维

a2.shape

TensorShape([1, 3, 4])

a2 = tf.tile(a2,[2,1,1]) # 复制一维2次，复制二、三维1次

a2.shape

TensorShape([2, 3, 4])