04-15 07:29 阅读 200

序列表示方法

Spatial signals
Temporal signals
Sequence
Sequence embedding
[b,100,1]
[b,28,28]
Batch

[words,word vec]

Word embedding

Embedding Layer

Spatial signals

41-序列表示方法-cnn.jpg

Temporal signals

41-序列表示方法-信号.jpg

Sequence

41-序列表示方法-序列.jpg

Sequence embedding

[b,seq_len,feature_len] # 1个句子10个单词每个单词4个意思，[1,10,4]
e.g. I like it.

import tensorflow as tf

tf.convert_to_tensor([[1,0,0],[0,1,0],[0,0,1]])

<tf.Tensor: id=4, shape=(3, 3), dtype=int32, numpy=
array([[1, 0, 0],
[0, 1, 0],
[0, 0, 1]], dtype=int32)>

[b,100,1]

[price,scalar,1] # 1-->1个点表示一个价格

41-序列表示方法-序列图.jpg

[b,28,28]

41-序列表示方法-图片.jpg

对图片扫描的次数变成了时间的概念

41-序列表示方法-数字6.jpg

Batch

[b,word num,word vec]
[word num,b,word vec]

41-序列表示方法-batch波形.jpg

[words,word vec]

How to represent a word

[Rome, Italy, ...]

one hot

41-序列表示方法-词向量.jpg

sparse
high-dim

41-序列表示方法-词相似度.jpg

semantic similarity
trainable

Word embedding

41-序列表示方法-词误差.jpg

Word2Vec vs GloVe

41-序列表示方法-词训练.jpg

Embedding Layer

Random initialized embedding

from tensorflow.keras import layers

x = tf.range(5)
x = tf.random.shuffle(x)
x

<tf.Tensor: id=10, shape=(5,), dtype=int32, numpy=array([0, 4, 3, 2, 1], dtype=int32)>

net = layers.Embedding(10,4)
net(x)

<tf.Tensor: id=26, shape=(5, 4), dtype=float32, numpy=
array([[-0.04665176, -0.00618398, -0.02745042, -0.0418861 ],
       [-0.00495533, -0.02990632, -0.04187028, -0.03159492],
       [ 0.00022942, -0.01628833, -0.00680885, -0.03196504],
       [-0.0023623 ,  0.04522124,  0.02052191, -0.02518519],
       [ 0.0211277 , -0.03581526,  0.00149528,  0.04243053]],
      dtype=float32)>

net.trainable

True

net.trainable_variables

[<tf.Variable 'embedding/embeddings:0' shape=(10, 4) dtype=float32, numpy=
array([[-0.04665176, -0.00618398, -0.02745042, -0.0418861 ],
        [ 0.0211277 , -0.03581526,  0.00149528,  0.04243053],
        [-0.0023623 ,  0.04522124,  0.02052191, -0.02518519],
        [ 0.00022942, -0.01628833, -0.00680885, -0.03196504],
        [-0.00495533, -0.02990632, -0.04187028, -0.03159492],
        [ 0.04476041,  0.00983595,  0.01300793, -0.00486787],
        [ 0.00272337,  0.00402355, -0.04166143,  0.01867583],
        [-0.01088942,  0.02177001,  0.01363814, -0.04016535],
        [-0.04173249, -0.03866537, -0.0426992 ,  0.00479555],
        [ 0.02334514,  0.01809745, -0.03649411, -0.00876436]],
       dtype=float32)>]