python - tf.nn.depthwise_conv2d is too slow. is it normal? -


i trying out recent arxiv work called "factorized cnn",

which argues spatially separated convolution (depth-wise convolution), channel-wise linear projection(1x1conv), can speed convolution operation.

this figure conv layer architecture

i found out can implement architecture tf.nn.depthwise_conv2d , 1x1 convolution, or tf.nn.separable_conv2d.

below implementation:

#conv filter depthwise convolution  depthwise_filter = tf.get_variable("depth_conv_w", [3,3,64,1], initializer=tf.random_normal_initializer(stddev=np.sqrt(2.0/9/32)))  #conv filter linear channel projection  pointwise_filter = tf.get_variable("point_conv_w", [1,1,64,64], initializer=tf.random_normal_initializer(stddev=np.sqrt(2.0/1/64)))  conv_b = tf.get_variable("conv_b", [64], initializer=tf.constant_initializer(0))  #depthwise convolution, multiplier 1  conv_tensor = tf.nn.relu(tf.nn.depthwise_conv2d(tensor, depthwise_filter, [1,1,1,1], padding='same'))  #linear channel projection 1x1 convolution  conv_tensor = tf.nn.bias_add(tf.nn.conv2d(conv_tensor, pointwise_filter, [1,1,1,1], padding='valid'), conv_b)  #residual  tensor = tf.add(tensor, conv_tensor)

this should around 9 times faster original 3x3x64 -> 64 channel convolution.

however, cannot experience performance improvement.

i must assume doing wrong, or there's wrong tensorflow's implementation.

since there few example using depthwise_conv2d, leaving question here.

is slow speed normal? or there mistake?


Comments

Popular posts from this blog

java - Jasper subreport showing only one entry from the JSON data source when embedded in the Title band -

mapreduce - Resource manager does not transit to active state from standby -

serialization - Convert Any type in scala to Array[Byte] and back -