DeePhi Quantization Tool Description
This tool is a model quantization tool to convolution neural networks (CNN). This tool can quantify both weights/biases as well as activations in 32-bit floating point (FP32) and 8-bit integer (INT8) formats, or any other bit depths. This tool can increase the inference performance and efficiency by ensuring accuracy. This tool supports all common layers in neural networks: convolution, pooling and fully-connected. It also supports batch normalization. Quantization tools do not require retraining the network or labeled data sets. Only one batch of photos is required. The process takes a few seconds to several hours depending on the size and complexity of the neural network. This allows for rapid model updates. This tool is collaboratively optimized for DeePhi DPU. It could generate INT8 format model file files required by DNNC.