#
# Copyright (c) 2018-2021 Cadence Design Systems, Inc.
#
# Permission is hereby granted, free of charge, to any person obtaining
# a copy of this software and associated documentation files (the
# "Software"), to use this Software with Cadence processor cores only and
# not with any other processors and platforms, subject to
# the following conditions:
#
# The above copyright notice and this permission notice shall be included
# in all copies or substantial portions of the Software.
#
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
# EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
# MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
# IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY
# CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT,
# TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE
# SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
#

======================================================================
Cadence HiFi4 Neural Network (NN) Library
======================================================================

======================================================================
Revision History
======================================================================

Version 2.4.1 API 1.0: Jul 27, 2021

+ Patch Release of HiFi 4 NN Library
+ Tested using RI.5 tools, xt-xcc/xt-xc++ and xt-clang/xt-clang++ compiler
+ [J2938]: Fixed the issue related to special case handling.
    Moved the special cases handling for vec_length=8/32 to the start 
    of the function in xa_nn_dot_prod_16x16_asym8s.
+ [J2976]: Removed legacy condition checks for depthwise_conv.
+ [J2994]: Fixed the issue related to xa_nn_conv2d_std_per_chan_sym8sxasym8s.
    updated to select correct vec values in case of vec_align_val is 1 and 3.

Limitations
> The ANN test bench, TFLite Micro example application and the supporting libraries
  need support for C++11 standard. This is supported only with Xtensa C library. 
> For configurations without VFPU, the floating point  variants of the kernels 
  are not supported and should not be used. 

Known issues
> Compiler optimizations are disabled for C++ source files 
> 3 failures seen in ANN testcases because  of differences in reference 
  code implementations:
    DEPTHWISE_CONV2D_FLOAT_LARGE_2_WEIGHTS_AS_INPUTS
    LOCAL_RESPONSE_NORM_FLOAT_4
    LOCAL_RESPONSE_NORM_FLOAT_4_RELAXED

----------------------------------------------------------------------
Version 2.4.0 API 1.0: Feb 10, 2021

+ Intermediate Release of HiFi 4 NN Library
+ Tested using RI.5 tools, xt-xcc/xt-xc++ and xt-clang/xt-clang++ compiler
+ Added support for asymmetric quantized 8 bit (int8) variants of 
  following TFLM operators
    Quantize
    SVDF
+ Updated TFLM example testbench to newer version of TFLM framework
+ Improved optimization for reminder loop in xa_nn_vec_softmax_asym8s_16 
+ Reduced memory banking conflicts in FC kernel xa_nn_fully_connected_sym8sxasym8s_asym8s

Limitations
> The ANN test bench, TFLite Micro example application and the supporting libraries
  need support for C++11 standard. This is supported only with Xtensa C library. 
> For configurations without VFPU, the floating point  variants of the kernels 
  are not supported and should not be used. 

Known issues
> Compiler optimizations are disabled for C++ source files 
> 3 failures seen in ANN testcases because  of differences in reference 
  code implementations:
    DEPTHWISE_CONV2D_FLOAT_LARGE_2_WEIGHTS_AS_INPUTS
    LOCAL_RESPONSE_NORM_FLOAT_4
    LOCAL_RESPONSE_NORM_FLOAT_4_RELAXED
 
----------------------------------------------------------------------
Version 2.3.0 API 1.0: Dec 22, 2020

+ Intermediate Release of HiFi 4 NN Library
+ Tested using RI.5 tools, xt-xcc/xt-xc++ and xt-clang/xt-clang++ compiler
+ Added support for asymmetric quantized 8 bit (int8) variants of 
  following TFLM operators
    Conv2D (Standard convolution)
    Depthwise convolution
    Fully connected
    Softmax
    Average Pool (using existing 8x8 variant)
    Max Pool (using existing 8x8 variant)
+ Enabled cross compilation for HiFi 3. 
+ Removed kernel padding requirement for depthwise convolution kernels.
+ [TENA-2712] Convolution 1D: Fix incorrect zero bias values in the call 
  to function xa_nn_matXvec_asym8xasym8_asym8_circ_nb.

Limitations
> The ANN test bench, TFLite Micro example application and the supporting libraries
  need support for C++11 standard. This is supported only with Xtensa C library. 
> For configurations without VFPU, the floating point  variants of the kernels 
  are not supported and should not be used. 

Known issues
> Compiler optimizations are disabled for C++ source files 
> 3 failures seen in ANN testcases because  of differences in reference 
  code implementations:
    DEPTHWISE_CONV2D_FLOAT_LARGE_2_WEIGHTS_AS_INPUTS
    LOCAL_RESPONSE_NORM_FLOAT_4
    LOCAL_RESPONSE_NORM_FLOAT_4_RELAXED
 
----------------------------------------------------------------------

Version 2.2.0 API 1.0: April 22, 2020

+ GA Release
+ Tested using RI.2 tools and xt-xcc/xt-xc++, xt-clan/xt-clang++ compilers
+ Improved support for NHWC (depth-first) data format for convolution and pooling kernels
+ Improved implementation of pointwise convolution kernels
+ Improved Relu support
+ Added support for SVDF operator (floating point variant) in TFLiteMicro
+ Enabled compilation of the NN library for cores with newlib as well as xclib
+ Enabled cross compilation for HiFi3z, HiFi5 and Fusion-F1 AVS with 16-bit Quad MAC
+ Added mechanism for controlling code size of the library
+ Updated TFLiteMicro speech commands example application
+ Added two supporting libraries for the TFLiteMicro speech commands example
+ Added a supporting library containing code of the ANN API support 
+ Programmer's Guide updated

Limitations
> The ANN test bench, TFLite Micro example application and the supporting libraries
  need support for C++11 standard. This is supported only with Xtensa C library. 
> For configurations without VFPU, the floating point  variants of the kernels 
  are not supported and should not be used. 

Known issues
> Compiler optimizations are disabled for C++ source files 
> 3 failures seen in ANN testcases because  of differences in reference 
  code implementations:
    DEPTHWISE_CONV2D_FLOAT_LARGE_2_WEIGHTS_AS_INPUTS
    LOCAL_RESPONSE_NORM_FLOAT_4
    LOCAL_RESPONSE_NORM_FLOAT_4_RELAXED
 
----------------------------------------------------------------------

Version 2.1.0 API 1.0: September 6, 2019

+ GA Release
+ Tested using RI.1 tools and xt-xcc/xt-xc++ compiler
+ Needs HiFi4 core with Xtensa C Library 
+ Programmer's Guide updated

Known issues
> Building with xt-clang compiler is not supported currently
> 3 failures seen in ANN testcases because  of differences in reference 
  code implementations:
    DEPTHWISE_CONV2D_FLOAT_LARGE_2_WEIGHTS_AS_INPUTS
    LOCAL_RESPONSE_NORM_FLOAT_4
    LOCAL_RESPONSE_NORM_FLOAT_4_RELAXED
 
----------------------------------------------------------------------

Version 2.0.0 API 1.0: August 8, 2019

+ Beta Release
+ Added support for Android NN API v1.1 (Android P)
+ Asym8 datatype support added to low level kernels for ANN compatibility
+ Added 38 ANN operations from Android NN API v1.1 to the library
+ Added TF Micro Lite example testbench 
+ Tested using RI.1 tools
+ Needs HiFi4 core with Xtensa C Library 
+ [TENX-47056] Removed the ISS mode switching call from the testbenches 

Known issues
> Programmer's Guide is not finalized
> 3 failures seen in ANN tests:
    DEPTHWISE_CONV2D_FLOAT_LARGE_2_WEIGHTS_AS_INPUTS
    LOCAL_RESPONSE_NORM_FLOAT_4
    LOCAL_RESPONSE_NORM_FLOAT_4_RELAXED
 
----------------------------------------------------------------------

Version 1.0.1 API 1.0: February 11, 2019

+ GA Release
+ Also available in xws file format
+ Programmer's Guide updated
+ Adapted to 3 digit release version number mechanism
 
----------------------------------------------------------------------

Version 1.0 API 1.0: January 7, 2019

+ GA Release
+ Added convolution, pooling low level kernels
+ Added LSTM Layer (forward path), CNN Layer
 
----------------------------------------------------------------------

Version 0.1_Alpha API 0.5: September 21, 2018

+ Alpha Release - Hifi4 Source Package only
+ NN Library implements:
  Matrix-X-vector multiplication kernels
  Activation function kernels
  GRU layer implementation

+ Programmers Guide - partially complete, DRAFT version
+ Refer PG for GRU details
+ Refer xa_nnlib_kernels_api.h and PG: Appendix for kernels details
+ Refer PG - Appendix for kernels performance
  (Measured with RG.5 tools, AE_HiFi4_LE5 with VFPU core)

+ Use 'build/runperf.sh' script for performance testing.
  The scripts requires test vectors from 'nnlib_testvectors_v0.1_Alpha.tgz'
  This test vectors package is shared separately.
----------------------------------------------------------------------
