Deploy your Quantized Model

Before You Begin

This guide explains how to update the model running on your STM32N6 board and adapt the application accordingly. After completing the steps below, follow the programming and run instructions in the README to flash the model weights, build, and run the application.

Generate a Neural-ART Model

Scripts are provided to automatically compile and package a model for each supported board: STM32N6570-DK and NUCLEO-N657X0-Q.

Each script performs the following steps:

Use the stedgeai CLI to compile the model for Neural-ART. Basic options (e.g., channel position, input/output data types) are passed on the command line, while advanced options are provided through the JSON file (user_neuralart_STM32N6570-DK.json or user_neuralart_NUCLEO-N657X0-Q.json). For computer vision use cases, the input buffer provided by the DCMIPP is uint8. To match this, set --input-data-type uint8 to force this input data type during model compilation.
Copy the files generated by stedgeai in st_ai_output/ into the application project. See Neural-ART: Description and Operation for details on the compilation and generated artifacts.
Convert the binary weights to an Intel HEX file so they can be programmed into the board’s external flash.

After generation, your new model is ready to run on-device. Next, update the application configuration.

Update Application Configuration

Image pre-processing is handled by DCMIPP and automatically adapts to the model using defines created by stedgeai. Post-processing, however, must be configured by the user. Edit Inc/app_config.h for your board: STM32N6570-DK or NUCLEO-N657X0-Q.

Classes for Detection/Classification

Update the class count and labels to match your model:

#define NB_CLASSES 2
static const char* classes_table[NB_CLASSES] = {
    "person",
    "not_person",
};

Configure Post-processing

Most computer vision models require post-processing to extract the inference result from the raw network outputs. Multiple post-processing implementations are supported. See the Post-processing Wrapper README for the available types and required parameters.

Choose the post-processing that matches your model and set the corresponding POSTPROCESS_TYPE and related defines in app_config.h (e.g., number of classes, anchors, grid sizes, thresholds). Examples for common models (Tiny YOLOv2, YOLOv8, ST YOLOX, etc.) are provided in the wrapper README.

Build and Run

Once configuration is complete, build and run the application as described in the README. This includes programming the converted HEX weights into external flash and running the firmware in either development mode or boot-from-flash mode.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Deploy your Quantized Model

Before You Begin

Generate a Neural-ART Model

Update Application Configuration

Classes for Detection/Classification

Configure Post-processing

Build and Run

FilesExpand file tree

Deploy-your-Quantized-Model.md

Latest commit

History

Deploy-your-Quantized-Model.md

File metadata and controls

Deploy your Quantized Model

Before You Begin

Generate a Neural-ART Model

Update Application Configuration

Classes for Detection/Classification

Configure Post-processing

Build and Run