-
Notifications
You must be signed in to change notification settings - Fork 1.5k
Open
Description
1. 环境
显卡 3090Ti
2. 过程
通过funasr/runtime/triton_gpu中的readme文档构建docker镜像,然后启动镜像,进入容器,在执行export_onnx.py导出onnx模型,并且把export_onnx.py中的最下面的几行转换为fp16的代码逻辑的注释删除,然后执行转换,生成model_fp16.onnx模型,转换后修改/model_repo_sense_voice_small/encoder中的配置文件为下面样子:
name: "encoder"
backend: "onnxruntime"
default_model_filename: "model_fp16.onnx"
max_batch_size: 16
input [
{
name: "speech"
data_type: TYPE_FP16
dims: [-1, 560]
},
{
name: "speech_lengths"
data_type: TYPE_INT32
dims: [1]
reshape: { shape: [ ] }
},
{
name: "language"
data_type: TYPE_INT32
dims: [1]
reshape: { shape: [ ] }
},
{
name: "textnorm"
data_type: TYPE_INT32
dims: [1]
reshape: { shape: [ ] }
}
]
output [
{
name: "ctc_logits"
data_type: TYPE_FP16
dims: [-1, 25055]
},
{
name: "encoder_out_lens"
data_type: TYPE_INT32
dims: [1]
reshape: { shape: [ ] }
}
]
dynamic_batching {
max_queue_delay_microseconds: 1000
}
parameters { key: "cudnn_conv_algo_search" value: { string_value: "2" } }
instance_group [
{
count: 1
kind: KIND_GPU
}
]
运行容器中的run.sh脚本,报下面的错误。
请问如果想要使用fp16的onnx模型,应该如何处理?fp32转换fp16是使用的docker内提供的export_onnx.py代码转换的,请问是这个代码存在问题吗?下面是export_onnx.py代码。
from model import SenseVoiceSmall
model_dir = "iic/SenseVoiceSmall"
#model_dir = "./SenseVoiceSmall"
model, kwargs = SenseVoiceSmall.from_pretrained(model=model_dir)
# model = model.to("cpu")
model = export_rebuild_model(model, max_seq_len=512, device="cuda")
# model.export()
print("Export Done.")
dummy_inputs = model.export_dummy_inputs()
# Export the model
torch.onnx.export(
model,
dummy_inputs,
"model.onnx",
input_names=model.export_input_names(),
output_names=model.export_output_names(),
dynamic_axes=model.export_dynamic_axes(),
opset_version=18
)
import os
import onnxmltools
from onnxmltools.utils.float16_converter import (
convert_float_to_float16)
decoder_onnx_model = onnxmltools.utils.load_model("model.onnx")
decoder_onnx_model = convert_float_to_float16(decoder_onnx_model)
decoder_onnx_path = "model_fp16.onnx"
onnxmltools.utils.save_model(decoder_onnx_model, decoder_onnx_path)
print("Model has been successfully exported to model.onnx")更具体的代码可以看 https://huggingface.co/yuekai/model_repo_sense_voice_small/blob/main/export_onnx.py
如果我不使用fp16,使用 https://huggingface.co/yuekai/model_repo_sense_voice_small/blob/main/export_onnx.py 这个代码导出的fp32的onnx也运行报错。如果使用 https://modelscope.cn/models/iic/SenseVoiceSmall-onnx/files 中的 model_quant.onnx 则是正常能用,请问这个是上面导出onnx代码存在问题导致的吗?
如果我想要使用fp16或者是int8的onnx,应该如何做?
Thanks
@yuekaizhang
Metadata
Metadata
Assignees
Labels
No labels