diff options
author | Navan Chauhan <navanchauhan@gmail.com> | 2025-03-28 12:23:11 -0600 |
---|---|---|
committer | Navan Chauhan <navanchauhan@gmail.com> | 2025-03-28 12:23:11 -0600 |
commit | 2de89bc355ea478b7d9b0d3f9185876e7047931d (patch) | |
tree | ffbdbbe88ce01dd56c42727bc9c1890d58644c4f /README.md | |
parent | bb018b2d346134c3f399b2424a9f5908d7ecf455 (diff) |
add instructions for quantization
Diffstat (limited to 'README.md')
-rw-r--r-- | README.md | 34 |
1 files changed, 34 insertions, 0 deletions
@@ -25,3 +25,37 @@ Works with handwritten formulae as well! - [ ] Image Export - [ ] UI Overhaul - [ ] Optimizations + +## Misc + +### Quantization + +#### Encoder Model + +```bash +python -m onnxruntime.quantization.preprocess --input iTexSnip/models/encoder_model.onnx --output encoder-infer.onnx +``` + +```python +import onnx +from onnxruntime.quantization import quantize_dynamic, QuantType +og = "encoder-infer.onnx" +quant = "encoder-quant.onnx" +quantized_model = quantize_dynamic(og, quant, nodes_to_exclude=['/embeddings/patch_embeddings/projection/Conv']) +``` + +It might be better if we quantize the encoder using static quantization. + +#### Decoder Model + +```bash +python -m onnxruntime.quantization.preprocess --input iTexSnip/models/decoder_model.onnx --output decoder-infer.onnx +``` + +```python +import onnx +from onnxruntime.quantization import quantize_dynamic, QuantType +og = "decoder-infer.onnx" +quant = "decoder-quant.onnx" +quantized_model = quantize_dynamic(og, quant) +``` |