add instructions for quantization

author: Navan Chauhan <navanchauhan@gmail.com> 2025-03-28 12:23:11 -0600
committer: Navan Chauhan <navanchauhan@gmail.com> 2025-03-28 12:23:11 -0600
commit: 2de89bc355ea478b7d9b0d3f9185876e7047931d (patch)
tree: ffbdbbe88ce01dd56c42727bc9c1890d58644c4f /README.md
parent: bb018b2d346134c3f399b2424a9f5908d7ecf455 (diff)
1 files changed, 34 insertions, 0 deletions
diff --git a/README.md b/README.md
index 0e5b25f..bf94169 100644
--- a/README.md
+++ b/README.md
@@ -25,3 +25,37 @@ Works with handwritten formulae as well!
 - [ ] Image Export
 - [ ] UI Overhaul
 - [ ] Optimizations
+
+## Misc
+
+### Quantization
+
+#### Encoder Model
+
+```bash
+python -m onnxruntime.quantization.preprocess --input  iTexSnip/models/encoder_model.onnx --output  encoder-infer.onnx
+```
+
+```python
+import onnx
+from onnxruntime.quantization import quantize_dynamic, QuantType
+og = "encoder-infer.onnx"
+quant = "encoder-quant.onnx"
+quantized_model = quantize_dynamic(og, quant, nodes_to_exclude=['/embeddings/patch_embeddings/projection/Conv'])
+```
+
+It might be better if we quantize the encoder using static quantization.
+
+#### Decoder Model
+
+```bash
+python -m onnxruntime.quantization.preprocess --input  iTexSnip/models/decoder_model.onnx --output  decoder-infer.onnx
+```
+
+```python
+import onnx
+from onnxruntime.quantization import quantize_dynamic, QuantType
+og = "decoder-infer.onnx"
+quant = "decoder-quant.onnx"
+quantized_model = quantize_dynamic(og, quant)
+```
author	Navan Chauhan <navanchauhan@gmail.com>	2025-03-28 12:23:11 -0600
committer	Navan Chauhan <navanchauhan@gmail.com>	2025-03-28 12:23:11 -0600
commit	2de89bc355ea478b7d9b0d3f9185876e7047931d (patch)
tree	ffbdbbe88ce01dd56c42727bc9c1890d58644c4f /README.md
parent	bb018b2d346134c3f399b2424a9f5908d7ecf455 (diff)