aboutsummaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorNavan Chauhan <navanchauhan@gmail.com>2025-03-28 12:23:11 -0600
committerNavan Chauhan <navanchauhan@gmail.com>2025-03-28 12:23:11 -0600
commit2de89bc355ea478b7d9b0d3f9185876e7047931d (patch)
treeffbdbbe88ce01dd56c42727bc9c1890d58644c4f
parentbb018b2d346134c3f399b2424a9f5908d7ecf455 (diff)
add instructions for quantization
-rw-r--r--README.md34
1 files changed, 34 insertions, 0 deletions
diff --git a/README.md b/README.md
index 0e5b25f..bf94169 100644
--- a/README.md
+++ b/README.md
@@ -25,3 +25,37 @@ Works with handwritten formulae as well!
- [ ] Image Export
- [ ] UI Overhaul
- [ ] Optimizations
+
+## Misc
+
+### Quantization
+
+#### Encoder Model
+
+```bash
+python -m onnxruntime.quantization.preprocess --input iTexSnip/models/encoder_model.onnx --output encoder-infer.onnx
+```
+
+```python
+import onnx
+from onnxruntime.quantization import quantize_dynamic, QuantType
+og = "encoder-infer.onnx"
+quant = "encoder-quant.onnx"
+quantized_model = quantize_dynamic(og, quant, nodes_to_exclude=['/embeddings/patch_embeddings/projection/Conv'])
+```
+
+It might be better if we quantize the encoder using static quantization.
+
+#### Decoder Model
+
+```bash
+python -m onnxruntime.quantization.preprocess --input iTexSnip/models/decoder_model.onnx --output decoder-infer.onnx
+```
+
+```python
+import onnx
+from onnxruntime.quantization import quantize_dynamic, QuantType
+og = "decoder-infer.onnx"
+quant = "decoder-quant.onnx"
+quantized_model = quantize_dynamic(og, quant)
+```