TimesFM 2.5 200M LiteRT (TFLite) Variants

LiteRT/TFLite exports of google/timesfm-2.5-200m-pytorch for on-device inference.

Included files

  • timesfm-2p5-200m-litert_ctx512_h128_fp32.tflite
  • timesfm-2p5-200m-litert_ctx512_h128_fp16_w16a32.tflite
  • timesfm-2p5-200m-litert_ctx512_h128_drq_w8a32.tflite
  • timesfm-2p5-200m-litert_ctx512_h128_int4_dq_w4a4.tflite

Tensor spec

  • Input: series, float32, shape [1, 512]
  • Output: point forecast, float32, shape [1, 128]

Variant summary

File Quantization Notes
*_fp32.tflite none Best fidelity, largest size
*_fp16_w16a32.tflite fp16 weights Good speed/size balance on many devices
*_drq_w8a32.tflite dynamic range quant (int8 weights, fp32 activations) Smaller model, usually small quality drop
*_int4_dq_w4a4.tflite dynamic int4 weight quant Smallest model, accuracy/speed trade-offs depend on device

Source and license

Minimal Python inference example

import numpy as np
import tensorflow as tf

interpreter = tf.lite.Interpreter(
    model_path="timesfm-2p5-200m-litert_ctx512_h128_fp32.tflite"
)
interpreter.allocate_tensors()

inp = interpreter.get_input_details()[0]
out = interpreter.get_output_details()[0]

x = np.zeros((1, 512), dtype=np.float32)
interpreter.set_tensor(inp["index"], x)
interpreter.invoke()
y = interpreter.get_tensor(out["index"])
print(y.shape)  # (1, 128)

Notes

  • These exports are intended for inference only.
  • Validate numerics against your own ONNX/PyTorch baseline before production use.
Downloads last month
2
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for jaichang/timesfm-2p5-200m-litert

Finetuned
(1)
this model