Quantized Tiny Recursive Models for Edge Deployment

Quantized TRMs for Edge Deployment

Overview

Tiny Recursive Models (TRMs) are naturally compact and recur over their latent state, making them attractive for ultra-efficient deployment on microcontrollers. This thesis develops low-precision (INT8/INT4) TRMs, analyses their stability under recursion and deep supervision, and deploys them on real hardware (GAP9 or Cortex-M) to study latency, memory, and energy trade-offs together with adaptive halting.

Expected Outcomes

  • Reference PyTorch TRM implementation ready for quantisation (building on the time-series thesis or existing TRM code)
  • Post-training and quantisation-aware training (QAT) pipelines (INT8, INT4 / mixed precision) with detailed accuracy–precision curves
  • Investigation of how quantisation affects recursive dynamics and deep supervision signals
  • Deployment of a quantised TRM on GAP9 or Cortex-M, with measured latency, memory footprint, and energy per inference
  • Evaluation of learned halting on-device versus fixed-compute baselines
  • (Optional) Technical report or workshop paper summarising quantisation + deployment findings

Prerequisites

  • Strong programming skills in Python and C/C++
  • Solid ML/DL foundations (PyTorch or equivalent)
  • Interest in embedded systems / edge AI and willingness to work with hardware toolchains
  • Prior experience with quantisation or model compression is a plus but not required

Tools & Skills

  • Python, PyTorch, quantisation frameworks (PyTorch built-ins, Brevitas, etc.)
  • C/C++ for embedded deployment (GAP SDK / PMSIS, or ARM microcontroller toolchain)
  • Linux, Git, scripting for experiment automation
  • Access to lab tools for power/latency measurements (provided)

What You Will Learn

  • How TRMs behave under low-precision arithmetic and recursive refinement
  • Practical post-training and QAT techniques for sequence / recursive models
  • Mapping compact neural networks onto MCUs or PULP SoCs while respecting memory and timing constraints
  • Measuring and interpreting energy/latency/accuracy trade-offs for edge deployment
  • Integrating ML algorithms with embedded software stacks to deliver a full-stack thesis

Application

Please email thoriri@iis.ee.ethz.ch with the subject "[MSc Thesis] Quantized Tiny Recursive Models for Edge Deployment". Attach a short motivation, CV, transcripts, and note any experience with PyTorch, quantisation, or embedded platforms.


Related Links

Thorir Mar Ingolfsson
Thorir Mar Ingolfsson
Postdoctoral Researcher

I develop efficient machine learning systems for biomedical wearables that operate under extreme resource constraints. My work bridges foundation models, neural architecture design, and edge deployment to enable real-time biosignal analysis on microwatt-scale devices.