NeurIPS 2024 — Spotlight Poster Presentation

Abstract

We propose a parameter-efficient fine-tuning approach that reduces compute requirements by 4x while maintaining 97% of full fine-tuning performance on domain-specific benchmarks. Our method combines adaptive rank selection with gradient-aware layer freezing.

Key Contributions

Adaptive LoRA rank selection — dynamically adjusts rank per layer based on gradient magnitude during training
Layer-wise freezing scheduler — progressively freezes converged layers to redirect compute to under-trained parameters
Domain benchmark suite — released evaluation suite covering legal, medical, and financial domains

Takeaways

The conference provided excellent networking with teams from DeepMind, Meta FAIR, and several university labs working on similar efficiency problems. Led to two follow-up collaborations.