Pisets: A Robust Speech Recognition System for Lectures and Interviews
Paper
• 2601.18415 • Published
• 34
This repository contains a fine-tuned Whisper Large V3 model for Russian speech recognition. It serves as the core transcription component of the Pisets system, specifically optimized for long audio recordings such as lectures and interviews.
The model was presented in the paper Pisets: A Robust Speech Recognition System for Lectures and Interviews.
The Pisets system implements a three-component architecture to improve recognition accuracy while minimizing hallucinations:
The complete source code and instructions for using the system (including generation of SRT and DocX files) can be found in the GitHub repository:
GitHub: https://github.com/bond005/pisets
If you use this model or the Pisets system in your research, please cite:
@article{bondarenko2026pisets,
title={Pisets: A Robust Speech Recognition System for Lectures and Interviews},
author={Ivan Bondarenko},
journal={arXiv preprint arXiv:2601.18415},
year={2026}
}