How to get inference for batch size > 1

This is critical for server usecases, to improve concurrency on GPU. Faster Whisper added https://github.com/SYSTRAN/faster-whisper/pull/856. For live use case this means multiple concurrent  transcriptions can happen on the same GPU/model. This will significantly lower costs of deployment.

Is there a way to do this now? Or is this something on the roadmap?

Thanks!