Releases: FluidInference/FluidAudio
Releases · FluidInference/FluidAudio
v0.15.4 japanese kokoro
What's Changed
- fix(tts): treat KokoroAne quote delimiters as punctuation by @LemonCANDY42 in #696
- fix(tts/kokoro-ane): synthesis correctness + doc accuracy by in #697
- fix(tts/kokoro-ane): KokoroNoise v2 — atan2 phase fix (removes HF sharpness) in #700
- feat(asr): expose per-token timings from StreamingUnifiedAsrManager by @ComicBit in #701
- Make SenseVoice downloads precision-aware by @pHequals7 in #703
- feat(tts/kokoro-ane): Japanese variant + PyTorch-matched output level by in #699
New Contributors
- @LemonCANDY42 made their first contribution in #696
- @ComicBit made their first contribution in #701
- @pHequals7 made their first contribution in #703
Full Changelog: v0.15.3...v0.15.4
v0.15.3
What's Changed
- feat(tts): M5 benchmark re-baseline + Kokoro ane-tail-gpu (M5 fix) + Supertonic int4 default + PocketTTS v2.1 ANE docs in #666
- fix(kokoro): make the M5-safe routing the default (Kokoro works on M5 out of the box) in #671
- chore(asr): remove experimental Parakeet CTC zh-CN Mandarin model in #675
- Remove experimental Magpie multilingual TTS backend in #674
- chore(asr): remove experimental Qwen3 ASR backend in #676
- fix(download): retry transient per-file failures in downloadRepo by @JulianPscheid in #681
- feat(tts/pocket): ANE placements — rank-4 split-KV models (.ane) + MLState pipeline (.aneState) by @Alex-Wengg in #679
- fix(kokoro): route the Noise stage to GPU in the M5-safe preset (+~10% synth) in #677
- feat(asr/eou): opt-in fused decoder+joint_decision path (+7-9% RTFx, WER neutral-or-better) in #680
- feat(asr): expose per-token timings from Nemotron streaming ASR (English + multilingual) by @JulianPscheid in #673
- Add TypeWhisper to the FluidAudio showcase by @SeoFood in #685
- fix(asr): make SlidingWindowAsrConfig.default fit the model's 240k-sample input in #689
- fix(asr): splice long-form chunk merges on SentencePiece word boundaries in #688
- fix(tts): Misaki-lexicon-first English frontend for KokoroAne in #692
- feat(asr): Parakeet Unified 0.6B backend (chunked-attention streaming + offline batch) in #693
New Contributors
- @JulianPscheid made their first contribution in #681
- @SeoFood made their first contribution in #685
- @rcourtman made their first contribution in #684
Full Changelog: v0.15.2...v0.15.3
v0.15.2
What's Changed
- feat(tts/supertonic3): quantized + ANE-bucketed VectorEstimator in #664
- PocketTTS v2.1: fused flow decoder (ANE) + cond prefill + fp16 flowlm (~1.8× RTFx) in #665
Full Changelog: v0.15.1...v0.15.2
v0.15.1
What's Changed
- perf(asr): opt-in GPU encoder placement for Parakeet v3 (+~8% RTFx, WER-neutral) in #659
- feat(asr): English Nemotron streaming 2240ms tier + B1 fusion (+50% RTFx, WER-neutral) in #660
Full Changelog: v0.15.0...v0.15.1
v0.15.0 — Multistreamer
What's Changed
- feat(asr): SenseVoiceSmall CoreML backend (multilingual, non-autoregressive) in #648
- feat(asr): Paraformer-large (zh) CoreML backend (non-autoregressive, CIF) in #651
- fix(download): fetch repo-root vocab.json for Cohere Transcribe (#649)g in #650
- feat(download): add offline-only enforcement via DownloadUtils.enforceOffline by @leecrossley in #632
- feat(asr): Nemotron 3.5 ASR Streaming Multilingual 0.6B (40 locales, on-device CoreML/ANE) g in #657
New Contributors
- @leecrossley made their first contribution in #632
Full Changelog: v0.14.8...v0.15.0
v0.14.8
What's Changed
- fix(asr): honor caller modelNames in DownloadUtils for non-baseline files (#524) in #625
- Fix/word boost improvements by @smdesai in #634
- fix(cli): expose missing Parakeet langs, add Greek script support by @rpkyle in #637
- Add timestamp support to EoU Streaming by @Kavi-Gupta in #629
- fix(asr): reduce English drift on French recordings via token blocklist by @Matth-93 in #630
- feat(diarizer): expose per-chunk embeddings on DiarizationResult by @adamsro in #633
- TTS: store downloaded models in Application Support, not Caches (iOS) in #642
- PocketTTS: stride-aware conditioning extraction for cloned voices in #643
- Supertonic-3: built-in voice enum + on-demand voice download in #644
New Contributors
- @rpkyle made their first contribution in #637
- @Kavi-Gupta made their first contribution in #629
- @jonyoder made their first contribution in #639
Full Changelog: v0.14.7...v0.14.8
v0.14.7
What's Changed
- feat(tts/pockettts): add 5 native-language voices + slim language-pack downloads ~40% in #624
- feat(cli): expose all unexposed config fields across 6 commands by @barzhomi in #616
- fix(asr): add opt-in v3 no-mel decode arbitration by @vdt4534 in #604
New Contributors
Full Changelog: v0.14.6...v0.14.7
v0.14.6
What's Changed
- feat(tts): Supertonic-3 multilingual CoreML TTS in #617
- refactor(tts): async StyleTTS2 predict + drop non-native Magpie synthesizeStream in #589
- Make SpeakerManager a struct and de-async DiarizerManager by @panv-kw in #591
- feat(tts/magpie): warmup API for cold-start mitigation (#60 Track 2) in #595* Fixed LS-EEND Memory Leak + Updated Docs by @SGD2718 in #605
- fix(tts/pocket-tts): repair v1 voice cloning for pocket-tts 2.0.0 (#592) in #601
- Timestamping RTTN decoder by @SGD2718 in #608
- fix(asr/nemotron): seed cache_len=1 to avoid ios17.slice_by_index zero-shape warning (#607) in #609
- fix(tts/pockettts): normalize French text and preserve mid-sentence chunks (#584) in #606
- feat(asr): expose tdtConfig in SlidingWindowAsrConfig by @execsumo in #611
- deprecate: remove CosyVoice3 and mono Kokoro (#571) in #602
- Diarization progress by @nburns in #615
Full Changelog: v0.14.5...v0.14.6
v0.14.5
What's Changed
- feat(tts/magpie): nanocodec v1/v2/v3 + decoder_step ANE pin + dual-precision API n #580
- feat(tts/magpie): nanocodec v4 (fp32 + int8 palettize) precision in #581
- fix(tts): guard direct Float16 reads with #if arch(arm64) (CosyVoice3, StyleTTS2) by @Beingpax in #582
- Add Utter app to showcase in README.md by @joepetrakovich in #585
- Fix: Prevent Metal crash when targetTokens is 0 in Kokoro TTS by @gregyoung14 in #586
- feat(tts): StyleTTS2 LibriTTS (iteration_3) CoreML backend in #588
New Contributors
- @joepetrakovich made their first contribution in #585
- @gregyoung14 made their first contribution in #586
Full Changelog: v0.14.4...v0.14.5
v0.14.4
What's Changed
- Feat/pocket tts int8 precision swap by @Gaozhongpai in #558
- Optimized LS-EEND API by @SGD2718 in #526
- Added Back the Old LS-EEND Constructors by @SGD2718 in #563
- feat(asr/parakeet-v3): introduce int4 encoder in #560
- feat(tts/benchmark): tts-benchmark CLI covering all TTS backends in #557
- feat(asr/cohere): long-form transcribeLong + cold/warm docs #564
- Fix DiarizerTimeline Short Segment Filter by @SGD2718 in #565
- feat(tts/kokoro-ane): add Mandarin (v1.1-zh) variant by in #570
New Contributors
- @Gaozhongpai made their first contribution in #558
- @alamparelli made their first contribution in #561
Full Changelog: v0.14.3...v0.14.4