Skip to content

Releases: FluidInference/FluidAudio

v0.15.4 japanese kokoro

16 Jun 17:49
b9d4372

Choose a tag to compare

What's Changed

  • fix(tts): treat KokoroAne quote delimiters as punctuation by @LemonCANDY42 in #696
  • fix(tts/kokoro-ane): synthesis correctness + doc accuracy by in #697
  • fix(tts/kokoro-ane): KokoroNoise v2 — atan2 phase fix (removes HF sharpness) in #700
  • feat(asr): expose per-token timings from StreamingUnifiedAsrManager by @ComicBit in #701
  • Make SenseVoice downloads precision-aware by @pHequals7 in #703
  • feat(tts/kokoro-ane): Japanese variant + PyTorch-matched output level by in #699

New Contributors

Full Changelog: v0.15.3...v0.15.4

v0.15.3

13 Jun 03:58
3c6e79f

Choose a tag to compare

What's Changed

  • feat(tts): M5 benchmark re-baseline + Kokoro ane-tail-gpu (M5 fix) + Supertonic int4 default + PocketTTS v2.1 ANE docs in #666
  • fix(kokoro): make the M5-safe routing the default (Kokoro works on M5 out of the box) in #671
  • chore(asr): remove experimental Parakeet CTC zh-CN Mandarin model in #675
  • Remove experimental Magpie multilingual TTS backend in #674
  • chore(asr): remove experimental Qwen3 ASR backend in #676
  • fix(download): retry transient per-file failures in downloadRepo by @JulianPscheid in #681
  • feat(tts/pocket): ANE placements — rank-4 split-KV models (.ane) + MLState pipeline (.aneState) by @Alex-Wengg in #679
  • fix(kokoro): route the Noise stage to GPU in the M5-safe preset (+~10% synth) in #677
  • feat(asr/eou): opt-in fused decoder+joint_decision path (+7-9% RTFx, WER neutral-or-better) in #680
  • feat(asr): expose per-token timings from Nemotron streaming ASR (English + multilingual) by @JulianPscheid in #673
  • Add TypeWhisper to the FluidAudio showcase by @SeoFood in #685
  • fix(asr): make SlidingWindowAsrConfig.default fit the model's 240k-sample input in #689
  • fix(asr): splice long-form chunk merges on SentencePiece word boundaries in #688
  • fix(tts): Misaki-lexicon-first English frontend for KokoroAne in #692
  • feat(asr): Parakeet Unified 0.6B backend (chunked-attention streaming + offline batch) in #693

New Contributors

Full Changelog: v0.15.2...v0.15.3

v0.15.2

07 Jun 12:26
7f963cd

Choose a tag to compare

What's Changed

  • feat(tts/supertonic3): quantized + ANE-bucketed VectorEstimator in #664
  • PocketTTS v2.1: fused flow decoder (ANE) + cond prefill + fp16 flowlm (~1.8× RTFx) in #665

Full Changelog: v0.15.1...v0.15.2

v0.15.1

05 Jun 16:22
ed66535

Choose a tag to compare

What's Changed

  • perf(asr): opt-in GPU encoder placement for Parakeet v3 (+~8% RTFx, WER-neutral) in #659
  • feat(asr): English Nemotron streaming 2240ms tier + B1 fusion (+50% RTFx, WER-neutral) in #660

Full Changelog: v0.15.0...v0.15.1

v0.15.0 — Multistreamer

04 Jun 04:27
b1c82cf

Choose a tag to compare

What's Changed

  • feat(asr): SenseVoiceSmall CoreML backend (multilingual, non-autoregressive) in #648
  • feat(asr): Paraformer-large (zh) CoreML backend (non-autoregressive, CIF) in #651
  • fix(download): fetch repo-root vocab.json for Cohere Transcribe (#649)g in #650
  • feat(download): add offline-only enforcement via DownloadUtils.enforceOffline by @leecrossley in #632
  • feat(asr): Nemotron 3.5 ASR Streaming Multilingual 0.6B (40 locales, on-device CoreML/ANE) g in #657

New Contributors

Full Changelog: v0.14.8...v0.15.0

v0.14.8

31 May 17:48
56607d9

Choose a tag to compare

What's Changed

  • fix(asr): honor caller modelNames in DownloadUtils for non-baseline files (#524) in #625
  • Fix/word boost improvements by @smdesai in #634
  • fix(cli): expose missing Parakeet langs, add Greek script support by @rpkyle in #637
  • Add timestamp support to EoU Streaming by @Kavi-Gupta in #629
  • fix(asr): reduce English drift on French recordings via token blocklist by @Matth-93 in #630
  • feat(diarizer): expose per-chunk embeddings on DiarizationResult by @adamsro in #633
  • TTS: store downloaded models in Application Support, not Caches (iOS) in #642
  • PocketTTS: stride-aware conditioning extraction for cloned voices in #643
  • Supertonic-3: built-in voice enum + on-demand voice download in #644

New Contributors

Full Changelog: v0.14.7...v0.14.8

v0.14.7

19 May 02:17
8048812

Choose a tag to compare

What's Changed

  • feat(tts/pockettts): add 5 native-language voices + slim language-pack downloads ~40% in #624
  • feat(cli): expose all unexposed config fields across 6 commands by @barzhomi in #616
  • fix(asr): add opt-in v3 no-mel decode arbitration by @vdt4534 in #604

New Contributors

Full Changelog: v0.14.6...v0.14.7

v0.14.6

17 May 06:29
364d839

Choose a tag to compare

What's Changed

  • feat(tts): Supertonic-3 multilingual CoreML TTS in #617
  • refactor(tts): async StyleTTS2 predict + drop non-native Magpie synthesizeStream in #589
  • Make SpeakerManager a struct and de-async DiarizerManager by @panv-kw in #591
  • feat(tts/magpie): warmup API for cold-start mitigation (#60 Track 2) in #595* Fixed LS-EEND Memory Leak + Updated Docs by @SGD2718 in #605
  • fix(tts/pocket-tts): repair v1 voice cloning for pocket-tts 2.0.0 (#592) in #601
  • Timestamping RTTN decoder by @SGD2718 in #608
  • fix(asr/nemotron): seed cache_len=1 to avoid ios17.slice_by_index zero-shape warning (#607) in #609
  • fix(tts/pockettts): normalize French text and preserve mid-sentence chunks (#584) in #606
  • feat(asr): expose tdtConfig in SlidingWindowAsrConfig by @execsumo in #611
  • deprecate: remove CosyVoice3 and mono Kokoro (#571) in #602
  • Diarization progress by @nburns in #615

Full Changelog: v0.14.5...v0.14.6

v0.14.5

09 May 05:17
ce59fb1

Choose a tag to compare

What's Changed

  • feat(tts/magpie): nanocodec v1/v2/v3 + decoder_step ANE pin + dual-precision API n #580
  • feat(tts/magpie): nanocodec v4 (fp32 + int8 palettize) precision in #581
  • fix(tts): guard direct Float16 reads with #if arch(arm64) (CosyVoice3, StyleTTS2) by @Beingpax in #582
  • Add Utter app to showcase in README.md by @joepetrakovich in #585
  • Fix: Prevent Metal crash when targetTokens is 0 in Kokoro TTS by @gregyoung14 in #586
  • feat(tts): StyleTTS2 LibriTTS (iteration_3) CoreML backend in #588

New Contributors

Full Changelog: v0.14.4...v0.14.5

v0.14.4

04 May 05:07
bdbff4d

Choose a tag to compare

What's Changed

  • Feat/pocket tts int8 precision swap by @Gaozhongpai in #558
  • Optimized LS-EEND API by @SGD2718 in #526
  • Added Back the Old LS-EEND Constructors by @SGD2718 in #563
  • feat(asr/parakeet-v3): introduce int4 encoder in #560
  • feat(tts/benchmark): tts-benchmark CLI covering all TTS backends in #557
  • feat(asr/cohere): long-form transcribeLong + cold/warm docs #564
  • Fix DiarizerTimeline Short Segment Filter by @SGD2718 in #565
  • feat(tts/kokoro-ane): add Mandarin (v1.1-zh) variant by in #570

New Contributors

Full Changelog: v0.14.3...v0.14.4