From 3c3b90a962c82509300a54980f990a3c7e07bce0 Mon Sep 17 00:00:00 2001 From: "github-actions[bot]" Date: Wed, 24 Jun 2026 12:31:59 +0300 Subject: [PATCH] Prepare release 0.5.0 --- CHANGELOG.md | 27 +++++++++++++++++++++++++++ CITATION.cff | 4 ++-- README.md | 4 ++-- pom.xml | 2 +- 4 files changed, 32 insertions(+), 5 deletions(-) diff --git a/CHANGELOG.md b/CHANGELOG.md index f157cf84..ff1cd811 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -2,6 +2,33 @@ All notable changes to GPULlama3.java will be documented in this file. +## [0.5.0] - 2026-06-24 + +### Features + +- Add prefill-decode and batch-prefill-decode for Qwen3 (FP16 and Q8_0) ([#122](https://github.com/beehive-lab/GPULlama3.java/pull/122)) +- Refactor GPU backend planner ([#117](https://github.com/beehive-lab/GPULlama3.java/pull/117)) +- Several fixes and improvements for CI ([#115](https://github.com/beehive-lab/GPULlama3.java/pull/115)) +- Ci/metrics history ([#114](https://github.com/beehive-lab/GPULlama3.java/pull/114)) +- Improve collection of performance/throughput metrics ([#113](https://github.com/beehive-lab/GPULlama3.java/pull/113)) +- Update TornadoVM dependency for jdk21 and fixed suffix regarding future releases ([#111](https://github.com/beehive-lab/GPULlama3.java/pull/111)) +- Add Prefill–Decode Separation with Batched Prompt Ingestion and Logits Skipping ([#102](https://github.com/beehive-lab/GPULlama3.java/pull/102)) + +### Other Changes + +- Qwen3 decode: split-KV attention + backend-aware warp GEMV (FP16 & Q8_0) ([#123](https://github.com/beehive-lab/GPULlama3.java/pull/123)) +- Introduce tool calling support ([#116](https://github.com/beehive-lab/GPULlama3.java/pull/116)) +- Cleanup of presentation materials ([#121](https://github.com/beehive-lab/GPULlama3.java/pull/121)) +- Add Q4_K/Q5_K/Q6_K GPU support via Q8_0 dequantization ([#108](https://github.com/beehive-lab/GPULlama3.java/pull/108)) +- llama-tornado script curation ([#112](https://github.com/beehive-lab/GPULlama3.java/pull/112)) +- Add Apple Metal backend support ([#103](https://github.com/beehive-lab/GPULlama3.java/pull/103)) +- Add DevoxxGreece presentation material ([#109](https://github.com/beehive-lab/GPULlama3.java/pull/109)) +- Devstral 2 support (Mistral 3 architecture, Tekken tokenizer, YaRN … ([#107](https://github.com/beehive-lab/GPULlama3.java/pull/107)) +- Add llamaTornado Java 25 single-file launcher with Metal backend support ([#105](https://github.com/beehive-lab/GPULlama3.java/pull/105)) +- [refactor] Simplify and unify the TornadoVM layer planner infrastructure ([#101](https://github.com/beehive-lab/GPULlama3.java/pull/101)) +- AddCI Actions for Quarkus-LangChain4j integration ([#89](https://github.com/beehive-lab/GPULlama3.java/pull/89)) +- Simplify and generalize TornadoVM version across JDK profiles in pom.xml ([#99](https://github.com/beehive-lab/GPULlama3.java/pull/99)) + ## [0.4.0] - 2026-02-25 ### Other Changes diff --git a/CITATION.cff b/CITATION.cff index da2cbe6b..6c24fa0e 100644 --- a/CITATION.cff +++ b/CITATION.cff @@ -15,6 +15,6 @@ authors: given-names: "Christos" title: "GPULlama3.java" license: MIT License -version: 0.4.0 -date-released: 2026-02-25 +version: 0.5.0 +date-released: 2026-06-24 url: "https://github.com/beehive-lab/GPULlama3.java" diff --git a/README.md b/README.md index 2e2db217..97d05ce6 100644 --- a/README.md +++ b/README.md @@ -133,7 +133,7 @@ You can add **GPULlama3.java** directly to your Maven project by including the f io.github.beehive-lab gpu-llama3 - 0.4.0 + 0.5.0 ``` @@ -142,7 +142,7 @@ You can add **GPULlama3.java** directly to your Maven project by including the f io.github.beehive-lab gpu-llama3 - 0.4.0-jdk25 + 0.5.0-jdk25 ``` diff --git a/pom.xml b/pom.xml index 4651725c..07af3084 100644 --- a/pom.xml +++ b/pom.xml @@ -38,7 +38,7 @@ - 0.4.0 + 0.5.0 4.0.2 -jdk21 ${tornadovm.base.version}${jdk.version.suffix}