Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
27 changes: 27 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,33 @@

All notable changes to GPULlama3.java will be documented in this file.

## [0.5.0] - 2026-06-24

### Features

- Add prefill-decode and batch-prefill-decode for Qwen3 (FP16 and Q8_0) ([#122](https://github.com/beehive-lab/GPULlama3.java/pull/122))
- Refactor GPU backend planner ([#117](https://github.com/beehive-lab/GPULlama3.java/pull/117))
- Several fixes and improvements for CI ([#115](https://github.com/beehive-lab/GPULlama3.java/pull/115))
- Ci/metrics history ([#114](https://github.com/beehive-lab/GPULlama3.java/pull/114))
- Improve collection of performance/throughput metrics ([#113](https://github.com/beehive-lab/GPULlama3.java/pull/113))
- Update TornadoVM dependency for jdk21 and fixed suffix regarding future releases ([#111](https://github.com/beehive-lab/GPULlama3.java/pull/111))
- Add Prefill–Decode Separation with Batched Prompt Ingestion and Logits Skipping ([#102](https://github.com/beehive-lab/GPULlama3.java/pull/102))

### Other Changes

- Qwen3 decode: split-KV attention + backend-aware warp GEMV (FP16 & Q8_0) ([#123](https://github.com/beehive-lab/GPULlama3.java/pull/123))
- Introduce tool calling support ([#116](https://github.com/beehive-lab/GPULlama3.java/pull/116))
- Cleanup of presentation materials ([#121](https://github.com/beehive-lab/GPULlama3.java/pull/121))
- Add Q4_K/Q5_K/Q6_K GPU support via Q8_0 dequantization ([#108](https://github.com/beehive-lab/GPULlama3.java/pull/108))
- llama-tornado script curation ([#112](https://github.com/beehive-lab/GPULlama3.java/pull/112))
- Add Apple Metal backend support ([#103](https://github.com/beehive-lab/GPULlama3.java/pull/103))
- Add DevoxxGreece presentation material ([#109](https://github.com/beehive-lab/GPULlama3.java/pull/109))
- Devstral 2 support (Mistral 3 architecture, Tekken tokenizer, YaRN … ([#107](https://github.com/beehive-lab/GPULlama3.java/pull/107))
- Add llamaTornado Java 25 single-file launcher with Metal backend support ([#105](https://github.com/beehive-lab/GPULlama3.java/pull/105))
- [refactor] Simplify and unify the TornadoVM layer planner infrastructure ([#101](https://github.com/beehive-lab/GPULlama3.java/pull/101))
- AddCI Actions for Quarkus-LangChain4j integration ([#89](https://github.com/beehive-lab/GPULlama3.java/pull/89))
- Simplify and generalize TornadoVM version across JDK profiles in pom.xml ([#99](https://github.com/beehive-lab/GPULlama3.java/pull/99))

## [0.4.0] - 2026-02-25

### Other Changes
Expand Down
4 changes: 2 additions & 2 deletions CITATION.cff
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,6 @@ authors:
given-names: "Christos"
title: "GPULlama3.java"
license: MIT License
version: 0.4.0
date-released: 2026-02-25
version: 0.5.0
date-released: 2026-06-24
url: "https://github.com/beehive-lab/GPULlama3.java"
4 changes: 2 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -133,7 +133,7 @@ You can add **GPULlama3.java** directly to your Maven project by including the f
<dependency>
<groupId>io.github.beehive-lab</groupId>
<artifactId>gpu-llama3</artifactId>
<version>0.4.0</version>
<version>0.5.0</version>
</dependency>
```

Expand All @@ -142,7 +142,7 @@ You can add **GPULlama3.java** directly to your Maven project by including the f
<dependency>
<groupId>io.github.beehive-lab</groupId>
<artifactId>gpu-llama3</artifactId>
<version>0.4.0-jdk25</version>
<version>0.5.0-jdk25</version>
</dependency>
```

Expand Down
2 changes: 1 addition & 1 deletion pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -38,7 +38,7 @@

<properties>
<!-- CI-friendly version: resolved by flatten-maven-plugin at build time -->
<revision>0.4.0</revision>
<revision>0.5.0</revision>
<tornadovm.base.version>4.0.2</tornadovm.base.version>
<jdk.version.suffix>-jdk21</jdk.version.suffix>
<tornadovm.version>${tornadovm.base.version}${jdk.version.suffix}</tornadovm.version>
Expand Down
Loading