feat: add L1 instruction cache to core fetcher#58
Open
mohamedx2 wants to merge 2 commits into
Open
Conversation
There was a problem hiding this comment.
Pull request overview
This PR adds an instruction-side caching mechanism to the core fetcher to reduce repeated program-memory read requests (especially beneficial for tight loops), and marks the corresponding roadmap item as complete.
Changes:
- Added an instruction cache path in
src/fetcher.svto return cached instructions without issuing external program-memory reads on hits. - Added a
skills-lock.jsonplus embedded-systems “skill” documentation under.agents/skills/.... - Updated
README.md“Next Steps” checklist item for instruction caching to completed.
Reviewed changes
Copilot reviewed 9 out of 9 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| src/fetcher.sv | Adds instruction cache storage/valid tracking and hit/miss logic to bypass program memory on hits. |
| README.md | Marks “Add a simple cache for instructions” as completed in the roadmap checklist. |
| skills-lock.json | Adds a lockfile describing an embedded-systems skill dependency (but path currently doesn’t match repo layout). |
| .agents/skills/embedded-systems/SKILL.md | Adds embedded-systems skill definition doc. |
| .agents/skills/embedded-systems/references/rtos-patterns.md | Adds embedded-systems reference material. |
| .agents/skills/embedded-systems/references/power-optimization.md | Adds embedded-systems reference material. |
| .agents/skills/embedded-systems/references/microcontroller-programming.md | Adds embedded-systems reference material. |
| .agents/skills/embedded-systems/references/memory-optimization.md | Adds embedded-systems reference material. |
| .agents/skills/embedded-systems/references/communication-protocols.md | Adds embedded-systems reference material. |
Comments suppressed due to low confidence (3)
src/fetcher.sv:46
- Resetting every icache_data entry in a for-loop will synthesize to a large amount of reset logic (and may prevent inferring a memory/RAM). Since icache_valid is already cleared on reset, the data array doesn’t need to be reset; dropping the loop (or using an init file for simulation only) will reduce area and improve synth/FPGA mapping.
icache_valid <= 0;
for (i = 0; i < (1<<PROGRAM_MEM_ADDR_BITS); i = i + 1) begin
icache_data[i] <= 0;
end
src/fetcher.sv:56
- icache_valid/icache_data are indexed with current_pc (fixed 8-bit), but their depth is based on PROGRAM_MEM_ADDR_BITS. If PROGRAM_MEM_ADDR_BITS != 8, this can become an out-of-range access or unintended truncation. To keep the parameter meaningful, make current_pc width PROGRAM_MEM_ADDR_BITS (and propagate through core/scheduler) or explicitly slice/cast the index to PROGRAM_MEM_ADDR_BITS.
if (icache_valid[current_pc]) begin
// Cache Hit
fetcher_state <= FETCHED;
instruction <= icache_data[current_pc];
mem_read_valid <= 0;
src/fetcher.sv:62
- This PR’s main behavioral change is the cache-hit fast path (FETCHED without asserting mem_read_valid). There are existing cocotb integration tests, but they only assert final memory results and won’t catch regressions where the cache stops being used. Adding a test that asserts program memory reads drop on repeated PCs (e.g., a tight loop) or that cycle count decreases would make the cache behavior verifiable.
if (core_state == 3'b001) begin
if (icache_valid[current_pc]) begin
// Cache Hit
fetcher_state <= FETCHED;
instruction <= icache_data[current_pc];
mem_read_valid <= 0;
end else begin
// Cache Miss
fetcher_state <= FETCHING;
mem_read_valid <= 1;
mem_read_address <= current_pc;
end
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Comment on lines
+32
to
+35
| // Simple Instruction Cache | ||
| reg [PROGRAM_MEM_DATA_BITS-1:0] icache_data [0:(1<<PROGRAM_MEM_ADDR_BITS)-1]; | ||
| reg [(1<<PROGRAM_MEM_ADDR_BITS)-1:0] icache_valid; | ||
| integer i; |
| "embedded-systems": { | ||
| "source": "jeffallan/claude-skills", | ||
| "sourceType": "github", | ||
| "skillPath": "skills/embedded-systems/SKILL.md", |
Comment on lines
+1
to
+11
| { | ||
| "version": 1, | ||
| "skills": { | ||
| "embedded-systems": { | ||
| "source": "jeffallan/claude-skills", | ||
| "sourceType": "github", | ||
| "skillPath": "skills/embedded-systems/SKILL.md", | ||
| "computedHash": "046558a8e6cd2c7e3e1da41360d64687fc72514329e04aad15e5bb7c4d19681b" | ||
| } | ||
| } | ||
| } |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This pull request introduces the first major architectural upgrade from the "Next Steps", which is adding an internal Instruction Cache!
Previously, the
fetcher.svinside each core blindly issued memory fetch requests to the external Program Memory Controller for every single instruction, resulting in significant delays, especially for loops (like thematmul.asmkernel).With this update:
fetcher.svlogic.FETCHEDin 1 cycle on a Cache Hit, bypassing the external Read operation.README.mdto show the first checklist ticket as complete.