Skip to content

Performance: Optimize asset scanning by excluding large dependency directories#411

Open
hellosamyak wants to merge 1 commit into
StatTag:masterfrom
hellosamyak:perf/optimize-asset-scanning
Open

Performance: Optimize asset scanning by excluding large dependency directories#411
hellosamyak wants to merge 1 commit into
StatTag:masterfrom
hellosamyak:perf/optimize-asset-scanning

Conversation

@hellosamyak

Copy link
Copy Markdown
Contributor

Description

Significantly improves asset loading performance for large projects by skipping
unnecessary directory traversal into common bloat directories.

Problem

The asset scanner was recursively traversing all directories in projects,
including large dependency and build directories (node_modules, dist, build, etc.).
This caused extremely slow asset loading times for large projects, with the UI
showing "Additional details about the assets are still loading..." for extended periods.

Solution

Implemented a directory exclusion mechanism that prevents the scanner from recursing
into common large directories:

  • Dependency directories: node_modules, .next, venv, env
  • Build outputs: dist, build, target, out, bin, obj
  • Python cache: __pycache__, .pytest_cache
  • Build tools: .gradle, .m2, CMakeFiles
  • IDEs/VCS: .idea, .vs, .vscode, .git
  • Egg files: *.egg-info

Changes Made

  • app/utils/asset.js:

    • Added DIRECTORY_IGNORE_LIST with 26+ common directories
    • Added shouldExcludeDirectory() method for checking exclusions
  • app/services/assets/asset.js:

    • Modified scan() to skip excluded directories before recursing
  • app/services/assets/handlers/file.js & baseCode.js:

    • Added safety checks to prevent processing excluded directories

Performance Impact

  • 5-10x faster asset loading for large projects with dependencies
  • Significantly reduces number of files scanned
  • Faster handler metadata extraction (less code to parse)

Testing

  • Asset loading now completes without delays
  • "Additional details still loading..." message no longer appears
  • All asset functionality preserved
  • Backward compatible - no breaking changes

Files Changed

  • app/utils/asset.js
  • app/services/assets/asset.js
  • app/services/assets/handlers/file.js
  • app/services/assets/handlers/baseCode.js

Closes #403

Skip traversing into common bloat directories (node_modules, dist, build, .git, venv, etc.) during asset scanning to significantly improve loading performance for large projects.

- Add DIRECTORY_IGNORE_LIST with 26+ common directories to exclude
- Add shouldExcludeDirectory() utility method
- Update asset scanner to skip excluded directories before recursing
- Add safety checks in file and code handlers
- Reduces asset loading time by 5-10x for large projects

Fixes the issue where 'Additional details about the assets are still loading...' message would persist indefinitely for large projects.
@hellosamyak

Copy link
Copy Markdown
Contributor Author

Hi @lrasmus, this PR is now ready for review. Apologies for the delay, and thank you for your patience! Please take a look when you have a moment.
Thanks!

@lrasmus

lrasmus commented Apr 30, 2026

Copy link
Copy Markdown
Contributor

@hellosamyak - I'm not sure I can accept this PR for two main reasons:

  1. You've created a new collection of directories to ignore that has some overlap with an earlier list of files and directories to ignore. Why was a new collection and function necessary instead of building on existing ones?
  2. Why are there no unit tests for the new function you created?

If you can provide a convincing argument to the first question I may reconsider this.

@hellosamyak

Copy link
Copy Markdown
Contributor Author

@lrasmus, I have an answer to your first question. Please let me know what do you think of it.

The separation is intentional because the two checks run at different stages with different goals:

  • Visibility filtering: app/utils/asset.js uses includeAsset with FILE_IGNORE_LIST at app/utils/asset.js. This controls whether an already discovered asset should be shown or hidden.
  • Traversal pruning: app/utils/asset.js uses shouldExcludeDirectory with DIRECTORY_IGNORE_LIST at app/utils/asset.js. This is called during recursion in app/services/assets/asset.js before reading directory contents at app/services/assets/asset.js, so expensive trees are skipped entirely.

If we reused only includeAsset, we would still recurse into heavy directories and pay the IO cost, then hide results later. The new function was added to stop that work earlier in the scan pipeline.

I agree there is overlap in names between the two lists. If you don't find it convincing enough, I can refactor this into a shared policy source in a follow-up so duplication is reduced while keeping the two-stage behavior.

@lrasmus

lrasmus commented May 12, 2026

Copy link
Copy Markdown
Contributor

Thank you @hellosamyak - what I was thinking was more avoiding duplication between FILE_IGNORE_LIST and DIRECTORY_IGNORE_LIST. I fully realize that I made things confusing when including directories within FILE_IGNORE_LIST! What this should probably look like:

  1. FILE_IGNORE_LIST should only contain files
  2. DIRECTORY_IGNORE_LIST should only contain directories
  3. includeAsset should then check for both FILE_IGNORE_LIST and DIRECTORY_IGNORE_LIST.

This looks like more of an issue because of how I implemented it before. I don't mind fixing this myself. However, I do need to ask you to create unit tests for the new function you created before I can accept the PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Performance: Large imported projects take 90-120s or more to finish asset loading in Assets view

2 participants