Skip to content

Implement more handlers#711

Open
renkun-ken wants to merge 29 commits intomasterfrom
experimental
Open

Implement more handlers#711
renkun-ken wants to merge 29 commits intomasterfrom
experimental

Conversation

@renkun-ken
Copy link
Member

@renkun-ken renkun-ken commented Feb 6, 2026

This PR implements the following handlers:

and makes the following improvements:

  • activeParameters in completion is supported.
image
  • hierarchicalDocumentSymbolSupport is implemented so that class members are displayed in outline.
image
  • The range of top-level symbols are fixed so that symbols defined in global scopes (enclosed in {} with or without if while etc) have correct ranges.
  • Add parse cache and diagnostics cache to reuse existing results of parsed document to be more responsive.
  • Rewrite semantic token extraction and code point calculation in C to reduce R allocation for better performance.

P.S: The code written in this PR is mostly done by @copilot.

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR expands the R language server’s LSP feature set (type hierarchy + semantic tokens) and improves responsiveness by adding caching and moving several hot-path operations to C.

Changes:

  • Implement textDocument/prepareTypeHierarchy, typeHierarchy/supertypes, and typeHierarchy/subtypes with supporting class/member extraction.
  • Add textDocument/semanticTokens/full and /range plus server capability wiring.
  • Improve performance via parse/diagnostics caching, task scheduling tweaks, and new C implementations for token scanning, UTF-16/code-point conversion, semantic token encoding, and matching.

Reviewed changes

Copilot reviewed 43 out of 45 changed files in this pull request and generated 7 comments.

Show a summary per file
File Description
tests/testthat/test-type-hierarchy.R Adds type hierarchy tests (R6).
tests/testthat/test-symbol.R Adapts symbol range assertions via helper.
tests/testthat/test-symbol-blocks.R Tests symbols in top-level blocks.
tests/testthat/test-signature.R Adds activeParameter test cases.
tests/testthat/test-semantic-tokens.R Adds semantic tokens tests (full/range).
tests/testthat/test-lintr.R Stabilizes lintr tests/config handling.
tests/testthat/test-hover.R Makes hover tests less brittle.
tests/testthat/test-highlight-blocks.R Tests highlights in top-level blocks.
tests/testthat/test-definition-blocks.R Tests definition in top-level blocks.
tests/testthat/helper-utils.R Adds request helpers + tweaks test server startup.
src/token.h Declares C token scanner.
src/token.c Implements fast token scanning in C.
src/semantic.h Declares semantic token encoder.
src/semantic.c Implements semantic token delta encoding in C.
src/match.h Declares C matching helpers.
src/match.c Implements ASCII case-insensitive/fuzzy matching.
src/encoding.h Declares UTF-16/code-point conversion helpers.
src/encoding.c Implements UTF-16/code-point conversion in C.
src/languageserver.c Registers new .Call entry points.
README.md Marks new LSP features as implemented.
R/workspace.R Adds parse/diagnostics caches + cache cleanup.
R/utils.R Adds ASCII detection, C-backed match/fuzzy, C token scan wrapper, and C-backed UTF-16 conversions.
R/type_hierarchy.R Implements type hierarchy logic and class member extraction.
R/task.R Adjusts task scheduling/utilization for parse tasks.
R/symbol.R Adds hierarchical DocumentSymbol support + class member children.
R/signature.R Adds activeParameter computation and parameter range parsing.
R/settings.R Introduces diagnostics cache TTL setting.
R/semantic.R Adds semantic tokens extraction + encoding + handlers.
R/selection.R Formatting/indentation adjustments only.
R/rename.R Formatting/indentation adjustments only.
R/references.R Refactors reference collection to avoid repeated c() growth.
R/languageserver.R Rescales session pool sizes and adds pending reply queues for new requests.
R/interfaces.R Adds document_symbol() constructor for hierarchical symbols.
R/hover.R Updates XPath roots to include more node types.
R/handlers-textsync.R Parses immediately on open (delay=0).
R/handlers-langfeatures.R Adds handlers for type hierarchy + semantic tokens requests.
R/formatting.R Avoids repeated list growth when building edits.
R/document.R Fixes srcref propagation for block parsing + adds parse caching by content hash.
R/diagnostics.R Adds diagnostics caching + explicit linters fallback logic.
R/definition.R Updates XPath roots similarly to hover/signature.
R/completion.R Avoids repeated list growth + updates XPath roots.
R/code_action.R Avoids repeated list growth when collecting actions.
R/capabilities.R Advertises type hierarchy + semantic tokens capabilities.
NAMESPACE Imports digest::digest.
DESCRIPTION Adds digest dependency + updates RoxygenNote.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +287 to +293
leading_space <- nchar(current_param) - nchar(sub("^\\\\s+", "", current_param))
trailing_space <- nchar(current_param) - nchar(sub("\\\\s+$", "", current_param))

# Calculate the label position as [start, end] in the full signature
# LSP uses 0-based positions
param_start <- base_offset + char_pos + leading_space
param_end <- base_offset + char_pos + nchar(current_param) - trailing_space
Copy link

Copilot AI Feb 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The leading/trailing whitespace trimming for parameter label offsets uses patterns like "^\\s+" / "\\s+$" without perl=TRUE, so they won't match actual whitespace in base R regex. This will compute incorrect ParameterInformation label ranges. Use [[:space:]]+ (POSIX) or enable perl=TRUE with "^\s+" / "\s+$".

Copilot uses AI. Check for mistakes.
@@ -0,0 +1,236 @@
#include <R.h>
#include <Rinternals.h>
#include <string.h>
Copy link

Copilot AI Feb 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file uses malloc()/free()/calloc() but does not include <stdlib.h>. On many toolchains this is a compile error (implicit declaration) and can break package compilation. Add the proper standard header for these allocation functions.

Suggested change
#include <string.h>
#include <string.h>
#include <stdlib.h>

Copilot uses AI. Check for mistakes.
Comment on lines +3 to +6
#include <ctype.h>
#include <string.h>
#include <wctype.h>
#include <stdint.h>
Copy link

Copilot AI Feb 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WINT_MAX is used later in this file but isn't guaranteed to be defined by <wctype.h> alone (it's typically from <wchar.h>). To avoid build failures on some platforms/toolchains, include the header that defines WINT_MAX (e.g. <wchar.h> or <limits.h>).

Copilot uses AI. Check for mistakes.
Comment on lines +159 to +165
idx <- idx + 1L
tokens[[idx]] <- list(
line = as.integer(line1 - 1), # Convert to 0-based, ensure integer
col = as.integer(col1 - 1), # Convert to 0-based, ensure integer
length = as.integer(nchar(token_text)), # Ensure integer
tokenType = as.integer(token_type), # Ensure integer
tokenModifiers = as.integer(modifiers) # Ensure integer
Copy link

Copilot AI Feb 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Semantic token positions/lengths need to be in LSP 'character' units (UTF-16 code units). Here col and length are computed using parse-data columns (code points) and nchar() (code points), which will be incorrect for non-BMP characters. Consider converting (col1-1) and col2 to UTF-16 units via document$to_lsp_position()/code_point_to_unit and derive length from the converted start/end columns rather than nchar(token_text).

Suggested change
idx <- idx + 1L
tokens[[idx]] <- list(
line = as.integer(line1 - 1), # Convert to 0-based, ensure integer
col = as.integer(col1 - 1), # Convert to 0-based, ensure integer
length = as.integer(nchar(token_text)), # Ensure integer
tokenType = as.integer(token_type), # Ensure integer
tokenModifiers = as.integer(modifiers) # Ensure integer
# Convert token start/end from parse-data code points to LSP UTF-16 units
start_pos_lsp <- document$to_lsp_position(list(
line = as.integer(line1 - 1L),
character = as.integer(col1 - 1L)
))
end_pos_lsp <- document$to_lsp_position(list(
line = as.integer(line2 - 1L),
character = as.integer(col2)
))
idx <- idx + 1L
tokens[[idx]] <- list(
line = as.integer(start_pos_lsp$line), # Already 0-based
col = as.integer(start_pos_lsp$character), # UTF-16 code units
length = as.integer(end_pos_lsp$character - start_pos_lsp$character),
tokenType = as.integer(token_type), # Ensure integer
tokenModifiers = as.integer(modifiers) # Ensure integer

Copilot uses AI. Check for mistakes.
Comment on lines +74 to +75
diagnostics_pool_size <- max(floor(pool_size * 0.75), 2)
diagnostics_pool <- if (pool_size > 0) SessionPool$new(diagnostics_pool_size, "diagnostics") else NULL
Copy link

Copilot AI Feb 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

diagnostics_pool_size can exceed the user-requested pool_size. For example, if R_LANGSVR_POOL_SIZE=1, this computes diagnostics_pool_size=2 and starts more sessions than requested. Consider capping diagnostics_pool_size to pool_size (and allowing 1) to respect the environment setting.

Suggested change
diagnostics_pool_size <- max(floor(pool_size * 0.75), 2)
diagnostics_pool <- if (pool_size > 0) SessionPool$new(diagnostics_pool_size, "diagnostics") else NULL
# Ensure diagnostics_pool_size does not exceed the user-requested pool_size
diagnostics_pool_size <- if (pool_size > 0) {
min(max(floor(pool_size * 0.75), 1L), pool_size)
} else {
0L
}
diagnostics_pool <- if (diagnostics_pool_size > 0) SessionPool$new(diagnostics_pool_size, "diagnostics") else NULL

Copilot uses AI. Check for mistakes.
Comment on lines +359 to +360
"//SYMBOL_FUNCTION_CALL[text() = 'setMethod']/following-sibling::expr[STR_CONST[text() = '\"'{token_quote}'\"']]",
token_quote = token_text
Copy link

Copilot AI Feb 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The XPath built for the setMethod() pattern is not valid XPath because it tries to concatenate string literals and the class name without using concat()/contains(). This will likely cause xml_find_all() to error and can break prepareTypeHierarchy for tokens containing dots (e.g. S3 method names). Consider rewriting this XPath to use contains(text(), '""') (like other helpers in this file) or build a single quoted literal via xml_single_quote().

Suggested change
"//SYMBOL_FUNCTION_CALL[text() = 'setMethod']/following-sibling::expr[STR_CONST[text() = '\"'{token_quote}'\"']]",
token_quote = token_text
"//SYMBOL_FUNCTION_CALL[text() = 'setMethod']/following-sibling::expr[STR_CONST[text() = {class_literal}]]",
class_literal = xml_single_quote(glue('"{token_text}"'))

Copilot uses AI. Check for mistakes.

# Check if this is a named argument (pattern: name = ...)
# Match identifier followed by =, with optional whitespace
named_match <- regexec("^([a-zA-Z._][a-zA-Z0-9._]*)\\s*=\\s*", current_arg)
Copy link

Copilot AI Feb 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This named-argument regex uses "\s" but regexec() defaults to POSIX ERE (perl = FALSE), where \s is not whitespace. As a result, patterns like "z =" won't match and activeParameter will fall back to comma counting (failing the new named-arg tests). Use a POSIX-safe whitespace class (e.g. [[:space:]]*) or set perl = TRUE.

Suggested change
named_match <- regexec("^([a-zA-Z._][a-zA-Z0-9._]*)\\s*=\\s*", current_arg)
named_match <- regexec("^([a-zA-Z._][a-zA-Z0-9._]*)[[:space:]]*=[[:space:]]*", current_arg)

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant

Comments