Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion sqlparse/keywords.py
Original file line number Diff line number Diff line change
Expand Up @@ -59,7 +59,7 @@
(r'(?![_A-ZÀ-Ü])-?(\d+(\.\d*)|\.\d+)(?![_A-ZÀ-Ü])',
tokens.Number.Float),
(r'(?![_A-ZÀ-Ü])-?\d+(?![_A-ZÀ-Ü])', tokens.Number.Integer),
(r"'(''|\\'|[^'])*'", tokens.String.Single),
(r"'(''|\\\\|\\'|[^'])*'", tokens.String.Single),
# not a real string literal in ANSI SQL:
(r'"(""|\\"|[^"])*"', tokens.String.Symbol),
Copy link

Copilot AI Feb 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The double-quoted string pattern on line 64 has the same vulnerability as the single-quoted pattern that's being fixed. It should also include \\\\ to handle escaped backslashes correctly. The pattern should be r'"(""|\\\\|\\"|[^"])*"' to match the fix being applied to single-quoted strings.

Suggested change
(r'"(""|\\"|[^"])*"', tokens.String.Symbol),
(r'"(""|\\\\|\\"|[^"])*"', tokens.String.Symbol),

Copilot uses AI. Check for mistakes.
(r'(""|".*?[^\\]")', tokens.String.Symbol),
Expand Down
30 changes: 30 additions & 0 deletions tests/test_tokenize.py
Original file line number Diff line number Diff line change
Expand Up @@ -245,3 +245,33 @@ def test_cli_commands():
p = sqlparse.parse('\\copy')[0]
assert len(p.tokens) == 1
assert p.tokens[0].ttype == T.Command


def test_escaped_backslash_in_string():
# issue814 - Escaped backslashes in string literals
sql = r"SELECT '\\\\', '\\\\'"
tokens = list(lexer.tokenize(sql))
# Should have: SELECT, space, string, comma, space, string
assert len(tokens) == 6
assert tokens[0] == (T.Keyword.DML, 'SELECT')
assert tokens[1] == (T.Whitespace, ' ')
# The string contains two backslashes in the SQL, which is represented
# as 4 backslashes in the Python raw string
Comment on lines +258 to +259
Copy link

Copilot AI Feb 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment is confusing. It says "The string contains two backslashes in the SQL" but the SQL source code actually contains 4 backslashes per string literal ('\\\\'). The comment should clarify whether it's referring to the SQL source code (4 backslashes) or the interpreted string value in databases that support backslash escaping (2 backslashes).

Suggested change
# The string contains two backslashes in the SQL, which is represented
# as 4 backslashes in the Python raw string
# Each SQL string literal contains four backslashes in the source, which
# databases with backslash escaping interpret as two backslashes; this is
# written as four backslashes in the Python raw string

Copilot uses AI. Check for mistakes.
assert tokens[2] == (T.Literal.String.Single, "'\\\\\\\\'")
assert tokens[3] == (T.Punctuation, ',')
assert tokens[4] == (T.Whitespace, ' ')
assert tokens[5] == (T.Literal.String.Single, "'\\\\\\\\'")
Comment on lines +250 to +263
Copy link

Copilot AI Feb 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider adding a test case that exactly matches the issue example (r"SELECT '\\', '\\'" with 2 backslashes per string) to ensure that specific reported case is covered. The current test with 4 backslashes is good for thorough testing, but having the exact issue case would make it clearer that the bug is fixed.

Copilot uses AI. Check for mistakes.


def test_escaped_quote_in_string():
# Test that escaped quotes still work
sql = r"SELECT 'it''s a test'"
tokens = list(lexer.tokenize(sql))
assert tokens[2] == (T.Literal.String.Single, "'it''s a test'")


def test_backslash_escaped_quote_in_string():
# Test backslash-escaped quotes
sql = r"SELECT 'it\'s a test'"
tokens = list(lexer.tokenize(sql))
assert tokens[2] == (T.Literal.String.Single, "'it\\'s a test'")
Loading