URL Encoding Explained: Percent-Encoding Guide
Every URL you craft passes through percent-encoding. Understand how it works, why it matters for SEO and security, and when to use encodeURIComponent vs encodeURI — with hands-on examples.
What Is URL Encoding?
URL encoding — formally called percent-encoding — is the mechanism for representing characters in a URI that are not allowed or have special meaning. When you see %20 instead of a space, or %26 instead of an ampersand, that's percent-encoding at work. It is defined by RFC 3986 and is fundamental to how the web handles data in URLs.
The process is simple: each byte of the character's UTF-8 representation is written as a percent sign followed by two uppercase hexadecimal digits. A space (0x20) becomes %20. A euro sign (€, which is 0xE2 0x82 0xAC in UTF-8) becomes %E2%82%AC. This ensures any character from any script — Chinese, Arabic, emoji — can be safely embedded in a URL.
Without URL encoding, URLs would break. Imagine a search query containing & — the browser would interpret it as a parameter separator instead of part of the search term. Percent-encoding removes this ambiguity and is the reason billions of URLs function correctly every second.
encodeURIComponent vs encodeURI: When to Use Each
JavaScript provides two built-in functions for URL encoding, and using the wrong one is a common source of bugs. encodeURIComponent is the stricter function: it encodes everything except A-Z, a-z, 0-9, and the characters - _ . ! ~ * ' ( ). Crucially, it also encodes /, ?, #, &, and =, which are structural characters in a URL.
encodeURI is the lenient function: it encodes spaces and non-ASCII characters but deliberately preserves URL-structure characters like /, ?, #, @, and &. This means it is suitable for encoding a complete URL where you want the structure to remain intact, but individual query values must not be encoded with it.
The rule of thumb is straightforward: use encodeURIComponent for any value being placed into a URL — query parameter values, path segments, fragment identifiers. Use encodeURI only when you have a fully formed URL string with spaces or non-ASCII characters and want to make it safe without breaking its structure.
A third approach used in form submissions is application/x-www-form-urlencoded, where spaces become + instead of %20. This is the format browsers use when submitting HTML forms via GET or POST. Our tool supports this as the 'Query (+ space)' mode.
Characters That Must Be Encoded
RFC 3986 defines a set of 'unreserved' characters that never need encoding: uppercase and lowercase letters (A-Z, a-z), digits (0-9), and four special characters: hyphen (-), underscore (_), period (.), and tilde (~). Every other character — including spaces, slashes, question marks, and all non-ASCII characters — must be percent-encoded when used outside their reserved purpose.
Reserved characters serve as delimiters in URIs. The colon (:) separates the scheme from the authority, slashes (/) separate path segments, the question mark (?) separates the path from the query string, the ampersand (&) separates query parameters, and the hash (#) marks the fragment. When these characters appear as data rather than delimiters, they must be encoded.
Common examples: a space becomes %20 (or + in form encoding), an ampersand becomes %26, an equals sign becomes %3D, a plus sign becomes %2B, and a forward slash becomes %2F. Understanding this table prevents the subtle data-corruption bugs that arise when a value containing & is mistakenly treated as a parameter separator.
URL Encoding and SEO: Why It Matters
Search engines crawl and index URLs, and poorly encoded URLs create real SEO problems. If a URL contains un-encoded spaces or special characters, search engine crawlers may fail to fetch the page or may index a malformed version of the URL. Google's own documentation recommends using hyphens in URLs and properly encoding any special characters.
Canonical URLs — the definitive URL for a page — must be properly encoded. If your canonical tag contains a raw space while the actual URL uses %20, Google may treat them as two different pages, splitting your ranking signals. Consistent percent-encoding ensures that canonical URLs, hreflang tags, sitemap URLs, and Open Graph URLs all resolve to the same resource.
For international SEO, URL encoding is critical. If your site targets multiple languages and uses non-ASCII characters in URLs (e.g., /products/café), those characters must be percent-encoded in sitemaps, href attributes, and API calls even if modern browsers display the decoded version in the address bar.
Building Query Strings Correctly
A query string is the portion of a URL following the question mark. It consists of key-value pairs joined by ampersands: ?name=John%20Doe&city=New%20York&page=1. Both keys and values should be individually encoded with encodeURIComponent to prevent injection of unintended parameters.
A common mistake is encoding the entire query string at once with encodeURIComponent, which would encode the & and = delimiters themselves, producing an invalid query string. Instead, encode each key and value separately, then join them: encodeURIComponent(key) + '=' + encodeURIComponent(value).
Modern JavaScript provides the URLSearchParams API, which handles encoding automatically. You can append parameters with params.append('q', 'hello world') and toString() produces q=hello+world using the form-encoding convention. For stricter RFC 3986 compliance, manually encoding with encodeURIComponent remains the gold standard.
Zutily's Query String Builder tab lets you visually add, edit, and remove parameters, parse existing URLs, and generate the properly encoded result — no manual encoding required.
Double Encoding: The Silent Bug
Double encoding occurs when an already-encoded string is encoded again. For example, a space encoded to %20 gets encoded again to %2520 (because % becomes %25). This is one of the most common and frustrating URL-related bugs, and it can break API calls, deep links, redirects, and analytics tracking.
The fix is disciplined encoding at the point of construction: encode raw values once when building the URL, and never re-encode a URL that's already been assembled. If you receive a URL from an external source and aren't sure whether it's encoded, decode it first with decodeURIComponent, then re-encode if necessary.
In server-side frameworks (Express, Django, Rails, Next.js), be aware that request parameters are often automatically decoded. If you then insert those values into a new URL without re-encoding, special characters will break the URL. Always encode when constructing, decode when reading.
URL Encoding in Different Languages
Every major programming language provides URL encoding utilities. In JavaScript, use encodeURIComponent. In Python, use urllib.parse.quote (component-level) or urllib.parse.urlencode (query strings). In Java, use URLEncoder.encode with UTF-8 charset. In PHP, use urlencode (+ for spaces) or rawurlencode (%20 for spaces).
The critical detail across all languages is specifying UTF-8 as the character encoding. Legacy systems sometimes default to Latin-1 or Windows-1252, which produces incorrect percent sequences for any non-ASCII character. Always explicitly set UTF-8 to ensure consistent results across platforms and languages.
Zutily's online URL Encoder/Decoder uses JavaScript's native encodeURIComponent, encodeURI, and their decode counterparts, all of which operate on UTF-8 by default. All processing runs entirely in your browser — no data is sent to any server.
Common Pitfalls and Best Practices
Never use encodeURI to encode a query parameter value. Because encodeURI preserves &, =, and ?, a value like 'a=b&c=d' would be left intact, injecting two unintended parameters. Always use encodeURIComponent for individual values.
Be careful with path segments containing slashes. If a user's input is 'AC/DC' and you need it in a URL path, you must encode it to AC%2FDC. If you use encodeURI, the slash will be preserved, creating an unexpected path segment.
Watch out for the plus sign. In query strings, + traditionally represents a space (from form encoding), but in path segments, + is literal. If your API sends a + in a query value meaning an actual plus sign, you need %2B. This inconsistency catches many developers.
Test with international characters. A URL that works with ASCII input may silently break with emoji, Chinese characters, or Arabic text. Always test your encoding with multi-byte UTF-8 characters to catch encoding issues early.
Try the Tools Mentioned
Free, instant, and private — right in your browser.