Question 1

Why romanize CJK instead of keeping the Unicode characters?

Accepted Answer

Browsers handle Unicode URLs fine, but everything downstream might not. Server logs, Slack snippets, copy-pasted links in email, analytics dashboards, monitoring tools, and many CLI utilities mangle non-ASCII paths or display them as percent-encoded gibberish (`%EC%84%9C%EC%9A%B8`). ASCII slugs survive every hop. The cost is a slight loss of "scannability" in the URL bar; the win is logs and dashboards stay readable. Some teams accept the gibberish trade-off and keep CJK URLs; others go the romanization route. Pick whichever pain you can absorb.

Question 2

Does this strip the accent from `café` or keep it?

Accepted Answer

Stripped. The tool runs Unicode NFKD normalization, which decomposes `é` into the base letter `e` plus a combining acute accent (U+0301), then removes all combining marks. `naïve` → `naive`, `crème brûlée` → `creme-brulee`. This matches what most CMS slug generators do (WordPress, Hugo, Jekyll). If you need accent-preserving slugs the URL has to stay Unicode; pure-ASCII rules and accent preservation are mutually exclusive.

Question 3

How long should a slug be?

Accepted Answer

Aim for 3–5 meaningful words, soft cap 60–75 characters. Google's SEO documentation does not give a hard limit but advises "short, descriptive". Search snippets truncate URLs visually at around 60 characters; logs and dashboards display the full slug fine, but a 200-character path looks spammy in social previews and is hard to share by voice. WordPress defaults to no cap; Hugo, Jekyll, and most static-site frameworks also accept long slugs but recommend keeping titles concise.

Question 4

Why does Kanji come out as Pinyin instead of Japanese readings?

Accepted Answer

Mapping a Kanji to its Japanese reading requires a dictionary lookup — `日` could be `nichi`, `hi`, `jitsu`, or part of a compound like `nihon` — and the right answer depends on context. Without an embedded morphological analyzer (kuromoji, MeCab) the tool would have to ship megabytes of dictionary data, so it falls back to per-character romanization based on the Unicode CJK Unified Ideographs block, which yields the Pinyin-ish form (`日` → `ri`). For Japanese-heavy titles, hand-write the romaji in the slug field or use a CMS plugin with a dictionary backing.

Question 5

Apostrophes — why does `don't` become `dont` instead of `don-t`?

Accepted Answer

Apostrophes are dropped without inserting a separator because the surrounding letters belong to one word, not two. `don-t-think` reads awkwardly and breaks word recognition; `dont-think` matches what readers expect. Most slug libraries do the same. If your style guide requires the apostrophe split for some reason, post-process the output with a single find-replace.

Question 6

Can I add stop-word removal (`the`, `a`, `is`, …)?

Accepted Answer

Not built in — this tool keeps the result close to the input. Stop-word removal is opinionated (which words count?) and language-specific, and shortening "10 ways to improve your SEO" to "10-ways-improve-your-seo" trades a few characters for slightly worse readability. Most SEO experts now advise leaving short stop words in the slug. If you really want to strip them, run the output through a quick sed or hand-edit; the tool does no harm by preserving them.

URL Slug Generator

How to use

Examples

Korean title → romanized ASCII slug

Japanese mixed title → Hepburn slug

Latin title with diacritics, length cap

FAQ

Why romanize CJK instead of keeping the Unicode characters?

Does this strip the accent from `café` or keep it?

How long should a slug be?

Why does Kanji come out as Pinyin instead of Japanese readings?

Apostrophes — why does `don't` become `dont` instead of `don-t`?

Can I add stop-word removal (`the`, `a`, `is`, …)?

Related concepts

Related articles

Percent-encoding: reserved characters and the double-encoding bug

Slugifying URLs: Unicode, diacritics, and collisions

Related tools

Korean → English Address Converter

Text Counter

Case Converter

Lorem Ipsum Generator