Overview
Alienizer is a character substitution tool: It replaces Latin letters with characters from other writing systems — depending on the settings, the changes may be barely noticeable or clearly visible.
What Happens
The Latin alphabet shares its forms with many other scripts. A Cyrillic "а" (U+0430) and a Latin "a" look identical — but they are two different characters from two different cultures. A Greek "ο" and an Armenian "ο" are very similar; a Georgian "ო" already looks somewhat different. And of course there are characters like the Devanagari म, the Ethiopic ም, or the Adlam 𞤃 — all carrying the same sound but visually occupying entirely different dimensions.
Alienizer exploits these so-called homoglyphs in script: The program searches the entire Unicode character space — over 150,000 encoded characters from more than 150 writing systems — for characters that resemble a Latin letter (or carry the same sound, more on that below), ranks them by degree of similarity to the source text, and substitutes accordingly.
The substitution of Latin letters with visually similar equivalents is also the basis of the IDN homograph attack, in which deceptively authentic domain names are registered for phishing. Alienizer uses this same mechanism not for deception, but to call into question the apparent stability of the relationship between sound and sign, word and meaning — in an era increasingly shaped by machine-generated text.
What You Can Do With It
Transform and export text. Download the transformed text as a file or copy it to the clipboard — for use in other programs, as a print template, or for further processing.
Inspect the substituted characters. Hovering over a substituted character displays the Unicode name, the writing system, and the visual distance value — making the otherwise invisible inner workings of the transformation readable.
Create chains of transformations. The result of a transformation can be returned directly to the input field and run through the Alienizer again. With each pass, additional substitutable characters are captured; the text moves further from its origin step by step.
Glitch text. A transformed text can also be transferred via copy-paste to another environment — a chat, a document, a website — and reused there, where its altered encoding status produces further effects: failed text searches, unexpected language recognition behavior, disruptions in automatic text processing.
Combine presets and custom settings. The presets (Subtle, Threshold, Drastic, Total) define clearly distinct levels of substitution; the sliders allow continuous control over substitution rate, visual distance, and script selection. Additionally, you can choose which writing systems to work with.
Alienize your own web. The extension (Chrome) or add-on (Mozilla Firefox) alienizes any text page displayed in the browser. No data is transmitted here either — the program runs entirely in the local browser.
Which Scripts Are Used and How
Alienizer currently supports 46 writing systems: European alphabets (Cyrillic, Greek, Armenian, Georgian), South Asian syllabic scripts (Devanagari, Bengali, Tamil, Malayalam), East Asian scripts (Hiragana, Katakana), Semitic alphabets (Arabic, Hebrew, Syriac), African scripts (Ethiopic, Vai, Bamum, N'Ko, Adlam, Meroitic), and others.
To compare visual similarity, the included characters were pre-rendered as 48×48-pixel bitmaps by a Python program and compared against all Latin letters. The smaller the pixel distance value, the more similar the character. This distance value ranges from 0.0 (pixel-identical) to 1.0 (completely different). The tolerance threshold can be freely adjusted on the website: narrow for characters that look like perfect copies; wider for characters that begin to reveal their foreign origin.
However, many scripts — Devanagari, Ethiopic, Arabic, Adlam, Vai, Meroitic, and those from East Asia — look so different from the Latin alphabet that they share no obvious visual connection. Their Unicode names, however, reveal a sound: "DEVANAGARI LETTER KA," "ETHIOPIC SYLLABLE MA," "ADLAM SMALL LETTER MIIM." This information is systematically evaluated: A Latin "m" can be replaced by a Devanagari म, a Hebrew מ, or an Adlam 𞤃. This phonetic substitution is only applied from a certain level of alienization onward ("Threshold" in the presets, and more strongly in "Drastic" and "Total"), since there is obviously no longer any visual connection to the original letter. In this way, the estrangement effect in Alienizer can be gradually increased.
Technical Background
Unicode is the universal standard for text-based digital communication: Every character in every script in the world receives a unique number — a so-called codepoint. The Latin "a" has the codepoint U+0061, the Cyrillic "а" U+0430; visually they are nearly indistinguishable, but to the computer they are two entirely different characters. Digital texts do not consist of images of letters but of sequences of such numbers. Alienizer replaces the codepoints of Latin characters with codepoints from other writing systems whose visual similarity has been systematically measured in advance: A Python program generated similarity tables that the JavaScript algorithm of this site uses at runtime. The changes persist when the text is copied, shared, saved, or processed by machines.
Runs entirely in the browser; no data is transmitted
Precomputed on 167,586 visual similarity pairs from 46 writing systems
Deterministic: Every transformation with the same seed is reproducible
Character inspector: Hovering over a substituted character displays the Unicode name, script, and distance value from the original
The program's source code (freely available under a CC license) and documentation of the Python analysis, including the tables used by Alienizer, can be found at github.com/roloffsimon/alienizer.
Tags
Privacy Practices
🔐 Security Analysis
This extension hasn't been security-scanned yet.