# Letter frequency analyzer

World's simplest crypto tool
This browser-based utility analyzes the frequency of occurrence of letters in the given plaintext or ciphertext and prints the letter counts, percentages, and totals in the output. It can also analyze the distribution of bigrams (pairs of letters), trigrams (triplets of letters), and n-grams (n consecutive letters). Created by cryptonerds from team Browserling.

Look what we made!
Browserling

Check out our project Browserling – your personal cloud browser.

A link to this tool, including input, options and all chained tools.

Can't convert.

## What is a letter frequency analyzer?

learn more about this tool
This online program performs the letter frequency analysis and displays the letter distribution of any text. Frequency analysis is a cryptanalysis technique that focuses on counting letters or groups of letters in the encrypted text to aid in breaking it. Frequency analysis assumes that in sufficiently large texts of one language, the average frequency of occurrence of a certain letter is the same. For example, in most English texts, the letter "e" appears most often (12.7%), followed by the letters "t" (9.06%), "a" (8.17%), and "o" (7.5%). The least common letters in English are "x" (0.15%), "q" (0.095%), and "z" (0.074%). In Finnish, the dominant letters are "a" (12.2%), "i" (10.8%), and "n" (8.8%), and in Dutch, major letters are "e" (18.9%), "n" (10%), and "a" (7.5%). Frequency analysis can also find distributions of pairs of letters (bigrams), triplets of letters (trigrams), and longer letter groups (n-grams). Using this statistical information, it's easy to identify the language a text is written in. For example, it's been calculated that the most repeated bigrams in the English language are "th" (1.52%), "he" (1.25%), and "in" (0.94%). You can specify the length of a letter group in the options and examine the n-gram distribution of any language. When generating bigrams, trigrams, or 4+grams, you have three choices that control how they are generated. The "Join Words Together" option will remove all spacing and special symbols between all words and then split it into n-grams. For example, if you are generating bigrams and for the input data "hi world", then this string will be joined together as "hiworld", and split into bigrams "hi, iw, wo, or, rl, ld". If the option "Stop at Word Boundary" is selected, then the n-gram generator will stop at the first non-word character. If the input is plaintext "hi world" and you're generating bigrams, then the output will be "hi, wo, or, rl, ld". When generating n-grams, all words with fewer characters than n will be dropped. With the same input "hi world" and now with trigrams, the word "hi" will be dropped, and you will get "wor, orl, rld". The last option, "Include Word Boundaries" will convert spaces and special symbols in the generated n-gram list to the "·" symbol. For example, if the input is "hi world" and bigrams are being computed, then the output will be "hi, i·, ·w, wo, or, rl, ld". If there are multiple special symbols in a row, they will be ignored and only one "·" will appear in the output. For example, trigrams with the input "hello (world)" will produce "hel, ell, llo, lo·, o·w, ·wo, wor, orl, rld, ld·". Sometimes, you need to analyze more symbols than just letters. We thought of that and added the "Analyze All Symbols" option. When it's selected, you will get the statistics of numbers, punctuation marks, and any other glyphs that are used in the text. By default, the program displays only the number of occurrences of letters (or groups of letters) but you can also display percentages or totals in the count. You can also rearrange the output of the analysis and sort the counts by frequency (highest to lowest) or alphabetically (letters from a to z). By default, the frequency analysis algorithm ignores the case of letters and assumes that the letters "A" and "a" are the same. If you require case sensitive analysis, then you can turn off the ignore-case option and count the frequency of letters "A" and "a" separately. Cryptabulous!

## Letter frequency analyzer examples

Click to useCounting English Letters

In this example, we use the frequency analysis method to find the number of letters in a text fragment about Cubism. We select the option to sort letters alphabetically and display them together with the frequency counts. According to empirical statistical data, the most common letters of the English language are "e" (12.7%), "t" (9.1%), and "a" (8.2%). After running the frequency analysis algorithm on this text, we find the following statistics: "e" appears 35 times, which is 13.1%, "t" appears 33 times, which is 12.3%, "i" appears 20 times, which is 7.5%, and "a" appears 19 times, which is 7.1%. The counted letter data is pretty close to the typical letter distribution. Sometimes, it differs by a couple of percent because the empirical statistics are calculated from a corpus of all available English text but here we have just a small fragment of text.

Cubism is an artistic movement that emerged during the early 20th century. In Cubist artwork, objects are analyzed, broken up, and reassembled in an abstracted form. Instead of depicting objects from a single viewpoint, the artist depicts the subject from a multitude of viewpoints to represent the subject in a greater context.

a: 19
b: 9
c: 12
d: 10
e: 35
f: 5
g: 5
h: 6
i: 20
j: 4
k: 2
l: 5
m: 9
n: 18
o: 14
p: 6
r: 19
s: 17
t: 33
u: 9
v: 3
w: 3
x: 1
y: 3
z: 1

**Required options**

Get distribution of letters only.

Number of letters or symbols
in a group.

Consider uppercase and
lowercase letters equal.

Frequency Analysis of Bigrams

In this example, we analyze the occurrences of letter pairs (letter bigrams) in the text. To do this, we set the length of the letter group to two. We also choose the "Stop at Word Boundary" bigram generation mode, which means that the words are disjoint and the last letter of the current word is not joined with the first letter of the coming word. As bigrams always have two letters, single letter words, such as "a" or "i" are skipped altogether. When we finish counting bigrams, we display their counts as fractions of the total count. The order of bigrams in the output is not sorted and is left in the order as they appear in the text. After the analysis, we can check the results and we see that the two most frequent letter pairs in English are "th" and "he", which correspond to statistical studies of the English language.

"Gatsby believed in the green light, the orgastic future that year by year recedes before us. It eluded us then, but that’s no matter — tomorrow we will run faster, stretch out our arms farther… And then one fine morning — So we beat on, boats against the current, borne back ceaselessly into the past." — F. Scott Fitzgerald, The Great Gatsby

ga: 4 (4/206)
at: 8 (8/206)
ts: 3 (3/206)
sb: 2 (2/206)
by: 3 (3/206)
be: 3 (3/206)
el: 3 (3/206)
li: 2 (2/206)
ie: 1 (1/206)
ev: 1 (1/206)
ve: 1 (1/206)
ed: 3 (3/206)
in: 5 (5/206)
th: 10 (10/206)
he: 8 (8/206)
gr: 2 (2/206)
re: 7 (7/206)
ee: 1 (1/206)
en: 4 (4/206)
ig: 1 (1/206)
gh: 1 (1/206)
ht: 1 (1/206)
or: 5 (5/206)
rg: 1 (1/206)
as: 4 (4/206)
st: 5 (5/206)
ti: 1 (1/206)
ic: 1 (1/206)
fu: 1 (1/206)
ut: 3 (3/206)
tu: 1 (1/206)
ur: 3 (3/206)
ha: 2 (2/206)
ye: 2 (2/206)
ea: 5 (5/206)
ar: 4 (4/206)
ec: 1 (1/206)
ce: 2 (2/206)
de: 2 (2/206)
es: 2 (2/206)
ef: 1 (1/206)
fo: 1 (1/206)
us: 2 (2/206)
it: 2 (2/206)
lu: 1 (1/206)
ud: 1 (1/206)
bu: 1 (1/206)
no: 1 (1/206)
ma: 1 (1/206)
tt: 2 (2/206)
te: 2 (2/206)
er: 4 (4/206)
to: 2 (2/206)
om: 1 (1/206)
mo: 2 (2/206)
rr: 2 (2/206)
ro: 1 (1/206)
ow: 1 (1/206)
we: 2 (2/206)
wi: 1 (1/206)
il: 1 (1/206)
ll: 1 (1/206)
ru: 1 (1/206)
un: 1 (1/206)
fa: 2 (2/206)
tr: 1 (1/206)
et: 1 (1/206)
tc: 1 (1/206)
ch: 1 (1/206)
ou: 2 (2/206)
rm: 1 (1/206)
ms: 1 (1/206)
rt: 1 (1/206)
an: 1 (1/206)
nd: 1 (1/206)
on: 2 (2/206)
ne: 3 (3/206)
fi: 2 (2/206)
rn: 2 (2/206)
ni: 1 (1/206)
ng: 1 (1/206)
so: 1 (1/206)
bo: 2 (2/206)
oa: 1 (1/206)
ag: 1 (1/206)
ai: 1 (1/206)
ns: 1 (1/206)
cu: 1 (1/206)
nt: 2 (2/206)
ba: 1 (1/206)
ac: 1 (1/206)
ck: 1 (1/206)
se: 1 (1/206)
le: 1 (1/206)
ss: 1 (1/206)
sl: 1 (1/206)
ly: 1 (1/206)
pa: 1 (1/206)
sc: 1 (1/206)
co: 1 (1/206)
ot: 1 (1/206)
tz: 1 (1/206)
zg: 1 (1/206)
ge: 1 (1/206)
ra: 1 (1/206)
al: 1 (1/206)
ld: 1 (1/206)

**Required options**

Get distribution of letters only.

Number of letters or symbols
in a group.

Don't glue characters of different
words together in groups.

Consider uppercase and
lowercase letters equal.

Spanish Alphabet Distribution

In this example, we wanted to check how the statistical analysis of letters works for other languages and instead of English text, we loaded Spanish text. In the Spanish language, the letter distributions are slightly different than in English and the first three most used letters are "e" (12.2%), "a" (11.5%), and "o" (8.7%). To make it easier to compare the results, we display the letters together with frequency percentages and sort them from the highest percentage to the smallest percentage. In the output, we get the following statistics: 1st place — letter "e" (13.05%), 2nd place — letter "a" (10.59%), 3rd place — letter "o" (9.85%). These stats strongly suggest that the text is written in Spanish. Quick note: if we had used letter bigrams, then they would confirm the input language even stronger with near 100% accuracy.

Saturno
Hablando de Saturno, es el que sigue en tamaño a Júpiter, más o menos 9 Tierras son las que se necesitarían para medirlo. Aunque tan grandulón, es el más ligero de todos los planetas. Los años de Saturno corresponden a 29.5 años en la Tierra. Saturno fue visto por Galileo en 1610. Y es el planeta que por su inclinación en el ecuador tiene 4 estaciones igual que nosotros acá en la Tierra. Los anillos de Saturno están compuestos por roca y hielo. Tienen aproximadamente 150.000 millas de diámetro y son muy delgados.

e: 53 (13.05%)
a: 43 (10.59%)
o: 40 (9.85%)
s: 38 (9.36%)
n: 35 (8.62%)
r: 27 (6.65%)
l: 26 (6.4%)
t: 24 (5.91%)
i: 22 (5.42%)
u: 19 (4.68%)
d: 15 (3.69%)
m: 11 (2.71%)
p: 10 (2.46%)
c: 9 (2.22%)
g: 6 (1.48%)
q: 5 (1.23%)
á: 5 (1.23%)
y: 4 (0.99%)
ñ: 3 (0.74%)
h: 2 (0.49%)
ó: 2 (0.49%)
b: 1 (0.25%)
j: 1 (0.25%)
ú: 1 (0.25%)
í: 1 (0.25%)
f: 1 (0.25%)
v: 1 (0.25%)
x: 1 (0.25%)

**Required options**

Get distribution of letters only.

Number of letters or symbols
in a group.

Consider uppercase and
lowercase letters equal.

Cracking a Cryptogram

In this example, we load an encrypted message (a cryptogram) in the input field and use our frequency analysis program to try to decode it. We select the analysis mode that calculates the statistics for all characters (not just letters but also numbers and punctuation symbols), make analysis case-sensitive (by deactivating the ignore-case option), and select the display-percentages statistics output mode. The first four major symbols (excluding a space) are "6", "E", "2", and "@". If we assume that the plain text message was written in English and compare these symbols with the most frequent letters of the English language, we get that "6" = "e", "E" = "t", "2" = "a", and "@" = "o". After a little bit of pondering and looking at the ASCII code points of these symbols, we notice that they are shifted by 47 positions in the ASCII character table. That gives us a clue that the text could be encoded with the ROT47 cipher. Puzzle solved! Use the ROT47 substitution cipher to decrypt the secret message.

E96 D64C6E 4@56 :D Ighf 7@==@H65 3J Ifb`]
2AA=J E9:D 4@56 EH:46]
2==@H65 6?ECJ E:>6i e2> E@ f2> FE4]
2=A92 D:6CC2 E2?8@

⎵: 18 (15.13%)
6: 11 (9.24%)
E: 9 (7.56%)
2: 8 (6.72%)
@: 7 (5.88%)
=: 6 (5.04%)
4: 5 (4.2%)
:: 5 (4.2%)
D: 4 (3.36%)
C: 4 (3.36%)
5: 4 (3.36%)
9: 3 (2.52%)
f: 3 (2.52%)
H: 3 (2.52%)
J: 3 (2.52%)
]: 3 (2.52%)
↵: 3 (2.52%)
A: 3 (2.52%)
>: 3 (2.52%)
I: 2 (1.68%)
?: 2 (1.68%)
g: 1 (0.84%)
h: 1 (0.84%)
7: 1 (0.84%)
3: 1 (0.84%)
b: 1 (0.84%)
`: 1 (0.84%)
i: 1 (0.84%)
e: 1 (0.84%)
F: 1 (0.84%)
8: 1 (0.84%)

**Required options**

Get distribution of all symbols.

Number of letters or symbols
in a group.

Consider uppercase and
lowercase letters equal.

Analyze Birth Years

In this example, we paste a list of people and their birth years in the input and use the "Analyze All Symbols" option to find the most popular years they were born in. We set the character group length to 4 (a fourgram), which equals the number of digits in a year. When forming fourgrams, we stop at the end of each word, so that peoples' names and birth years are analyzed separately. In the resulting statistics, we find that the year 1995 appears most often (4 times), followed by the years 1996 (3 times), 2000 (3 times), and 1985 (2 times).

Oliver 1995
Bella 1996
Billy 1994
Jorge 1990
Helen 1995
Lyla 2000
Adam 1985
Karla 1995
Lukas 1992
Cali 1995
Hugo 2000
Mira 1996
Kate 1985
Axel 2000
Ariel 1996

1995: 4
1996: 3
2000: 3
1985: 2
oliv: 1
live: 1
iver: 1
bell: 1
ella: 1
bill: 1
illy: 1
1994: 1
jorg: 1
orge: 1
1990: 1
hele: 1
elen: 1
lyla: 1
adam: 1
karl: 1
arla: 1
luka: 1
ukas: 1
1992: 1
cali: 1
hugo: 1
mira: 1
kate: 1
axel: 1
arie: 1
riel: 1

**Required options**

Get distribution of all symbols.

Number of letters or symbols
in a group.

Don't glue characters of different
words together in groups.

Consider uppercase and
lowercase letters equal.

Pro tips
Master online crypto tools

You can pass input to this tool via

__?input__query argument and it will automatically compute output. Here's how to type it in your browser's address bar. Click to try!
https://onlinecryptotools.com/analyze-letter-frequency

__?input__=Cubism%20is%20an%20artistic%20movement%20that%20emerged%20during%20the%20early%2020th%20century.%20In%20Cubist%20artwork%2C%20objects%20are%20analyzed%2C%20broken%20up%2C%20and%20reassembled%20in%20an%20abstracted%20form.%20Instead%20of%20depicting%20objects%20from%20a%20single%20viewpoint%2C%20the%20artist%20depicts%20the%20subject%20from%20a%20multitude%20of%20viewpoints%20to%20represent%20the%20subject%20in%20a%20greater%20context.&group-length=1&analyze-letters-only=true&type=print-count&sort=sort-by-alphabet&ignore-case=true
All crypto tools

Quickly encrypt data using the ROT13 substitution cipher.

Quickly decrypt data that was encrypted with ROT13.

Quickly encrypt data using the ROT47 substitution cipher.

Quickly undo ROT47 encryption and find the original message.

Quickly count letters in the given text and print their distribution.

Quickly count words in the given text and print their distribution.

Coming soon
These crypto tools are on the way

ROT-2 Encrypt/Decrypt

Encrypt and decrypt numbers using ROT2 cypher.

ROT-5 Encrypt/Decrypt

Encrypt and decrypt numbers using ROT5 cypher.

ROT-n Encrypt/Decrypt

Encrypt and decrypt numbers using custom ROT cypher.

Base26 Encode/Decode

Encode and decode data to/from base26 encoding.

Base32 Encode/Decode

Encode and decode data to/from base32 encoding.

Base58 Encode/Decode

Encode and decode data to/from base58 encoding.

Base62 Encode/Decode

Encode and decode data to/from base62 encoding.

Base64 Encode/Decode

Encode and decode data to/from base64 encoding.

ASCII85 Encode/Decode

Encode and decode data to/from ASCII85 encoding.

XOR Encrypt/Decrypt

Encrypt and decrypt data using the XOR algorithm.

AES Encrypt/Decrypt

Encrypt and decrypt data using the AES algorithm.

RC2 Encrypt/Decrypt

Encrypt and decrypt data using the RC2 algorithm.

RC4 Encrypt/Decrypt

Encrypt and decrypt data using the RC4 algorithm.

RC5 Encrypt/Decrypt

Encrypt and decrypt data using the RC5 algorithm.

RC6 Encrypt/Decrypt

Encrypt and decrypt data using the RC6 algorithm.

Akelarre Encrypt/Decrypt

Encrypt and decrypt data using the Akelarre algorithm.

Ake98 Encrypt/Decrypt

Encrypt and decrypt data using the Ake98 algorithm.

DES Encrypt/Decrypt

Encrypt and decrypt data using the DES algorithm.

Triple DES Encrypt/Decrypt

Encrypt and decrypt data using the 3DES algorithm.

Rabbit Encrypt/Decrypt

Encrypt and decrypt data using the Rabbit cipher.

Blowfish Encrypt/Decrypt

Encrypt and decrypt data using the Blowfish cipher.

Twofish Encrypt/Decrypt

Encrypt and decrypt data using the Twofish cipher.

KASUMI Encrypt/Decrypt

Encrypt and decrypt data using the KASUMI cipher.

Serpent Encrypt/Decrypt

Encrypt and decrypt data using the Serpent cipher.

Solitaire Encrypt/Decrypt

Encrypt and decrypt data using the Solitaire cipher.

Break a Substitution Cipher

Decode text encrypted with a simple substitution cipher.

Generate Random Numbers

Print a list of random numbers.

Generate Primes

Print a list of prime numbers.

Generate Twin Primes

Print a list of twin prime numbers.

Generate Cousin Primes

Print a list of cousin prime numbers.

Generate Prime Triplets

Print a list of prime triplets.

Generate Random Primes

Print a list of random prime numbers.

Check Prime Numbers

Check if the given numbers are prime numbers.

Create Random Passwords

Generate one or more random passwords.

Replace Message Alphabet

Rewrite the given message with a new alphabet.

Calculate Text Entropy

Compute the Shannon entropy of the given message.

Generate High Entropy Data

Create text with high Shannon entropy.

Generate Low Entropy Data

Create text with low Shannon entropy.

Analyze Random Data

Perform statistical analysis of random data.

Subscribe!
Never miss an update

Cool!

Notifications
We'll let you know when we add this tool

Cool!