᠎᠎᠎          
51K

Letter frequency analyzer

World's simplest crypto tool
This browser-based utility analyzes the frequency of occurrence of letters in the given plaintext or ciphertext and prints the letter counts, percentages, and totals in the output. It can also analyze the distribution of bigrams (pairs of letters), trigrams (triplets of letters), and n-grams (n consecutive letters). Created by cryptonerds from team Browserling.
We created a cloud browser! Browserling
Check out our project Browserling – anonymous cloud browser.
Analysis Mode and Length
Get distribution of letters only.
Get distribution of all symbols.
Number of letters or symbols in a group.
Enter "1" to analyze individual letters or symbols, enter "2" to analyze pairs of letters or symbols (bigrams), enter "3" to analyze triplets of letters or symbols (trigrams), etc.
Bigrams, Trigrams, N-grams
These options work only for letter or symbol group lengths 2 and up.
Glue characters of different words together in groups. (A word is anything separated by one or more spaces, tabs, or newlines.)
Don't glue characters of different words together in groups.
Mark the word boundary using a middle dot symbol.
Format, Sort, and Case
Select statistics output format.
Choose how to sort the output.
Consider uppercase and lowercase letters equal.

What is a letter frequency analyzer?

learn more about this tool
This online program performs the letter frequency analysis and displays the letter distribution of any text. Frequency analysis is a cryptanalysis technique that focuses on counting letters or groups of letters in the encrypted text to aid in breaking it. Frequency analysis assumes that in sufficiently large texts of one language, the average frequency of occurrence of a certain letter is the same. For example, in most English texts, the letter "e" appears most often (12.7%), followed by the letters "t" (9.06%), "a" (8.17%), and "o" (7.5%). The least common letters in English are "x" (0.15%), "q" (0.095%), and "z" (0.074%). In Finnish, the dominant letters are "a" (12.2%), "i" (10.8%), and "n" (8.8%), and in Dutch, major letters are "e" (18.9%), "n" (10%), and "a" (7.5%). Frequency analysis can also find distributions of pairs of letters (bigrams), triplets of letters (trigrams), and longer letter groups (n-grams). Using this statistical information, it's easy to identify the language a text is written in. For example, it's been calculated that the most repeated bigrams in the English language are "th" (1.52%), "he" (1.25%), and "in" (0.94%). You can specify the length of a letter group in the options and examine the n-gram distribution of any language. When generating bigrams, trigrams, or 4+grams, you have three choices that control how they are generated. The "Join Words Together" option will remove all spacing and special symbols between all words and then split it into n-grams. For example, if you are generating bigrams and for the input data "hi world", then this string will be joined together as "hiworld", and split into bigrams "hi, iw, wo, or, rl, ld". If the option "Stop at Word Boundary" is selected, then the n-gram generator will stop at the first non-word character. If the input is plaintext "hi world" and you're generating bigrams, then the output will be "hi, wo, or, rl, ld". When generating n-grams, all words with fewer characters than n will be dropped. With the same input "hi world" and now with trigrams, the word "hi" will be dropped, and you will get "wor, orl, rld". The last option, "Include Word Boundaries" will convert spaces and special symbols in the generated n-gram list to the "·" symbol. For example, if the input is "hi world" and bigrams are being computed, then the output will be "hi, i·, ·w, wo, or, rl, ld". If there are multiple special symbols in a row, they will be ignored and only one "·" will appear in the output. For example, trigrams with the input "hello (world)" will produce "hel, ell, llo, lo·, o·w, ·wo, wor, orl, rld, ld·". Sometimes, you need to analyze more symbols than just letters. We thought of that and added the "Analyze All Symbols" option. When it's selected, you will get the statistics of numbers, punctuation marks, and any other glyphs that are used in the text. By default, the program displays only the number of occurrences of letters (or groups of letters) but you can also display percentages or totals in the count. You can also rearrange the output of the analysis and sort the counts by frequency (highest to lowest) or alphabetically (letters from a to z). By default, the frequency analysis algorithm ignores the case of letters and assumes that the letters "A" and "a" are the same. If you require case sensitive analysis, then you can turn off the ignore-case option and count the frequency of letters "A" and "a" separately. Cryptabulous!

Letter frequency analyzer examples

Click to use
Counting English Letters
In this example, we use the frequency analysis method to find the number of letters in a text fragment about Cubism. We select the option to sort letters alphabetically and display them together with the frequency counts. According to empirical statistical data, the most common letters of the English language are "e" (12.7%), "t" (9.1%), and "a" (8.2%). After running the frequency analysis algorithm on this text, we find the following statistics: "e" appears 35 times, which is 13.1%, "t" appears 33 times, which is 12.3%, "i" appears 20 times, which is 7.5%, and "a" appears 19 times, which is 7.1%. The counted letter data is pretty close to the typical letter distribution. Sometimes, it differs by a couple of percent because the empirical statistics are calculated from a corpus of all available English text but here we have just a small fragment of text.
Cubism is an artistic movement that emerged during the early 20th century. In Cubist artwork, objects are analyzed, broken up, and reassembled in an abstracted form. Instead of depicting objects from a single viewpoint, the artist depicts the subject from a multitude of viewpoints to represent the subject in a greater context.
a: 19 b: 9 c: 12 d: 10 e: 35 f: 5 g: 5 h: 6 i: 20 j: 4 k: 2 l: 5 m: 9 n: 18 o: 14 p: 6 r: 19 s: 17 t: 33 u: 9 v: 3 w: 3 x: 1 y: 3 z: 1
Required options
These options will be used automatically if you select this example.
Get distribution of letters only.
Number of letters or symbols in a group.
Select statistics output format.
Choose how to sort the output.
Consider uppercase and lowercase letters equal.
Frequency Analysis of Bigrams
In this example, we analyze the occurrences of letter pairs (letter bigrams) in the text. To do this, we set the length of the letter group to two. We also choose the "Stop at Word Boundary" bigram generation mode, which means that the words are disjoint and the last letter of the current word is not joined with the first letter of the coming word. As bigrams always have two letters, single letter words, such as "a" or "i" are skipped altogether. When we finish counting bigrams, we display their counts as fractions of the total count. The order of bigrams in the output is not sorted and is left in the order as they appear in the text. After the analysis, we can check the results and we see that the two most frequent letter pairs in English are "th" and "he", which correspond to statistical studies of the English language.
"Gatsby believed in the green light, the orgastic future that year by year recedes before us. It eluded us then, but that’s no matter — tomorrow we will run faster, stretch out our arms farther… And then one fine morning — So we beat on, boats against the current, borne back ceaselessly into the past." — F. Scott Fitzgerald, The Great Gatsby
ga: 4 (4/206) at: 8 (8/206) ts: 3 (3/206) sb: 2 (2/206) by: 3 (3/206) be: 3 (3/206) el: 3 (3/206) li: 2 (2/206) ie: 1 (1/206) ev: 1 (1/206) ve: 1 (1/206) ed: 3 (3/206) in: 5 (5/206) th: 10 (10/206) he: 8 (8/206) gr: 2 (2/206) re: 7 (7/206) ee: 1 (1/206) en: 4 (4/206) ig: 1 (1/206) gh: 1 (1/206) ht: 1 (1/206) or: 5 (5/206) rg: 1 (1/206) as: 4 (4/206) st: 5 (5/206) ti: 1 (1/206) ic: 1 (1/206) fu: 1 (1/206) ut: 3 (3/206) tu: 1 (1/206) ur: 3 (3/206) ha: 2 (2/206) ye: 2 (2/206) ea: 5 (5/206) ar: 4 (4/206) ec: 1 (1/206) ce: 2 (2/206) de: 2 (2/206) es: 2 (2/206) ef: 1 (1/206) fo: 1 (1/206) us: 2 (2/206) it: 2 (2/206) lu: 1 (1/206) ud: 1 (1/206) bu: 1 (1/206) no: 1 (1/206) ma: 1 (1/206) tt: 2 (2/206) te: 2 (2/206) er: 4 (4/206) to: 2 (2/206) om: 1 (1/206) mo: 2 (2/206) rr: 2 (2/206) ro: 1 (1/206) ow: 1 (1/206) we: 2 (2/206) wi: 1 (1/206) il: 1 (1/206) ll: 1 (1/206) ru: 1 (1/206) un: 1 (1/206) fa: 2 (2/206) tr: 1 (1/206) et: 1 (1/206) tc: 1 (1/206) ch: 1 (1/206) ou: 2 (2/206) rm: 1 (1/206) ms: 1 (1/206) rt: 1 (1/206) an: 1 (1/206) nd: 1 (1/206) on: 2 (2/206) ne: 3 (3/206) fi: 2 (2/206) rn: 2 (2/206) ni: 1 (1/206) ng: 1 (1/206) so: 1 (1/206) bo: 2 (2/206) oa: 1 (1/206) ag: 1 (1/206) ai: 1 (1/206) ns: 1 (1/206) cu: 1 (1/206) nt: 2 (2/206) ba: 1 (1/206) ac: 1 (1/206) ck: 1 (1/206) se: 1 (1/206) le: 1 (1/206) ss: 1 (1/206) sl: 1 (1/206) ly: 1 (1/206) pa: 1 (1/206) sc: 1 (1/206) co: 1 (1/206) ot: 1 (1/206) tz: 1 (1/206) zg: 1 (1/206) ge: 1 (1/206) ra: 1 (1/206) al: 1 (1/206) ld: 1 (1/206)
Required options
These options will be used automatically if you select this example.
Get distribution of letters only.
Number of letters or symbols in a group.
Don't glue characters of different words together in groups.
Select statistics output format.
Choose how to sort the output.
Consider uppercase and lowercase letters equal.
Spanish Alphabet Distribution
In this example, we wanted to check how the statistical analysis of letters works for other languages and instead of English text, we loaded Spanish text. In the Spanish language, the letter distributions are slightly different than in English and the first three most used letters are "e" (12.2%), "a" (11.5%), and "o" (8.7%). To make it easier to compare the results, we display the letters together with frequency percentages and sort them from the highest percentage to the smallest percentage. In the output, we get the following statistics: 1st place — letter "e" (13.05%), 2nd place — letter "a" (10.59%), 3rd place — letter "o" (9.85%). These stats strongly suggest that the text is written in Spanish. Quick note: if we had used letter bigrams, then they would confirm the input language even stronger with near 100% accuracy.
Saturno Hablando de Saturno, es el que sigue en tamaño a Júpiter, más o menos 9 Tierras son las que se necesitarían para medirlo. Aunque tan grandulón, es el más ligero de todos los planetas. Los años de Saturno corresponden a 29.5 años en la Tierra. Saturno fue visto por Galileo en 1610. Y es el planeta que por su inclinación en el ecuador tiene 4 estaciones igual que nosotros acá en la Tierra. Los anillos de Saturno están compuestos por roca y hielo. Tienen aproximadamente 150.000 millas de diámetro y son muy delgados.
e: 53 (13.05%) a: 43 (10.59%) o: 40 (9.85%) s: 38 (9.36%) n: 35 (8.62%) r: 27 (6.65%) l: 26 (6.4%) t: 24 (5.91%) i: 22 (5.42%) u: 19 (4.68%) d: 15 (3.69%) m: 11 (2.71%) p: 10 (2.46%) c: 9 (2.22%) g: 6 (1.48%) q: 5 (1.23%) á: 5 (1.23%) y: 4 (0.99%) ñ: 3 (0.74%) h: 2 (0.49%) ó: 2 (0.49%) b: 1 (0.25%) j: 1 (0.25%) ú: 1 (0.25%) í: 1 (0.25%) f: 1 (0.25%) v: 1 (0.25%) x: 1 (0.25%)
Required options
These options will be used automatically if you select this example.
Get distribution of letters only.
Number of letters or symbols in a group.
Select statistics output format.
Choose how to sort the output.
Consider uppercase and lowercase letters equal.
Cracking a Cryptogram
In this example, we load an encrypted message (a cryptogram) in the input field and use our frequency analysis program to try to decode it. We select the analysis mode that calculates the statistics for all characters (not just letters but also numbers and punctuation symbols), make analysis case-sensitive (by deactivating the ignore-case option), and select the display-percentages statistics output mode. The first four major symbols (excluding a space) are "6", "E", "2", and "@". If we assume that the plain text message was written in English and compare these symbols with the most frequent letters of the English language, we get that "6" = "e", "E" = "t", "2" = "a", and "@" = "o". After a little bit of pondering and looking at the ASCII code points of these symbols, we notice that they are shifted by 47 positions in the ASCII character table. That gives us a clue that the text could be encoded with the ROT47 cipher. Puzzle solved! Use the ROT47 substitution cipher to decrypt the secret message.
E96 D64C6E 4@56 :D Ighf 7@==@H65 3J Ifb`] 2AA=J E9:D 4@56 EH:46] 2==@H65 6?ECJ E:>6i e2> E@ f2> FE4] 2=A92 D:6CC2 E2?8@
⎵: 18 (15.13%) 6: 11 (9.24%) E: 9 (7.56%) 2: 8 (6.72%) @: 7 (5.88%) =: 6 (5.04%) 4: 5 (4.2%) :: 5 (4.2%) D: 4 (3.36%) C: 4 (3.36%) 5: 4 (3.36%) 9: 3 (2.52%) f: 3 (2.52%) H: 3 (2.52%) J: 3 (2.52%) ]: 3 (2.52%) ↵: 3 (2.52%) A: 3 (2.52%) >: 3 (2.52%) I: 2 (1.68%) ?: 2 (1.68%) g: 1 (0.84%) h: 1 (0.84%) 7: 1 (0.84%) 3: 1 (0.84%) b: 1 (0.84%) `: 1 (0.84%) i: 1 (0.84%) e: 1 (0.84%) F: 1 (0.84%) 8: 1 (0.84%)
Required options
These options will be used automatically if you select this example.
Get distribution of all symbols.
Number of letters or symbols in a group.
Select statistics output format.
Choose how to sort the output.
Consider uppercase and lowercase letters equal.
Analyze Birth Years
In this example, we paste a list of people and their birth years in the input and use the "Analyze All Symbols" option to find the most popular years they were born in. We set the character group length to 4 (a fourgram), which equals the number of digits in a year. When forming fourgrams, we stop at the end of each word, so that peoples' names and birth years are analyzed separately. In the resulting statistics, we find that the year 1995 appears most often (4 times), followed by the years 1996 (3 times), 2000 (3 times), and 1985 (2 times).
Oliver 1995 Bella 1996 Billy 1994 Jorge 1990 Helen 1995 Lyla 2000 Adam 1985 Karla 1995 Lukas 1992 Cali 1995 Hugo 2000 Mira 1996 Kate 1985 Axel 2000 Ariel 1996
1995: 4 1996: 3 2000: 3 1985: 2 oliv: 1 live: 1 iver: 1 bell: 1 ella: 1 bill: 1 illy: 1 1994: 1 jorg: 1 orge: 1 1990: 1 hele: 1 elen: 1 lyla: 1 adam: 1 karl: 1 arla: 1 luka: 1 ukas: 1 1992: 1 cali: 1 hugo: 1 mira: 1 kate: 1 axel: 1 arie: 1 riel: 1
Required options
These options will be used automatically if you select this example.
Get distribution of all symbols.
Number of letters or symbols in a group.
Don't glue characters of different words together in groups.
Select statistics output format.
Choose how to sort the output.
Consider uppercase and lowercase letters equal.
Pro tips Master online crypto tools
You can pass input to this tool via ?input query argument and it will automatically compute output. Here's how to type it in your browser's address bar. Click to try!
https://onlinecryptotools.com/analyze-letter-frequency?input=Cubism%20is%20an%20artistic%20movement%20that%20emerged%20during%20the%20early%2020th%20century.%20In%20Cubist%20artwork%2C%20objects%20are%20analyzed%2C%20broken%20up%2C%20and%20reassembled%20in%20an%20abstracted%20form.%20Instead%20of%20depicting%20objects%20from%20a%20single%20viewpoint%2C%20the%20artist%20depicts%20the%20subject%20from%20a%20multitude%20of%20viewpoints%20to%20represent%20the%20subject%20in%20a%20greater%20context.&group-length=1&analyze-letters-only=true&type=print-count&sort=sort-by-alphabet&ignore-case=true
All crypto tools
Didn't find the tool you were looking for? Let us know what tool we are missing and we'll build it!
Quickly encrypt data using the ROT13 substitution cipher.
Quickly decrypt data that was encrypted with ROT13.
Quickly encrypt data using the ROT47 substitution cipher.
Quickly undo ROT47 encryption and find the original message.
Quickly count letters in the given text and print their distribution.
Quickly count words in the given text and print their distribution.
Coming soon These crypto tools are on the way
ROT-2 Encrypt/Decrypt
Encrypt and decrypt numbers using ROT2 cypher.
ROT-5 Encrypt/Decrypt
Encrypt and decrypt numbers using ROT5 cypher.
ROT-n Encrypt/Decrypt
Encrypt and decrypt numbers using custom ROT cypher.
Base26 Encode/Decode
Encode and decode data to/from base26 encoding.
Base32 Encode/Decode
Encode and decode data to/from base32 encoding.
Base58 Encode/Decode
Encode and decode data to/from base58 encoding.
Base62 Encode/Decode
Encode and decode data to/from base62 encoding.
Base64 Encode/Decode
Encode and decode data to/from base64 encoding.
ASCII85 Encode/Decode
Encode and decode data to/from ASCII85 encoding.
XOR Encrypt/Decrypt
Encrypt and decrypt data using the XOR algorithm.
AES Encrypt/Decrypt
Encrypt and decrypt data using the AES algorithm.
RC2 Encrypt/Decrypt
Encrypt and decrypt data using the RC2 algorithm.
RC4 Encrypt/Decrypt
Encrypt and decrypt data using the RC4 algorithm.
RC5 Encrypt/Decrypt
Encrypt and decrypt data using the RC5 algorithm.
RC6 Encrypt/Decrypt
Encrypt and decrypt data using the RC6 algorithm.
Akelarre Encrypt/Decrypt
Encrypt and decrypt data using the Akelarre algorithm.
Ake98 Encrypt/Decrypt
Encrypt and decrypt data using the Ake98 algorithm.
DES Encrypt/Decrypt
Encrypt and decrypt data using the DES algorithm.
Triple DES Encrypt/Decrypt
Encrypt and decrypt data using the 3DES algorithm.
Rabbit Encrypt/Decrypt
Encrypt and decrypt data using the Rabbit cipher.
Blowfish Encrypt/Decrypt
Encrypt and decrypt data using the Blowfish cipher.
Twofish Encrypt/Decrypt
Encrypt and decrypt data using the Twofish cipher.
KASUMI Encrypt/Decrypt
Encrypt and decrypt data using the KASUMI cipher.
Serpent Encrypt/Decrypt
Encrypt and decrypt data using the Serpent cipher.
Solitaire Encrypt/Decrypt
Encrypt and decrypt data using the Solitaire cipher.
Break a Substitution Cipher
Decode text encrypted with a simple substitution cipher.
Generate Random Numbers
Print a list of random numbers.
Generate Primes
Print a list of prime numbers.
Generate Twin Primes
Print a list of twin prime numbers.
Generate Cousin Primes
Print a list of cousin prime numbers.
Generate Prime Triplets
Print a list of prime triplets.
Generate Random Primes
Print a list of random prime numbers.
Check Prime Numbers
Check if the given numbers are prime numbers.
Create Random Passwords
Generate one or more random passwords.
Replace Message Alphabet
Rewrite the given message with a new alphabet.
Calculate Text Entropy
Compute the Shannon entropy of the given message.
Generate High Entropy Data
Create text with high Shannon entropy.
Generate Low Entropy Data
Create text with low Shannon entropy.
Analyze Random Data
Perform statistical analysis of random data.