Step 1 of 8

Corpus

Paste text, upload a .txt file, or extract readable text from a website URL.

Choose a source

Switching tabs preserves the underlying text.

Raw text preview

First few hundred characters — exactly what the preprocessor will see.

— no text loaded —

Statistics

Computed live from the raw text.

Characters
0
Words (whitespace)
0
Sentences
0
Estimated vocabulary
0
Before any cleaning or limiting
The corpus is quite short (0 words). 4-grams need at least a few hundred words to be meaningful.
Corpus text stored locally