You need to preprocess the corpus before splitting it. Go to preprocessing.
Ratios
Sliders must sum to 100%. The default is the conventional 70 / 10 / 20.
train70%
validation10%
test20%
Total100%
Distribution
How tokens are allocated.
Training
Validation
Test
Tip: Sentences are shuffled with a fixed seed, then validation/test words outside the training vocabulary become
<UNK>.