Add ignored prefixes and suffixes to test data. This ensures that implementation does not rely on the texts having zero offset inside their byte arrays.
Use ICU to find line break boundaries.
Implement line breaking on code point boundaries.