You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
# Character accesscontext.char() # Current charcontext.prev_char(n) # Nth previous charcontext.next_char(n) # Nth next char# Word accesscontext.prev_word(n) # Nth previous wordcontext.next_word(n) # Nth next word# Positioncontext.position() # Current indexcontext.text_length() # Total length
Rule Result Values
Constant
Value
Meaning
BOUNDARY
1.0
Definitely a boundary
NOT_BOUNDARY
0.0
Definitely not a boundary
MAYBE
0.5
Need more context
LIKELY
0.75
Probably a boundary
VERY_LIKELY
0.90
Almost certainly
Troubleshooting
False Positives (Over-splitting)
If sentences are split too aggressively:
Add abbreviation to custom set
Increase min_sentence_length
Enable aggressive_abbreviations mode
False Negatives (Under-splitting)
If sentences aren't split enough:
Check for missing abbreviations
Review Rule 84 (semicolon handling)
Try excluding rules that may be blocking
Debug Mode
detector=TokenBoundaryDetector(debug=True)
explanations=detector.explain(text)
# Shows all rules evaluated and their results