
PYPI Status Documentation Status

A small spaCy pipeline component for matching within document sentences using regular expressions.


  • spaCy component for sentence-by-sentence pattern matching
  • Find matches with complex patterns using the power of regular expressions
  • Easily convert simple keywords into valid regular expressions
  • Specify matching patterns as well as patterns to ignore
  • Annotate matches for NER (Named Entity Recognition) tasks


pip install sentency


The following minimally complex example showcases the features of sentenCy.

import spacy
from spacy import displacy

from sentency.regex import regexize_keywords
from sentency.sentency import Sentex

text = """
Screening for abdominal aortic aneurysm.
Impression: There is evidence of a fusiform
abdominal aortic aneurysm measuring 3.4 cm.
aaa_keywords = "abdominal aortic aneurysm"
ignore_keywords = "screening aneurysm"

keyword_regex = regexize_keywords(aaa_keywords)
ignore_regex = regexize_keywords(ignore_keywords)

nlp = spacy.load("en_core_web_sm")
"sentex", config={
        "sentence_regex": keyword_regex,
        "ignore_regex": ignore_regex,
        "annotate_ents": True,
        "label": "AAA"

doc = nlp(text)

displacy.render(doc, style="ent", options = {"ents": ["AAA"]})


This package was created with Cookiecutter and the audreyr/cookiecutter-pypackage project template.