sentency package¶
Submodules¶
sentency.logs module¶
sentency.regex module¶
-
sentency.regex.
regexize_keywords
(keyword_str, keyword_delimiter=' ', line_delimiter='\n', case_insensitive=True)[source]¶ Convert a string of keywords into a regular expression that can be used as input for the sentenCy sentex component. You can separate keywords individually or into groups using keyword and line delimiters.
Example usage: >>> from sentency.regex import regexize_keywords >>> keyword_str = “abdominal aortic aneurysmnaneurysm abdominal aorta” >>> regexize_keywords(keyword_str) (?i)((abdominal.*aortic.*aneurysm)|(aneurysm.*abdominal.*aorta))
keyword_str: str, The keyword string to be converted into a regular expression. keyword_delimiter: str, The delimiter separating individual keywords in keyword_str. Default is ‘ ‘ line_delimiter: str, The string separating lines into a regular expression. Default is ‘n’ case_insensitive: bool, Should the regular expression be case-insensitive? RETURNS: str, The regular expression
sentency.sentency module¶
-
class
sentency.sentency.
Sentex
(nlp: spacy.language.Language, sentence_regex: str, ignore_regex: str, annotate_ents: bool, label: str)[source]¶ Bases:
object
Sentex is a spaCy pipeline component that adds spans to the list Doc._.sentex based on regular expression matches within each sentence of the document. If an ignore_regex is given, sentences matching that regular expression will be ignored.
- nlp: Language,
- A required argument for spacy to use this as a factory
- sentence_regex : str,
- A regular expression to match spans within each sentence of the document.
- ignore_regex : str,
- A regular expression to identify sentences that should be ignored.
- annotate_ents: bool,
- Write/overwrite matches to Doc.ents
- label: str,
- If annotate_ents == True, the label for the matched entity
-
class
sentency.sentency.
Size
(nlp: spacy.language.Language, size_regex: str, sentex_only: bool, annotate_ents: bool, label: str)[source]¶ Bases:
object
Size is a spaCy pipeline component that adds spans to the list Doc._.size based on regular expression matches within each sentence of the document.
- nlp: Language,
- A required argument for spacy to use this as a factory
- size_regex : str,
- A regular expression to match spans within each sentence of the document.
- sentex_only : bool,
- Only match in sentences with Sentex-matched entities
- annotate_ents: bool,
- Write/overwrite matches to Doc.ents
- label: str,
- If annotate_ents == True, the label for the matched entity
Module contents¶
Top-level package for sentency.