sentency package

Submodules

sentency.logs module

sentency.logs.get_console_handler() → logging.StreamHandler[source]

Get console handler.

Returns: logging.StreamHandler, StreamHandler which logs into stdout

sentency.logs.get_logger(name: str = 'sentency.logs', log_level: Union[str, int] = 40) → logging.Logger[source]

Get logger.

name: Text, logger name log_level: Text or int, logging level; can be string name or integer value Returns: logging.Logger, logger instance

sentency.regex module

sentency.regex.regexize_keywords(keyword_str, keyword_delimiter=' ', line_delimiter='\n', case_insensitive=True)[source]

Convert a string of keywords into a regular expression that can be used as input for the sentenCy sentex component. You can separate keywords individually or into groups using keyword and line delimiters.

Example usage: >>> from sentency.regex import regexize_keywords >>> keyword_str = “abdominal aortic aneurysmnaneurysm abdominal aorta” >>> regexize_keywords(keyword_str) (?i)((abdominal.*aortic.*aneurysm)|(aneurysm.*abdominal.*aorta))

keyword_str: str, The keyword string to be converted into a regular expression. keyword_delimiter: str, The delimiter separating individual keywords in keyword_str. Default is ‘ ‘ line_delimiter: str, The string separating lines into a regular expression. Default is ‘n’ case_insensitive: bool, Should the regular expression be case-insensitive? RETURNS: str, The regular expression

sentency.sentency module

class sentency.sentency.Sentex(nlp: spacy.language.Language, sentence_regex: str, ignore_regex: str, annotate_ents: bool, label: str)[source]

Bases: object

Sentex is a spaCy pipeline component that adds spans to the list Doc._.sentex based on regular expression matches within each sentence of the document. If an ignore_regex is given, sentences matching that regular expression will be ignored.

nlp: Language,
A required argument for spacy to use this as a factory
sentence_regex : str,
A regular expression to match spans within each sentence of the document.
ignore_regex : str,
A regular expression to identify sentences that should be ignored.
annotate_ents: bool,
Write/overwrite matches to Doc.ents
label: str,
If annotate_ents == True, the label for the matched entity
set_annotations(doc)[source]

Modify the document in place. Logic taken from spacy.pipeline.entityruler.EntityRuler

class sentency.sentency.Size(nlp: spacy.language.Language, size_regex: str, sentex_only: bool, annotate_ents: bool, label: str)[source]

Bases: object

Size is a spaCy pipeline component that adds spans to the list Doc._.size based on regular expression matches within each sentence of the document.

nlp: Language,
A required argument for spacy to use this as a factory
size_regex : str,
A regular expression to match spans within each sentence of the document.
sentex_only : bool,
Only match in sentences with Sentex-matched entities
annotate_ents: bool,
Write/overwrite matches to Doc.ents
label: str,
If annotate_ents == True, the label for the matched entity
parse_size(text: str) → float[source]
process(doc: spacy.tokens.doc.Doc) → spacy.tokens.doc.Doc[source]
set_annotations(doc)[source]

Modify the document in place. Logic taken from spacy.pipeline.entityruler.EntityRuler

sentency.sentency.create_sentex_component(nlp: spacy.language.Language, name: str, sentence_regex: str, ignore_regex: str, annotate_ents: bool, label: str)[source]
sentency.sentency.create_size_component(nlp: spacy.language.Language, name: str, size_regex: str, sentex_only: bool, annotate_ents: bool, label: str)[source]

Module contents

Top-level package for sentency.