Skip to content

transformers

Class TokenTransformer allows for transformation of spaCy tokens.

Transformer

Transforms a token using the evaluator.

Parameters:

Name Type Description Default
evaluator Evaluator

Evaluates if the token should be processed or not.

required
replace str

Replaces token based on the token evaluation.

required
Example
from spacy_cleaner.processing.evaluators import StopwordsEvaluator

transformer = Transformer(StopwordsEvaluator(), replace="")
transformer.transform(tok)
Source code in spacy_cleaner/processing/transformers.py
class Transformer:
    """Transforms a token using the evaluator.

    Args:
        evaluator: Evaluates if the token should be processed or not.
        replace: Replaces token based on the token evaluation.

    Example:
        ```python
        from spacy_cleaner.processing.evaluators import StopwordsEvaluator

        transformer = Transformer(StopwordsEvaluator(), replace="")
        transformer.transform(tok)
        ```
    """

    def __init__(self, evaluator: evaluators.Evaluator, replace: str) -> None:
        self.evaluator = evaluator
        self.replace = replace

    def transform(self, tok: tokens.Token) -> Union[str, tokens.Token]:
        """Processes a token using the evaluator.

        Args:
            tok: The token to be evaluated.

        Returns:
            A string or token depending on evaluation.
        """
        return self.replace if self.evaluator.evaluate(tok) else tok

transform(tok)

Processes a token using the evaluator.

Parameters:

Name Type Description Default
tok Token

The token to be evaluated.

required

Returns:

Type Description
Union[str, Token]

A string or token depending on evaluation.

Source code in spacy_cleaner/processing/transformers.py
def transform(self, tok: tokens.Token) -> Union[str, tokens.Token]:
    """Processes a token using the evaluator.

    Args:
        tok: The token to be evaluated.

    Returns:
        A string or token depending on evaluation.
    """
    return self.replace if self.evaluator.evaluate(tok) else tok