bionty.Bionty#

class bionty.Bionty(source, version=None, species=None, *, reference_id=None, synonyms_field=None, include_id_prefixes=None, include_name_prefixes=None, exclude_id_prefixes=None, exclude_name_prefixes=None, **kwargs)#

Bases: object

Biological entity as an Bionty.

See Guide for background.

Attributes

source :class:`~<class 'property'>`#

Name of the source.

species :class:`~<class 'property'>`#

The name of Species Bionty.

version :class:`~<class 'property'>`#

The name of version entity Bionty.

Methods

curate(df, column=None, reference_id=None, case_sensitive=True)#

Curate index of passed DataFrame to conform with default identifier.

  • If target_column is None, checks the existing index for compliance with the default identifier.

  • If target_column denotes an entity identifier, tries to map that identifier to the default identifier.

Parameters:
  • df – The input Pandas DataFrame to curate.

  • column – The column in the passed Pandas DataFrame to curate.

  • reference_id – The reference column in the ontology Pandas DataFrame. ‘Defaults to ontology_id’.

  • case_sensitive – Whether the curation should be case sensitive or not. Defaults to True.

Return type:

DataFrame

Returns:

Returns the DataFrame with the curated index and a boolean __curated__ column that indicates compliance with the default identifier.

Examples

>>> import pandas as pd
>>> import bionty as bt
>>> df = pd.DataFrame(index=["Boettcher cell", "bone marrow cell"]
>>> ct = bt.CellType()
>>> curated_df = ct.curate(df, reference_id=ct.name)
df()#

Pandas DataFrame of the ontology.

Return type:

DataFrame

Returns:

A Pandas DataFrame of the ontology indexed by the passed reference_id or “ontology_id” if not specified.

Examples

>>> import bionty as bt
>>> bt.Gene().df()
fuzzy_match(string, reference_id, synonyms_field='synonyms', case_sensitive=True, return_ranked_results=False)#

Fuzzy matching of a given string using RapidFuzz.

Parameters:
  • string – The input string to match against the reference_id ontology values.

  • reference_id – The BiontyField of ontology the input string is matching against.

  • synonyms_field – Also map against in the synonyms (If None, no mapping against synonyms).

  • case_sensitive – Whether the match is case sensitive.

  • return_ranked_results – Whether to return all entries ranked by matching ratios.

Return type:

str

Returns:

Best match of the input string.

Examples

>>> import bionty as bt
>>> ct = bt.CellType()
>>> ct.fuzzy_match("T cells", ct.name)
inspect(identifiers, reference_id, return_df=False)#

Inspect if a list of identifiers are mappable to the entity reference.

Parameters:
  • identifiers – Identifiers that will be checked against the Ontology.

  • reference_id – The BiontyField of the ontology to compare against. Examples are ‘ontology_id’ to map against the ontology ID or ‘name’ to map against the ontologies field names.

  • return_df – Whether to return a Pandas DataFrame.

Return type:

Union[DataFrame, dict[str, list[str]]]

Returns:

  • A Dictionary that maps the input ontology (keys) to the ontology field (values)

  • If specified A Pandas DataFrame with the curated index and a boolean __curated__ column that indicates compliance with the default identifier.

Examples

>>> import pandas as pd
>>> import bionty as bt
>>> df = pd.DataFrame(index=["Boettcher cell", "bone marrow cell"]
>>> ct = bt.CellType()
>>> ct.inspect(df, reference_id=ct.name)
lookup(field='name')#

Return an auto-complete object for the bionty id.

Parameters:

field – The field to lookup the values for. Adapt this parameter to, for example, ‘ontology_id’ to lookup by ID. Defaults to ‘name’.

Return type:

tuple

Returns:

A NamedTuple of lookup information of the entitys values.

Examples

>>> import bionty as bt
>>> gene_lookout = bt.Gene().lookup()
>>> gene_lookout.TEF
map_synonyms(identifiers, reference_id, *, synonyms_field='synonyms', return_mapper=False)#

Maps input identifiers against Ontology synonyms.

Parameters:
  • identifiers – Identifiers that will be mapped against an Ontology field (BiontyField).

  • reference_id – The BiontyField of ontology representing the identifiers.

  • return_mapper – Whether to return a dictionary of {identifiers : <mapped reference_id values>}.

Return type:

Union[Dict[str, str], List[str]]

Returns:

  • A list of mapped reference_id values if return_mapper is False.

  • A dictionary of mapped values with mappable identifiers as keys and values mapped to reference_id as values if return_mapper is True.

Examples

>>> import pandas as pd
>>> import bionty as bt
>>> gene_symbols = ["A1CF", "A1BG", "FANCD1", "FANCD20"]
>>> gn = bt.Gene(source="ensembl", version="release-108")
>>> mapping = gn.map_synonyms(gene_symbols, gn.symbol)