EDAM: An ontology of bioinformatics operations, types of data and identifiers, topics, and formats
1.2
application/rdf+xml
14:12:2012 12:45
3263
EDAM
EDAM (EMBRACE Data And Methods) is an ontology of bioinformatics operations (tool, application, or workflow functions), types of data, topics (application domains), and data formats. The applications of EDAM are within organising tools and data, finding suitable tools in catalogues, and integrating them into complex applications or workflows. Semantic annotations with EDAM are applicable to diverse entities such as for example Web services, databases, programmatic libraries, standalone tools and toolkits, interactive applications, data schemas, data sets, or publications within bioinformatics. Annotation with EDAM may also contribute to data provenance, and EDAM terms and synonyms can be used in text mining. EDAM - and in particular the EDAM Data sub-ontology - serves also as a markup vocabulary for bioinformatics data on the Semantic Web.
EDAM editors: Jon Ison and Matus Kalas. Co-authors: Inge Jonassen, Dan Bolser, Hamish McWilliam, Mahmut Uludag, James Malone, Rodrigo Lopez, Steve Pettifer, and Peter Rice. Funding: No funding targetted exclusively the development of EDAM; contibutions from these projects: EMBRACE (FP6, EU), EMBOSS (BBSRC, UK), eSysbio and FUGE Bioinformatics Platform (both Research Council of Norway). See http://edamontology.org for documentation and licence.
Jon Ison
Matúš Kalaš
EDAM http://edamontology.org/ "EDAM relations and concept properties"
EDAM_data http://edamontology.org/data_ "EDAM types of data"
EDAM_format http://edamontology.org/format_ "EDAM data formats"
EDAM_operation http://edamontology.org/operation_ "EDAM operations"
EDAM_topic http://edamontology.org/topic_ "EDAM topics"
Jon Ison, Matus Kalas
operations "EDAM operations"
data "EDAM types of data"
topics "EDAM topics"
formats "EDAM data formats"
identifiers "EDAM types of identifiers"
relations "EDAM relations"
concept_properties "EDAM concept properties"
edam "EDAM"
bioinformatics "Bioinformatics"
Singular, bioinformatics-specific operations that are functions of tools, workflows, or scripts, or can be performed manually.
operations "EDAM operations"
Types of data that are relevant in bioinformatics, commonly used as inputs, outputs, or intermediate data of analyses, or provided by databases and portals.
data "EDAM types of data"
Application domains of bioinformatics tools and resources; topics of research, studies, or analyses; approaches, techniques, and paradigms within - or directly related to - bioinformatics.
topics "EDAM topics"
Data formats commonly used in - and specific to - bioinformatics. Many format concepts in EDAM include references to their definition and documentation.
formats "EDAM data formats"
Types of identifiers that identify biological or computational entities; including resource-specific data accessions. Several identifier concepts in EDAM include regular expressions and examples.
identifiers "EDAM types of identifiers"
Types of relations - defined in EDAM - that apply between concepts, entities subject to semantic annotation, and entities and concepts (or possibly even vice versa).
relations "EDAM relations"
Types of concept properties and property modifiers, that are defined and used in EDAM in addition to the standard markup from the OBO and OWL formats and the standard Semantic Web vocabularies.
concept_properties "EDAM concept properties"
Created in
concept_properties
Version in which a concept was created.
true
Obsolete since
Version in which a concept was made obsolete.
concept_properties
true
Regular expression
true
'Regular expression' concept property ('regex' metadata tag) specifies the allowed values of types of identifiers (accessions). Applicable to some other types of data, too.
concept_properties
Example
true
'Example' concept property ('example' metadat tag) lists examples of valid values of types of identifiers (accessions). Applicable to some other types of data, too.
concept_properties
Documentation
'Documentation' trailing modifier (qualifier, 'documentation') of 'xref' links of 'Format' concepts. When 'true', the link is pointing to a page with explanation, description, documentation, or specification of the given data format.
true
concept_properties
Specification
relations
edam
bioinformatics
has format
'A has_format B' defines for the subject A, that it has the object B as its data format.
Subject A can be any concept or entity outside of an ontology (or an ontology concept in a role of an entity being semantically annotated) that is (or is in a role of) 'Data', or an input, output, input or output argument of an 'Operation'. Object B can either be a concept that is a 'Format', or in unexpected cases an entity outside of an ontology that is a 'Format' or is in the role of a 'Format'. In EDAM, 'has_format' is not explicitly defined between EDAM concepts, only the inverse 'is_format_of'.
false
false
false
OBO_REL:is_a
false
'OBI:has_quality' might be seen narrower in the sense that it only relates subjects that are an 'independent_continuant' (snap:IndependentContinuant) with objects that are a 'quality' (snap:Quality), and is broader in the sense that it relates with any qualities of the subject.
relations
edam
bioinformatics
has function
'A has_function B' defines for the subject A, that it has the object B as its function.
Subject A can be any concept or entity outside of an ontology (or an ontology concept in a role of an entity being semantically annotated). Object B can either be a concept that is (or is in a role of) a function, or an entity outside of an ontology that is (or is in a role of) a function specification. In the scope of EDAM, 'has_function' serves only for relating annotated entities outside of EDAM with 'Operation' concepts.
false
false
false
true
OBO_REL:is_a
OBO_REL:bearer_of
In very unusual cases.
true
'OBI:has_function' only relates subjects that are an 'independent_continuant' (snap:IndependentContinuant), so for example no processes, with objects that are a 'function' (snap:Function). It does not define explicitly that the object is a function of the subject.
Is defined anywhere? Not in the 'unknown' version of RO. 'OBO_REL:bearer_of' is narrower in the sense that it only relates ontological categories (concepts) that are an 'independent_continuant' (snap:IndependentContinuant) with ontological categories that are a 'specifically_dependent_continuant' (snap:SpecificallyDependentContinuant), and broader in the sense that it relates with any borne objects not just functions of the subject.
OBO_REL:bearer_of
relations
edam
bioinformatics
has identifier
'A has_identifier B' defines for the subject A, that it has the object B as its identifier.
Subject A can be any concept or entity outside of an ontology (or an ontology concept in a role of an entity being semantically annotated). Object B can either be a concept that is an 'Identifier', or an entity outside of an ontology that is an 'Identifier' or is in the role of an 'Identifier'. In EDAM, 'has_identifier' is not explicitly defined between EDAM concepts, only the inverse 'is_identifier_of'.
false
false
false
false
OBO_REL:is_a
relations
edam
bioinformatics
has input
'A has_input B' defines for the subject A, that it has the object B as a necessary or actual input or input argument.
Subject A can either be concept that is or has an 'Operation' function, or an entity outside of an ontology (or an ontology concept in a role of an entity being semantically annotated) that has an 'Operation' function or is an 'Operation'. Object B can be any concept or entity. In EDAM, only 'has_input' is explicitly defined between EDAM concepts ('Operation' 'has_input' 'Data'). The inverse, 'is_input_of', is not explicitly defined.
false
false
false
OBO_REL:is_a
true
OBO_REL:has_participant
In very unusual cases.
true
'OBI:has_specified_input' only relates subjects that are a 'planned process' (http://purl.obolibrary.org/obo/OBI_0000011) with objects that are a 'continuant' (snap:Continuant).
'OBO_REL:has_participant' is narrower in the sense that it only relates ontological categories (concepts) that are a 'process' (span:Process) with ontological categories that are a 'continuant' (snap:Continuant), and broader in the sense that it relates with any participating objects not just inputs or input arguments of the subject.
OBO_REL:has_participant
relations
edam
bioinformatics
has output
'A has_output B' defines for the subject A, that it has the object B as a necessary or actual output or output argument.
Subject A can either be concept that is or has an 'Operation' function, or an entity outside of an ontology (or an ontology concept in a role of an entity being semantically annotated) that has an 'Operation' function or is an 'Operation'. Object B can be any concept or entity. In EDAM, only 'has_output' is explicitly defined between EDAM concepts ('Operation' 'has_output' 'Data'). The inverse, 'is_output_of', is not explicitly defined.
false
false
false
OBO_REL:is_a
true
OBO_REL:has_participant
In very unusual cases.
true
'OBI:has_specified_output' only relates subjects that are a 'planned process' (http://purl.obolibrary.org/obo/OBI_0000011) with objects that are a 'continuant' (snap:Continuant).
'OBO_REL:has_participant' is narrower in the sense that it only relates ontological categories (concepts) that are a 'process' (span:Process) with ontological categories that are a 'continuant' (snap:Continuant), and broader in the sense that it relates with any participating objects not just outputs or output arguments of the subject. It is also not clear whether an output (result) actually participates in the process that generates it.
OBO_REL:has_participant
relations
edam
bioinformatics
has topic
'A has_topic B' defines for the subject A, that it has the object B as its topic (A is in the scope of a topic B).
Subject A can be any concept or entity outside of an ontology (or an ontology concept in a role of an entity being semantically annotated). Object B can either be a concept that is a 'Topic', or in unexpected cases an entity outside of an ontology that is a 'Topic' or is in the role of a 'Topic'. In EDAM, only 'has_topic' is explicitly defined between EDAM concepts ('Operation' or 'Data' 'has_topic' 'Topic'). The inverse, 'is_topic_of', is not explicitly defined.
false
false
false
OBO_REL:is_a
true
In very unusual cases.
true
'ao:hasTopic' is narrower in the sense that it only relates subjects that are an annotation, and it is broader in the sense that it relates with any resource.
'OBI:has_quality' might be seen narrower in the sense that it only relates subjects that are an 'independent_continuant' (snap:IndependentContinuant) with objects that are a 'quality' (snap:Quality), and is broader in the sense that it relates with any qualities of the subject.
'is about' is narrower in the sense that it only relates subjects that are information artifacts and the relation is not necessarily the one of having a topic. It is broader in the sense that it relates with any object.
relations
edam
bioinformatics
is format of
'A is_format_of B' defines for the subject A, that it is a data format of the object B.
Subject A can either be a concept that is a 'Format', or in unexpected cases an entity outside of an ontology (or an ontology concept in a role of an entity being semantically annotated) that is a 'Format' or is in the role of a 'Format'. Object B can be any concept or entity outside of an ontology that is (or is in a role of) 'Data', or an input, output, input or output argument of an 'Operation'. In EDAM, only 'is_format_of' is explicitly defined between EDAM concepts ('Format' 'is_format_of' 'Data'). The inverse, 'has_format', is not explicitly defined.
false
false
false
false
OBO_REL:is_a
OBO_REL:quality_of
Is defined anywhere? Not in the 'unknown' version of RO. 'OBO_REL:quality_of' might be seen narrower in the sense that it only relates subjects that are a 'quality' (snap:Quality) with objects that are an 'independent_continuant' (snap:IndependentContinuant), and is broader in the sense that it relates any qualities of the object.
OBO_REL:quality_of
relations
edam
bioinformatics
is function of
'A is_function_of B' defines for the subject A, that it is a function of the object B.
Subject A can either be concept that is (or is in a role of) a function, or an entity outside of an ontology (or an ontology concept in a role of an entity being semantically annotated) that is (or is in a role of) a function specification. Object B can be any concept or entity. Within EDAM itself, 'is_function_of' is not used.
false
false
false
true
OBO_REL:is_a
OBO_REL:function_of
OBO_REL:inheres_in
In very unusual cases.
true
Is defined anywhere? Not in the 'unknown' version of RO. 'OBO_REL:function_of' only relates subjects that are a 'function' (snap:Function) with objects that are an 'independent_continuant' (snap:IndependentContinuant), so for example no processes. It does not define explicitly that the subject is a function of the object.
OBO_REL:function_of
Is defined anywhere? Not in the 'unknown' version of RO. 'OBO_REL:inheres_in' is narrower in the sense that it only relates ontological categories (concepts) that are a 'specifically_dependent_continuant' (snap:SpecificallyDependentContinuant) with ontological categories that are an 'independent_continuant' (snap:IndependentContinuant), and broader in the sense that it relates any borne subjects not just functions.
OBO_REL:inheres_in
relations
edam
bioinformatics
is identifier of
'A is_identifier_of B' defines for the subject A, that it is an identifier of the object B.
Subject A can either be a concept that is an 'Identifier', or an entity outside of an ontology (or an ontology concept in a role of an entity being semantically annotated) that is an 'Identifier' or is in the role of an 'Identifier'. Object B can be any concept or entity outside of an ontology. In EDAM, only 'is_identifier_of' is explicitly defined between EDAM concepts (only 'Identifier' 'is_identifier_of' 'Data'). The inverse, 'has_identifier', is not explicitly defined.
false
false
false
OBO_REL:is_a
false
relations
edam
bioinformatics
is input of
'A is_input_of B' defines for the subject A, that it as a necessary or actual input or input argument of the object B.
Subject A can be any concept or entity outside of an ontology (or an ontology concept in a role of an entity being semantically annotated). Object B can either be a concept that is or has an 'Operation' function, or an entity outside of an ontology that has an 'Operation' function or is an 'Operation'. In EDAM, 'is_input_of' is not explicitly defined between EDAM concepts, only the inverse 'has_input'.
false
false
false
OBO_REL:is_a
true
OBO_REL:participates_in
In very unusual cases.
true
'OBI:is_specified_input_of' only relates subjects that are a 'continuant' (snap:Continuant) with objects that are a 'planned process' (http://purl.obolibrary.org/obo/OBI_0000011).
'OBO_REL:participates_in' is narrower in the sense that it only relates ontological categories (concepts) that are a 'continuant' (snap:Continuant) with ontological categories that are a 'process' (span:Process), and broader in the sense that it relates any participating subjects not just inputs or input arguments.
OBO_REL:participates_in
relations
edam
bioinformatics
is output of
'A is_output_of B' defines for the subject A, that it as a necessary or actual output or output argument of the object B.
Subject A can be any concept or entity outside of an ontology (or an ontology concept in a role of an entity being semantically annotated). Object B can either be a concept that is or has an 'Operation' function, or an entity outside of an ontology that has an 'Operation' function or is an 'Operation'. In EDAM, 'is_output_of' is not explicitly defined between EDAM concepts, only the inverse 'has_output'.
false
false
false
OBO_REL:is_a
true
OBO_REL:participates_in
In very unusual cases.
true
'OBI:is_specified_output_of' only relates subjects that are a 'continuant' (snap:Continuant) with objects that are a 'planned process' (http://purl.obolibrary.org/obo/OBI_0000011).
'OBO_REL:participates_in' is narrower in the sense that it only relates ontological categories (concepts) that are a 'continuant' (snap:Continuant) with ontological categories that are a 'process' (span:Process), and broader in the sense that it relates any participating subjects not just outputs or output arguments. It is also not clear whether an output (result) actually participates in the process that generates it.
OBO_REL:participates_in
relations
edam
bioinformatics
is topic of
'A is_topic_of B' defines for the subject A, that it is a topic of the object B (a topic A is the scope of B).
Subject A can either be a concept that is a 'Topic', or in unexpected cases an entity outside of an ontology (or an ontology concept in a role of an entity being semantically annotated) that is a 'Topic' or is in the role of a 'Topic'. Object B can be any concept or entity outside of an ontology. In EDAM, 'is_topic_of' is not explicitly defined between EDAM concepts, only the inverse 'has_topic'.
false
false
false
OBO_REL:is_a
true
OBO_REL:quality_of
In very unusual cases.
true
Is defined anywhere? Not in the 'unknown' version of RO. 'OBO_REL:quality_of' might be seen narrower in the sense that it only relates subjects that are a 'quality' (snap:Quality) with objects that are an 'independent_continuant' (snap:IndependentContinuant), and is broader in the sense that it relates any qualities of the object.
OBO_REL:quality_of
Resource type
true
data
edam
A type of computational resource used in bioinformatics.
data
beta12orEarlier
beta12orEarlier
bioinformatics
beta12orEarlier
data
edam
bioinformatics
data
Data
Information, represented in an information artefact (data record) that is 'understandable' by dedicated computational tools that can use the data as input or produce it as output.
Data record
Datum
Data set
EDAM does not distinguish a data record (a tool-understandable information artefact) from data or datum (its content, the tool-understandable encoding of an information).
Data record
EDAM does not distinguish the multiplicity of data, such as one data item (datum) versus a collection of data (data set).
Datum
EDAM does not distinguish the multiplicity of data, such as one data item (datum) versus a collection of data (data set).
Data set
GFO 'Perpetuant' is in general broader than data, but it may be seen narrower in the sense of being a concrete individual.
Data does theoretically not need to have a purpose, but in all regular cases it does. Remark: EDAM Data sub-ontology focuses on scientific data (SIO_000472), in particular the bioinformatics (SIO_010065) and biological (SIO_010019).
IAO 'data item' is a closely related ontological category (concept) broader in the sense of being any type of data in any role, and narrower in the sense of being limited to a 'generically_dependent_continuant' (snap:GenericallyDependentContinuant), standing in relation of aboutness to some entity (http://purl.obolibrary.org/obo/IAO_0000030), and to data that is intended to be a truthful statement about something.
IAO 'information content entity' is a closely related ontological category (concept) broader in the sense of covering any type of data in any role, and narrower in the sense of being limited to a 'generically_dependent_continuant' (snap:GenericallyDependentContinuant) and standing in relation of aboutness to some entity.
Data does however not necessarily contain statements and not necessarily about an entity.
Tool
true
edam
beta12orEarlier
beta12orEarlier
A bioinformatics package or tool, e.g. a standalone application or web service.
data
bioinformatics
data
Database
true
beta12orEarlier
data
beta12orEarlier
bioinformatics
A digital data archive typically based around a relational model but sometimes using an object-oriented, tree or graph-based model.
edam
data
Ontology
An ontology of biological or bioinformatics concepts and relations, a controlled vocabulary, structured glossary etc.
bioinformatics
edam
data
beta12orEarlier
data
Directory metadata
bioinformatics
data
edam
beta12orEarlier
A directory on disk from which files are read.
data
MeSH vocabulary
true
data
beta12orEarlier
Controlled vocabulary from National Library of Medicine. The MeSH thesaurus is used to index articles in biomedical journals for the Medline/PubMED databases.
data
bioinformatics
beta12orEarlier
edam
HGNC vocabulary
true
bioinformatics
data
Controlled vocabulary for gene names (symbols) from HUGO Gene Nomenclature Committee.
beta12orEarlier
beta12orEarlier
edam
data
UMLS vocabulary
true
edam
bioinformatics
data
beta12orEarlier
beta12orEarlier
data
Compendium of controlled vocabularies for the biomedical domain (Unified Medical Language System).
beta12orEarlier
data
identifiers
bioinformatics
edam
identifier
Identifier
A text token, number or something else which identifies an entity, but which may not be persistent (stable) or unique (the same identifier may identify multiple things).
ID
Almost exact but limited to identifying resources.
Database entry
true
data
An entry (retrievable via URL) from a biological database.
bioinformatics
beta12orEarlier
beta12orEarlier
data
edam
Molecular mass
data
bioinformatics
edam
data
beta12orEarlier
Mass of a molecule.
Molecular charge
PDBML:pdbx_formal_charge
bioinformatics
data
beta12orEarlier
Net charge of a molecule.
edam
data
Chemical formula
beta12orEarlier
A specification of a chemical structure.
bioinformatics
data
Chemical structure specification
edam
data
QSAR descriptor
QSAR descriptors have numeric values that quantify chemical information encoded in a symbolic representation of a molecule. They are used in quantitative structure activity relationship (QSAR) applications. Many subtypes of individual descriptors (not included in EDAM) cover various types of protein properties.
edam
beta12orEarlier
data
A QSAR quantitative descriptor (name-value pair) of chemical structure.
data
bioinformatics
Raw sequence
Non-sequence characters may be used for example for gaps and translation stop.
data
edam
A raw molecular sequence (string of characters) which might include ambiguity, unknown positions and non-sequence characters.
data
bioinformatics
beta12orEarlier
Sequence record
data
data
A molecular sequence and associated metadata.
edam
beta12orEarlier
bioinformatics
Sequence set
data
bioinformatics
data
SO:0001260
beta12orEarlier
This concept may be used for arbitrary sequence sets and associated data arising from processing.
edam
A collection of multiple molecular sequences and associated metadata that do not (typically) correspond to molecular sequence database records or entries and which (typically) are derived from some analytical method.
Sequence mask character
A character used to replace (mask) other characters in a molecular sequence.
data
bioinformatics
edam
beta12orEarlier
data
Sequence mask type
bioinformatics
data
beta12orEarlier
Sequence masking is where specific characters or positions in a molecular sequence are masked (replaced) with an another (mask character). The mask type indicates what is masked, for example regions that are not of interest or which are information-poor including acidic protein regions, basic protein regions, proline-rich regions, low compositional complexity regions, short-periodicity internal repeats, simple repeats and low complexity regions. Masked sequences are used in database search to eliminate statistically significant but biologically uninteresting hits.
A label (text token) describing the type of sequence masking to perform.
edam
data
DNA sense specification
bioinformatics
data
beta12orEarlier
edam
DNA strand specification
data
The forward or 'top' strand might specify a sequence is to be used as given, the reverse or 'bottom' strand specifying the reverse complement of the sequence is to be used.
The strand of a DNA sequence (forward or reverse).
Strand
Sequence length specification
bioinformatics
data
beta12orEarlier
A specification of sequence length(s).
data
edam
Sequence metadata
This is used for such things as a report including the sequence identifier, type and length.
Basic or general information concerning molecular sequences.
beta12orEarlier
data
edam
data
bioinformatics
Sequence feature source
data
beta12orEarlier
How the annotation of a sequence feature (for example in EMBL or Swiss-Prot) was derived.
edam
This might be the name and version of a software tool, the name of a database, or 'curated' to indicate a manual annotation (made by a human).
data
bioinformatics
Database hits (sequence)
data
A report of sequence hits and associated data from searching a sequence database (for example a BLAST search). This will typically include a list of scores (often with statistical evaluation) and a set of alignments for the hits.
The score list includes the alignment score, percentage of the query sequence matched, length of the database sequence entry in this alignment, identifier of the database sequence entry, excerpt of the database sequence entry description etc.
edam
data
bioinformatics
beta12orEarlier
Database hits (secondary)
bioinformatics
A report of hits from a search of a protein secondary or domain database.
data
data
edam
beta12orEarlier
Methods might use fingerprints, motifs, profiles, hidden Markov models, sequence alignment etc to provide a mapping of a query protein sequence to a secondary database (Prosite, Blocks, ProDom, Prints, Pfam etc.). In this way the query is classified as a member of a known protein family or group. See concepts under 'Protein features'.
Sequence signature model
true
data
bioinformatics
data
edam
beta12orEarlier
beta12orEarlier
Data files used by motif or profile methods.
Sequence signature
true
data
edam
beta12orEarlier
A classifier of sequences such as a sequence motif, profile or other diagnostic element.
beta12orEarlier
data
bioinformatics
Sequence alignment (words)
Alignment of exact matches between subsequences (words) within two or more molecular sequences.
data
Sequence word alignment
data
beta12orEarlier
edam
bioinformatics
Dotplot
data
beta12orEarlier
edam
bioinformatics
A dotplot of sequence similarities identified from word-matching or character comparison.
data
Sequence alignment
Alignment of multiple molecular sequences.
beta12orEarlier
data
edam
bioinformatics
data
Sequence alignment parameter
beta12orEarlier
bioinformatics
edam
data
data
Some simple value controlling a sequence alignment (or similar 'match') operation.
Sequence similarity score
edam
data
beta12orEarlier
bioinformatics
data
A value representing molecular sequence similarity.
Sequence alignment metadata
bioinformatics
edam
data
data
beta12orEarlier
Report of general information on a sequence alignment, typically include a description, sequence identifiers and alignment score.
Sequence alignment report
beta12orEarlier
edam
data
An informative report of molecular sequence alignment-derived data or metadata.
bioinformatics
This is a broad data type and is used a placeholder for other, more specific types.
data
Sequence profile alignment
bioinformatics
data
data
A profile-profile alignment (each profile typically representing a sequence alignment).
beta12orEarlier
edam
Sequence-profile alignment
Data associated with the alignment might also be included, e.g. ranked list of best-scoring sequences and a graphical representation of scores.
bioinformatics
data
edam
data
Alignment of one or more molecular sequence(s) to one or more sequence profile(s) (each profile typically representing a sequence alignment).
beta12orEarlier
Sequence distance matrix
Methods might perform character compatibility analysis or identify patterns of similarity in an alignment or data matrix.
data
A matrix of estimated evolutionary distance between molecular sequences, such as is suitable for phylogenetic tree calculation.
bioinformatics
edam
Phylogenetic distance matrix
data
Moby:phylogenetic_distance_matrix
beta12orEarlier
Phylogenetic character data
data
beta12orEarlier
edam
bioinformatics
data
Basic character data from which a phylogenetic tree may be generated.
As defined, this concept would also include molecular sequences, microsatellites, polymorphisms (RAPDs, RFLPs, or AFLPs), restriction sites and fragments
Phylogenetic tree
bioinformatics
Moby:myTree
beta12orEarlier
data
The raw data (not just an image) from which a phylogenetic tree is directly generated or plotted, such as topology, lengths (in time or in expected amounts of variance) and a confidence interval for each length.
Moby:phylogenetic_tree
data
A phylogenetic tree is usually constructed from a set of sequences from which an alignment (or data matrix) is calculated. See also 'Phylogenetic tree image'.
Moby:Tree
Phylogeny
edam
Comparison matrix
data
bioinformatics
Substitution matrix
edam
Matrix of integer or floating point numbers for amino acid or nucleotide sequence comparison.
beta12orEarlier
data
The comparison matrix might include matrix name, optional comment, height and width (or size) of matrix, an index row/column (of characters) and data rows/columns (of integers or floats).
Protein topology
true
edam
beta12orEarlier
bioinformatics
beta12orEarlier
data
The location and size of the secondary structure elements and intervening loop regions is usually indicated.
data
Predicted or actual protein topology represented as a string of protein secondary structure elements.
Protein features (secondary structure)
edam
beta12orEarlier
data
bioinformatics
data
The location and size of the secondary structure elements and intervening loop regions is typically given. The report can include disulphide bonds and post-translationally formed peptide bonds (crosslinks).
Protein secondary structure
Secondary structure assignment (predicted or real) of a protein.
Protein features (super-secondary)
data
bioinformatics
edam
A report of predicted or actual super-secondary structure of protein sequence(s).
data
Protein structure report (super-secondary)
beta12orEarlier
Super-secondary structures include leucine zippers, coiled coils, Helix-Turn-Helix etc.
Secondary structure alignment (protein)
beta12orEarlier
Alignment of the (1D representations of) secondary structure of two or more proteins.
data
bioinformatics
edam
data
Secondary structure alignment metadata (protein)
true
beta12orEarlier
edam
beta12orEarlier
data
An informative report on protein secondary structure alignment-derived data or metadata.
bioinformatics
data
RNA secondary structure record
beta12orEarlier
bioinformatics
data
edam
Moby:RNAStructML
This includes thermodynamically stable or evolutionarily conserved structures such as knots, pseudoknots etc.
data
An informative report of secondary structure (predicted or real) of an RNA molecule.
Secondary structure alignment (RNA)
edam
beta12orEarlier
data
Alignment of the (1D representations of) secondary structure of two or more RNA molecules.
bioinformatics
data
Moby:RNAStructAlignmentML
Secondary structure alignment metadata (RNA)
true
beta12orEarlier
data
data
edam
An informative report of RNA secondary structure alignment-derived data or metadata.
beta12orEarlier
bioinformatics
Structure
3D coordinate and associated data for a macromolecular tertiary (3D) structure or part of a structure.
Structure data
data
The coordinate data may be predicted or real.
bioinformatics
data
beta12orEarlier
edam
Tertiary structure record
true
An entry from a molecular tertiary (3D) structure database.
beta12orEarlier
beta12orEarlier
bioinformatics
data
data
edam
Database hits (structure)
This includes alignment and score data.
edam
beta12orEarlier
Results (hits) from searching a database of tertiary structure.
data
data
bioinformatics
Structure alignment
edam
data
bioinformatics
Alignment (superimposition) of molecular tertiary (3D) structures.
data
beta12orEarlier
A tertiary structure alignment will include the untransformed coordinates of one macromolecule, followed by the second (or subsequent) structure(s) with all the coordinates transformed (by rotation / translation) to give a superposition.
Structure alignment report
bioinformatics
data
An informative report on molecular tertiary structure alignment-derived data.
edam
data
This is a broad data type and is used a placeholder for other, more specific types.
beta12orEarlier
Structure similarity score
data
bioinformatics
edam
A value representing molecular structure similarity, measured from structure alignment or some other type of structure comparison.
data
beta12orEarlier
Structural (3D) profile
bioinformatics
edam
data
data
Some type of structural (3D) profile or template (representing a structure or structure alignment).
beta12orEarlier
3D profile
Structural (3D) profile alignment
A 3D profile-3D profile alignment (each profile representing structures or a structure alignment).
data
data
edam
bioinformatics
Structural profile alignment
beta12orEarlier
Sequence-3D profile alignment
beta12orEarlier
data
bioinformatics
data
An alignment of a sequence to a 3D profile (representing structures or a structure alignment).
edam
Sequence-structural profile alignment
Protein sequence-structure scoring matrix
edam
Matrix of values used for scoring sequence-structure compatibility.
data
beta12orEarlier
bioinformatics
data
Sequence-structure alignment
data
An alignment of molecular sequence to structure (from threading sequence(s) through 3D structure or representation of structure(s)).
data
beta12orEarlier
bioinformatics
edam
Amino acid annotation
beta12orEarlier
data
edam
bioinformatics
An informative report about a specific amino acid.
data
Peptide annotation
beta12orEarlier
data
An informative report about a specific peptide.
bioinformatics
edam
data
Protein report
bioinformatics
edam
data
Gene product annotation
beta12orEarlier
An informative report about one or more specific protein molecules or protein structural domains, derived from analysis of primary (sequence or structural) data.
data
Protein property
data
edam
A report of primarily non-positional data describing intrinsic physical, chemical or other properties of a protein molecule or model.
bioinformatics
The report may be based on analysis of nucleic acid sequence or structural data. This is a broad data type and is used a placeholder for other, more specific types.
data
beta12orEarlier
Protein physicochemical property
Protein features (3D motif)
beta12orEarlier
data
Protein structure report (3D motif)
edam
data
An informative report on the 3D structural motifs in a protein.
This might include conformation of conserved substructures, conserved geometry (spatial arrangement) of secondary structure or protein backbone, role and functions etc.
bioinformatics
Protein domain classification
bioinformatics
beta12orEarlier
data
Data concerning the classification of the sequences and/or structures of protein structural domain(s).
edam
data
Protein features (domains)
edam
beta12orEarlier
The report will typically include a graphic of the location of domains in a sequence, with associated data such as lists of related sequences, literature references, etc.
Protein structural domains
Protein domain assignment
Summary of structural domains or 3D folds in a protein or polypeptide chain.
bioinformatics
data
data
Protein architecture report
beta12orEarlier
bioinformatics
edam
data
data
Protein structure report (architecture)
Protein property (architecture)
An informative report on architecture (spatial arrangement of secondary structure) of a protein structure.
Protein folding report
This is a broad data type and is used a placeholder for other, more specific types. It is primarily intended to help navigation of EDAM and would not typically be used for annotation.
Protein report (folding)
data
data
edam
bioinformatics
Protein property (folding)
A report on an analysis or model of protein folding properties, folding pathways, residues or sites that are key to protein folding, nucleation or stabilization centers etc.
beta12orEarlier
Protein features (mutation)
true
Protein report (mutation)
beta13
Protein structure report (mutation)
data
data
Protein property (mutation)
beta12orEarlier
bioinformatics
Data on the effect of (typically point) mutation on protein folding, stability, structure and function.
This is a broad data type and is used a placeholder for other, more specific types. It is primarily intended to help navigation of EDAM and would not typically be used for annotation.
edam
Protein interaction raw data
data
beta12orEarlier
edam
Protein-protein interaction data from for example yeast two-hybrid analysis, protein microarrays, immunoaffinity chromatography followed by mass spectrometry, phage display etc.
bioinformatics
data
This is a broad data type and is used a placeholder for other, more specific types. It is primarily intended to help navigation of EDAM and would not typically be used for annotation.
Protein interaction
bioinformatics
data
data
Protein report (interaction)
beta12orEarlier
An informative report on the interactions (predicted or known) of a protein, protein domain or part of a protein with itself or some other molecule(s), which might be another protein, nucleic acid or some other ligand.
edam
Protein family
edam
Protein family annotation
bioinformatics
data
beta12orEarlier
data
An informative report on a specific protein family or other group of classified proteins.
Vmax
The maximum initial velocity or rate of a reaction. It is the limiting velocity as substrate concentrations get very large.
data
data
edam
beta12orEarlier
bioinformatics
Km
data
beta12orEarlier
data
Km is the concentration (usually in Molar units) of substrate that leads to half-maximal velocity of an enzyme-catalysed reaction.
edam
bioinformatics
Nucleotide base annotation
bioinformatics
An informative report about a specific nucleotide base.
data
edam
beta12orEarlier
data
Nucleic acid property
Nucleic acid physicochemical property
data
bioinformatics
data
beta12orEarlier
The report may be based on analysis of nucleic acid sequence or structural data. This is a broad data type and is used a placeholder for other, more specific types.
A report of primarily non-positional data describing intrinsic physical, chemical or other properties of a nucleic acid molecule.
edam
Codon usage report
This is a broad data type and is used a placeholder for other, more specific types.
data
bioinformatics
data
Data derived from analysis of codon usage (typically a codon usage table) of DNA sequences.
edam
beta12orEarlier
Gene annotation
data
bioinformatics
Gene annotation (functional)
data
Moby:GeneInfo
Moby_namespace:Human_Readable_Description
beta12orEarlier
edam
This might include the gene name, description, summary and so on. It can include details about the function of a gene, such as its encoded protein or a functional classification of the gene sequence along according to the encoded protein(s).
Moby:gene
Gene report
An informative report on a particular locus, gene, gene system or groups of genes.
Gene classification
true
beta12orEarlier
edam
A report on the classification of nucleic acid / gene sequences according to the functional classification of their gene products.
bioinformatics
data
beta12orEarlier
data
Nucleic acid features (variation)
beta12orEarlier
data
A report on stable, naturally occuring mutations in a nucleotide sequence including alleles, naturally occurring mutations such as single base nucleotide substitutions, deletions and insertions, RFLPs and other polymorphisms.
bioinformatics
Sequence variation annotation
data
edam
SO:0001059
Gene annotation (chromosome)
This includes basic information. e.g. chromosome number, length, karyotype features, chromosome sequence etc.
data
data
An informative report on a specific chromosome.
edam
beta12orEarlier
bioinformatics
Genotype/phenotype annotation
beta12orEarlier
data
data
edam
bioinformatics
An informative report on the set of genes (or allelic forms) present in an individual, organism or cell and associated with a specific physical characteristic, or a report concerning an organisms traits and phenotypes.
Nucleic acid features (primers)
edam
data
beta12orEarlier
bioinformatics
data
Report on matches to PCR primers and hybridization oligos in a nucleic acid sequence.
Experiment annotation (PCR assay data)
data
PCR assay data
beta12orEarlier
bioinformatics
Data on a PCR assay or electronic / virtual PCR.
edam
data
Sequence trace
edam
beta12orEarlier
data
bioinformatics
This is the raw data produced by a DNA sequencing machine.
Fluorescence trace data generated by an automated DNA sequencer, which can be interprted as a molecular sequence (reads), given associated sequencing metadata such as base-call quality scores.
data
Sequence assembly
data
SO:0000353
SO:0001248
beta12orEarlier
An assembly of fragments of a (typically genomic) DNA sequence.
edam
data
Typically, an assembly is a collection of contigs (for example ESTs and genomic DNA fragments) that are ordered, aligned and merged. Annotation of the assembled sequence might be included.
bioinformatics
Perhaps surprisingly, the definition of 'SO:assembly' is narrower than the 'SO:sequence_assembly'.
SO:0001248
Radiation Hybrid (RH) scores
bioinformatics
data
edam
Radiation hybrid scores (RH) scores for one or more markers.
data
Radiation Hybrid (RH) scores are used in Radiation Hybrid mapping.
beta12orEarlier
Gene annotation (linkage)
An informative report on the linkage of alleles.
data
data
beta12orEarlier
edam
bioinformatics
Gene expression profile
Gene expression pattern
bioinformatics
Data quantifying the level of expression of (typically) multiple genes, derived for example from microarray experiments.
beta12orEarlier
edam
data
data
Experiment annotation (microarray)
beta12orEarlier
This might specify which raw data file relates to which sample and information on hybridisations, e.g. which are technical and which are biological replicates.
data
bioinformatics
edam
Information on a microarray experiment such as conditions, protocol, sample:data relationships etc.
data
Experimental design annotation
Oligonucleotide probe data
true
Data on oligonucleotide probes (typically for use with DNA microarrays).
bioinformatics
data
beta13
edam
data
beta12orEarlier
SAGE experimental data
true
edam
data
beta12orEarlier
data
beta12orEarlier
Serial analysis of gene expression (SAGE) experimental data
bioinformatics
Output from a serial analysis of gene expression (SAGE) experiment.
MPSS experimental data
true
data
edam
beta12orEarlier
beta12orEarlier
Massively parallel signature sequencing (MPSS) experimental data
Massively parallel signature sequencing (MPSS) data.
bioinformatics
data
SBS experimental data
true
data
bioinformatics
edam
beta12orEarlier
Sequencing by synthesis (SBS) experimental data
Sequencing by synthesis (SBS) data.
data
beta12orEarlier
Sequence tag profile (with gene assignment)
Tag to gene assignments (tag mapping) of SAGE, MPSS and SBS data. Typically this is the sequencing-based expression profile annotated with gene identifiers.
bioinformatics
data
beta12orEarlier
edam
data
Protein X-ray crystallographic data
beta12orEarlier
data
X-ray crystallography data.
bioinformatics
edam
data
Protein NMR data
bioinformatics
beta12orEarlier
data
edam
data
Protein nuclear magnetic resonance (NMR) raw data.
Protein circular dichroism (CD) spectroscopic data
data
data
Protein secondary structure from protein coordinate or circular dichroism (CD) spectroscopic data.
bioinformatics
beta12orEarlier
edam
Electron microscopy volume map
edam
EM volume map
Volume map data from electron microscopy.
bioinformatics
data
beta12orEarlier
data
Electron microscopy model
This might include the location in the model of the known features of a particular macromolecule.
Annotation on a structural 3D model (volume map) from electron microscopy.
data
edam
beta12orEarlier
bioinformatics
data
2D PAGE image
beta12orEarlier
Two-dimensional gel electrophoresis image.
edam
data
data
Two-dimensional gel electrophoresis image
bioinformatics
Mass spectrometry spectra
data
Spectra from mass spectrometry.
beta12orEarlier
data
edam
bioinformatics
Peptide mass fingerprint
edam
beta12orEarlier
A set of peptide masses (peptide mass fingerprint) from mass spectrometry.
bioinformatics
data
data
Protein fingerprint
Peak list
Peptide identification
beta12orEarlier
bioinformatics
data
Protein or peptide identifications with evidence supporting the identifications, typically from comparing a peptide mass fingerprint (from mass spectrometry) to a sequence database.
edam
data
Pathway or network annotation
true
beta12orEarlier
bioinformatics
data
An informative report about a specific biological pathway or network, typically including a map (diagram) of the pathway.
data
edam
beta12orEarlier
Biological pathway map
true
beta12orEarlier
data
A map (typically a diagram) of a biological pathway.
data
bioinformatics
edam
beta12orEarlier
Data resource definition
edam
data
A definition of a data resource serving one or more types of data, including metadata and links to the resource or data proper.
bioinformatics
data
beta12orEarlier
Workflow metadata
beta12orEarlier
edam
data
bioinformatics
Basic information, annotation or documentation concerning a workflow (but not the workflow itself).
data
Biological model
edam
data
bioinformatics
beta12orEarlier
A biological model which can be represented in mathematical terms.
data
Statistical estimate score
data
bioinformatics
A value representing estimated statistical significance of some observed data; typically sequence database hits.
beta12orEarlier
data
edam
EMBOSS database resource definition
data
edam
beta12orEarlier
Resource definition for an EMBOSS database.
bioinformatics
data
Version information
beta12orEarlier
Information on a version of software or data, for example name, version number and release date.
Development status / maturity may be part of the version information, for example in case of tools, standards, or some data records.
bioinformatics
data
data
edam
Development status / maturity may be part of the version information, for example in case of tools, standards, or some data records.
Database cross-mapping
The cross-mapping is typically a table where each row is an accession number and each column is a database being cross-referenced. The cells give the accession number or identifier of the corresponding entry in a database. If a cell in the table is not filled then no mapping could be found for the database. Additional information might be given on version, date etc.
A mapping of the accession numbers (or other database identifier) of entries between (typically) two biological or biomedical databases.
data
edam
data
bioinformatics
beta12orEarlier
Data index
beta12orEarlier
bioinformatics
An index of data of biological relevance.
data
edam
data
Data index report
beta12orEarlier
data
data
edam
bioinformatics
Database index annotation
A report of an analysis of an index of biological data.
Database metadata
bioinformatics
edam
data
data
beta12orEarlier
Basic information on bioinformatics database(s) or other data sources such as name, type, description, URL etc.
Tool metadata
data
edam
data
Basic information about one or more bioinformatics applications or packages, such as name, type, description, or other documentation.
beta12orEarlier
bioinformatics
Job metadata
data
Textual metadata on a submitted or completed job.
edam
Moby:PDGJOB
data
bioinformatics
beta12orEarlier
User metadata
data
edam
data
Textual metadata on a software author or end-user, for example a person or other software.
bioinformatics
beta12orEarlier
Small molecule annotation
data
data
An informative report on a specific chemical compound.
Small molecule report
bioinformatics
Chemical compound annotation
beta12orEarlier
edam
Cell line annotation
beta12orEarlier
Report on a particular strain of organism cell line including plants, virus, fungi and bacteria. The data typically includes strain number, organism type, growth conditions, source and so on.
bioinformatics
data
data
edam
Organism strain data
Scent annotation
bioinformatics
data
An informative report about a specific scent.
edam
beta12orEarlier
data
Ontology term
bioinformatics
A term (name) from an ontology.
edam
data
beta12orEarlier
data
Ontology concept metadata
bioinformatics
data
Data concerning or derived from a concept from a biological ontology.
data
beta12orEarlier
edam
Keyword
Moby:Global_Keyword
Moby:Wildcard_Query
Moby:BooleanQueryString
data
beta12orEarlier
data
Moby:QueryString
bioinformatics
edam
Keyword(s) or phrase(s) used (typically) for text-searching purposes.
Boolean operators (AND, OR and NOT) and wildcard characters may be allowed.
Bibliographic reference
beta12orEarlier
Moby:Publication
edam
Citation
Reference
data
A bibliographic reference might include information such as authors, title, journal name, date and (possibly) a link to the abstract or full-text of the article if available.
Bibliographic data that uniquely identifies a scientific article, book or other published material.
bioinformatics
data
Moby:GCP_SimpleCitation
Article
data
A body of scientific text, typically a full text article from a scientific journal.
beta12orEarlier
edam
data
bioinformatics
Text mining report
A text mining abstract will typically include an annotated a list of words or sentences extracted from one or more scientific articles.
beta12orEarlier
data
data
bioinformatics
An abstract of the results of text mining.
edam
Entity identifier
true
An identifier of a biological entity or phenomenon.
identifier
beta12orEarlier
beta12orEarlier
data
bioinformatics
edam
identifiers
Data resource identifier
true
beta12orEarlier
An identifier of a data resource.
edam
identifier
identifiers
beta12orEarlier
bioinformatics
data
Identifier (typed)
identifiers
edam
data
bioinformatics
An identifier that identifies a particular type of data.
This concept exists only to assist EDAM maintenance and navigation in graphical browsers. It does not add semantic information. This branch provides an alternative organisation of the concepts nested under 'Accession' and 'Name'. All concepts under here are already included under 'Accession' or 'Name'.
identifier
beta12orEarlier
Tool identifier
data
beta12orEarlier
An identifier of a bioinformatics tool, e.g. an application or web service.
bioinformatics
identifier
identifiers
edam
Discrete entity identifier
true
identifier
identifiers
bioinformatics
data
beta12orEarlier
Name or other identifier of a discrete entity (any biological thing with a distinct, discrete physical existence).
beta12orEarlier
edam
Entity feature identifier
true
identifiers
identifier
data
edam
beta12orEarlier
Name or other identifier of an entity feature (a physical part or region of a discrete biological entity, or a feature that can be mapped to such a thing).
beta12orEarlier
bioinformatics
Entity collection identifier
true
edam
Name or other identifier of a collection of discrete biological entities.
data
identifiers
identifier
bioinformatics
beta12orEarlier
beta12orEarlier
Phenomenon identifier
true
data
identifiers
edam
beta12orEarlier
Name or other identifier of a physical, observable biological occurrence or event.
beta12orEarlier
bioinformatics
identifier
Molecule identifier
identifier
edam
identifiers
beta12orEarlier
data
bioinformatics
Name or other identifier of a molecule.
Atom identifier
beta12orEarlier
data
Identifier (e.g. character symbol) of a specific atom.
identifier
bioinformatics
identifiers
edam
Molecule name
identifiers
edam
beta12orEarlier
identifier
bioinformatics
data
Name of a specific molecule.
Molecule type
Protein|DNA|RNA
bioinformatics
beta12orEarlier
A label (text token) describing the type a molecule.
data
edam
data
For example, 'Protein', 'DNA', 'RNA' etc.
Chemical identifier
true
bioinformatics
edam
beta12orEarlier
beta12orEarlier
Unique identifier of a chemical compound.
identifiers
identifier
data
Chromosome name
identifier
beta12orEarlier
edam
Name of a chromosome.
identifiers
bioinformatics
data
Peptide identifier
beta12orEarlier
identifier
Identifier of a peptide chain.
data
bioinformatics
identifiers
edam
Protein identifier
identifiers
bioinformatics
edam
identifier
data
Identifier of a protein.
beta12orEarlier
Compound name
identifier
beta12orEarlier
bioinformatics
edam
identifiers
Chemical name
Unique name of a chemical compound.
data
Chemical registry number
identifiers
edam
identifier
Unique registry number of a chemical compound.
bioinformatics
beta12orEarlier
data
Ligand identifier
true
beta12orEarlier
Code word for a ligand, for example from a PDB file.
identifier
edam
data
identifiers
beta12orEarlier
bioinformatics
Drug identifier
edam
data
Identifier of a drug.
identifier
beta12orEarlier
bioinformatics
identifiers
Amino acid identifier
bioinformatics
identifier
Residue identifier
beta12orEarlier
Identifier of an amino acid.
data
identifiers
edam
Nucleotide identifier
identifier
Name or other identifier of a nucleotide.
bioinformatics
data
beta12orEarlier
identifiers
edam
Monosaccharide identifier
Identifier of a monosaccharide.
bioinformatics
beta12orEarlier
identifiers
edam
data
identifier
Chemical name (ChEBI)
This is the recommended chemical name for use for example in database annotation.
edam
Unique name from Chemical Entities of Biological Interest (ChEBI) of a chemical compound.
identifier
identifiers
bioinformatics
data
beta12orEarlier
ChEBI chemical name
Chemical name (IUPAC)
bioinformatics
beta12orEarlier
edam
identifiers
IUPAC chemical name
data
identifier
IUPAC recommended name of a chemical compound.
Chemical name (INN)
edam
identifier
data
International Non-proprietary Name (INN or 'generic name') of a chemical compound, assigned by the World Health Organization (WHO).
bioinformatics
identifiers
beta12orEarlier
INN chemical name
Chemical name (brand)
data
identifiers
edam
bioinformatics
Brand chemical name
beta12orEarlier
identifier
Brand name of a chemical compound.
Chemical name (synonymous)
identifiers
Synonymous name of a chemical compound.
edam
data
bioinformatics
identifier
Synonymous chemical name
beta12orEarlier
Chemical registry number (CAS)
identifiers
bioinformatics
identifier
edam
data
beta12orEarlier
CAS registry number of a chemical compound.
CAS chemical registry number
Chemical registry number (Beilstein)
data
identifiers
Beilstein registry number of a chemical compound.
bioinformatics
identifier
Beilstein chemical registry number
beta12orEarlier
edam
Chemical registry number (Gmelin)
bioinformatics
Gmelin registry number of a chemical compound.
identifier
beta12orEarlier
identifiers
edam
data
Gmelin chemical registry number
HET group name
beta12orEarlier
Component identifier code
bioinformatics
Short ligand name
identifier
data
3-letter code word for a ligand (HET group) from a PDB file, for example ATP.
identifiers
edam
Amino acid name
identifier
edam
data
bioinformatics
identifiers
String of one or more ASCII characters representing an amino acid.
beta12orEarlier
Nucleotide code
data
bioinformatics
edam
String of one or more ASCII characters representing a nucleotide.
identifier
identifiers
beta12orEarlier
Polypeptide chain ID
Polypeptide chain identifier
This is typically a character (for the chain) appended to a PDB identifier, e.g. 1cukA
PDB chain identifier
Protein chain identifier
PDB strand id
PDBML:pdbx_PDB_strand_id
edam
Identifier of a polypeptide chain from a protein.
data
identifiers
Chain identifier
bioinformatics
identifier
WHATIF: chain
beta12orEarlier
Protein name
identifiers
bioinformatics
beta12orEarlier
edam
data
identifier
Name of a protein.
Enzyme identifier
data
Name or other identifier of an enzyme or record from a database of enzymes.
bioinformatics
identifiers
beta12orEarlier
edam
identifier
EC number
bioinformatics
Enzyme Commission number
beta12orEarlier
data
Moby:Annotated_EC_Number
edam
EC
An Enzyme Commission (EC) number of an enzyme.
EC code
Moby:EC_Number
[0-9]+\.-\.-\.-|[0-9]+\.[0-9]+\.-\.-|[0-9]+\.[0-9]+\.[0-9]+\.-|[0-9]+\.[0-9]+\.[0-9]+\.[0-9]+
identifier
identifiers
Enzyme name
beta12orEarlier
Name of an enzyme.
data
identifier
identifiers
bioinformatics
edam
Restriction enzyme name
identifier
data
beta12orEarlier
edam
Name of a restriction enzyme.
bioinformatics
identifiers
Sequence position specification
beta12orEarlier
A specification (partial or complete) of one or more positions or regions of a molecular sequence or map.
data
data
bioinformatics
edam
Sequence feature ID
beta12orEarlier
edam
bioinformatics
A unique identifier of molecular sequence feature, for example an ID of a feature that is unique within the scope of the GFF file.
identifier
data
identifiers
Sequence position
PDBML:_atom_site.id
beta12orEarlier
data
WHATIF: number
data
SO:0000735
A position of a single point (base or residue) in a sequence, or part of such a specification.
bioinformatics
edam
WHATIF: PDBx_atom_site
Sequence range
edam
data
data
Specification of range(s) of sequence positions.
bioinformatics
beta12orEarlier
Nucleic acid feature identifier
true
bioinformatics
Name or other identifier of an nucleic acid feature.
beta12orEarlier
beta12orEarlier
data
identifiers
identifier
edam
Protein feature identifier
true
edam
bioinformatics
identifier
Name or other identifier of a protein feature.
beta12orEarlier
data
identifiers
beta12orEarlier
Sequence feature key
A feature key indicates the biological nature of the feature or information about changes to or versions of the sequence.
data
bioinformatics
beta12orEarlier
The type of a sequence feature, typically a term or accession from the Sequence Ontology, for example an EMBL or Swiss-Prot sequence feature key.
Sequence feature type
edam
data
Sequence feature method
Sequence feature qualifier
Feature qualifiers hold information about a feature beyond that provided by the feature key and location.
data
edam
bioinformatics
data
Typically one of the EMBL or Swiss-Prot feature qualifiers.
beta12orEarlier
Sequence feature label
A feature label identifies a feature of a sequence database entry. When used with the database name and the entry's primary accession number, it is a unique identifier of that feature.
beta12orEarlier
bioinformatics
Typically an EMBL or Swiss-Prot feature label.
data
Sequence feature name
edam
data
EMBOSS Uniform Feature Object
The name of a sequence feature-containing entity adhering to the standard feature naming scheme used by all EMBOSS applications.
bioinformatics
data
beta12orEarlier
UFO
edam
data
Codon name
true
bioinformatics
beta12orEarlier
identifier
beta12orEarlier
data
String of one or more ASCII characters representing a codon.
identifiers
edam
Gene identifier
edam
beta12orEarlier
An identifier of a gene, such as a name/symbol or a unique identifier of a gene in a database.
data
bioinformatics
identifier
identifiers
Moby:GeneAccessionList
Gene symbol
edam
data
The short name of a gene; a single word that does not contain white space characters. It is typically derived from the gene name.
Moby_namespace:Global_GeneCommonName
identifier
identifiers
beta12orEarlier
Moby_namespace:Global_GeneSymbol
bioinformatics
Gene ID (NCBI)
bioinformatics
beta12orEarlier
NCBI gene ID
identifiers
An NCBI unique identifier of a gene.
edam
http://www.geneontology.org/doc/GO.xrf_abbs:LocusID
data
Gene identifier (Entrez)
http://www.geneontology.org/doc/GO.xrf_abbs:NCBI_Gene
Entrez gene ID
identifier
Gene identifier (NCBI)
NCBI geneid
Gene identifier (NCBI RefSeq)
true
beta12orEarlier
An NCBI RefSeq unique identifier of a gene.
edam
bioinformatics
data
identifier
identifiers
beta12orEarlier
Gene identifier (NCBI UniGene)
true
edam
identifiers
An NCBI UniGene unique identifier of a gene.
bioinformatics
identifier
data
beta12orEarlier
beta12orEarlier
Gene identifier (Entrez)
true
data
[0-9]+
edam
beta12orEarlier
beta12orEarlier
identifier
An Entrez unique identifier of a gene.
bioinformatics
identifiers
Gene ID (CGD)
beta12orEarlier
Identifier of a gene or feature from the CGD database.
data
edam
CGD ID
identifiers
bioinformatics
identifier
Gene ID (DictyBase)
identifiers
bioinformatics
Identifier of a gene from DictyBase.
data
identifier
beta12orEarlier
edam
Gene ID (Ensembl)
bioinformatics
data
edam
Unique identifier for a gene (or other feature) from the Ensembl database.
beta12orEarlier
identifier
identifiers
Ensembl Gene ID
Gene ID (SGD)
identifiers
data
identifier
edam
bioinformatics
beta12orEarlier
SGD identifier
S[0-9]+
Identifier of an entry from the SGD database.
Gene ID (GeneDB)
data
[a-zA-Z_0-9\.-]*
bioinformatics
edam
Moby_namespace:GeneDB
Identifier of a gene from the GeneDB database.
identifier
beta12orEarlier
identifiers
GeneDB identifier
TIGR identifier
bioinformatics
identifiers
identifier
data
edam
Identifier of an entry from the TIGR database.
beta12orEarlier
TAIR accession (gene)
Identifier of an gene from the TAIR database.
beta12orEarlier
edam
data
identifiers
bioinformatics
identifier
Gene:[0-9]{7}
Protein domain ID
bioinformatics
identifier
identifiers
This is typically a character or string concatenated with a PDB identifier and a chain identifier.
edam
data
Identifier of a protein structural domain.
beta12orEarlier
SCOP domain identifier
identifier
bioinformatics
beta12orEarlier
identifiers
edam
data
Identifier of a protein domain (or other node) from the SCOP database.
CATH domain ID
identifiers
bioinformatics
CATH domain identifier
1nr3A00
data
identifier
edam
Identifier of a protein domain from CATH.
beta12orEarlier
SCOP concise classification string (sccs)
edam
identifiers
An scss includes the class (alphabetical), fold, superfamily and family (all numerical) to which a given domain belongs.
data
beta12orEarlier
identifier
bioinformatics
A SCOP concise classification string (sccs) is a compact representation of a SCOP domain classification.
SCOP sunid
33229
A sunid uniquely identifies an entry in the SCOP hierarchy, including leaves (the SCOP domains) and higher level nodes including entries corresponding to the protein level.
identifier
edam
bioinformatics
data
SCOP unique identifier
sunid
identifiers
Unique identifier (number) of an entry in the SCOP hierarchy, for example 33229.
beta12orEarlier
CATH node ID
beta12orEarlier
CATH node identifier
A code number identifying a node from the CATH database.
identifiers
bioinformatics
CATH code
data
identifier
3.30.1190.10.1.1.1.1.1
edam
Kingdom name
identifier
beta12orEarlier
identifiers
The name of a biological kingdom (Bacteria, Archaea, or Eukaryotes).
bioinformatics
data
edam
Species name
beta12orEarlier
identifier
Organism species
The name of a species (typically a taxonomic group) of organism.
identifiers
bioinformatics
edam
data
Strain name
The name of a strain of an organism variant, typically a plant, virus or bacterium.
beta12orEarlier
identifier
bioinformatics
identifiers
edam
data
URI
edam
bioinformatics
beta12orEarlier
A string of characters that name or otherwise identify a resource on the Internet.
data
data
Database identifier
edam
identifier
data
beta12orEarlier
An identifier of a biological or bioinformatics database.
bioinformatics
identifiers
Directory name
identifiers
bioinformatics
data
edam
The name of a directory.
beta12orEarlier
identifier
File name
beta12orEarlier
data
edam
The name (or part of a name) of a file (of any type).
bioinformatics
identifier
identifiers
Ontology name
identifiers
bioinformatics
data
Name of an ontology of biological or bioinformatics concepts and relations.
identifier
edam
beta12orEarlier
URL
Moby:Link
beta12orEarlier
data
Moby:URL
A Uniform Resource Locator (URL).
edam
bioinformatics
data
URN
edam
data
data
beta12orEarlier
A Uniform Resource Name (URN).
bioinformatics
LSID
A Life Science Identifier (LSID) - a unique identifier of some data.
data
Life Science Identifier
LSIDs provide a standard way to locate and describe data. An LSID is represented as a Uniform Resource Name (URN) with the following format: URN:LSID:<Authority>:<Namespace>:<ObjectID>[:<Version>]
data
beta12orEarlier
bioinformatics
edam
Database name
edam
identifiers
bioinformatics
beta12orEarlier
The name of a biological or bioinformatics database.
data
identifier
Sequence database name
true
beta13
edam
bioinformatics
The name of a molecular sequence database.
identifier
beta12orEarlier
identifiers
data
Enumerated file name
edam
The name of a file (of any type) with restricted possible values.
data
beta12orEarlier
bioinformatics
identifier
identifiers
File name extension
beta12orEarlier
data
bioinformatics
identifier
A file extension is the characters appearing after the final '.' in the file name.
identifiers
edam
The extension of a file name.
File base name
A file base name is the file name stripped of its directory specification and extension.
edam
data
identifier
The base name of a file.
beta12orEarlier
bioinformatics
identifiers
QSAR descriptor name
bioinformatics
Name of a QSAR descriptor.
beta12orEarlier
data
identifiers
identifier
edam
Database entry identifier
true
beta12orEarlier
edam
identifier
This concept is required for completeness. It should never have child concepts.
identifiers
beta12orEarlier
An identifier of an entry from a database where the same type of identifier is used for objects (data) of different semantic type.
bioinformatics
data
Sequence identifier
beta12orEarlier
identifier
An identifier of molecular sequence(s) or entries from a molecular sequence database.
edam
identifiers
bioinformatics
data
Sequence set ID
beta12orEarlier
identifiers
edam
data
bioinformatics
An identifier of a set of molecular sequence(s).
identifier
Sequence signature identifier
true
Identifier of a sequence signature (motif or profile) for example from a database of sequence patterns.
identifiers
bioinformatics
data
edam
beta12orEarlier
beta12orEarlier
identifier
Sequence alignment ID
bioinformatics
identifier
beta12orEarlier
identifiers
edam
Identifier of a molecular sequence alignment, for example a record from an alignment database.
data
Phylogenetic distance matrix identifier
true
identifiers
data
Identifier of a phylogenetic distance matrix.
beta12orEarlier
beta12orEarlier
identifier
edam
bioinformatics
Phylogenetic tree ID
identifier
beta12orEarlier
identifiers
Identifier of a phylogenetic tree for example from a phylogenetic tree database.
bioinformatics
edam
data
Comparison matrix identifier
identifiers
beta12orEarlier
data
Substitution matrix identifier
bioinformatics
edam
identifier
An identifier of a comparison matrix.
Structure ID
bioinformatics
A unique and persistent identifier of a molecular tertiary structure, typically an entry from a structure database.
identifiers
identifier
data
edam
beta12orEarlier
Structural (3D) profile ID
bioinformatics
beta12orEarlier
identifiers
edam
identifier
Identifier or name of a structural (3D) profile or template (representing a structure or structure alignment).
Structural profile identifier
data
Structure alignment ID
data
identifier
edam
Identifier of an entry from a database of tertiary structure alignments.
beta12orEarlier
bioinformatics
identifiers
Amino acid index ID
data
beta12orEarlier
identifiers
bioinformatics
edam
identifier
Identifier of an index of amino acid physicochemical and biochemical property data.
Protein interaction ID
identifiers
data
Identifier of a report of protein interactions from a protein interaction database (typically).
edam
bioinformatics
identifier
beta12orEarlier
Protein family identifier
identifier
beta12orEarlier
bioinformatics
identifiers
Identifier of a protein family.
edam
Protein secondary database record identifier
data
Codon usage table name
identifiers
Unique name of a codon usage table.
bioinformatics
data
identifier
edam
beta12orEarlier
Transcription factor identifier
edam
Identifier of a transcription factor (or a TF binding site).
data
identifier
beta12orEarlier
bioinformatics
identifiers
Microarray experiment annotation ID
identifiers
Identifier of an entry from a database of microarray data.
edam
data
beta12orEarlier
bioinformatics
identifier
Electron microscopy model ID
edam
beta12orEarlier
Identifier of an entry from a database of electron microscopy data.
identifier
bioinformatics
identifiers
data
Gene expression report ID
Gene expression profile identifier
edam
identifier
identifiers
data
beta12orEarlier
Accession of a report of gene expression (e.g. a gene expression profile) from a database.
bioinformatics
Genotype and phenotype annotation ID
data
beta12orEarlier
edam
identifier
bioinformatics
identifiers
Identifier of an entry from a database of genotypes and phenotypes.
Pathway or network identifier
data
edam
identifiers
bioinformatics
beta12orEarlier
identifier
Identifier of an entry from a database of biological pathways or networks.
Workflow ID
bioinformatics
identifiers
data
beta12orEarlier
edam
Identifier of a biological or biomedical workflow, typically from a database of workflows.
identifier
Data resource definition identifier
identifier
Identifier of a data type definition from some provider.
identifiers
bioinformatics
data
beta12orEarlier
edam
Biological model identifier
beta12orEarlier
Identifier of a mathematical model, typically an entry from a database.
bioinformatics
identifiers
data
edam
identifier
Compound identifier
beta12orEarlier
identifier
Small molecule identifier
Identifier of an entry from a database of chemicals.
edam
bioinformatics
Chemical compound identifier
data
identifiers
Ontology concept ID
A unique (typically numerical) identifier of a concept in an ontology of biological or bioinformatics concepts and relations.
identifier
bioinformatics
Ontology concept ID
edam
beta12orEarlier
data
identifiers
Article ID
identifier
Unique identifier of a scientific article.
data
bioinformatics
identifiers
beta12orEarlier
edam
FlyBase ID
edam
identifier
data
identifiers
bioinformatics
FB[a-zA-Z_0-9]{2}[0-9]{7}
beta12orEarlier
Identifier of an object from the FlyBase database.
WormBase name
edam
data
identifier
identifiers
beta12orEarlier
Name of an object from the WormBase database, usually a human-readable name.
bioinformatics
WormBase class
data
A WormBase class describes the type of object such as 'sequence' or 'protein'.
edam
bioinformatics
Class of an object from the WormBase database.
identifier
identifiers
beta12orEarlier
Sequence accession
Sequence accession number
data
identifiers
bioinformatics
identifier
A persistent, unique identifier of a molecular sequence database entry.
beta12orEarlier
edam
Sequence type
data
edam
Sequence type might reflect the molecule (protein, nucleic acid etc) or the sequence itself (gapped, ambiguous etc).
A label (text token) describing a type of molecular sequence.
beta12orEarlier
bioinformatics
data
EMBOSS Uniform Sequence Address
bioinformatics
beta12orEarlier
data
identifiers
edam
The name of a sequence-based entity adhering to the standard sequence naming scheme used by all EMBOSS applications.
EMBOSS USA
identifier
Sequence accession (protein)
beta12orEarlier
identifiers
Protein sequence accession number
data
bioinformatics
edam
Accession number of a protein sequence database entry.
identifier
Sequence accession (nucleic acid)
beta12orEarlier
Accession number of a nucleotide sequence database entry.
bioinformatics
data
Nucleotide sequence accession number
identifier
identifiers
edam
RefSeq accession
RefSeq ID
identifiers
beta12orEarlier
identifier
Accession number of a RefSeq database entry.
(NC|AC|NG|NT|NW|NZ|NM|NR|XM|XR|NP|AP|XP|YP|ZP)_[0-9]+
data
edam
bioinformatics
UniProt accession (extended)
true
bioinformatics
1.0
Accession number of a UniProt (protein sequence) database entry. May contain version or isoform number.
[A-NR-Z][0-9][A-Z][A-Z0-9][A-Z0-9][0-9]|[OPQ][0-9][A-Z0-9][A-Z0-9][A-Z0-9][0-9]|[A-NR-Z][0-9][A-Z][A-Z0-9][A-Z0-9][0-9].[0-9]+|[OPQ][0-9][A-Z0-9][A-Z0-9][A-Z0-9][0-9].[0-9]+|[A-NR-Z][0-9][A-Z][A-Z0-9][A-Z0-9][0-9]-[0-9]+|[OPQ][0-9][A-Z0-9][A-Z0-9][A-Z0-9][0-9]-[0-9]+
data
edam
identifiers
Q7M1G0|P43353-2|P01012.107
beta12orEarlier
identifier
PIR identifier
edam
PIR ID
beta12orEarlier
identifiers
An identifier of PIR sequence database entry.
PIR accession number
data
bioinformatics
identifier
TREMBL accession
true
data
identifier
bioinformatics
identifiers
1.2
edam
Identifier of a TREMBL sequence database entry.
beta12orEarlier
Gramene primary identifier
Primary identifier of a Gramene database entry.
data
identifiers
Gramene primary ID
edam
identifier
bioinformatics
beta12orEarlier
EMBL/GenBank/DDBJ ID
beta12orEarlier
identifiers
Identifier of a (nucleic acid) entry from the EMBL/GenBank/DDBJ databases.
data
identifier
bioinformatics
edam
Sequence cluster ID (UniGene)
A unique identifier of an entry (gene cluster) from the NCBI UniGene database.
UniGene cluster ID
bioinformatics
UniGene identifier
edam
identifiers
beta12orEarlier
identifier
UniGene ID
UniGene cluster id
data
dbEST accession
beta12orEarlier
identifier
data
dbEST ID
edam
Identifier of a dbEST database entry.
identifiers
bioinformatics
dbSNP ID
Identifier of a dbSNP database entry.
data
edam
dbSNP identifier
bioinformatics
beta12orEarlier
identifier
identifiers
EMBOSS sequence type
true
edam
identifiers
bioinformatics
data
The EMBOSS type of a molecular sequence.
beta12orEarlier
identifier
See the EMBOSS documentation (http://emboss.sourceforge.net/) for a definition of what this includes.
beta12orEarlier
EMBOSS listfile
bioinformatics
data
beta12orEarlier
List of EMBOSS Uniform Sequence Addresses (EMBOSS listfile).
edam
data
Sequence cluster ID
identifier
beta12orEarlier
edam
identifiers
data
An identifier of a cluster of molecular sequence(s).
bioinformatics
Sequence cluster ID (COG)
identifier
identifiers
bioinformatics
edam
Unique identifier of an entry from the COG database.
beta12orEarlier
data
COG ID
Sequence motif identifier
identifier
beta12orEarlier
edam
identifiers
bioinformatics
data
Identifier of a sequence motif, for example an entry from a motif database.
Sequence profile ID
A sequence profile typically represents a sequence alignment.
edam
Identifier of a sequence profile.
identifier
data
bioinformatics
identifiers
beta12orEarlier
ELM ID
bioinformatics
beta12orEarlier
data
Identifier of an entry from the ELMdb database of protein functional sites.
edam
identifier
identifiers
Prosite accession number
Prosite ID
edam
PS[0-9]{5}
data
Accession number of an entry from the Prosite database.
bioinformatics
identifiers
identifier
beta12orEarlier
HMMER hidden Markov model ID
identifiers
edam
beta12orEarlier
data
Unique identifier or name of a HMMER hidden Markov model.
identifier
bioinformatics
JASPAR profile ID
Unique identifier or name of a profile from the JASPAR database.
bioinformatics
data
identifier
beta12orEarlier
identifiers
edam
Sequence alignment type
bioinformatics
data
A label (text token) describing the type of a sequence alignment.
Possible values include for example the EMBOSS alignment types, BLAST alignment types and so on.
data
edam
beta12orEarlier
BLAST sequence alignment type
true
The type of a BLAST sequence alignment.
identifiers
bioinformatics
edam
beta12orEarlier
data
beta12orEarlier
identifier
Phylogenetic tree type
nj|upgmp
data
edam
A label (text token) describing the type of a phylogenetic tree.
bioinformatics
data
beta12orEarlier
For example 'nj', 'upgmp' etc.
TreeBASE study accession number
bioinformatics
identifier
identifiers
beta12orEarlier
data
edam
Accession number of an entry from the TreeBASE database.
TreeFam accession number
identifiers
edam
data
Accession number of an entry from the TreeFam database.
bioinformatics
identifier
beta12orEarlier
Comparison matrix type
For example 'blosum', 'pam', 'gonnet', 'id' etc. Comparison matrix type may be required where a series of matrices of a certain type are used.
A label (text token) describing the type of a comparison matrix.
Substitution matrix type
beta12orEarlier
bioinformatics
blosum|pam|gonnet|id
data
edam
data
Comparison matrix name
Unique name or identifier of a comparison matrix.
identifiers
identifier
Substitution matrix name
bioinformatics
edam
See for example http://www.ebi.ac.uk/Tools/webservices/help/matrix.
data
beta12orEarlier
PDB ID
edam
identifier
[a-zA-Z_0-9]{4}
identifiers
beta12orEarlier
PDB identifier
bioinformatics
data
PDBID
An identifier of an entry from the PDB database.
AAindex ID
data
edam
Identifier of an entry from the AAindex database.
bioinformatics
identifier
beta12orEarlier
identifiers
BIND accession number
data
beta12orEarlier
Accession number of an entry from the BIND database.
identifier
edam
bioinformatics
identifiers
IntAct accession number
data
identifier
Accession number of an entry from the IntAct database.
edam
beta12orEarlier
bioinformatics
identifiers
EBI\-[0-9]+
Protein family name
Name of a protein family.
identifiers
identifier
beta12orEarlier
data
edam
bioinformatics
InterPro entry name
identifier
edam
identifiers
data
Name of an InterPro entry, usually indicating the type of protein matches for that entry.
beta12orEarlier
bioinformatics
InterPro accession
IPR[0-9]{6}
bioinformatics
IPR015590
edam
identifier
InterPro primary accession number
beta12orEarlier
Every InterPro entry has a unique accession number to provide a persistent citation of database records.
data
InterPro primary accession
identifiers
Primary accession number of an InterPro entry.
InterPro secondary accession
identifiers
Secondary accession number of an InterPro entry.
identifier
InterPro secondary accession number
bioinformatics
data
beta12orEarlier
edam
Gene3D ID
identifiers
bioinformatics
Unique identifier of an entry from the Gene3D database.
identifier
edam
beta12orEarlier
data
PIRSF ID
Unique identifier of an entry from the PIRSF database.
identifiers
edam
beta12orEarlier
bioinformatics
identifier
PIRSF[0-9]{6}
data
PRINTS code
The unique identifier of an entry in the PRINTS database.
edam
PR[0-9]{5}
data
identifier
identifiers
beta12orEarlier
bioinformatics
Pfam accession number
beta12orEarlier
data
edam
identifier
bioinformatics
Accession number of a Pfam entry.
PF[0-9]{5}
identifiers
SMART accession number
data
SM[0-9]{5}
beta12orEarlier
identifier
bioinformatics
identifiers
Accession number of an entry from the SMART database.
edam
Superfamily hidden Markov model number
identifiers
beta12orEarlier
data
Unique identifier (number) of a hidden Markov model from the Superfamily database.
identifier
bioinformatics
edam
TIGRFam ID
identifier
beta12orEarlier
Accession number of an entry (family) from the TIGRFam database.
identifiers
edam
TIGRFam accession number
data
bioinformatics
ProDom accession number
beta12orEarlier
identifiers
bioinformatics
data
ProDom is a protein domain family database.
PD[0-9]+
edam
identifier
A ProDom domain family accession number.
TRANSFAC accession number
edam
identifiers
Identifier of an entry from the TRANSFAC database.
identifier
beta12orEarlier
bioinformatics
data
ArrayExpress accession number
[AEP]-[a-zA-Z_0-9]{4}-[0-9]+
bioinformatics
identifiers
edam
ArrayExpress experiment ID
beta12orEarlier
identifier
data
Accession number of an entry from the ArrayExpress database.
PRIDE experiment accession number
identifier
PRIDE experiment accession number.
beta12orEarlier
identifiers
[0-9]+
bioinformatics
edam
data
EMDB ID
bioinformatics
edam
data
Identifier of an entry from the EMDB electron microscopy database.
identifiers
beta12orEarlier
identifier
GEO accession number
beta12orEarlier
data
edam
o^GDS[0-9]+
bioinformatics
identifiers
identifier
Accession number of an entry from the GEO database.
GermOnline ID
bioinformatics
identifier
data
beta12orEarlier
Identifier of an entry from the GermOnline database.
edam
identifiers
EMAGE ID
data
identifiers
identifier
beta12orEarlier
bioinformatics
edam
Identifier of an entry from the EMAGE database.
Disease ID
bioinformatics
beta12orEarlier
Identifier of an entry from a database of disease.
edam
data
identifier
identifiers
HGVbase ID
data
beta12orEarlier
identifiers
Identifier of an entry from the HGVbase database.
identifier
bioinformatics
edam
HIVDB identifier
true
Identifier of an entry from the HIVDB database.
beta12orEarlier
identifier
bioinformatics
beta12orEarlier
identifiers
edam
data
OMIM ID
Identifier of an entry from the OMIM database.
[*#+%^]?[0-9]{6}
identifiers
bioinformatics
edam
beta12orEarlier
identifier
data
KEGG object identifier
bioinformatics
data
Unique identifier of an object from one of the KEGG databases (excluding the GENES division).
identifier
edam
beta12orEarlier
identifiers
Pathway ID (reactome)
identifier
identifiers
bioinformatics
data
Reactome ID
edam
beta12orEarlier
REACT_[0-9]+(\.[0-9]+)?
Identifier of an entry from the Reactome database.
Pathway ID (aMAZE)
true
identifier
Identifier of an entry from the aMAZE database.
bioinformatics
beta12orEarlier
identifiers
aMAZE ID
beta12orEarlier
data
edam
Pathway ID (BioCyc)
edam
identifiers
data
Identifier of an pathway from the BioCyc biological pathways database.
identifier
beta12orEarlier
bioinformatics
BioCyc pathway ID
Pathway ID (INOH)
bioinformatics
identifiers
Identifier of an entry from the INOH database.
INOH identifier
identifier
edam
beta12orEarlier
data
Pathway ID (PATIKA)
edam
beta12orEarlier
identifiers
identifier
bioinformatics
PATIKA ID
Identifier of an entry from the PATIKA database.
data
Pathway ID (CPDB)
beta12orEarlier
identifiers
data
bioinformatics
CPDB ID
This concept refers to identifiers used by the databases collated in CPDB; CPDB identifiers are not independently defined.
edam
identifier
Identifier of an entry from the CPDB (ConsensusPathDB) biological pathways database, which is an identifier from an external database integrated into CPDB.
Pathway ID (Panther)
beta12orEarlier
data
Identifier of a biological pathway from the Panther Pathways database.
identifier
bioinformatics
edam
Panther Pathways ID
PTHR[0-9]{5}
identifiers
MIRIAM identifier
MIR:[0-9]{8}
identifier
MIR:00100005
edam
Unique identifier of a MIRIAM data resource.
This is the identifier used internally by MIRIAM for a data type.
beta12orEarlier
data
bioinformatics
identifiers
MIRIAM data type name
identifier
The name of a data type from the MIRIAM database.
data
identifiers
beta12orEarlier
edam
bioinformatics
MIRIAM URI
identifier
urn:miriam:pubmed:16333295|urn:miriam:obo.go:GO%3A0045202
A MIRIAM URI consists of the URI of the MIRIAM data type (PubMed, UniProt etc) followed by the identifier of an element of that data type, for example PMID for a publication or an accession number for a GO term.
data
identifiers
The URI (URL or URN) of a data entity from the MIRIAM database.
beta12orEarlier
bioinformatics
edam
MIRIAM data type primary name
bioinformatics
identifiers
The primary name of a MIRIAM data type is taken from a controlled vocabulary.
UniProt|Enzyme Nomenclature
data
edam
The primary name of a data type from the MIRIAM database.
beta12orEarlier
identifier
A protein entity has the MIRIAM data type 'UniProt', and an enzyme has the MIRIAM data type 'Enzyme Nomenclature'.
UniProt|Enzyme Nomenclature
MIRIAM data type synonymous name
identifiers
bioinformatics
data
beta12orEarlier
identifier
edam
A synonymous name for a MIRIAM data type taken from a controlled vocabulary.
A synonymous name of a data type from the MIRIAM database.
Taverna workflow ID
identifier
identifiers
bioinformatics
Unique identifier of a Taverna workflow.
edam
data
beta12orEarlier
Biological model name
data
identifiers
identifier
Name of a biological (mathematical) model.
bioinformatics
beta12orEarlier
edam
BioModel ID
(BIOMD|MODEL)[0-9]{10}
data
identifier
bioinformatics
edam
beta12orEarlier
Unique identifier of an entry from the BioModel database.
identifiers
PubChem CID
identifier
beta12orEarlier
edam
Chemical structure specified in PubChem Compound Identification (CID), a non-zero integer identifier for a unique chemical structure.
bioinformatics
PubChem compound accession identifier
[0-9]+
identifiers
data
ChemSpider ID
identifier
bioinformatics
edam
[0-9]+
beta12orEarlier
data
Identifier of an entry from the ChemSpider database.
identifiers
ChEBI ID
beta12orEarlier
identifiers
edam
bioinformatics
CHEBI:[0-9]+
identifier
data
ChEBI identifier
Identifier of an entry from the ChEBI database.
BioPax concept ID
identifier
An identifier of a concept from the BioPax ontology.
beta12orEarlier
bioinformatics
identifiers
data
edam
GO concept ID
data
[0-9]{7}|GO:[0-9]{7}
identifiers
edam
An identifier of a concept from The Gene Ontology.
beta12orEarlier
bioinformatics
identifier
GO concept identifier
MeSH concept ID
identifiers
identifier
edam
data
bioinformatics
An identifier of a concept from the MeSH vocabulary.
beta12orEarlier
HGNC concept ID
identifier
An identifier of a concept from the HGNC controlled vocabulary.
beta12orEarlier
data
edam
bioinformatics
identifiers
NCBI taxonomy ID
beta12orEarlier
identifier
9662|3483|182682
A stable unique identifier for each taxon (for a species, a family, an order, or any other group in the NCBI taxonomy database.
NCBI tax ID
edam
[1-9][0-9]{0,8}
data
identifiers
NCBI taxonomy identifier
bioinformatics
Plant Ontology concept ID
data
beta12orEarlier
identifier
bioinformatics
edam
An identifier of a concept from the Plant Ontology (PO).
identifiers
UMLS concept ID
identifier
beta12orEarlier
identifiers
An identifier of a concept from the UMLS vocabulary.
bioinformatics
edam
data
FMA concept ID
identifier
Classifies anatomical entities according to their shared characteristics (genus) and distinguishing characteristics (differentia). Specifies the part-whole and spatial relationships of the entities, morphological transformation of the entities during prenatal development and the postnatal life cycle and principles, rules and definitions according to which classes and relationships in the other three components of FMA are represented.
An identifier of a concept from Foundational Model of Anatomy.
FMA:[0-9]+
beta12orEarlier
bioinformatics
edam
data
identifiers
EMAP concept ID
identifier
identifiers
edam
An identifier of a concept from the EMAP mouse ontology.
data
beta12orEarlier
bioinformatics
ChEBI concept ID
bioinformatics
edam
data
An identifier of a concept from the ChEBI ontology.
identifiers
identifier
beta12orEarlier
MGED concept ID
data
bioinformatics
identifiers
edam
identifier
An identifier of a concept from the MGED ontology.
beta12orEarlier
myGrid concept ID
An identifier of a concept from the myGrid ontology.
The ontology is provided as two components, the service ontology and the domain ontology. The domain ontology acts provides concepts for core bioinformatics data types and their relations. The service ontology describes the physical and operational features of web services.
edam
beta12orEarlier
data
identifier
identifiers
bioinformatics
PubMed ID
4963447
beta12orEarlier
edam
PubMed unique identifier of an article.
data
[1-9][0-9]{0,8}
bioinformatics
identifier
PMID
identifiers
Digital Object Identifier
data
bioinformatics
Digital Object Identifier (DOI) of a published article.
identifier
(doi\:)?[0-9]{2}\.[0-9]{4}/.*
identifiers
beta12orEarlier
edam
Medline UI
identifier
The use of Medline UI has been replaced by the PubMed unique identifier.
Medline UI (unique identifier) of an article.
beta12orEarlier
edam
Medline unique identifier
identifiers
data
bioinformatics
Tool name
identifier
bioinformatics
data
beta12orEarlier
The name of a computer package, application, method or function.
identifiers
edam
Tool name (signature)
Signature methods from http://www.ebi.ac.uk/Tools/InterProScan/help.html#results include BlastProDom, FPrintScan, HMMPIR, HMMPfam, HMMSmart, HMMTigr, ProfileScan, ScanRegExp, SuperFamily and HAMAP.
edam
bioinformatics
identifiers
The unique name of a signature (sequence classifier) method.
identifier
data
beta12orEarlier
Tool name (BLAST)
BLAST name
beta12orEarlier
The name of a BLAST tool.
identifiers
edam
data
bioinformatics
identifier
This include 'blastn', 'blastp', 'blastx', 'tblastn' and 'tblastx'.
Tool name (FASTA)
identifiers
This includes 'fasta3', 'fastx3', 'fasty3', 'fastf3', 'fasts3' and 'ssearch'.
identifier
edam
data
bioinformatics
The name of a FASTA tool.
beta12orEarlier
Tool name (EMBOSS)
bioinformatics
edam
identifier
beta12orEarlier
data
identifiers
The name of an EMBOSS application.
Tool name (EMBASSY package)
beta12orEarlier
The name of an EMBASSY package.
bioinformatics
identifier
identifiers
edam
data
QSAR descriptor (constitutional)
QSAR constitutional descriptor
bioinformatics
edam
data
beta12orEarlier
data
A QSAR constitutional descriptor.
QSAR descriptor (electronic)
data
QSAR electronic descriptor
bioinformatics
data
A QSAR electronic descriptor.
edam
beta12orEarlier
QSAR descriptor (geometrical)
bioinformatics
beta12orEarlier
edam
A QSAR geometrical descriptor.
data
data
QSAR geometrical descriptor
QSAR descriptor (topological)
edam
A QSAR topological descriptor.
QSAR topological descriptor
data
bioinformatics
data
beta12orEarlier
QSAR descriptor (molecular)
data
edam
QSAR molecular descriptor
beta12orEarlier
data
A QSAR molecular descriptor.
bioinformatics
Sequence set (protein)
beta12orEarlier
data
bioinformatics
edam
data
Any collection of multiple protein sequences and associated metadata that do not (typically) correspond to common sequence database records or database entries.
Sequence set (nucleic acid)
data
data
edam
Any collection of multiple nucleotide sequences and associated metadata that do not (typically) correspond to common sequence database records or database entries.
beta12orEarlier
bioinformatics
Sequence cluster
beta12orEarlier
The cluster might include sequences identifiers, short descriptions, alignment and summary information.
bioinformatics
edam
data
data
A set of sequences that have been clustered or otherwise classified as belonging to a group including (typically) sequence cluster information.
Psiblast checkpoint file
true
A file of intermediate results from a PSIBLAST search that is used for priming the search in the next PSIBLAST iteration.
bioinformatics
beta12orEarlier
A Psiblast checkpoint file uses ASN.1 Binary Format and usually has the extension '.asn'.
beta12orEarlier
edam
data
data
HMMER synthetic sequences set
true
beta12orEarlier
data
beta12orEarlier
Sequences generated by HMMER package in FASTA-style format.
bioinformatics
edam
data
Proteolytic digest
edam
data
A protein sequence cleaved into peptide fragments (by enzymatic or chemical cleavage) with fragment masses.
beta12orEarlier
bioinformatics
data
Restriction digest
beta12orEarlier
data
data
edam
SO:0000412
Restriction digest fragments from digesting a nucleotide sequence with restriction sites using a restriction endonuclease.
bioinformatics
PCR primers
bioinformatics
edam
data
Oligonucleotide primer(s) for PCR and DNA amplification, for example a minimal primer set.
data
beta12orEarlier
vectorstrip cloning vector definition file
true
beta12orEarlier
File of sequence vectors used by EMBOSS vectorstrip application, or any file in same format.
data
edam
data
bioinformatics
beta12orEarlier
Primer3 internal oligo mishybridizing library
true
beta12orEarlier
bioinformatics
edam
data
A library of nucleotide sequences to avoid during hybridization events. Hybridization of the internal oligo to sequences in this library is avoided, rather than priming from them. The file is in a restricted FASTA format.
beta12orEarlier
data
Primer3 mispriming library file
true
edam
beta12orEarlier
data
beta12orEarlier
data
A nucleotide sequence library of sequences to avoid during amplification (for example repetitive sequences, or possibly the sequences of genes in a gene family that should not be amplified. The file must is in a restricted FASTA format.
bioinformatics
primersearch primer pairs sequence record
true
data
edam
File of one or more pairs of primer sequences, as used by EMBOSS primersearch application.
beta12orEarlier
bioinformatics
beta12orEarlier
data
Sequence cluster (protein)
beta12orEarlier
Protein sequence cluster
bioinformatics
A cluster of protein sequences.
data
The sequences are typically related, for example a family of sequences.
edam
data
Sequence cluster (nucleic acid)
bioinformatics
beta12orEarlier
Nucleotide sequence cluster
data
data
The sequences are typically related, for example a family of sequences.
A cluster of nucleotide sequences.
edam
Sequence length
edam
data
data
beta12orEarlier
The size (length) of a sequence, subsequence or region in a sequence.
bioinformatics
Word size
data
data
edam
beta12orEarlier
Size of a sequence word.
bioinformatics
Word length
Word size is used for example in word-based sequence database search methods.
Window size
edam
Size of a sequence window.
data
beta12orEarlier
A window is a region of fixed size but not fixed position over a molecular sequence. It is typically moved (computationally) over a sequence during scoring.
data
bioinformatics
Sequence length range
data
Specification of range(s) of length of sequences.
beta12orEarlier
data
bioinformatics
edam
Sequence information report
true
edam
bioinformatics
Report on basic information about a molecular sequence such as name, accession number, type (nucleic or protein), length, description etc.
beta12orEarlier
data
beta12orEarlier
data
Sequence property
Sequence properties report
edam
An informative report about non-positional sequence features, typically a report on general molecular sequence properties derived from sequence analysis.
data
bioinformatics
data
beta12orEarlier
Feature record
Sequence features
Features
Annotation of positional features of molecular sequence(s), i.e. that can be mapped to position(s) in the sequence.
data
edam
data
General sequence features
This includes annotation of positional sequence features, organized into a standard feature table, or any other report of sequence features. General feature reports are a source of sequence feature table information although internal conversion would be required.
SO:0000110
bioinformatics
beta12orEarlier
Sequence features report
Sequence features (comparative)
true
bioinformatics
beta12orEarlier
data
edam
This is a broad data type and is used a placeholder for other, more specific types. It is primarily intended to help navigation of EDAM and would not typically be used for annotation.
beta13
Comparative data on sequence features such as statistics, intersections (and data on intersections), differences etc.
data
Sequence property (protein)
true
data
beta12orEarlier
data
bioinformatics
beta12orEarlier
edam
A report of general sequence properties derived from protein sequence data.
Sequence property (nucleic acid)
true
beta12orEarlier
edam
data
A report of general sequence properties derived from nucleotide sequence data.
beta12orEarlier
data
bioinformatics
Sequence complexity
bioinformatics
Sequence property (complexity)
data
edam
beta12orEarlier
data
A report on sequence complexity, for example low-complexity or repeat regions in sequences.
Sequence ambiguity
beta12orEarlier
bioinformatics
Sequence property (ambiguity)
A report on ambiguity in molecular sequence(s).
edam
data
data
Sequence composition
Sequence property (composition)
edam
data
bioinformatics
data
A report (typically a table) on character or word composition / frequency of a molecular sequence(s).
beta12orEarlier
Peptide molecular weight hits
edam
data
beta12orEarlier
data
bioinformatics
A report on peptide fragments of certain molecular weight(s) in one or more protein sequences.
Sequence composition (base position variability)
data
edam
data
bioinformatics
Report on or plot of third base position variability in a nucleotide sequence.
beta12orEarlier
Sequence composition table
true
data
beta12orEarlier
data
edam
beta12orEarlier
A table of character or word composition / frequency of a molecular sequence.
bioinformatics
Sequence composition (base frequencies)
bioinformatics
data
edam
A table of base frequencies of a nucleotide sequence.
data
beta12orEarlier
Sequence composition (base words)
data
edam
data
A table of word composition of a nucleotide sequence.
beta12orEarlier
bioinformatics
Amino acid frequencies
data
data
A table of amino acid frequencies of a protein sequence.
beta12orEarlier
bioinformatics
Sequence composition (amino acid frequencies)
edam
Amino acid word frequencies
data
edam
beta12orEarlier
Sequence composition (amino acid words)
bioinformatics
data
A table of amino acid word composition of a protein sequence.
DAS sequence feature annotation
true
edam
data
beta12orEarlier
bioinformatics
Annotation of a molecular sequence in DAS format.
beta12orEarlier
data
Sequence feature table
Feature table
Annotation of positional sequence features, organized into a standard feature table.
data
beta12orEarlier
edam
bioinformatics
data
Map
bioinformatics
A map of (typically one) DNA sequence annotated with positional or non-positional features.
data
beta12orEarlier
DNA map
edam
data
Nucleic acid features
edam
bioinformatics
Feature table (nucleic acid)
Nucleic acid feature table
Nucleotide sequence-specific feature annotation (positional features of a nucleotide sequence).
data
This includes nucleotide sequence feature annotation in any known sequence feature table format and any other report of nucleic acid features.
data
beta12orEarlier
Protein features
Protein feature table
This includes protein sequence feature annotation in any known sequence feature table format and any other report of protein features.
edam
data
Feature table (protein)
bioinformatics
data
Protein sequence-specific feature annotation (positional features of a protein sequence).
beta12orEarlier
Genetic map
Moby:GeneticMap
Linkage map
beta12orEarlier
data
edam
A genetic (linkage) map indicates the proximity of two genes on a chromosome, whether two genes are linked and the frequency they are transmitted together to an offspring. They are limited to genetic markers of traits observable only in whole organisms.
data
bioinformatics
A map showing the relative positions of genetic markers in a nucleic acid sequence, based on estimation of non-physical distance such as recombination frequencies.
Sequence map
A map of genetic markers in a contiguous, assembled genomic sequence, with the sizes and separation of markers measured in base pairs.
data
data
beta12orEarlier
bioinformatics
A sequence map typically includes annotation on significant subsequences such as contigs, haplotypes and genes. The contigs shown will (typically) be a set of small overlapping clones representing a complete chromosomal segment.
edam
Physical map
data
beta12orEarlier
bioinformatics
Distance in a physical map is measured in base pairs. A physical map might be ordered relative to a reference map (typically a genetic map) in the process of genome sequencing.
edam
data
A map of DNA (linear or circular) annotated with physical features or landmarks such as restriction sites, cloned DNA fragments, genes or genetic markers, along with the physical distances between them.
Sequence signature map
true
bioinformatics
beta12orEarlier
edam
beta12orEarlier
data
data
Image of a sequence with matches to signatures, motifs or profiles.
Cytogenetic map
bioinformatics
This is the lowest-resolution physical map and can provide only rough estimates of physical (base pair) distances. Like a genetic map, they are limited to genetic markers of traits observable only in whole organisms.
Cytogenic map
data
Chromosome map
edam
Cytologic map
A map showing banding patterns derived from direct observation of a stained chromosome.
data
beta12orEarlier
DNA transduction map
A gene map showing distances between loci based on relative cotransduction frequencies.
data
data
bioinformatics
beta12orEarlier
edam
Gene map
beta12orEarlier
data
edam
data
Sequence map of a single gene annotated with genetic features such as introns, exons, untranslated regions, polyA signals, promoters, enhancers and (possibly) mutations defining alleles of a gene.
bioinformatics
Plasmid map
beta12orEarlier
Sequence map of a plasmid (circular DNA).
bioinformatics
data
data
edam
Genome map
edam
bioinformatics
data
data
Sequence map of a whole genome.
beta12orEarlier
Restriction map
data
data
beta12orEarlier
Image of the restriction enzyme cleavage sites (restriction sites) in a nucleic acid sequence.
bioinformatics
edam
InterPro compact match image
true
edam
data
data
bioinformatics
beta12orEarlier
The sequence(s) might be screened against InterPro, or be the sequences from the InterPro entry itself. Each protein is represented as a scaled horizontal line with colored bars indicating the position of the matches.
Image showing matches between protein sequence(s) and InterPro Entries.
beta12orEarlier
InterPro detailed match image
true
The sequence(s) might be screened against InterPro, or be the sequences from the InterPro entry itself.
Image showing detailed information on matches between protein sequence(s) and InterPro Entries.
beta12orEarlier
data
edam
bioinformatics
data
beta12orEarlier
InterPro architecture image
true
bioinformatics
beta12orEarlier
data
Image showing the architecture of InterPro domains in a protein sequence.
The sequence(s) might be screened against InterPro, or be the sequences from the InterPro entry itself. Domain architecture is shown as a series of non-overlapping domains in the protein.
data
edam
beta12orEarlier
SMART protein schematic
true
SMART protein schematic in PNG format.
beta12orEarlier
beta12orEarlier
edam
data
data
bioinformatics
GlobPlot domain image
true
data
bioinformatics
edam
Images based on GlobPlot prediction of intrinsic disordered regions and globular domains in protein sequences.
beta12orEarlier
beta12orEarlier
data
Sequence features (motifs)
data
Report on the location of matches to profiles, motifs (conserved or functional patterns) or other signatures in one or more sequences.
beta12orEarlier
Use this concept if another, more specific concept is not available.
bioinformatics
data
edam
Sequence features (repeats)
data
data
Location of short repetitive subsequences (repeat sequences) in (typically nucleotide) sequences.
The report might include derived data map such as classification, annotation, organization, periodicity etc.
Repeat sequence map
edam
beta12orEarlier
bioinformatics
Nucleic acid features (gene and transcript structure)
edam
bioinformatics
data
A report on predicted or actual gene structure, regions which make an RNA product and features such as promoters, coding regions, splice sites etc.
beta12orEarlier
data
Gene annotation (structure)
Nucleic acid features (mobile genetic elements)
data
data
edam
beta12orEarlier
A report on a region of a nucleic acid sequence containin mobile genetic elements.
Nucleic acid features (transposons)
This includes transposons, Plasmids, Bacteriophage elements and Group II introns.
bioinformatics
Nucleic acid features (PolyA signal or site)
beta12orEarlier
A polyA signal is required for endonuclease cleavage of an RNA transcript that is followed by polyadenylation. A polyA site is a site on an RNA transcript to which adenine residues will be added during post-transcriptional polyadenylation.
bioinformatics
data
A region or site in a eukaryotic and eukaryotic viral RNA sequence which directs endonuclease cleavage or polyadenylation of an RNA transcript.
data
PolyA site
PolyA signal
edam
Nucleic acid features (quadruplexes)
beta12orEarlier
data
bioinformatics
data
A report on quadruplex-forming motifs in a nucleotide sequence.
edam
Nucleic acid features (CpG island and isochore)
data
A report or plot of CpG rich regions (isochores) in a nucleotide sequence.
bioinformatics
beta12orEarlier
edam
data
Nucleic acid features (restriction sites)
bioinformatics
data
Report on restriction enzyme recognition sites (restriction sites) in a nucleic acid sequence.
edam
data
beta12orEarlier
Nucleic acid features (nucleosome exclusion sequences)
A report on nucleosome formation potential or exclusion sequence(s).
data
beta12orEarlier
edam
bioinformatics
data
Nucleic acid features (splice sites)
Nucleic acid report (RNA splicing)
Nucleic acid report (RNA splice model)
bioinformatics
data
data
beta12orEarlier
A report on splice sites in a nucleotide sequence or alternative RNA splicing events.
edam
Nucleic acid features (matrix/scaffold attachment sites)
edam
data
beta12orEarlier
bioinformatics
A report on matrix/scaffold attachment regions (MARs/SARs) in a DNA sequence.
data
Gene features (exonic splicing enhancer)
true
data
A report on exonic splicing enhancers (ESE) in an exon.
edam
bioinformatics
beta13
beta12orEarlier
data
Nucleic acid features (microRNA)
data
A report on microRNA sequence (miRNA) or precursor, microRNA targets, miRNA binding sites in an RNA sequence etc.
edam
beta12orEarlier
data
bioinformatics
Nucleic acid features (operon)
data
beta12orEarlier
A report on operons (operators, promoters and genes) from a bacterial genome.
edam
bioinformatics
The report for a query sequence or gene might include the predicted operon leader and trailer gene, gene composition of the operon and associated information, as well as information on the query.
Gene features (operon)
data
Gene features (promoter)
beta12orEarlier
data
data
A report on whole promoters or promoter elements (transcription start sites, RNA polymerase binding site, transcription factor binding sites, promoter enhancers etc) in a DNA sequence.
edam
bioinformatics
Nucleic acid features (coding sequence)
Gene features (coding sequence)
beta12orEarlier
Gene features (coding region)
A report on protein-coding regions including coding sequences (CDS), exons, translation initiation sites and open reading frames.
Gene annotation (translation)
bioinformatics
data
edam
data
Gene features (SECIS element)
true
data
data
beta13
A report on selenocysteine insertion sequence (SECIS) element in a DNA sequence.
beta12orEarlier
edam
bioinformatics
Gene features (TFBS)
data
A report on the transcription factor binding sites (TFBS) in a DNA sequence.
bioinformatics
data
edam
beta12orEarlier
Protein features (sites)
true
A report on predicted or known key residue positions (sites) in a protein sequence, such as binding or functional sites.
bioinformatics
beta12orEarlier
beta12orEarlier
data
Use this concept for collections of specific sites which are not necessarily contiguous, rather than contiguous stretches of amino acids.
edam
data
Protein features (signal peptides)
beta12orEarlier
edam
A report on the location of signal peptides or signal peptide cleavage sites in protein sequences.
data
bioinformatics
data
Protein features (cleavage sites)
data
bioinformatics
edam
beta12orEarlier
A report on cleavage sites (for a proteolytic enzyme or agent) in a protein sequence.
data
Protein features (post-translation modifications)
data
data
Post-translation modification
beta12orEarlier
bioinformatics
A report on post-translation modifications in a protein sequence, typically describing the specific sites involved.
edam
Protein features (post-translation modification sites)
Protein features (active sites)
data
beta12orEarlier
Enzyme active site
A report on catalytic residues (active site) of an enzyme.
data
edam
bioinformatics
Protein features (binding sites)
bioinformatics
A report on ligand-binding (non-catalytic) residues of a protein, such as sites that bind metal, prosthetic groups or lipids.
beta12orEarlier
data
data
edam
Protein features (epitopes)
true
A report on antigenic determinant sites (epitopes) in proteins, from sequence and / or structural data.
beta13
data
Epitope mapping is commonly done during vaccine design.
beta12orEarlier
bioinformatics
edam
data
Protein features (nucleic acid binding sites)
bioinformatics
beta12orEarlier
edam
data
data
A report on RNA and DNA-binding proteins and binding sites in protein sequences.
MHC Class I epitopes report
true
data
edam
A report on epitopes that bind to MHC class I molecules.
beta12orEarlier
bioinformatics
data
beta12orEarlier
MHC Class II epitopes report
true
A report on predicted epitopes that bind to MHC class II molecules.
data
data
beta12orEarlier
beta12orEarlier
edam
bioinformatics
Protein features (PEST sites)
true
data
bioinformatics
beta13
'PEST' motifs target proteins for proteolytic degradation and reduce the half-lives of proteins dramatically.
data
edam
A report or plot of PEST sites in a protein sequence.
beta12orEarlier
Sequence database hits scores list
true
data
beta12orEarlier
bioinformatics
edam
beta12orEarlier
Scores from a sequence database search (for example a BLAST search).
data
Sequence database hits alignments list
true
edam
beta12orEarlier
data
Alignments from a sequence database search (for example a BLAST search).
beta12orEarlier
bioinformatics
data
Sequence database hits evaluation data
true
data
data
bioinformatics
edam
beta12orEarlier
A report on the evaluation of the significance of sequence similarity scores from a sequence database search (for example a BLAST search).
beta12orEarlier
MEME motif alphabet
true
bioinformatics
data
data
beta12orEarlier
beta12orEarlier
Alphabet for the motifs (patterns) that MEME will search for.
edam
MEME background frequencies file
true
beta12orEarlier
data
beta12orEarlier
bioinformatics
edam
data
MEME background frequencies file.
MEME motifs directive file
true
beta12orEarlier
edam
File of directives for ordering and spacing of MEME motifs.
data
beta12orEarlier
data
bioinformatics
Dirichlet distribution
data
data
edam
bioinformatics
beta12orEarlier
Dirichlet distribution used by hidden Markov model analysis programs.
HMM emission and transition counts
bioinformatics
beta12orEarlier
data
Emission and transition counts of a hidden Markov model, generated once HMM has been determined, for example after residues/gaps have been assigned to match, delete and insert states.
data
edam
Regular expression
bioinformatics
beta12orEarlier
data
Regular expression pattern.
data
edam
Sequence motif
edam
bioinformatics
Any specific or conserved pattern (typically expressed as a regular expression) in a molecular sequence.
data
beta12orEarlier
data
Sequence profile
edam
bioinformatics
data
data
beta12orEarlier
Some type of statistical model representing a (typically multiple) sequence alignment.
Protein signature
data
edam
beta12orEarlier
InterPro entry
data
An entry (sequence classifier and associated data) from the InterPro database.
bioinformatics
Prosite nucleotide pattern
true
beta12orEarlier
data
A nucleotide regular expression pattern from the Prosite database.
beta12orEarlier
bioinformatics
data
edam
Prosite protein pattern
true
beta12orEarlier
data
edam
beta12orEarlier
bioinformatics
data
A protein regular expression pattern from the Prosite database.
Position frequency matrix
PFM
A profile (typically representing a sequence alignment) that is a simple matrix of nucleotide (or amino acid) counts per position.
bioinformatics
beta12orEarlier
edam
data
data
Position weight matrix
Contributions of individual sequences to the matrix might be uneven (weighted).
data
A profile (typically representing a sequence alignment) that is weighted matrix of nucleotide (or amino acid) counts per position.
data
edam
beta12orEarlier
PWM
bioinformatics
Information content matrix
edam
A profile (typically representing a sequence alignment) derived from a matrix of nucleotide (or amino acid) counts per position that reflects information content at each position.
data
ICM
bioinformatics
data
beta12orEarlier
Hidden Markov model
edam
data
data
A hidden Markov model representation of a set or alignment of sequences.
beta12orEarlier
bioinformatics
HMM
Fingerprint
bioinformatics
One or more fingerprints (sequence classifiers) as used in the PRINTS database.
data
edam
data
beta12orEarlier
Domainatrix signature
true
beta12orEarlier
beta12orEarlier
bioinformatics
data
edam
A protein signature of the type used in the EMBASSY Signature package.
data
HMMER NULL hidden Markov model
true
bioinformatics
NULL hidden Markov model representation used by the HMMER package.
data
data
beta12orEarlier
edam
beta12orEarlier
Protein family signature
A protein family signature (sequence classifier) from the InterPro database.
data
data
bioinformatics
edam
beta12orEarlier
Protein family signatures cover all domains in the matching proteins and span >80% of the protein length and with no adjacent protein domain signatures or protein region signatures.
Protein domain signature
Protein domain signatures identify structural or functional domains or other units with defined boundaries.
bioinformatics
A protein domain signature (sequence classifier) from the InterPro database.
edam
data
beta12orEarlier
data
Protein region signature
A protein region signature defines a region which cannot be described as a protein family or domain signature.
bioinformatics
data
beta12orEarlier
A protein region signature (sequence classifier) from the InterPro database.
edam
data
Protein repeat signature
A protein repeat signature (sequence classifier) from the InterPro database.
beta12orEarlier
data
edam
A protein repeat signature is a repeated protein motif, that is not in single copy expected to independently fold into a globular domain.
data
bioinformatics
Protein site signature
edam
data
bioinformatics
A protein site signature (sequence classifier) from the InterPro database.
data
beta12orEarlier
A protein site signature is a classifier for a specific site in a protein.
Protein conserved site signature
beta12orEarlier
edam
A protein conserved site signature (sequence classifier) from the InterPro database.
bioinformatics
A protein conserved site signature is any short sequence pattern that may contain one or more unique residues and is cannot be described as a active site, binding site or post-translational modification.
data
data
Protein active site signature
beta12orEarlier
edam
data
bioinformatics
data
A protein active site signature (sequence classifier) from the InterPro database.
A protein active site signature corresponds to an enzyme catalytic pocket. An active site typically includes non-contiguous residues, therefore multiple signatures may be required to describe an active site. ; residues involved in enzymatic reactions for which mutational data is typically available.
Protein binding site signature
data
beta12orEarlier
A protein binding site signature corresponds to a site that reversibly binds chemical compounds, which are not themselves substrates of the enzymatic reaction. This includes enzyme cofactors and residues involved in electron transport or protein structure modification.
data
bioinformatics
A protein binding site signature (sequence classifier) from the InterPro database.
edam
Protein post-translational modification signature
beta12orEarlier
A protein post-translational modification signature corresponds to sites that undergo modification of the primary structure, typically to activate or de-activate a function. For example, methylation, sumoylation, glycosylation etc. The modification might be permanent or reversible.
A protein post-translational modification signature (sequence classifier) from the InterPro database.
data
bioinformatics
edam
data
Sequence alignment (pair)
data
Alignment of exactly two molecular sequences.
bioinformatics
beta12orEarlier
data
edam
Sequence alignment (multiple)
true
data
edam
Alignment of more than two molecular sequences.
beta12orEarlier
beta12orEarlier
bioinformatics
data
Sequence alignment (nucleic acid)
data
Alignment of multiple nucleotide sequences.
edam
bioinformatics
data
beta12orEarlier
Sequence alignment (protein)
bioinformatics
beta12orEarlier
data
edam
data
Alignment of multiple protein sequences.
Sequence alignment (hybrid)
data
bioinformatics
beta12orEarlier
Hybrid sequence alignments include for example genomic DNA to EST, cDNA or mRNA.
data
Alignment of multiple molecular sequences of different types.
edam
Sequence alignment (nucleic acid pair)
bioinformatics
Alignment of exactly two nucleotide sequences.
beta12orEarlier
data
edam
data
Sequence alignment (protein pair)
beta12orEarlier
Alignment of exactly two protein sequences.
data
edam
bioinformatics
data
Hybrid sequence alignment (pair)
true
Alignment of exactly two molecular sequences of different types.
edam
bioinformatics
beta12orEarlier
data
data
beta12orEarlier
Multiple nucleotide sequence alignment
true
edam
beta12orEarlier
data
Alignment of more than two nucleotide sequences.
bioinformatics
data
beta12orEarlier
Multiple protein sequence alignment
true
data
edam
Alignment of more than two protein sequences.
beta12orEarlier
data
beta12orEarlier
bioinformatics
Alignment score or penalty
A simple floating point number defining the penalty for opening or extending a gap in an alignment.
data
edam
bioinformatics
data
beta12orEarlier
Score end gaps control
true
bioinformatics
edam
Whether end gaps are scored or not.
beta12orEarlier
beta12orEarlier
data
data
Aligned sequence order
true
Controls the order of sequences in an output sequence alignment.
data
data
bioinformatics
edam
beta12orEarlier
beta12orEarlier
Gap opening penalty
bioinformatics
beta12orEarlier
data
edam
A penalty for opening a gap in an alignment.
data
Gap extension penalty
edam
beta12orEarlier
data
A penalty for extending a gap in an alignment.
bioinformatics
data
Gap separation penalty
data
edam
bioinformatics
A penalty for gaps that are close together in an alignment.
data
beta12orEarlier
Terminal gap penalty
true
beta12orEarlier
data
bioinformatics
A penalty for gaps at the termini of an alignment, either from the N/C terminal of protein or 5'/3' terminal of nucleotide sequences.
data
edam
beta12orEarlier
Match reward score
The score for a 'match' used in various sequence database search applications with simple scoring schemes.
data
edam
beta12orEarlier
data
bioinformatics
Mismatch penalty score
The score (penalty) for a 'mismatch' used in various alignment and sequence database search applications with simple scoring schemes.
beta12orEarlier
bioinformatics
data
edam
data
Drop off score
beta12orEarlier
data
bioinformatics
data
This is the threshold drop in score at which extension of word alignment is halted.
edam
Gap opening penalty (integer)
true
data
beta12orEarlier
A simple floating point number defining the penalty for opening a gap in an alignment.
beta12orEarlier
bioinformatics
edam
data
Gap opening penalty (float)
true
A simple floating point number defining the penalty for opening a gap in an alignment.
beta12orEarlier
bioinformatics
beta12orEarlier
data
data
edam
Gap extension penalty (integer)
true
edam
beta12orEarlier
beta12orEarlier
A simple floating point number defining the penalty for extending a gap in an alignment.
bioinformatics
data
data
Gap extension penalty (float)
true
data
beta12orEarlier
edam
data
A simple floating point number defining the penalty for extending a gap in an alignment.
beta12orEarlier
bioinformatics
Gap separation penalty (integer)
true
edam
beta12orEarlier
data
data
bioinformatics
beta12orEarlier
A simple floating point number defining the penalty for gaps that are close together in an alignment.
Gap separation penalty (float)
true
A simple floating point number defining the penalty for gaps that are close together in an alignment.
data
bioinformatics
beta12orEarlier
data
beta12orEarlier
edam
Terminal gap opening penalty
data
A number defining the penalty for opening gaps at the termini of an alignment, either from the N/C terminal of protein or 5'/3' terminal of nucleotide sequences.
edam
data
beta12orEarlier
bioinformatics
Terminal gap extension penalty
bioinformatics
data
data
edam
beta12orEarlier
A number defining the penalty for extending gaps at the termini of an alignment, either from the N/C terminal of protein or 5'/3' terminal of nucleotide sequences.
Sequence identity
beta12orEarlier
data
edam
bioinformatics
data
Sequence identity is the number (%) of matches (identical characters) in positions from an alignment of two molecular sequences.
Sequence similarity
Data Type is float probably.
edam
Sequence similarity is the similarity (expressed as a percentage) of two molecular sequences calculated from their alignment, a scoring matrix for scoring characters substitutions and penalties for gap insertion and extension.
data
data
bioinformatics
beta12orEarlier
Sequence alignment metadata (quality report)
true
Data on molecular sequence alignment quality (estimated accuracy).
beta12orEarlier
edam
data
bioinformatics
data
beta12orEarlier
Sequence alignment report (site conservation)
This is a broad data type and is used a placeholder for other, more specific types. It is primarily intended to help navigation of EDAM and would not typically be used for annotation. Use this concept for calculated substitution rates, relative site variability, data on sites with biased properties, highly conserved or very poorly conserved sites, regions, blocks etc.
data
data
Data on character conservation in a molecular sequence alignment.
beta12orEarlier
edam
bioinformatics
Sequence alignment report (site correlation)
Data on correlations between sites in a molecular sequence alignment, typically to identify possible covarying positions and predict contacts or structural constraints in protein structures.
beta12orEarlier
data
bioinformatics
edam
data
Sequence-profile alignment (Domainatrix signature)
true
beta12orEarlier
data
Alignment of molecular sequences to a Domainatrix signature (representing a sequence alignment).
edam
beta12orEarlier
bioinformatics
data
Sequence-profile alignment (HMM)
bioinformatics
data
Alignment of molecular sequence(s) to a hidden Markov model(s).
edam
data
beta12orEarlier
Sequence-profile alignment (fingerprint)
edam
beta12orEarlier
bioinformatics
data
data
Alignment of molecular sequences to a protein fingerprint from the PRINTS database.
Phylogenetic continuous quantitative data
Continuous quantitative data that may be read during phylogenetic tree calculation.
edam
Quantitative traits
Phylogenetic continuous quantitative characters
beta12orEarlier
data
bioinformatics
data
Phylogenetic discrete data
Discretely coded characters
data
bioinformatics
beta12orEarlier
Phylogenetic discrete states
edam
data
Character data with discrete states that may be read during phylogenetic tree calculation.
Discrete characters
Phylogenetic character cliques
data
beta12orEarlier
edam
Phylogenetic report (cliques)
data
One or more cliques of mutually compatible characters that are generated, for example from analysis of discrete character data, and are used to generate a phylogeny.
bioinformatics
Phylogenetic invariants
bioinformatics
Phylogenetic report (invariants)
data
data
beta12orEarlier
Phylogenetic invariants data for testing alternative tree topologies.
edam
Phylogenetic tree report
data
data
A report of data concerning or derived from a phylogenetic tree, or from comparing two or more phylogenetic trees.
This is a broad data type and is used a placeholder for other, more specific types.
bioinformatics
beta12orEarlier
Phylogenetic tree-derived report
edam
DNA substitution model
beta12orEarlier
Sequence alignment report (DNA substitution model)
edam
data
A model of DNA substitution that explains a DNA sequence alignment, derived from phylogenetic tree analysis.
data
bioinformatics
Phylogenetic tree report (DNA substitution model)
Phylogenetic tree report (tree shape)
data
Data about the shape of a phylogenetic tree.
data
edam
beta12orEarlier
bioinformatics
Phylogenetic tree report (tree evaluation)
bioinformatics
Data on the confidence of a phylogenetic tree.
edam
data
data
beta12orEarlier
Phylogenetic tree report (tree distances)
Distances, such as Branch Score distance, between two or more phylogenetic trees.
data
edam
data
beta12orEarlier
bioinformatics
Phylogenetic tree report (tree stratigraphic)
bioinformatics
data
edam
Molecular clock and stratigraphic (age) data derived from phylogenetic tree analysis.
data
beta12orEarlier
Phylogenetic character contrasts
beta12orEarlier
edam
data
Independent contrasts for characters used in a phylogenetic tree, or covariances, regressions and correlations between characters for those contrasts.
bioinformatics
Phylogenetic report (character contrasts)
data
Comparison matrix (integers)
true
bioinformatics
beta12orEarlier
Matrix of integer numbers for sequence comparison.
data
edam
data
Substitution matrix (integers)
beta12orEarlier
Comparison matrix (floats)
true
data
beta12orEarlier
bioinformatics
edam
Substitution matrix (floats)
data
Matrix of floating point numbers for sequence comparison.
beta12orEarlier
Comparison matrix (nucleotide)
data
data
bioinformatics
Nucleotide substitution matrix
beta12orEarlier
edam
Matrix of integer or floating point numbers for nucleotide comparison.
Comparison matrix (amino acid)
Amino acid substitution matrix
edam
data
data
Matrix of integer or floating point numbers for amino acid comparison.
beta12orEarlier
bioinformatics
Nucleotide comparison matrix (integers)
true
Matrix of integer numbers for nucleotide comparison.
beta12orEarlier
beta12orEarlier
data
Nucleotide substitution matrix (integers)
edam
data
bioinformatics
Nucleotide comparison matrix (floats)
true
edam
Nucleotide substitution matrix (floats)
beta12orEarlier
data
bioinformatics
Matrix of floating point numbers for nucleotide comparison.
data
beta12orEarlier
Amino acid comparison matrix (integers)
true
data
beta12orEarlier
data
beta12orEarlier
bioinformatics
Amino acid substitution matrix (integers)
edam
Matrix of integer numbers for amino acid comparison.
Amino acid comparison matrix (floats)
true
data
beta12orEarlier
data
Amino acid substitution matrix (floats)
bioinformatics
edam
Matrix of floating point numbers for amino acid comparison.
beta12orEarlier
Protein features (membrane regions)
bioinformatics
An informative report on trans- or intra-membrane regions of a protein, typically describing physicochemical properties of the secondary structure elements.
This might include the location and size of the membrane spanning segments and intervening loop regions, transmembrane region IN/OUT orientation relative to the membrane, plus the following data for each amino acid: A Z-coordinate (the distance to the membrane center), the free energy of membrane insertion (calculated in a sliding window over the sequence) and a reliability score. The z-coordinate implies information about re-entrant helices, interfacial helices, the tilt of a transmembrane helix and loop lengths.
beta12orEarlier
Intramembrane region report
data
edam
data
Transmembrane region report
Protein report (membrane protein)
Nucleic acid structure
data
edam
bioinformatics
beta12orEarlier
3D coordinate and associated data for a nucleic acid tertiary (3D) structure.
data
Protein structure
beta12orEarlier
data
edam
3D coordinate and associated data for a protein tertiary (3D) structure.
bioinformatics
data
Protein-ligand complex
This includes interactions of proteins with atoms, ions and small molecules or macromolecules such as nucleic acids or other polypeptides. For stable inter-polypeptide interactions use 'Protein complex' instead.
data
bioinformatics
edam
beta12orEarlier
The structure of a protein in complex with a ligand, typically a small molecule such as an enzyme substrate or cofactor, but possibly another macromolecule.
data
Carbohydrate structure
bioinformatics
3D coordinate and associated data for a carbohydrate (3D) structure.
beta12orEarlier
edam
data
data
Small molecule structure
data
beta12orEarlier
bioinformatics
CHEBI:23367
edam
data
3D coordinate and associated data for the (3D) structure of a small molecule, such as any common chemical compound.
DNA structure
data
data
bioinformatics
3D coordinate and associated data for a DNA tertiary (3D) structure.
beta12orEarlier
edam
RNA structure record
edam
3D coordinate and associated data for an RNA tertiary (3D) structure.
bioinformatics
beta12orEarlier
data
data
tRNA structure record
data
beta12orEarlier
3D coordinate and associated data for a tRNA tertiary (3D) structure, including tmRNA, snoRNAs etc.
data
edam
bioinformatics
Protein chain
bioinformatics
3D coordinate and associated data for the tertiary (3D) structure of a polypeptide chain.
data
beta12orEarlier
edam
data
Protein domain
data
data
beta12orEarlier
bioinformatics
edam
3D coordinate and associated data for the tertiary (3D) structure of a protein domain.
Protein structure (all atoms)
3D coordinate and associated data for a protein tertiary (3D) structure (all atoms).
data
bioinformatics
data
edam
beta12orEarlier
Protein structure (C-alpha atoms)
data
beta12orEarlier
data
bioinformatics
edam
C-beta atoms from amino acid side-chains may be included.
3D coordinate and associated data for a protein tertiary (3D) structure (typically C-alpha atoms only).
Protein chain (all atoms)
true
data
edam
beta12orEarlier
3D coordinate and associated data for a polypeptide chain tertiary (3D) structure (all atoms).
data
beta12orEarlier
bioinformatics
Protein chain (C-alpha atoms)
true
data
edam
beta12orEarlier
3D coordinate and associated data for a polypeptide chain tertiary (3D) structure (typically C-alpha atoms only).
data
bioinformatics
C-beta atoms from amino acid side-chains may be included.
beta12orEarlier
Protein domain (all atoms)
true
data
data
beta12orEarlier
3D coordinate and associated data for a protein domain tertiary (3D) structure (all atoms).
beta12orEarlier
bioinformatics
edam
Protein domain (C-alpha atoms)
true
beta12orEarlier
beta12orEarlier
bioinformatics
C-beta atoms from amino acid side-chains may be included.
data
edam
data
3D coordinate and associated data for a protein domain tertiary (3D) structure (typically C-alpha atoms only).
Structure alignment (pair)
data
data
beta12orEarlier
edam
bioinformatics
Alignment (superimposition) of exactly two molecular tertiary (3D) structures.
Structure alignment (multiple)
true
data
edam
Alignment (superimposition) of more than two molecular tertiary (3D) structures.
data
beta12orEarlier
bioinformatics
beta12orEarlier
Structure alignment (protein)
data
edam
bioinformatics
data
Alignment (superimposition) of protein tertiary (3D) structures.
beta12orEarlier
Structure alignment (nucleic acid)
bioinformatics
data
Alignment (superimposition) of nucleic acid tertiary (3D) structures.
beta12orEarlier
edam
data
Structure alignment (protein pair)
Alignment (superimposition) of exactly two protein tertiary (3D) structures.
beta12orEarlier
data
bioinformatics
data
edam
Multiple protein tertiary structure alignment
true
bioinformatics
beta12orEarlier
beta12orEarlier
Alignment (superimposition) of more than two protein tertiary (3D) structures.
edam
data
data
Structure alignment (protein all atoms)
data
beta12orEarlier
Alignment (superimposition) of protein tertiary (3D) structures (all atoms considered).
edam
bioinformatics
data
Structure alignment (protein C-alpha atoms)
beta12orEarlier
bioinformatics
edam
data
C-beta atoms from amino acid side-chains may be considered.
data
Alignment (superimposition) of protein tertiary (3D) structures (typically C-alpha atoms only considered).
Pairwise protein tertiary structure alignment (all atoms)
true
edam
beta12orEarlier
beta12orEarlier
data
data
Alignment (superimposition) of exactly two protein tertiary (3D) structures (all atoms considered).
bioinformatics
Pairwise protein tertiary structure alignment (C-alpha atoms)
true
beta12orEarlier
edam
data
data
beta12orEarlier
bioinformatics
Alignment (superimposition) of exactly two protein tertiary (3D) structures (typically C-alpha atoms only considered).
C-beta atoms from amino acid side-chains may be included.
Multiple protein tertiary structure alignment (all atoms)
true
edam
beta12orEarlier
Alignment (superimposition) of exactly two protein tertiary (3D) structures (all atoms considered).
data
data
bioinformatics
beta12orEarlier
Multiple protein tertiary structure alignment (C-alpha atoms)
true
data
data
beta12orEarlier
bioinformatics
beta12orEarlier
C-beta atoms from amino acid side-chains may be included.
edam
Alignment (superimposition) of exactly two protein tertiary (3D) structures (typically C-alpha atoms only considered).
Structure alignment (nucleic acid pair)
Alignment (superimposition) of exactly two nucleic acid tertiary (3D) structures.
data
data
bioinformatics
beta12orEarlier
edam
Multiple nucleic acid tertiary structure alignment
true
beta12orEarlier
data
Alignment (superimposition) of more than two nucleic acid tertiary (3D) structures.
data
edam
beta12orEarlier
bioinformatics
Structure alignment (RNA)
beta12orEarlier
Alignment (superimposition) of RNA tertiary (3D) structures.
edam
data
bioinformatics
data
Structural transformation matrix
beta12orEarlier
edam
Matrix to transform (rotate/translate) 3D coordinates, typically the transformation necessary to superimpose two molecular structures.
data
data
bioinformatics
DaliLite hit table
true
data
data
beta12orEarlier
DaliLite hit table of protein chain tertiary structure alignment data.
beta12orEarlier
bioinformatics
The significant and top-scoring hits for regions of the compared structures is shown. Data such as Z-Scores, number of aligned residues, root-mean-square deviation (RMSD) of atoms and sequence identity are given.
edam
Molecular similarity score
true
beta12orEarlier
bioinformatics
edam
A score reflecting structural similarities of two molecules.
data
data
beta12orEarlier
Root-mean-square deviation
Root-mean-square deviation (RMSD) is calculated to measure the average distance between superimposed macromolecular coordinates.
bioinformatics
edam
beta12orEarlier
data
data
RMSD
Tanimoto similarity score
bioinformatics
A ligand fingerprint is derived from ligand structural data from a Protein DataBank file. It reflects the elements or groups present or absent, covalent bonds and bond orders and the bonded environment in terms of SATIS codes and BLEEP atom types.
edam
data
A measure of the similarity between two ligand fingerprints.
data
beta12orEarlier
3D-1D scoring matrix
data
beta12orEarlier
data
A matrix of 3D-1D scores reflecting the probability of amino acids to occur in different tertiary structural environments.
bioinformatics
edam
Amino acid index
edam
A table of 20 numerical values which quantify a property (e.g. physicochemical or biochemical) of the common amino acids.
bioinformatics
data
data
beta12orEarlier
Amino acid index (chemical classes)
data
beta12orEarlier
Chemical classification (small, aliphatic, aromatic, polar, charged etc) of amino acids.
bioinformatics
data
edam
Amino acid pair-wise contact potentials
beta12orEarlier
edam
bioinformatics
data
data
Statistical protein contact potentials.
Amino acid index (molecular weight)
edam
bioinformatics
beta12orEarlier
data
Molecular weights of amino acids.
data
Amino acid index (hydropathy)
beta12orEarlier
data
Hydrophobic, hydrophilic or charge properties of amino acids.
edam
data
bioinformatics
Amino acid index (White-Wimley data)
edam
data
Experimental free energy values for the water-interface and water-octanol transitions for the amino acids.
data
bioinformatics
beta12orEarlier
Amino acid index (van der Waals radii)
beta12orEarlier
data
data
edam
Van der Waals radii of atoms for different amino acid residues.
bioinformatics
Enzyme property
An informative report on a specific enzyme.
bioinformatics
data
data
Enzyme report
edam
beta12orEarlier
Protein report (enzyme)
Restriction enzyme property
edam
data
data
Restriction enzyme report
Protein report (restriction enzyme)
This might include name of enzyme, organism, isoschizomers, methylation, source, suppliers, literature references, or data on restriction enzyme patterns such as name of enzyme, recognition site, length of pattern, number of cuts made by enzyme, details of blunt or sticky end cut etc.
beta12orEarlier
bioinformatics
Restriction enzyme pattern data
An informative report on a specific restriction enzyme such as enzyme reference data.
Peptide molecular weights
The report might include associated data such as frequency of peptide fragment molecular weights.
data
List of molecular weight(s) of one or more proteins or peptides, for example cut by proteolytic enzymes or reagents.
beta12orEarlier
bioinformatics
edam
data
Peptide hydrophobic moment
data
Hydrophobic moment is a peptides hydrophobicity measured for different angles of rotation.
Report on the hydrophobic moment of a polypeptide sequence.
bioinformatics
data
beta12orEarlier
edam
Protein aliphatic index
The aliphatic index is the relative protein volume occupied by aliphatic side chains.
beta12orEarlier
The aliphatic index of a protein.
bioinformatics
data
edam
data
Protein sequence hydropathy plot
edam
beta12orEarlier
A protein sequence with annotation on hydrophobic or hydrophilic / charged regions, hydrophobicity plot etc.
Hydrophobic moment is a peptides hydrophobicity measured for different angles of rotation.
data
data
bioinformatics
Protein charge plot
data
A plot of the mean charge of the amino acids within a window of specified length as the window is moved along a protein sequence.
data
edam
bioinformatics
beta12orEarlier
Protein solubility
Protein solubility data
edam
data
data
The solubility or atomic solvation energy of a protein sequence or structure.
bioinformatics
beta12orEarlier
Protein crystallizability
data
data
edam
Data on the crystallizability of a protein sequence.
Protein crystallizability data
bioinformatics
beta12orEarlier
Protein globularity
data
edam
Data on the stability, intrinsic disorder or globularity of a protein sequence.
beta12orEarlier
data
Protein globularity data
bioinformatics
Protein titration curve
data
bioinformatics
beta12orEarlier
data
The titration curve of a protein.
edam
Protein isoelectric point
The isoelectric point of one proteins.
data
bioinformatics
data
edam
beta12orEarlier
Protein pKa value
data
bioinformatics
beta12orEarlier
The pKa value of a protein.
data
edam
Protein hydrogen exchange rate
data
data
The hydrogen exchange rate of a protein.
bioinformatics
edam
beta12orEarlier
Protein extinction coefficient
beta12orEarlier
data
edam
The extinction coefficient of a protein.
data
bioinformatics
Protein optical density
bioinformatics
beta12orEarlier
data
edam
data
The optical density of a protein.
Protein subcellular localization
true
beta13
An informative report on protein subcellular localization (nuclear, cytoplasmic, mitochondrial, chloroplast, plastid, membrane etc) or destination (exported / extracellular proteins).
data
data
Protein report (subcellular localization)
beta12orEarlier
edam
bioinformatics
Peptide immunogenicity data
This includes data on peptide ligands that elicit an immune response (immunogens), allergic cross-reactivity, predicted antigenicity (Hopp and Woods plot) etc. These data are useful in the development of peptide-specific antibodies or multi-epitope vaccines. Methods might use sequence data (for example motifs) and / or structural data.
beta12orEarlier
Peptide immunogenicity report
data
bioinformatics
data
edam
Peptide immunogenicity
An report on allergenicity / immunogenicity of peptides and proteins.
MHC peptide immunogenicity report
true
edam
data
bioinformatics
beta12orEarlier
data
A report on the immunogenicity of MHC class I or class II binding peptides.
beta13
Protein structure report
Protein structure report (domain)
Protein report (structure)
Annotation on or structural information derived from one or more specific protein 3D structure(s) or structural domains.
beta12orEarlier
Protein property (structural)
edam
data
bioinformatics
data
Protein structure-derived report
Protein structural property
Protein structural quality report
beta12orEarlier
Protein report (structural quality)
data
edam
Model validation might involve checks for atomic packing, steric clashes, agreement with electron density maps etc.
data
Protein property (structural quality)
Protein structure report (quality evaluation)
bioinformatics
Report on the quality of a protein three-dimensional model.
Protein residue interactions
bioinformatics
Data on inter-atomic or inter-residue contacts, distances and interactions in protein structure(s) or on the interactions of protein atoms or residues with non-protein groups.
data
edam
Atom interaction data
beta12orEarlier
Residue interaction data
data
Protein flexibility or motion report
Protein structure report (flexibility or motion)
beta12orEarlier
Protein flexibility or motion
edam
data
Informative report on flexibility or motion of a protein structure.
Protein property (flexibility or motion)
bioinformatics
This is a broad data type and is used a placeholder for other, more specific types. It is primarily intended to help navigation of EDAM and would not typically be used for annotation.
data
Protein solvent accessibility
data
data
Data on the solvent accessible or buried surface area of a protein structure.
beta12orEarlier
edam
bioinformatics
This is a broad data type and is used a placeholder for other, more specific types. It is primarily intended to help navigation of EDAM and would not typically be used for annotation. This concept covers definitions of the protein surface, interior and interfaces, accessible and buried residues, surface accessible pockets, interior inaccessible cavities etc.
Protein surface report
Data on the surface properties (shape, hydropathy, electrostatic patches etc) of a protein structure.
data
beta12orEarlier
bioinformatics
edam
Protein structure report (surface)
This is a broad data type and is used a placeholder for other, more specific types. It is primarily intended to help navigation of EDAM and would not typically be used for annotation.
data
Ramachandran plot
bioinformatics
edam
Phi/psi angle data or a Ramachandran plot of a protein structure.
beta12orEarlier
data
data
Protein dipole moment
edam
beta12orEarlier
bioinformatics
Data on the net charge distribution (dipole moment) of a protein structure.
data
data
Protein distance matrix
data
edam
A matrix of distances between amino acid residues (for example the C-alpha atoms) in a protein structure.
data
beta12orEarlier
bioinformatics
Protein contact map
data
bioinformatics
edam
data
beta12orEarlier
An amino acid residue contact map for a protein structure.
Protein residue 3D cluster
data
data
edam
Report on clusters of contacting residues in protein structures such as a key structural residue network.
bioinformatics
beta12orEarlier
Protein hydrogen bonds
edam
data
bioinformatics
data
beta12orEarlier
Patterns of hydrogen bonding in protein structures.
Protein non-canonical interactions
data
beta12orEarlier
edam
Non-canonical atomic interactions in protein structures.
data
Protein non-canonical interactions report
bioinformatics
CATH node
data
CATH classification node report
beta12orEarlier
bioinformatics
edam
data
Information on a node from the CATH database.
The report (for example http://www.cathdb.info/cathnode/1.10.10.10) includes CATH code (of the node and upper levels in the hierarchy), classification text (of appropriate levels in hierarchy), list of child nodes, representative domain and other relevant data and links.
SCOP node
bioinformatics
edam
beta12orEarlier
data
Information on a node from the SCOP database.
SCOP classification node
data
EMBASSY domain classification
true
An EMBASSY domain classification file (DCF) of classification and other data for domains from SCOP or CATH, in EMBL-like format.
edam
bioinformatics
beta12orEarlier
data
data
beta12orEarlier
CATH class
bioinformatics
edam
data
data
Information on a protein 'class' node from the CATH database.
beta12orEarlier
CATH architecture
data
bioinformatics
Information on a protein 'architecture' node from the CATH database.
edam
beta12orEarlier
data
CATH topology
beta12orEarlier
bioinformatics
Information on a protein 'topology' node from the CATH database.
edam
data
data
CATH homologous superfamily
Information on a protein 'homologous superfamily' node from the CATH database.
data
beta12orEarlier
data
bioinformatics
edam
CATH structurally similar group
edam
Information on a protein 'structurally similar group' node from the CATH database.
bioinformatics
beta12orEarlier
data
data
CATH functional category
data
beta12orEarlier
edam
bioinformatics
Information on a protein 'functional category' node from the CATH database.
data
Protein fold recognition report
true
beta12orEarlier
edam
data
Methods use some type of mapping between sequence and fold, for example secondary structure prediction and alignment, profile comparison, sequence properties, homologous sequence search, kernel machines etc. Domains and folds might be taken from SCOP or CATH.
bioinformatics
A report on known protein structural domains or folds that are recognized (identified) in protein sequence(s).
beta12orEarlier
data
Protein-protein interaction
data
data
edam
Data on protein-protein interaction(s).
bioinformatics
For example, an informative report about a specific protein complex (of multiple polypeptide chains).
Protein structure report (protein complex)
beta12orEarlier
Protein-ligand interaction
data
data
bioinformatics
beta12orEarlier
Data on protein-ligand (small molecule) interaction(s).
edam
Protein-nucleic acid interaction
Data on protein-DNA/RNA interaction(s).
beta12orEarlier
data
data
bioinformatics
edam
Nucleic acid melting profile
data
edam
beta12orEarlier
bioinformatics
data
Data on the dissociation characteristics of a double-stranded nucleic acid molecule (DNA or a DNA/RNA hybrid) during heating.
A melting (stability) profile calculated the free energy required to unwind and separate the nucleic acid strands, plotted for sliding windows over a sequence.
Nucleic acid stability profile
Nucleic acid enthalpy
data
Enthalpy of hybridized or double stranded nucleic acid (DNA or RNA/DNA).
data
edam
beta12orEarlier
bioinformatics
Nucleic acid entropy
data
beta12orEarlier
data
Entropy of hybridized or double stranded nucleic acid (DNA or RNA/DNA).
edam
bioinformatics
Nucleic acid melting temperature
true
bioinformatics
beta12orEarlier
beta12orEarlier
Melting temperature of hybridized or double stranded nucleic acid (DNA or RNA/DNA).
data
edam
data
Nucleic acid stitch profile
A stitch profile diagram shows partly melted DNA conformations (with probabilities) at a range of temperatures. For example, a stitch profile might show possible loop openings with their location, size, probability and fluctuations at a given temperature.
bioinformatics
Stitch profile of hybridized or double stranded nucleic acid (DNA or RNA/DNA).
data
data
beta12orEarlier
edam
DNA base pair stacking energies data
edam
data
data
beta12orEarlier
bioinformatics
DNA base pair stacking energies data.
DNA base pair twist angle data
edam
bioinformatics
DNA base pair twist angle data.
data
beta12orEarlier
data
DNA base trimer roll angles data
data
data
edam
DNA base trimer roll angles data.
beta12orEarlier
bioinformatics
Vienna RNA parameters
true
bioinformatics
edam
beta12orEarlier
data
beta12orEarlier
RNA parameters used by the Vienna package.
data
Vienna RNA structure constraints
true
beta12orEarlier
beta12orEarlier
data
bioinformatics
Structure constraints used by the Vienna package.
data
edam
Vienna RNA concentration data
true
beta12orEarlier
bioinformatics
edam
data
data
RNA concentration data used by the Vienna package.
beta12orEarlier
Vienna RNA calculated energy
true
bioinformatics
data
edam
RNA calculated energy data generated by the Vienna package.
beta12orEarlier
beta12orEarlier
data
Base pairing probability matrix dotplot
data
data
Dotplot of RNA base pairing probability matrix.
Such as generated by the Vienna package.
beta12orEarlier
edam
bioinformatics
Nucleic acid folding report
This is a broad data type and is used a placeholder for other, more specific types. It is primarily intended to help navigation of EDAM and would not typically be used for annotation.
bioinformatics
Nucleic acid report (folding model)
A report on an analysis of RNA/DNA folding, minimum folding energies for DNA or RNA sequences, energy landscape of RNA mutants etc.
beta12orEarlier
data
data
Nucleic acid report (folding)
edam
Codon usage table
beta12orEarlier
data
A codon usage table might include the codon usage table name, optional comments and a table with columns for codons and corresponding codon usage data. A genetic code can be extracted from or represented by a codon usage table.
Table of codon usage data calculated from one or more nucleic acid sequences.
bioinformatics
data
edam
Genetic code
bioinformatics
edam
data
A genetic code need not include detailed codon usage information.
data
A genetic code for an organism.
beta12orEarlier
Codon adaptation index
true
data
edam
data
beta12orEarlier
bioinformatics
beta12orEarlier
CAI
A simple measure of synonymous codon usage bias often used to predict gene expression levels.
Codon usage bias plot
data
bioinformatics
edam
A plot of the synonymous codon usage calculated for windows over a nucleotide sequence.
data
beta12orEarlier
Synonymous codon usage statistic plot
Nc statistic
true
data
beta12orEarlier
The effective number of codons used in a gene sequence. This reflects how far codon usage of a gene departs from equal usage of synonymous codons.
edam
data
bioinformatics
beta12orEarlier
Codon usage fraction difference
beta12orEarlier
edam
The differences in codon usage fractions between two codon usage tables.
data
data
bioinformatics
Pharmacogenomic annotation
Data on the influence of genotype on drug response.
The report might correlate gene expression or single-nucleotide polymorphisms with drug efficacy or toxicity.
bioinformatics
beta12orEarlier
data
edam
data
Disease annotation
data
beta12orEarlier
bioinformatics
data
An informative report on a specific disease.
edam
Gene annotation (linkage disequilibrium)
data
beta12orEarlier
bioinformatics
data
edam
A report on linkage disequilibrium; the non-random association of alleles or polymorphisms at two or more loci (not necessarily on the same chromosome).
Heat map
A heat map is a table where rows and columns correspond to different genes and contexts (for example, cells or samples) and the cell color represents the level of expression of a gene that context.
data
bioinformatics
edam
data
beta12orEarlier
A graphical 2D tabular representation of gene expression data, typically derived from a DNA microarray experiment.
Affymetrix probe sets library file
true
bioinformatics
data
edam
beta12orEarlier
data
beta12orEarlier
CDF file
Affymetrix library file of information about which probes belong to which probe set.
Affymetrix probe sets information library file
true
beta12orEarlier
GIN file
bioinformatics
data
beta12orEarlier
data
Affymetrix library file of information about the probe sets such as the gene name with which the probe set is associated.
edam
Molecular weights standard fingerprint
Standard protonated molecular masses from trypsin (modified porcine trypsin, Promega) and keratin peptides, used in EMBOSS.
bioinformatics
data
data
edam
beta12orEarlier
Pathway or network (metabolic)
data
bioinformatics
edam
data
A report typically including a map (diagram) of a metabolic pathway.
beta12orEarlier
This includes carbohydrate, energy, lipid, nucleotide, amino acid, glycan, PK/NRP, cofactor/vitamin, secondary metabolite, xenobiotics etc.
Pathway or network (genetic information processing)
data
A report typically including a map (diagram) of a genetic information processing pathway.
bioinformatics
data
edam
beta12orEarlier
Pathway or network (environmental information processing)
beta12orEarlier
bioinformatics
edam
A report typically including a map (diagram) of an environmental information processing pathway.
data
data
Pathway or network (signal transduction)
beta12orEarlier
edam
data
data
bioinformatics
A report typically including a map (diagram) of a signal transduction pathway.
Pathway or network (cellular process)
beta12orEarlier
data
edam
data
A report typically including a map (diagram) of a cellular process pathway.
bioinformatics
Pathway or network (disease)
data
edam
A report typically including a map (diagram) of a disease pathway, typically a diagram for a human disease.
bioinformatics
data
beta12orEarlier
Pathway or network (drug structure relationship)
edam
data
beta12orEarlier
bioinformatics
data
A report typically including a map (diagram) of drug structure relationships.
Pathway or network (protein-protein interaction)
data
beta12orEarlier
data
A report typically including a map (diagram) of a network of protein interactions.
edam
bioinformatics
MIRIAM datatype
beta12orEarlier
bioinformatics
An entry (data type) from the Minimal Information Requested in the Annotation of Biochemical Models (MIRIAM) database of data resources.
edam
A MIRIAM entry describes a MIRIAM data type including the official name, synonyms, root URI, identifier pattern (regular expression applied to a unique identifier of the data type) and documentation. Each data type can be associated with several resources. Each resource is a physical location of a service (typically a database) providing information on the elements of a data type. Several resources may exist for each data type, provided the same (mirrors) or different information. MIRIAM provides a stable and persistent reference to its data types.
data
data
E-value
Expectation value
A simple floating point number defining the lower or upper limit of an expectation value (E-value).
beta12orEarlier
data
edam
bioinformatics
data
An expectation value (E-Value) is the expected number of observations which are at least as extreme as observations expected to occur by random chance. The E-value describes the number of hits with a given score or better that are expected to occur at random when searching a database of a particular size. It decreases exponentially with the score (S) of a hit. A low E value indicates a more significant score.
Z-value
A z-value might be specified as a threshold for reporting hits from database searches.
The z-value is the number of standard deviations a data value is above or below a mean value.
bioinformatics
data
edam
beta12orEarlier
data
P-value
The P-value is the probability of obtaining by random chance a result that is at least as extreme as an observed result, assuming a NULL hypothesis is true.
A z-value might be specified as a threshold for reporting hits from database searches.
data
data
beta12orEarlier
bioinformatics
edam
Database version information
beta12orEarlier
Information on a database (or ontology) version, for example name, version number and release date.
edam
bioinformatics
Ontology version information
data
data
Tool version information
data
data
edam
beta12orEarlier
Information on an application version, for example name, version number and release date.
bioinformatics
CATH version information
true
Information on a version of the CATH database.
edam
beta12orEarlier
data
bioinformatics
beta12orEarlier
data
Swiss-Prot to PDB mapping
true
data
edam
beta12orEarlier
data
bioinformatics
beta12orEarlier
Cross-mapping of Swiss-Prot codes to PDB identifiers.
Sequence database cross-references
true
bioinformatics
edam
beta12orEarlier
data
data
beta12orEarlier
Cross-references from a sequence record to other databases.
Job status
data
Values for EBI services are 'DONE' (job has finished and the results can then be retrieved), 'ERROR' (the job failed or no results where found), 'NOT_FOUND' (the job id is no longer available; job results might be deleted, 'PENDING' (the job is in a queue waiting processing), 'RUNNING' (the job is currently being processed).
bioinformatics
data
edam
beta12orEarlier
Metadata on the status of a submitted job.
Job ID
true
beta12orEarlier
1.0
edam
bioinformatics
The (typically numeric) unique identifier of a submitted job.
identifiers
data
identifier
Job type
A label (text token) describing the type of job, for example interactive or non-interactive.
beta12orEarlier
bioinformatics
data
data
edam
Tool log
beta12orEarlier
edam
data
bioinformatics
A report of tool-specific metadata on some analysis or process performed, for example a log of diagnostic or error messages.
data
DaliLite log file
true
beta12orEarlier
bioinformatics
data
data
edam
DaliLite log file describing all the steps taken by a DaliLite alignment of two protein structures.
beta12orEarlier
STRIDE log file
true
edam
beta12orEarlier
beta12orEarlier
data
STRIDE log file.
data
bioinformatics
NACCESS log file
true
data
NACCESS log file.
data
bioinformatics
beta12orEarlier
beta12orEarlier
edam
EMBOSS wordfinder log file
true
data
edam
beta12orEarlier
beta12orEarlier
bioinformatics
EMBOSS wordfinder log file.
data
EMBOSS domainatrix log file
true
bioinformatics
edam
data
beta12orEarlier
data
EMBOSS (EMBASSY) domainatrix application log file.
beta12orEarlier
EMBOSS sites log file
true
EMBOSS (EMBASSY) sites application log file.
data
beta12orEarlier
edam
data
beta12orEarlier
bioinformatics
EMBOSS supermatcher error file
true
bioinformatics
edam
beta12orEarlier
beta12orEarlier
data
EMBOSS (EMBASSY) supermatcher error file.
data
EMBOSS megamerger log file
true
data
data
EMBOSS megamerger log file.
beta12orEarlier
beta12orEarlier
edam
bioinformatics
EMBOSS whichdb log file
true
beta12orEarlier
data
EMBOSS megamerger log file.
bioinformatics
edam
data
beta12orEarlier
EMBOSS vectorstrip log file
true
EMBOSS vectorstrip log file.
data
bioinformatics
data
edam
beta12orEarlier
beta12orEarlier
Username
beta12orEarlier
identifiers
edam
data
identifier
bioinformatics
A username on a computer system.
Password
identifier
bioinformatics
data
edam
identifiers
A password on a computer system.
beta12orEarlier
Email address
identifiers
Moby:EmailAddress
A valid email address of an end-user.
Moby:Email
bioinformatics
identifier
edam
data
beta12orEarlier
Person name
bioinformatics
data
The name of a person.
identifiers
identifier
beta12orEarlier
edam
Number of iterations
edam
beta12orEarlier
bioinformatics
data
Number of iterations of an algorithm.
data
Number of output entities
data
data
beta12orEarlier
edam
Number of entities (for example database hits, sequences, alignments etc) to write to an output file.
bioinformatics
Hit sort order
true
bioinformatics
data
data
edam
beta12orEarlier
beta12orEarlier
Controls the order of hits (reported matches) in an output file from a database search.
Drug annotation
data
beta12orEarlier
bioinformatics
An informative report on a specific drug.
data
edam
Phylogenetic tree image
See also 'Phylogenetic tree'
bioinformatics
beta12orEarlier
An image (for viewing or printing) of a phylogenetic tree including (typically) a plot of rooted or unrooted phylogenies, cladograms, circular trees or phenograms and associated information.
data
edam
data
RNA secondary structure image
beta12orEarlier
Image of RNA secondary structure, knots, pseudoknots etc.
data
bioinformatics
edam
data
Protein secondary structure image
beta12orEarlier
Image of protein secondary structure.
data
data
edam
bioinformatics
Structure image
Image of one or more molecular tertiary (3D) structures.
data
data
bioinformatics
beta12orEarlier
edam
Sequence alignment image
edam
bioinformatics
Image of two or more aligned molecular sequences possibly annotated with alignment features.
data
beta12orEarlier
data
Structure image (small molecule)
edam
beta12orEarlier
Small molecule structure image
The molecular identifier and formula are typically included.
An image of the structure of a small chemical compound.
data
data
Chemical structure image
bioinformatics
Fate map
beta12orEarlier
A fate map is a plan of early stage of an embryo such as a blastula, showing areas that are significance to development.
edam
data
data
bioinformatics
Microarray spots image
data
edam
data
An image of spots from a microarray experiment.
beta12orEarlier
bioinformatics
BioPax
true
data
bioinformatics
data
beta12orEarlier
A term from the BioPax ontology.
beta12orEarlier
edam
GO
true
Moby:GO_Term
Moby:GOTerm
beta12orEarlier
A term definition from The Gene Ontology (GO).
Moby:Annotated_GO_Term_With_Probability
data
Moby:Annotated_GO_Term
Gene Ontology term
beta12orEarlier
bioinformatics
data
edam
MeSH
true
beta12orEarlier
beta12orEarlier
bioinformatics
A term from the MeSH vocabulary.
data
edam
data
HGNC
true
bioinformatics
beta12orEarlier
edam
A term from the HGNC controlled vocabulary.
data
beta12orEarlier
data
NCBI taxonomy vocabulary
true
data
A term from the NCBI taxonomy vocabulary.
edam
beta12orEarlier
bioinformatics
beta12orEarlier
data
Plant ontology term
true
bioinformatics
data
edam
A term from the Plant Ontology (PO).
data
beta12orEarlier
beta12orEarlier
UMLS
true
data
A term from the UMLS vocabulary.
edam
beta12orEarlier
beta12orEarlier
bioinformatics
data
FMA
true
edam
data
beta12orEarlier
bioinformatics
A term from Foundational Model of Anatomy.
beta12orEarlier
data
Classifies anatomical entities according to their shared characteristics (genus) and distinguishing characteristics (differentia). Specifies the part-whole and spatial relationships of the entities, morphological transformation of the entities during prenatal development and the postnatal life cycle and principles, rules and definitions according to which classes and relationships in the other three components of FMA are represented.
EMAP
true
beta12orEarlier
data
edam
beta12orEarlier
bioinformatics
A term from the EMAP mouse ontology.
data
ChEBI
true
edam
beta12orEarlier
data
data
bioinformatics
beta12orEarlier
A term from the ChEBI ontology.
MGED
true
data
A term from the MGED ontology.
beta12orEarlier
data
beta12orEarlier
bioinformatics
edam
myGrid
true
beta12orEarlier
bioinformatics
A term from the myGrid ontology.
data
data
edam
beta12orEarlier
The ontology is provided as two components, the service ontology and the domain ontology. The domain ontology acts provides concepts for core bioinformatics data types and their relations. The service ontology describes the physical and operational features of web services.
GO (biological process)
true
Data Type is an enumerated string.
beta12orEarlier
data
A term definition for a biological process from the Gene Ontology (GO).
edam
data
bioinformatics
beta12orEarlier
GO (molecular function)
true
beta12orEarlier
bioinformatics
Data Type is an enumerated string.
data
beta12orEarlier
edam
data
A term definition for a molecular function from the Gene Ontology (GO).
GO (cellular component)
true
A term definition for a cellular component from the Gene Ontology (GO).
beta12orEarlier
data
Data Type is an enumerated string.
data
beta12orEarlier
bioinformatics
edam
Ontology relation type
bioinformatics
A relation type defined in an ontology.
data
edam
beta12orEarlier
data
Ontology concept definition
The definition of a concept from an ontology.
data
data
bioinformatics
beta12orEarlier
edam
Ontology concept comment
bioinformatics
edam
A comment on a concept from an ontology.
data
beta12orEarlier
data
Ontology concept reference
true
data
edam
beta12orEarlier
data
bioinformatics
Reference for a concept from an ontology.
beta12orEarlier
doc2loc document information
true
Information on a published article provided by the doc2loc program.
The doc2loc output includes the url, format, type and availability code of a document for every service provider.
data
edam
beta12orEarlier
data
beta12orEarlier
bioinformatics
PDB residue number
A residue identifier (a string) from a PDB file.
edam
data
data
bioinformatics
beta12orEarlier
PDBML:PDB_residue_no
WHATIF: pdb_number
Atomic coordinate
Cartesian coordinate
edam
data
bioinformatics
beta12orEarlier
Cartesian coordinate of an atom (in a molecular structure).
data
Atomic x coordinate
Cartesian x coordinate of an atom (in a molecular structure).
beta12orEarlier
WHATIF: PDBx_Cartn_x
edam
data
bioinformatics
data
PDBML:_atom_site.Cartn_x in PDBML
Cartesian x coordinate
Atomic y coordinate
data
data
WHATIF: PDBx_Cartn_y
edam
Cartesian y coordinate of an atom (in a molecular structure).
PDBML:_atom_site.Cartn_y in PDBML
beta12orEarlier
bioinformatics
Cartesian y coordinate
Atomic z coordinate
bioinformatics
data
Cartesian z coordinate
data
edam
beta12orEarlier
Cartesian z coordinate of an atom (in a molecular structure).
WHATIF: PDBx_Cartn_z
PDBML:_atom_site.Cartn_z
PDB atom name
Identifier (a string) of a specific atom from a PDB file for a molecular structure.
bioinformatics
identifier
WHATIF: atom_type
PDBML:pdbx_PDB_atom_name
identifiers
WHATIF: alternate_atom
WHATIF: PDBx_auth_atom_id
beta12orEarlier
edam
WHATIF: PDBx_type_symbol
data
Protein atom
beta12orEarlier
data
This is a broad data type and is used a placeholder for other, more specific types. It is primarily intended to help navigation of EDAM and would not typically be used for annotation.
data
bioinformatics
Data on a single atom from a protein structure.
CHEBI:33250
Atom data
edam
Protein residue
Residue
This is a broad data type and is used a placeholder for other, more specific types. It is primarily intended to help navigation of EDAM and would not typically be used for annotation.
edam
Data on a single amino acid residue position in a protein structure.
bioinformatics
data
beta12orEarlier
data
Atom name
data
beta12orEarlier
identifier
edam
Name of an atom.
bioinformatics
identifiers
PDB residue name
identifiers
beta12orEarlier
bioinformatics
WHATIF: type
edam
identifier
data
Three-letter amino acid residue names as used in PDB files.
PDB model number
identifiers
WHATIF: model_number
edam
beta12orEarlier
bioinformatics
identifier
Identifier of a model structure from a PDB file.
Model number
PDBML:pdbx_PDB_model_num
data
CATH domain report
true
edam
bioinformatics
beta12orEarlier
data
The report (for example http://www.cathdb.info/domain/1cukA01) includes CATH codes for levels in the hierarchy for the domain, level descriptions and relevant data and links.
Summary of domain classification information for a CATH domain.
data
beta13
CATH representative domain sequences (ATOM)
true
data
FASTA sequence database (based on ATOM records in PDB) for CATH domains (clustered at different levels of sequence identity).
beta12orEarlier
beta12orEarlier
edam
data
bioinformatics
CATH representative domain sequences (COMBS)
true
beta12orEarlier
beta12orEarlier
bioinformatics
FASTA sequence database (based on COMBS sequence data) for CATH domains (clustered at different levels of sequence identity).
data
edam
data
CATH domain sequences (ATOM)
true
data
beta12orEarlier
data
beta12orEarlier
edam
FASTA sequence database for all CATH domains (based on PDB ATOM records).
bioinformatics
CATH domain sequences (COMBS)
true
data
beta12orEarlier
edam
bioinformatics
beta12orEarlier
FASTA sequence database for all CATH domains (based on COMBS sequence data).
data
Sequence version information
edam
bioinformatics
data
Information on an molecular sequence version.
beta12orEarlier
data
Score or penalty
A numerical value, either some type of scored value arising for example from a prediction method or a value used in a scoring scheme, which might reduce the final score (penalty).
beta12orEarlier
edam
bioinformatics
data
data
Protein report (function)
true
beta12orEarlier
edam
bioinformatics
Report on general functional properties of specific protein(s).
beta13
For properties that can be mapped to a sequence, use 'Sequence report' instead.
data
data
Gene name (ASPGD)
edam
http://www.geneontology.org/doc/GO.xrf_abbs:ASPGD_LOCUS
identifier
identifiers
data
Name of a gene from Aspergillus Genome Database.
beta12orEarlier
bioinformatics
Gene name (CGD)
data
identifier
beta12orEarlier
http://www.geneontology.org/doc/GO.xrf_abbs:CGD_LOCUS
bioinformatics
edam
Name of a gene from Candida Genome Database.
identifiers
Gene name (dictyBase)
Name of a gene from dictyBase database.
beta12orEarlier
edam
identifiers
bioinformatics
data
http://www.geneontology.org/doc/GO.xrf_abbs:dictyBase
identifier
Gene name (EcoGene primary)
identifier
EcoGene primary gene name
identifiers
data
Primary name of a gene from EcoGene Database.
edam
beta12orEarlier
bioinformatics
http://www.geneontology.org/doc/GO.xrf_abbs:ECOGENE_G
Gene name (MaizeGDB)
beta12orEarlier
identifier
identifiers
bioinformatics
edam
Name of a gene from MaizeGDB (maize genes) database.
data
http://www.geneontology.org/doc/GO.xrf_abbs:MaizeGDB_Locus
Gene name (SGD)
data
bioinformatics
beta12orEarlier
identifier
edam
Name of a gene from Saccharomyces Genome Database.
identifiers
http://www.geneontology.org/doc/GO.xrf_abbs:SGD_LOCUS
Gene name (TGD)
data
http://www.geneontology.org/doc/GO.xrf_abbs:TGD_LOCUS
identifiers
Name of a gene from Tetrahymena Genome Database.
edam
bioinformatics
identifier
beta12orEarlier
Gene name (CGSC)
Symbol of a gene from E.coli Genetic Stock Center.
bioinformatics
data
identifiers
identifier
http://www.geneontology.org/doc/GO.xrf_abbs: CGSC
edam
beta12orEarlier
Gene name (HGNC)
Symbol of a gene approved by the HUGO Gene Nomenclature Committee.
HGNC gene symbol
Gene name (HUGO)
HGNC:[0-9]{1,5}
HUGO gene symbol
HGNC gene name
edam
identifier
http://www.geneontology.org/doc/GO.xrf_abbs: HGNC_gene
HUGO gene name
identifiers
Official gene name
bioinformatics
HGNC symbol
HUGO symbol
beta12orEarlier
data
Gene name (MGD)
identifier
beta12orEarlier
http://www.geneontology.org/doc/GO.xrf_abbs: MGD
MGI:[0-9]+
edam
bioinformatics
identifiers
data
Symbol of a gene from the Mouse Genome Database.
Gene name (Bacillus subtilis)
data
bioinformatics
beta12orEarlier
identifier
Symbol of a gene from Bacillus subtilis Genome Sequence Project.
identifiers
http://www.geneontology.org/doc/GO.xrf_abbs: SUBTILISTG
edam
Gene ID (PlasmoDB)
identifier
http://www.geneontology.org/doc/GO.xrf_abbs: ApiDB_PlasmoDB
beta12orEarlier
bioinformatics
Identifier of a gene from PlasmoDB Plasmodium Genome Resource.
identifiers
data
edam
Gene ID (EcoGene)
data
EcoGene Accession
edam
identifiers
Identifier of a gene from EcoGene Database.
identifier
beta12orEarlier
EcoGene ID
bioinformatics
Gene ID (FlyBase)
http://www.geneontology.org/doc/GO.xrf_abbs: FlyBase
data
http://www.geneontology.org/doc/GO.xrf_abbs: FB
Gene identifier from FlyBase database.
bioinformatics
beta12orEarlier
identifier
identifiers
edam
Gene ID (GeneDB Glossina morsitans)
true
beta12orEarlier
Gene identifier from Glossina morsitans GeneDB database.
beta13
http://www.geneontology.org/doc/GO.xrf_abbs: GeneDB_Gmorsitans
identifier
bioinformatics
identifiers
data
edam
Gene ID (GeneDB Leishmania major)
true
beta12orEarlier
http://www.geneontology.org/doc/GO.xrf_abbs: GeneDB_Lmajor
bioinformatics
identifiers
Gene identifier from Leishmania major GeneDB database.
data
beta13
identifier
edam
Gene ID (GeneDB Plasmodium falciparum)
true
beta12orEarlier
identifiers
edam
http://www.geneontology.org/doc/GO.xrf_abbs: GeneDB_Pfalciparum
Gene identifier from Plasmodium falciparum GeneDB database.
identifier
beta13
data
bioinformatics
Gene ID (GeneDB Schizosaccharomyces pombe)
true
bioinformatics
identifier
edam
beta12orEarlier
http://www.geneontology.org/doc/GO.xrf_abbs: GeneDB_Spombe
Gene identifier from Schizosaccharomyces pombe GeneDB database.
beta13
data
identifiers
Gene ID (GeneDB Trypanosoma brucei)
true
http://www.geneontology.org/doc/GO.xrf_abbs: GeneDB_Tbrucei
beta12orEarlier
identifier
edam
data
bioinformatics
beta13
identifiers
Gene identifier from Trypanosoma brucei GeneDB database.
Gene ID (Gramene)
Gene identifier from Gramene database.
http://www.geneontology.org/doc/GO.xrf_abbs: GR_GENE
beta12orEarlier
bioinformatics
data
identifiers
identifier
edam
http://www.geneontology.org/doc/GO.xrf_abbs: GR_gene
Gene ID (Virginia microbial)
bioinformatics
Gene identifier from Virginia Bioinformatics Institute microbial database.
identifiers
data
edam
http://www.geneontology.org/doc/GO.xrf_abbs: VMD
identifier
http://www.geneontology.org/doc/GO.xrf_abbs: PAMGO_VMD
beta12orEarlier
Gene ID (SGN)
beta12orEarlier
identifiers
identifier
Gene identifier from Sol Genomics Network.
edam
data
bioinformatics
http://www.geneontology.org/doc/GO.xrf_abbs: SGN
Gene ID (WormBase)
identifier
http://www.geneontology.org/doc/GO.xrf_abbs: WormBase
Gene identifier used by WormBase database.
beta12orEarlier
bioinformatics
edam
http://www.geneontology.org/doc/GO.xrf_abbs: WB
WBGene[0-9]{8}
data
identifiers
Gene synonym
true
identifier
Any name (other than the recommended one) for a gene.
Gene name synonym
data
identifiers
bioinformatics
beta12orEarlier
edam
beta12orEarlier
ORF name
The name of an open reading frame attributed by a sequencing project.
identifier
edam
bioinformatics
identifiers
beta12orEarlier
data
Sequence assembly component
true
data
beta12orEarlier
bioinformatics
beta12orEarlier
A component of a larger sequence assembly.
edam
data
Chromosome annotation (aberration)
true
edam
A report on a chromosome aberration such as abnormalities in chromosome structure.
beta12orEarlier
beta12orEarlier
bioinformatics
data
data
Clone ID
data
identifiers
An identifier of a clone (cloned molecular sequence) from a database.
identifier
bioinformatics
beta12orEarlier
edam
PDB insertion code
edam
An insertion code (part of the residue number) for an amino acid residue from a PDB file.
data
data
PDBML:pdbx_PDB_ins_code
WHATIF: insertion_code
bioinformatics
beta12orEarlier
Atomic occupancy
The sum of the occupancies of all the atom types at a site should not normally significantly exceed 1.0.
beta12orEarlier
data
The fraction of an atom type present at a site in a molecular structure.
bioinformatics
edam
data
WHATIF: PDBx_occupancy
Isotropic B factor
bioinformatics
beta12orEarlier
WHATIF: PDBx_B_iso_or_equiv
data
edam
data
Isotropic B factor (atomic displacement parameter) for an atom from a PDB file.
Deletion map
Deletion-based cytogenetic map
bioinformatics
edam
data
A cytogenetic map showing chromosome banding patterns in mutant cell lines relative to the wild type.
beta12orEarlier
A cytogenetic map is built from a set of mutant cell lines with sub-chromosomal deletions and a reference wild-type line ('genome deletion panel'). The panel is used to map markers onto the genome by comparing mutant to wild-type banding patterns. Markers are linked (occur in the same deleted region) if they share the same banding pattern (presence or absence) as the deletion panel.
data
QTL map
Quantitative trait locus map
bioinformatics
data
beta12orEarlier
data
A genetic map which shows the approximate location of quantitative trait loci (QTL) between two or more markers.
edam
Haplotype map
data
A map of haplotypes in a genome or other sequence, describing common patterns of genetic variation.
bioinformatics
Moby:Haplotyping_Study_obj
data
edam
beta12orEarlier
Map set
data
data
Moby:GCP_CorrelatedMapSet
bioinformatics
edam
Data describing a set of multiple genetic or physical maps, typically sharing a common set of features which are mapped.
Moby:GCP_CorrelatedLinkageMapSet
beta12orEarlier
Map feature
true
Mappable features may be based on Gramene's notion of map features; see http://www.gramene.org/db/cmap/feature_type_info.
beta12orEarlier
edam
Moby:MapFeature
A feature which may mapped (positioned) on a genetic or other type of map.
data
beta12orEarlier
bioinformatics
data
Map type
beta12orEarlier
data
data
edam
Map types may be based on Gramene's notion of a map type; see http://www.gramene.org/db/cmap/map_type_info.
A designation of the type of map (genetic map, physical map, sequence map etc) or map set.
bioinformatics
Protein fold name
bioinformatics
beta12orEarlier
identifiers
edam
data
The name of a protein fold.
identifier
Taxon
Moby:BriefTaxonConcept
The name of a group of organisms belonging to the same taxonomic rank.
data
For a complete list of taxonomic ranks see https://www.phenoscape.org/wiki/Taxonomic_Rank_Vocabulary.
edam
Taxonomic rank
Taxonomy rank
beta12orEarlier
identifiers
Moby:PotentialTaxon
identifier
bioinformatics
Organism identifier
edam
data
bioinformatics
identifier
A unique identifier of a (group of) organisms.
beta12orEarlier
identifiers
Genus name
identifier
The name of a genus of organism.
edam
identifiers
bioinformatics
data
beta12orEarlier
Taxonomic classification
Moby:TaxonTCS
identifiers
Moby:TaxonScientificName
identifier
Taxonomic information
bioinformatics
Name components correspond to levels in a taxonomic hierarchy (e.g. 'Genus', 'Species', etc.) Meta information such as a reference where the name was defined and a date might be included.
Taxonomic name
Moby:iANT_organism-xml
data
The full name for a group of organisms, reflecting their biological classification and (usually) conforming to a standard nomenclature.
edam
beta12orEarlier
Moby:GCP_Taxon
Moby:TaxonName
iHOP organism ID
identifier
data
A unique identifier for an organism used in the iHOP database.
Moby_namespace:iHOPorganism
beta12orEarlier
bioinformatics
edam
identifiers
Genbank common name
beta12orEarlier
identifier
Common name for an organism as used in the GenBank database.
edam
bioinformatics
data
identifiers
NCBI taxon
beta12orEarlier
identifier
data
edam
bioinformatics
The name of a taxon from the NCBI taxonomy database.
identifiers
Synonym
true
An alternative for a word.
data
edam
data
beta12orEarlier
Alternative name
bioinformatics
beta12orEarlier
Misspelling
true
beta12orEarlier
bioinformatics
edam
beta12orEarlier
data
data
A common misspelling of a word.
Acronym
true
beta12orEarlier
data
edam
data
bioinformatics
An abbreviation of a phrase or word.
beta12orEarlier
Misnomer
true
bioinformatics
edam
beta12orEarlier
beta12orEarlier
data
A term which is likely to be misleading of its meaning.
data
Author ID
beta12orEarlier
identifiers
Moby:Author
edam
data
identifier
Information on the authors of a published work.
bioinformatics
DragonDB author identifier
data
edam
identifiers
bioinformatics
beta12orEarlier
identifier
An identifier representing an author in the DragonDB database.
Annotated URI
Moby:DescribedLink
data
data
bioinformatics
A URI along with annotation describing the data found at the address.
beta12orEarlier
edam
UniProt keywords
true
A controlled vocabulary for words and phrases that can appear in the keywords field (KW line) of entries from the UniProt database.
bioinformatics
data
beta12orEarlier
beta12orEarlier
edam
data
Gene ID (GeneFarm)
edam
beta12orEarlier
Identifier of a gene from the GeneFarm database.
identifiers
bioinformatics
identifier
Moby_namespace:GENEFARM_GeneID
data
Blattner number
edam
data
The blattner identifier for a gene.
Moby_namespace:Blattner_number
identifiers
identifier
bioinformatics
beta12orEarlier
Gene ID (MIPS Maize)
true
bioinformatics
edam
identifiers
data
identifier
beta13
Moby_namespace:MIPS_GE_Maize
beta12orEarlier
Identifier for genetic elements in MIPS Maize database.
MIPS genetic element identifier (Maize)
Gene ID (MIPS Medicago)
true
bioinformatics
beta13
identifier
beta12orEarlier
MIPS genetic element identifier (Medicago)
Identifier for genetic elements in MIPS Medicago database.
Moby_namespace:MIPS_GE_Medicago
edam
data
identifiers
Gene name (DragonDB)
Moby_namespace:DragonDB_Gene
beta12orEarlier
bioinformatics
identifiers
data
edam
The name of an Antirrhinum Gene from the DragonDB database.
identifier
Gene name (Arabidopsis)
bioinformatics
data
beta12orEarlier
identifier
edam
A unique identifier for an Arabidopsis gene, which is an acronym or abbreviation of the gene name.
identifiers
Moby_namespace:ArabidopsisGeneSymbol
iHOP symbol
beta12orEarlier
identifiers
bioinformatics
edam
Moby_namespace:iHOPsymbol
A unique identifier of a protein or gene used in the iHOP database.
identifier
data
Gene name (GeneFarm)
identifiers
edam
GeneFarm gene ID
Moby_namespace:GENEFARM_GeneName
bioinformatics
beta12orEarlier
identifier
data
Name of a gene from the GeneFarm database.
Locus ID
edam
beta12orEarlier
identifiers
A unique name or other identifier of a genetic locus, typically conforming to a scheme that names loci (such as predicted genes) depending on their position in a molecular sequence, for example a completely sequenced genome or chromosome.
data
identifier
bioinformatics
Locus identifier
Locus name
Locus ID (AGI)
AGI identifier
identifiers
identifier
beta12orEarlier
AGI locus code
Arabidopsis gene loci number
edam
bioinformatics
http://www.geneontology.org/doc/GO.xrf_abbs:AGI_LocusCode
Locus identifier for Arabidopsis Genome Initiative (TAIR, TIGR and MIPS databases)
AGI ID
data
AT[1-5]G[0-9]{5}
Locus ID (ASPGD)
Identifier for loci from ASPGD (Aspergillus Genome Database).
http://www.geneontology.org/doc/GO.xrf_abbs: ASPGDID
data
beta12orEarlier
http://www.geneontology.org/doc/GO.xrf_abbs: ASPGD
bioinformatics
identifier
identifiers
edam
Locus ID (MGG)
edam
identifiers
http://www.geneontology.org/doc/GO.xrf_abbs: Broad_MGG
beta12orEarlier
identifier
bioinformatics
Identifier for loci from Magnaporthe grisea Database at the Broad Institute.
data
Locus ID (CGD)
http://www.geneontology.org/doc/GO.xrf_abbs: CGD
data
CGDID
bioinformatics
beta12orEarlier
http://www.geneontology.org/doc/GO.xrf_abbs: CGDID
CGD locus identifier
edam
identifiers
identifier
Identifier for loci from CGD (Candida Genome Database).
Locus ID (CMR)
http://www.geneontology.org/doc/GO.xrf_abbs: JCVI_CMR
Locus identifier for Comprehensive Microbial Resource at the J. Craig Venter Institute.
data
edam
http://www.geneontology.org/doc/GO.xrf_abbs: TIGR_CMR
bioinformatics
beta12orEarlier
identifiers
identifier
NCBI locus tag
identifier
data
bioinformatics
Identifier for loci from NCBI database.
http://www.geneontology.org/doc/GO.xrf_abbs: NCBI_locus_tag
identifiers
edam
Locus ID (NCBI)
Moby_namespace:LocusID
beta12orEarlier
Locus ID (SGD)
SGDID
bioinformatics
identifier
Identifier for loci from SGD (Saccharomyces Genome Database).
identifiers
data
http://www.geneontology.org/doc/GO.xrf_abbs: SGD
beta12orEarlier
edam
http://www.geneontology.org/doc/GO.xrf_abbs: SGDID
Locus ID (MMP)
bioinformatics
beta12orEarlier
identifiers
Moby_namespace:MMP_Locus
Identifier of loci from Maize Mapping Project.
identifier
edam
data
Locus ID (DictyBase)
identifier
bioinformatics
identifiers
data
Identifier of locus from DictyBase (Dictyostelium discoideum).
beta12orEarlier
edam
Moby_namespace:DDB_gene
Locus ID (EntrezGene)
Moby_namespace:EntrezGene_EntrezGeneID
identifiers
beta12orEarlier
edam
identifier
Moby_namespace:EntrezGene_ID
data
Identifier of a locus from EntrezGene database.
bioinformatics
Locus ID (MaizeGDB)
identifier
edam
identifiers
beta12orEarlier
Moby_namespace:MaizeGDB_Locus
Identifier of locus from MaizeGDB (Maize genome database).
bioinformatics
data
Quantitative trait locus
true
Moby:SO_QTL
edam
A QTL sometimes but does not necessarily correspond to a gene.
data
A stretch of DNA that is closely linked to the genes underlying a quantitative trait (a phenotype that varies in degree and depends upon the interactions between multiple genes and their environment).
data
beta12orEarlier
beta12orEarlier
QTL
bioinformatics
Gene ID (KOME)
data
bioinformatics
edam
Moby_namespace:GeneId
beta12orEarlier
identifier
identifiers
Identifier of a gene from the KOME database.
Locus ID (Tropgene)
data
identifiers
edam
beta12orEarlier
Moby:Tropgene_locus
Identifier of a locus from the Tropgene database.
identifier
bioinformatics
Alignment
bioinformatics
data
data
beta12orEarlier
edam
An alignment of molecular sequences, structures or profiles derived from them.
Atomic property
bioinformatics
data
beta12orEarlier
edam
Data for an atom (in a molecular structure).
data
General atomic property
UniProt keyword
http://www.geneontology.org/doc/GO.xrf_abbs: SP_KW
Moby_namespace:SP_KW
data
A word or phrase that can appear in the keywords field (KW line) of entries from the UniProt database.
data
bioinformatics
edam
beta12orEarlier
Ordered locus name
true
beta12orEarlier
edam
A name for a genetic locus conforming to a scheme that names loci (such as predicted genes) depending on their position in a molecular sequence, for example a completely sequenced genome or chromosome.
bioinformatics
identifiers
data
identifier
beta12orEarlier
Map position
PDBML:_atom_site.id
Moby:GCP_MapPoint
Moby:GCP_MapPosition
Moby:GCP_MapInterval
This includes positions in genomes based on a reference sequence. A position may be specified for any mappable object, i.e. anything that may have positional information such as a physical position in a chromosome.
beta12orEarlier
Moby:Locus
Moby:MapPosition
Moby:HitPosition
Moby:GenePosition
data
A position in a map (for example a genetic map), either a single position (point) or a region / interval.
data
edam
bioinformatics
Locus
Moby:Position
Amino acid property
Amino acid data
beta12orEarlier
data
edam
data
bioinformatics
Data concerning the intrinsic physical (e.g. structural) or chemical properties of one, more or all amino acids.
Annotation
true
A human-readable collection of information which (typically) is generated or collated by hand and which describes a biological entity, phenomena or associated primary (e.g. sequence or structural) data, as distinct from the primary data itself and computer-generated reports derived from it.
bioinformatics
beta12orEarlier
data
data
beta13
This is a broad data type and is used a placeholder for other, more specific types.
edam
Map attribute
bioinformatics
beta12orEarlier
data
An attribute of a molecular map (genetic or physical).
edam
data
Vienna RNA structural data
true
beta12orEarlier
edam
beta12orEarlier
data
data
bioinformatics
Data used by the Vienna RNA analysis package.
Sequence mask parameter
data
Data used to replace (mask) characters in a molecular sequence.
bioinformatics
edam
data
beta12orEarlier
Enzyme kinetics data
This is a broad data type and is used a placeholder for other, more specific types.
beta12orEarlier
edam
Data concerning chemical reaction(s) catalysed by enzyme(s).
data
data
bioinformatics
Michaelis Menten plot
edam
beta12orEarlier
data
A plot giving an approximation of the kinetics of an enzyme-catalysed reaction, assuming simple kinetics (i.e. no intermediate or product inhibition, allostericity or cooperativity). It plots initial reaction rate to the substrate concentration (S) from which the maximum rate (vmax) is apparent.
data
bioinformatics
Hanes Woolf plot
A plot based on the Michaelis Menten equation of enzyme kinetics plotting the ratio of the initial substrate concentration (S) against the reaction velocity (v).
edam
data
bioinformatics
beta12orEarlier
data
Experimental data
true
Experimental measurement data
edam
data
beta13
bioinformatics
beta12orEarlier
Raw data from or annotation on laboratory experiments.
This is a broad data type and is used a placeholder for other, more specific types. It is primarily intended to help navigation of EDAM and would not typically be used for annotation.
data
Genome version information
Information on a genome version.
data
bioinformatics
beta12orEarlier
edam
data
Evidence
beta12orEarlier
bioinformatics
data
data
edam
Typically a statement about some data or results, including evidence or the source of a statement, which may include computational prediction, laboratory experiment, literature reference etc.
Sequence record lite
data
edam
bioinformatics
A molecular sequence and minimal metadata, typically an identifier of the sequence and/or a comment.
data
beta12orEarlier
Sequence
edam
data
One or more molecular sequences, possibly with associated annotation.
data
bioinformatics
This concept is a placeholder of concepts for primary sequence data including raw sequences and sequence records. It should not normally be used for derivatives such as sequence alignments, motifs or profiles.
beta12orEarlier
Sequence record lite (nucleic acid)
bioinformatics
edam
data
A nucleic acid sequence and minimal metadata, typically an identifier of the sequence and/or a comment.
data
beta12orEarlier
Sequence record lite (protein)
A protein sequence and minimal metadata, typically an identifier of the sequence and/or a comment.
data
beta12orEarlier
edam
bioinformatics
data
Report
beta12orEarlier
data
document
data
bioinformatics
This is a broad data type and is used a placeholder for other, more specific types. The notions of 'data', 'report', 'annotation' and 'metadata' are somewhat subjective and overlapping. 'Report' is used to indicate human-readable collections of information which (typically) are computer-generated from analysis of primary (e.g. core sequence or structural) data, as distinct from the primary data itself or human-generated annotation on an entity. 'Annotation' indicates human-readable collections of information which (typically) is generated or collated by hand and which describes a biological entity, phenomena or associated primary (e.g. sequence or structural) data, as distinct from the primary data itself and computer-generated reports derived from it. 'Metadata' indicates data concerning or describing some core data, as distinct from the primary data that is being described. This includes metadata on the origin, source, history, ownership or location
of some thing.
Document
edam
A human-readable collection of information including annotation on a biological entity or phenomena, computer-generated reports of analysis of primary (e.g. sequence or structural) data, metadata about the primary data, and any free (essentially unformatted) text, as distinct from the primary data itself.
Text
Molecular property (general)
data
beta12orEarlier
General data for a molecule.
bioinformatics
General molecular property
data
edam
Structural data
true
edam
beta13
Data concerning molecular structural data.
data
data
beta12orEarlier
bioinformatics
This is a broad data type and is used a placeholder for other, more specific types.
Sequence motif (nucleic acid)
bioinformatics
beta12orEarlier
data
data
edam
A nucleotide sequence motif.
Sequence motif (protein)
edam
An amino acid sequence motif.
bioinformatics
beta12orEarlier
data
data
Search parameter
data
data
Some simple value controlling a search operation, typically a search of a database.
beta12orEarlier
edam
bioinformatics
Database hits
edam
data
bioinformatics
A report of hits from searching a database of some type.
beta12orEarlier
data
Secondary structure
The secondary structure assignment (predicted or real) of a nucleic acid or protein.
beta12orEarlier
data
data
edam
bioinformatics
Matrix
An array of numerical values where (typically) the rows and columns correspond to molecular entities and the values are comparative data, for example, distances between molecular sequences.
bioinformatics
data
This is a broad data type and is used a placeholder for other, more specific types.
beta12orEarlier
data
edam
Alignment report
This is a broad data type and is used a placeholder for other, more specific types.
edam
data
An informative report about a molecular alignment of some type, including alignment-derived data or metadata.
bioinformatics
beta12orEarlier
data
Nucleic acid report
data
bioinformatics
beta12orEarlier
data
edam
An informative report about one or more specific nucleic acid molecules, derived from analysis of primary (sequence or structural) data.
Structure report
Structure-derived report
data
beta12orEarlier
bioinformatics
An informative report on general information, properties or features of one or more molecular tertiary (3D) structures.
data
edam
Nucleic acid structure report
A report on nucleic acid structure-derived data, describing structural properties of a DNA molecule, or any other annotation or information about specific nucleic acid 3D structure(s).
data
beta12orEarlier
Nucleic acid property (structural)
bioinformatics
Nucleic acid structural property
This includes reports on the stiffness, curvature, twist/roll data or other conformational parameters or properties.
edam
data
Molecular property
bioinformatics
data
Physicochemical property
data
edam
A report on the physical (e.g. structural) or chemical properties of molecules, or parts of a molecule.
SO:0000400
beta12orEarlier
DNA base structural data
data
edam
beta12orEarlier
data
Structural data for DNA base pairs or runs of bases, such as energy or angle data.
bioinformatics
Database entry version information
edam
data
Information on a database (or ontology) entry version, such as name (or other identifier) or parent database, unique identifier of entry, data, author and so on.
data
beta12orEarlier
bioinformatics
Accession
data
bioinformatics
edam
A persistent (stable) and unique identifier, typically identifying an object (entry) from a database.
beta12orEarlier
identifier
identifiers
Nucleic acid features (SNP)
An SNP is an individual point mutation or substitution of a single nucleotide.
beta12orEarlier
data
edam
data
SNP annotation
Single nucleotide polymorphism
Annotation on a single nucleotide polymorphism (SNP) in a DNA sequence.
bioinformatics
Data reference
beta12orEarlier
data
Reference to a dataset (or a cross-reference between two datasets), typically one or more entries in a biological database or ontology.
edam
bioinformatics
A list of database accessions or identifiers are usually included.
data
Job identifier
bioinformatics
identifiers
beta12orEarlier
edam
An identifier of a submitted job.
data
identifier
Name
edam
A name of a thing, which need not necessarily uniquely identify it.
Symbolic name
bioinformatics
identifier
identifiers
data
beta12orEarlier
Closely related, but focusing on labeling and human readability but not on identification.
Type
data
data
edam
beta12orEarlier
A label (text token) describing the type of a thing, typically an enumerated string (a string with one of a limited set of values).
bioinformatics
User ID
edam
bioinformatics
identifier
data
beta12orEarlier
An identifier of a software end-user (typically a person).
identifiers
KEGG organism code
edam
A three-letter code used in the KEGG databases to uniquely identify organisms.
beta12orEarlier
bioinformatics
identifier
identifiers
data
Gene name (KEGG GENES)
edam
identifiers
Moby_namespace:GeneId
Name of an entry (gene) from the KEGG GENES database.
data
bioinformatics
identifier
[a-zA-Z_0-9]+:[a-zA-Z_0-9\.-]*
KEGG GENES entry name
beta12orEarlier
BioCyc ID
edam
identifiers
bioinformatics
beta12orEarlier
Identifier of an object from one of the BioCyc databases.
identifier
data
Compound ID (BioCyc)
edam
beta12orEarlier
Identifier of a compound from the BioCyc chemical compounds database.
identifier
bioinformatics
BioCyc compound identifier
identifiers
data
BioCyc compound ID
Reaction ID (BioCyc)
identifiers
bioinformatics
beta12orEarlier
edam
data
Identifier of a biological reaction from the BioCyc reactions database.
identifier
Enzyme ID (BioCyc)
beta12orEarlier
BioCyc enzyme ID
edam
identifier
Identifier of an enzyme from the BioCyc enzymes database.
data
identifiers
bioinformatics
Reaction ID
identifier
Identifier of a biological reaction from a database.
edam
bioinformatics
identifiers
beta12orEarlier
data
Identifier (hybrid)
edam
beta12orEarlier
This branch provides an alternative organisation of the concepts nested under 'Accession' and 'Name'. All concepts under here are already included under 'Accession' or 'Name'.
An identifier that is re-used for data objects of fundamentally different types (typically served from a single database).
identifiers
data
bioinformatics
identifier
Molecular property identifier
identifiers
bioinformatics
Identifier of a molecular property.
edam
data
beta12orEarlier
identifier
Codon usage table identifier
data
identifier
edam
bioinformatics
beta12orEarlier
identifiers
Identifier of a codon usage table, for example a genetic code.
FlyBase primary identifier
Primary identifier of an object from the FlyBase database.
identifiers
bioinformatics
identifier
edam
data
beta12orEarlier
WormBase identifier
identifier
identifiers
Identifier of an object from the WormBase database.
beta12orEarlier
bioinformatics
data
edam
WormBase wormpep ID
bioinformatics
beta12orEarlier
CE[0-9]{5}
identifiers
Protein identifier used by WormBase database.
edam
identifier
data
Nucleic acid features (codon)
true
beta12orEarlier
data
data
bioinformatics
An informative report on a trinucleotide sequence that encodes an amino acid including the triplet sequence, the encoded amino acid or whether it is a start or stop codon.
beta12orEarlier
edam
Map identifier
data
identifiers
identifier
edam
bioinformatics
beta12orEarlier
An identifier of a map of a molecular sequence.
Person identifier
identifiers
beta12orEarlier
bioinformatics
edam
data
identifier
An identifier of a software end-user (typically a person).
Nucleic acid identifier
identifiers
bioinformatics
beta12orEarlier
edam
identifier
data
Name or other identifier of a nucleic acid molecule.
Translation frame specification
edam
data
Frame for translation of DNA (3 forward and 3 reverse frames relative to a chromosome).
data
bioinformatics
beta12orEarlier
Genetic code identifier
bioinformatics
data
identifier
identifiers
beta12orEarlier
An identifier of a genetic code.
edam
Genetic code name
bioinformatics
identifier
identifiers
beta12orEarlier
data
Informal name for a genetic code, typically an organism name.
edam
File format name
beta12orEarlier
bioinformatics
identifiers
Name of a file format such as HTML, PNG, PDF, EMBL, GenBank and so on.
edam
identifier
data
Sequence profile type
beta12orEarlier
A label (text token) describing a type of sequence profile such as frequency matrix, Gribskov profile, hidden Markov model etc.
data
edam
bioinformatics
data
Operating system name
bioinformatics
Name of a computer operating system such as Linux, PC or Mac.
beta12orEarlier
edam
identifier
data
identifiers
Mutation type
true
bioinformatics
A type of point or block mutation, including insertion, deletion, change, duplication and moves.
beta12orEarlier
data
edam
data
beta12orEarlier
Logical operator
data
edam
bioinformatics
identifier
A logical operator such as OR, AND, XOR, and NOT.
identifiers
beta12orEarlier
Results sort order
bioinformatics
A control of the order of data that is output, for example the order of sequences in an alignment.
edam
data
data
beta12orEarlier
Possible options including sorting by score, rank, by increasing P-value (probability, i.e. most statistically significant hits given first) and so on.
Toggle
true
data
edam
beta12orEarlier
bioinformatics
A simple parameter that is a toggle (boolean value), typically a control for a modal tool.
beta12orEarlier
data
Sequence width
true
The width of an output sequence or alignment.
beta12orEarlier
bioinformatics
data
beta12orEarlier
edam
data
Gap penalty
A penalty for introducing or extending a gap in an alignment.
data
beta12orEarlier
data
edam
bioinformatics
Nucleic acid melting temperature
Melting temperature
data
data
A temperature concerning nucleic acid denaturation, typically the temperature at which the two strands of a hybridized or double stranded nucleic acid (DNA or RNA/DNA) molecule separate.
bioinformatics
edam
beta12orEarlier
Concentration
data
data
The concentration of a chemical compound.
beta12orEarlier
bioinformatics
edam
Window step size
Size of the incremental 'step' a sequence window is moved over a sequence.
bioinformatics
data
beta12orEarlier
data
edam
EMBOSS graph
true
An image of a graph generated by the EMBOSS suite.
bioinformatics
beta12orEarlier
beta12orEarlier
data
data
edam
EMBOSS report
true
edam
beta12orEarlier
data
beta12orEarlier
An application report generated by the EMBOSS suite.
bioinformatics
data
Sequence offset
An offset for a single-point sequence position.
data
edam
data
beta12orEarlier
bioinformatics
Threshold
edam
A value that serves as a threshold for a tool (usually to control scoring or output).
beta12orEarlier
data
data
bioinformatics
Protein report (transcription factor)
true
data
edam
An informative report on a transcription factor protein.
data
beta12orEarlier
Transcription factor binding site data
This might include conformational or physicochemical properties, as well as sequence information for transcription factor(s) binding sites.
bioinformatics
beta13
Database category name
true
identifiers
beta12orEarlier
identifier
The name of a category of biological or bioinformatics database.
edam
bioinformatics
beta12orEarlier
data
Sequence profile name
true
data
bioinformatics
edam
identifier
Name of a sequence profile.
beta12orEarlier
beta12orEarlier
identifiers
Color
true
beta12orEarlier
beta12orEarlier
data
Specification of one or more colors.
data
edam
bioinformatics
Rendering parameter
Graphics parameter
Graphical parameter
A parameter that is used to control rendering (drawing) to a device or image.
bioinformatics
data
beta12orEarlier
data
edam
Sequence name
identifier
data
edam
bioinformatics
identifiers
Any arbitrary name of a molecular sequence.
beta12orEarlier
Date
edam
beta12orEarlier
data
A temporal date.
bioinformatics
data
Word composition
true
edam
beta12orEarlier
beta12orEarlier
data
bioinformatics
Word composition data for a molecular sequence.
data
Fickett testcode plot
data
beta12orEarlier
data
A plot of Fickett testcode statistic (identifying protein coding regions) in a nucleotide sequences.
edam
bioinformatics
Sequence similarity plot
A plot of sequence similarities identified from word-matching or character comparison.
bioinformatics
data
beta12orEarlier
edam
data
Helical wheel
An image of peptide sequence sequence looking down the axis of the helix for highlighting amphipathicity and other properties.
bioinformatics
edam
data
beta12orEarlier
data
Helical net
edam
An image of peptide sequence sequence in a simple 3,4,3,4 repeating pattern that emulates at a simple level the arrangement of residues around an alpha helix.
bioinformatics
data
Useful for highlighting amphipathicity and other properties.
beta12orEarlier
data
Protein sequence properties plot
true
A plot of general physicochemical properties of a protein sequence.
edam
beta12orEarlier
data
beta12orEarlier
data
bioinformatics
Protein ionization curve
data
A plot of pK versus pH for a protein.
bioinformatics
beta12orEarlier
data
edam
Sequence composition plot
A plot of character or word composition / frequency of a molecular sequence.
beta12orEarlier
data
edam
bioinformatics
data
Nucleic acid density plot
edam
data
data
Density plot (of base composition) for a nucleotide sequence.
beta12orEarlier
bioinformatics
Sequence trace image
bioinformatics
beta12orEarlier
data
edam
Image of a sequence trace (nucleotide sequence versus probabilities of each of the 4 bases).
data
Nucleic acid features (siRNA)
A report on siRNA duplexes in mRNA.
bioinformatics
data
data
beta12orEarlier
edam
Sequence set (stream)
true
data
This concept may be used for sequence sets that are expected to be read and processed a single sequence at a time.
data
beta12orEarlier
edam
bioinformatics
beta12orEarlier
A collection of multiple molecular sequences and (typically) associated metadata that is intended for sequential processing.
FlyBase secondary identifier
beta12orEarlier
identifier
identifiers
bioinformatics
Secondary identifier are used to handle entries that were merged with or split from other entries in the database.
Secondary identifier of an object from the FlyBase database.
edam
data
Cardinality
true
beta12orEarlier
data
beta12orEarlier
edam
data
bioinformatics
The number of a certain thing.
Exactly 1
true
bioinformatics
edam
A single thing.
data
data
beta12orEarlier
beta12orEarlier
1 or more
true
beta12orEarlier
edam
bioinformatics
beta12orEarlier
One or more things.
data
data
Exactly 2
true
data
beta12orEarlier
edam
beta12orEarlier
bioinformatics
Exactly two things.
data
2 or more
true
data
edam
data
beta12orEarlier
bioinformatics
beta12orEarlier
Two or more things.
Sequence checksum
Hash
edam
Hash code
A fixed-size datum calculated (by using a hash function) for a molecular sequence, typically for purposes of error detection or indexing.
data
Hash value
bioinformatics
Hash sum
data
beta12orEarlier
Protein features (chemical modification)
Protein modification annotation
data
bioinformatics
Data on a chemical modification of a protein.
beta12orEarlier
MOD:00000
GO:0006464
edam
data
Error
data
data
bioinformatics
Data on an error generated by computer system or tool.
beta12orEarlier
edam
Database entry metadata
data
bioinformatics
Basic information on any arbitrary database entry.
edam
data
beta12orEarlier
Gene cluster
true
A cluster of similar genes.
beta12orEarlier
data
edam
beta13
data
bioinformatics
Sequence record full
SO:2000061
A molecular sequence and comprehensive metadata (such as a feature table), typically corresponding to a full entry from a molecular sequence database.
data
beta12orEarlier
edam
bioinformatics
data
Plasmid identifier
data
beta12orEarlier
identifier
An identifier of a plasmid in a database.
bioinformatics
edam
identifiers
Mutation ID
A unique identifier of a specific mutation catalogued in a database.
beta12orEarlier
identifier
identifiers
bioinformatics
edam
data
Mutation annotation (basic)
true
data
beta12orEarlier
edam
data
beta12orEarlier
bioinformatics
Information describing the mutation itself, the organ site, tissue and type of lesion where the mutation has been identified, description of the patient origin and life-style.
Mutation annotation (prevalence)
true
edam
data
beta12orEarlier
An informative report on the prevalence of mutation(s), including data on samples and mutation prevalence (e.g. by tumour type)..
bioinformatics
data
beta12orEarlier
Mutation annotation (prognostic)
true
data
beta12orEarlier
An informative report on mutation prognostic data, such as information on patient cohort, the study settings and the results of the study.
edam
beta12orEarlier
data
bioinformatics
Mutation annotation (functional)
true
data
beta12orEarlier
data
beta12orEarlier
bioinformatics
edam
An informative report on the functional properties of mutant proteins including transcriptional activities, promotion of cell growth and tumorigenicity, dominant negative effects, capacity to induce apoptosis, cell-cycle arrest or checkpoints in human cells and so on.
Codon number
data
data
The number of a codon, for instance, at which a mutation is located.
bioinformatics
beta12orEarlier
edam
Tumor annotation
beta12orEarlier
An informative report on a specific tumor including nature and origin of the sample, anatomic site, organ or tissue, tumor type, including morphology and/or histologic type, and so on.
bioinformatics
edam
data
data
Server metadata
bioinformatics
edam
beta12orEarlier
data
Basic information about a server on the web, such as an SRS server.
data
Database field name
identifier
beta12orEarlier
data
identifiers
bioinformatics
The name of a field in a database.
edam
Sequence cluster ID (SYSTERS)
identifier
edam
bioinformatics
data
beta12orEarlier
identifiers
SYSTERS cluster ID
Unique identifier of a sequence cluster from the SYSTERS database.
Ontology metadata
bioinformatics
data
edam
data
Data concerning a biological ontology.
beta12orEarlier
Raw SCOP domain classification
true
Raw SCOP domain classification data files.
beta12orEarlier
These are the parsable data files provided by SCOP.
beta13
data
bioinformatics
data
edam
Raw CATH domain classification
true
beta12orEarlier
bioinformatics
data
beta13
Raw CATH domain classification data files.
data
These are the parsable data files provided by CATH.
edam
Heterogen annotation
bioinformatics
edam
data
An informative report on the types of small molecules or 'heterogens' (non-protein groups) that are represented in PDB files.
beta12orEarlier
data
Phylogenetic property values
true
beta12orEarlier
data
data
Phylogenetic property values data.
edam
beta12orEarlier
bioinformatics
Sequence set (bootstrapped)
A collection of sequences output from a bootstrapping (resampling) procedure.
data
bioinformatics
data
Bootstrapping is often performed in phylogenetic analysis.
beta12orEarlier
edam
Phylogenetic consensus tree
true
data
beta12orEarlier
beta12orEarlier
bioinformatics
A consensus phylogenetic tree derived from comparison of multiple trees.
edam
data
Schema
edam
data
A data schema for organising or transforming data of some type.
data
bioinformatics
beta12orEarlier
DTD
edam
beta12orEarlier
data
bioinformatics
A DTD (document type definition).
data
XML Schema
edam
data
data
An XML Schema.
XSD
beta12orEarlier
bioinformatics
Relax-NG schema
edam
data
A relax-NG schema.
bioinformatics
data
beta12orEarlier
XSLT stylesheet
data
data
An XSLT stylesheet.
beta12orEarlier
edam
bioinformatics
Data resource definition name
The name of a data type.
identifier
beta12orEarlier
identifiers
bioinformatics
data
edam
OBO file format name
bioinformatics
beta12orEarlier
Name of an OBO file format such as OBO-XML, plain and so on.
edam
identifiers
identifier
data
Gene ID (MIPS)
MIPS genetic element identifier
identifiers
beta12orEarlier
data
bioinformatics
Identifier for genetic elements in MIPS database.
identifier
edam
Sequence identifier (protein)
true
identifier
data
beta12orEarlier
edam
identifiers
beta12orEarlier
bioinformatics
An identifier of protein sequence(s) or protein sequence database entries.
Sequence identifier (nucleic acid)
true
beta12orEarlier
identifiers
An identifier of nucleotide sequence(s) or nucleotide sequence database entries.
edam
beta12orEarlier
identifier
data
bioinformatics
EMBL accession
edam
data
An accession number of an entry from the EMBL sequence database.
EMBL accession number
EMBL identifier
beta12orEarlier
EMBL ID
identifiers
bioinformatics
identifier
UniProt ID
edam
UniProt entry name
data
UniProt identifier
beta12orEarlier
bioinformatics
UniProtKB entry name
An identifier of a polypeptide in the UniProt database.
identifier
UniProtKB identifier
identifiers
GenBank accession
bioinformatics
GenBank identifier
identifiers
GenBank ID
GenBank accession number
beta12orEarlier
edam
data
Accession number of an entry from the GenBank sequence database.
identifier
Gramene secondary identifier
identifier
Gramene secondary ID
Secondary (internal) identifier of a Gramene database entry.
identifiers
edam
Gramene internal ID
bioinformatics
data
Gramene internal identifier
beta12orEarlier
Sequence variation ID
bioinformatics
identifier
beta12orEarlier
identifiers
data
An identifier of an entry from a database of molecular sequence variation.
edam
Gene ID
identifiers
identifier
beta12orEarlier
Gene accession
Gene code
bioinformatics
A unique (and typically persistent) identifier of a gene in a database, that is (typically) different to the gene name/symbol.
edam
data
Gene name (AceView)
beta12orEarlier
edam
Name of an entry (gene) from the AceView genes database.
identifier
identifiers
AceView gene name
data
bioinformatics
Gene ID (ECK)
http://www.geneontology.org/doc/GO.xrf_abbs: ECK
E. coli K-12 gene identifier
identifier
Identifier of an E. coli K-12 gene from EcoGene Database.
data
edam
bioinformatics
beta12orEarlier
ECK accession
identifiers
Gene ID (HGNC)
bioinformatics
identifier
edam
Identifier for a gene approved by the HUGO Gene Nomenclature Committee.
beta12orEarlier
data
HGNC ID
identifiers
Gene name
beta12orEarlier
bioinformatics
identifiers
edam
data
identifier
The name of a gene, (typically) assigned by a person and/or according to a naming scheme. It may contain white space characters and is typically more intuitive and readable than a gene symbol. It (typically) may be used to identify similar genes in different species and to derive a gene symbol.
Gene name (NCBI)
edam
identifier
bioinformatics
Name of an entry (gene) from the NCBI genes database.
NCBI gene name
identifiers
data
beta12orEarlier
SMILES string
edam
data
beta12orEarlier
A specification of a chemical structure in SMILES format.
bioinformatics
data
STRING ID
beta12orEarlier
identifier
data
bioinformatics
Unique identifier of an entry from the STRING database of protein-protein interactions.
identifiers
edam
Virus annotation
beta12orEarlier
bioinformatics
data
data
edam
An informative report on a specific virus.
Virus annotation (taxonomy)
edam
beta12orEarlier
data
data
An informative report on the taxonomy of a specific virus.
bioinformatics
Reaction ID (SABIO-RK)
bioinformatics
data
[0-9]+
edam
beta12orEarlier
identifier
Identifier of a biological reaction from the SABIO-RK reactions database.
identifiers
Carbohydrate structure report
data
bioinformatics
edam
beta12orEarlier
data
Annotation on or information derived from one or more specific carbohydrate 3D structure(s).
GI number
edam
identifiers
bioinformatics
Nucleotide sequence GI number is shown in the VERSION field of the database record. Protein sequence GI number is shown in the CDS/db_xref field of a nucleotide database record, and the VERSION field of a protein database record.
A series of digits that are assigned consecutively to each sequence record processed by NCBI. The GI number bears no resemblance to the Accession number of the sequence record.
gi number
NCBI GI number
data
identifier
beta12orEarlier
NCBI version
accession.version
Nucleotide sequence version contains two letters followed by six digits, a dot, and a version number (or for older nucleotide sequence records, the format is one letter followed by five digits, a dot, and a version number). Protein sequence version contains three letters followed by five digits, a dot, and a version number.
identifier
beta12orEarlier
An identifier assigned to sequence records processed by NCBI, made of the accession number of the database record followed by a dot and a version number.
identifiers
bioinformatics
edam
NCBI accession.version
data
Cell line name
edam
data
The name of a cell line.
bioinformatics
identifier
beta12orEarlier
identifiers
Cell line name (exact)
The name of a cell line.
data
beta12orEarlier
identifier
edam
identifiers
bioinformatics
Cell line name (truncated)
data
identifiers
identifier
edam
beta12orEarlier
The name of a cell line.
bioinformatics
Cell line name (no punctuation)
identifiers
beta12orEarlier
edam
data
bioinformatics
identifier
The name of a cell line.
Cell line name (assonant)
edam
bioinformatics
beta12orEarlier
data
identifier
The name of a cell line.
identifiers
Enzyme ID
Enzyme accession
beta12orEarlier
identifiers
A unique, persistent identifier of an enzyme.
data
identifier
bioinformatics
edam
REBASE enzyme number
edam
data
beta12orEarlier
bioinformatics
identifiers
Identifier of an enzyme from the REBASE enzymes database.
identifier
DrugBank ID
edam
bioinformatics
Unique identifier of a drug from the DrugBank database.
DB[0-9]{5}
identifier
identifiers
data
beta12orEarlier
GI number (protein)
A unique identifier assigned to NCBI protein sequence records.
Nucleotide sequence GI number is shown in the VERSION field of the database record. Protein sequence GI number is shown in the CDS/db_xref field of a nucleotide database record, and the VERSION field of a protein database record.
bioinformatics
edam
identifier
beta12orEarlier
protein gi
identifiers
protein gi number
data
Bit score
data
beta12orEarlier
data
Bit scores are normalized with respect to the scoring system and therefore can be used to compare alignment scores from different searches.
edam
A score derived from the alignment of two sequences, which is then normalized with respect to the scoring system.
bioinformatics
Translation phase specification
data
bioinformatics
Phase
Phase for translation of DNA (0, 1 or 2) relative to a fragment of the coding sequence.
edam
beta12orEarlier
data
Metadata
beta12orEarlier
data
edam
Provenance metadata
bioinformatics
This is a broad data type and is used a placeholder for other, more specific types.
data
Data concerning or describing some core data, as distinct from the primary data that is being described. This includes metadata on the origin, source, history, ownership or location of some thing.
Ontology identifier
identifiers
edam
Any arbitrary identifier of an ontology.
data
identifier
beta12orEarlier
bioinformatics
Ontology concept name
beta12orEarlier
identifiers
identifier
bioinformatics
data
The name of a concept in an ontology.
edam
Genome build identifier
data
edam
identifiers
An identifier of a build of a particular genome.
bioinformatics
beta12orEarlier
identifier
Pathway or network name
edam
beta12orEarlier
data
identifiers
bioinformatics
identifier
The name of a biological pathway or network.
Pathway ID (KEGG)
[a-zA-Z_0-9]{2,3}[0-9]{5}
identifier
beta12orEarlier
data
edam
identifiers
KEGG pathway ID
bioinformatics
Identifier of a pathway from the KEGG pathway database.
Pathway ID (NCI-Nature)
Identifier of a pathway from the NCI-Nature pathway database.
[a-zA-Z_0-9]+
data
bioinformatics
identifier
identifiers
beta12orEarlier
edam
Pathway ID (ConsensusPathDB)
Identifier of a pathway from the ConsensusPathDB pathway database.
bioinformatics
identifier
beta12orEarlier
edam
data
identifiers
Sequence cluster ID (UniRef)
edam
UniRef entry accession
bioinformatics
data
Unique identifier of an entry from the UniRef database.
identifier
beta12orEarlier
UniRef cluster id
identifiers
Sequence cluster ID (UniRef100)
Unique identifier of an entry from the UniRef100 database.
UniRef100 entry accession
identifiers
bioinformatics
UniRef100 cluster id
data
identifier
edam
beta12orEarlier
Sequence cluster ID (UniRef90)
Unique identifier of an entry from the UniRef90 database.
edam
bioinformatics
identifier
identifiers
data
UniRef90 cluster id
beta12orEarlier
UniRef90 entry accession
Sequence cluster ID (UniRef50)
identifiers
UniRef50 cluster id
identifier
data
Unique identifier of an entry from the UniRef50 database.
edam
bioinformatics
UniRef50 entry accession
beta12orEarlier
Ontological data
true
beta13
edam
Data concerning an ontology.
This is a broad data type and is used a placeholder for other, more specific types.
beta12orEarlier
data
data
bioinformatics
RNA family annotation
bioinformatics
data
An informative report on a specific RNA family or other group of classified RNA sequences.
beta12orEarlier
edam
data
RNA family identifier
identifiers
Identifier of an RNA family, typically an entry from a RNA sequence classification database.
data
bioinformatics
edam
identifier
beta12orEarlier
RFAM accession
identifier
beta12orEarlier
edam
Stable accession number of an entry (RNA family) from the RFAM database.
identifiers
bioinformatics
data
Protein signature type
data
bioinformatics
edam
A label (text token) describing a type of protein family signature (sequence classifier) from the InterPro database.
data
beta12orEarlier
Domain-nucleic acid interaction
beta12orEarlier
data
data
bioinformatics
edam
Data on protein domain-DNA/RNA interaction(s).
Domain-domain interaction
bioinformatics
data
data
beta12orEarlier
Data on protein domain-protein domain interaction(s).
edam
Domain-domain interaction (indirect)
true
data
data
Data on indirect protein domain-protein domain interaction(s).
beta12orEarlier
bioinformatics
edam
beta12orEarlier
Sequence accession (hybrid)
data
beta12orEarlier
Accession number of a nucleotide or protein sequence database entry.
edam
bioinformatics
identifiers
identifier
2D PAGE data
true
beta12orEarlier
edam
bioinformatics
This is a broad data type and is used a placeholder for other, more specific types. It is primarily intended to help navigation of EDAM and would not typically be used for annotation.
data
Data concerning two-dimensional polygel electrophoresis.
data
beta13
Experiment annotation (2D PAGE)
data
edam
An informative report on a two-dimensional gel electrophoresis experiment, gel or spots in a gel.
bioinformatics
data
beta12orEarlier
Pathway or network accession
edam
identifier
identifiers
data
beta12orEarlier
A persistent, unique identifier of a biological pathway or network (typically a database entry).
bioinformatics
Secondary structure alignment
beta12orEarlier
bioinformatics
data
Alignment of the (1D representations of) secondary structure of two or more molecules.
edam
data
ASTD ID
Identifier of an object from the ASTD database.
identifier
beta12orEarlier
bioinformatics
identifiers
data
edam
ASTD ID (exon)
Identifier of an exon from the ASTD database.
bioinformatics
beta12orEarlier
identifier
edam
data
identifiers
ASTD ID (intron)
edam
identifiers
Identifier of an intron from the ASTD database.
identifier
beta12orEarlier
data
bioinformatics
ASTD ID (polya)
identifier
beta12orEarlier
edam
identifiers
data
Identifier of a polyA signal from the ASTD database.
bioinformatics
ASTD ID (tss)
edam
identifier
identifiers
data
bioinformatics
beta12orEarlier
Identifier of a transcription start site from the ASTD database.
2D PAGE spot (annotated)
bioinformatics
data
edam
data
An informative report on individual spot(s) from a two-dimensional (2D PAGE) gel.
beta12orEarlier
2D PAGE spot annotation
Spot ID
Unique identifier of a spot from a two-dimensional (protein) gel.
data
edam
identifiers
beta12orEarlier
identifier
bioinformatics
Spot serial number
beta12orEarlier
data
edam
bioinformatics
identifiers
Unique identifier of a spot from a two-dimensional (protein) gel in the SWISS-2DPAGE database.
identifier
Spot ID (HSC-2DPAGE)
beta12orEarlier
identifier
bioinformatics
data
edam
identifiers
Unique identifier of a spot from a two-dimensional (protein) gel from a HSC-2DPAGE database.
Protein-motif interaction
true
data
beta12orEarlier
Data on the interaction of a protein (or protein domain) with specific structural (3D) and/or sequence motifs.
edam
data
beta13
bioinformatics
Strain identifier
identifier
data
beta12orEarlier
identifiers
Identifier of a strain of an organism variant, typically a plant, virus or bacterium.
edam
bioinformatics
CABRI accession
edam
beta12orEarlier
data
bioinformatics
identifiers
A unique identifier of an item from the CABRI database.
identifier
Experiment annotation (genotype)
data
edam
Metadata on a genotype experiment including case control, population, and family studies. These might use array based methods and re-sequencing methods.
beta12orEarlier
bioinformatics
data
Genotype experiment ID
data
Identifier of an entry from a database of genotype experiment metadata.
edam
identifiers
identifier
beta12orEarlier
bioinformatics
EGA accession
Identifier of an entry from the EGA database.
beta12orEarlier
bioinformatics
identifiers
identifier
data
edam
IPI protein ID
identifier
bioinformatics
beta12orEarlier
identifiers
IPI[0-9]{8}
data
edam
Identifier of a protein entry catalogued in the International Protein Index (IPI) database.
RefSeq accession (protein)
beta12orEarlier
Accession number of a protein from the RefSeq database.
identifier
identifiers
edam
bioinformatics
data
RefSeq protein ID
EPD ID
bioinformatics
identifiers
EPD identifier
identifier
data
Identifier of an entry (promoter) from the EPD database.
edam
beta12orEarlier
TAIR accession
Identifier of an entry from the TAIR database.
beta12orEarlier
edam
data
bioinformatics
identifiers
identifier
TAIR accession (At gene)
edam
Identifier of an Arabidopsis thaliana gene from the TAIR database.
identifiers
bioinformatics
beta12orEarlier
data
identifier
UniSTS accession
data
Identifier of an entry from the UniSTS database.
bioinformatics
edam
identifiers
beta12orEarlier
identifier
UNITE accession
edam
identifiers
bioinformatics
Identifier of an entry from the UNITE database.
data
beta12orEarlier
identifier
UTR accession
bioinformatics
data
edam
identifiers
Identifier of an entry from the UTR database.
identifier
beta12orEarlier
UniParc accession
data
UPI
UniParc ID
identifier
Accession number of a UniParc (protein sequence) database entry.
UPI[A-F0-9]{10}
identifiers
edam
bioinformatics
beta12orEarlier
mFLJ/mKIAA number
bioinformatics
identifier
data
Identifier of an entry from the Rouge or HUGE databases.
identifiers
beta12orEarlier
edam
Fungi annotation
An informative report on a specific fungus.
data
data
beta12orEarlier
edam
bioinformatics
Fungi annotation (anamorph)
bioinformatics
edam
An informative report on a specific fungus anamorph.
data
beta12orEarlier
data
Nucleic acid features (exon)
Gene features (exon)
beta12orEarlier
data
data
An informative report on an exon in a nucleotide sequences.
bioinformatics
edam
Protein ID (Ensembl)
bioinformatics
Unique identifier for a protein from the Ensembl database.
data
Ensembl protein ID
beta12orEarlier
identifier
edam
identifiers
Gene annotation (transcript)
An informative report on a specific gene transcript, clone or EST.
edam
Gene annotation (clone or EST)
Gene transcript annotation
beta12orEarlier
bioinformatics
data
data
Toxin annotation
edam
bioinformatics
An informative report on a specific toxin.
data
data
beta12orEarlier
Protein report (membrane protein)
true
beta12orEarlier
data
bioinformatics
An informative report on a membrane protein.
data
edam
beta12orEarlier
Protein-drug interaction
beta12orEarlier
bioinformatics
edam
data
Informative report on protein-drug interaction(s) including binding affinity data.
data
Map data
true
edam
bioinformatics
Data concerning a map of molecular sequence(s).
beta13
This is a broad data type and is used a placeholder for other, more specific types.
data
beta12orEarlier
data
Phylogenetic raw data
beta12orEarlier
bioinformatics
data
edam
data
This is a broad data type and is used a placeholder for other, more specific types.
Data concerning phylogeny, typically of molecular sequences.
Phylogenetic data
Protein data
true
data
data
beta13
edam
beta12orEarlier
bioinformatics
Data concerning one or more protein molecules.
This is a broad data type and is used a placeholder for other, more specific types.
Nucleic acid data
true
bioinformatics
This is a broad data type and is used a placeholder for other, more specific types.
data
data
beta12orEarlier
beta13
Data concerning one or more nucleic acid molecules.
edam
Article data
true
beta13
edam
bioinformatics
Data concerning the scientific literature.
This is a broad data type and is used a placeholder for other, more specific types. It is primarily intended to help navigation of EDAM and would not typically be used for annotation. It includes concepts that are best described as scientific text or closely concerned with or derived from text.
data
beta12orEarlier
data
Parameter
bioinformatics
data
edam
Tool parameter
Tool-specific parameter
data
Parameter or primitive
beta12orEarlier
Typically a simple numerical or string value that controls the operation of a tool.
Slightly narrower in the sense of changing the characteristics of a system/function.
Molecular data
true
This is a broad data type and is used a placeholder for other, more specific types.
beta12orEarlier
beta13
Data concerning a specific type of molecule.
edam
bioinformatics
data
data
Molecule-specific data
Molecule report
data
data
edam
beta12orEarlier
Molecular report
bioinformatics
An informative report on a specific molecule.
Organism annotation
bioinformatics
data
edam
beta12orEarlier
data
An informative report on a specific organism.
Experiment annotation
edam
Annotation on a wet lab experiment, such as experimental conditions.
data
data
beta12orEarlier
bioinformatics
Nucleic acid features (mutation)
data
Annotation on a mutation.
edam
Mutation annotation
data
beta12orEarlier
bioinformatics
Sequence parameter
data
data
bioinformatics
beta12orEarlier
edam
A parameter concerning calculations on molecular sequences.
Sequence tag profile
bioinformatics
data
Output from a serial analysis of gene expression (SAGE), massively parallel signature sequencing (MPSS) or sequencing by synthesis (SBS) experiment. In all cases this is a list of short sequence tags and the number of times it is observed.
Sequencing-based expression profile
beta12orEarlier
edam
SAGE, MPSS and SBS experiments are usually performed to study gene expression. The sequence tags are typically subsequently annotated (after a database search) with the mRNA (and therefore gene) the tag was extracted from.
data
Mass spectrometry data
beta12orEarlier
Data concerning a mass spectrometry measurement.
bioinformatics
data
edam
data
Protein structure raw data
This is a broad data type and is used a placeholder for other, more specific types. It is primarily intended to help navigation of EDAM and would not typically be used for annotation.
bioinformatics
Raw data from experimental methods for determining protein structure.
data
edam
data
beta12orEarlier
Mutation identifier
identifier
beta12orEarlier
bioinformatics
edam
data
identifiers
An identifier of a mutation.
Alignment data
true
bioinformatics
edam
This is a broad data type and is used a placeholder for other, more specific types. This includes entities derived from sequences and structures such as motifs and profiles.
data
data
beta12orEarlier
Data concerning an alignment of two or more molecular sequences, structures or derived data.
beta13
Data index data
true
This is a broad data type and is used a placeholder for other, more specific types.
Database index
beta13
bioinformatics
Data concerning an index of data.
data
edam
data
beta12orEarlier
Amino acid name (single letter)
identifier
identifiers
bioinformatics
edam
Single letter amino acid identifier, e.g. G.
data
beta12orEarlier
Amino acid name (three letter)
identifier
data
edam
Three letter amino acid identifier, e.g. GLY.
bioinformatics
identifiers
beta12orEarlier
Amino acid name (full name)
Full name of an amino acid, e.g. Glycine.
identifiers
bioinformatics
beta12orEarlier
edam
data
identifier
Toxin identifier
data
identifier
bioinformatics
Identifier of a toxin.
identifiers
edam
beta12orEarlier
ArachnoServer ID
beta12orEarlier
edam
identifier
bioinformatics
Unique identifier of a toxin from the ArachnoServer database.
identifiers
data
Expressed gene list
A simple summary of expressed genes.
edam
Gene annotation (expressed gene list)
data
bioinformatics
beta12orEarlier
data
BindingDB Monomer ID
Unique identifier of a monomer from the BindingDB database.
beta12orEarlier
identifiers
edam
identifier
bioinformatics
data
GO concept name
true
data
The name of a concept from the GO ontology.
edam
bioinformatics
beta12orEarlier
beta12orEarlier
identifiers
identifier
GO concept ID (biological process)
An identifier of a 'biological process' concept from the the Gene Ontology.
bioinformatics
identifiers
identifier
edam
data
beta12orEarlier
[0-9]{7}|GO:[0-9]{7}
GO concept ID (molecular function)
An identifier of a 'molecular function' concept from the the Gene Ontology.
data
bioinformatics
identifier
[0-9]{7}|GO:[0-9]{7}
identifiers
beta12orEarlier
edam
GO concept name (cellular component)
true
identifier
bioinformatics
beta12orEarlier
identifiers
edam
beta12orEarlier
The name of a concept for a cellular component from the GO ontology.
data
Northern blot image
An image arising from a Northern Blot experiment.
data
beta12orEarlier
edam
bioinformatics
data
Blot ID
identifier
edam
data
identifiers
beta12orEarlier
Unique identifier of a blot from a Northern Blot.
bioinformatics
BlotBase blot ID
identifiers
bioinformatics
identifier
data
Unique identifier of a blot from a Northern Blot from the BlotBase database.
beta12orEarlier
edam
Hierarchy
Hierarchy annotation
bioinformatics
edam
A biological hierarchy which might include data describing the hierarchy proper, hierarchy components and associated annotation.
beta12orEarlier
data
data
Hierarchy identifier
true
bioinformatics
identifier
data
Identifier of an entry from a database of biological hierarchies.
identifiers
edam
beta12orEarlier
beta12orEarlier
Brite hierarchy ID
edam
identifier
Identifier of an entry from the Brite database of biological hierarchies.
bioinformatics
beta12orEarlier
identifiers
data
Cancer type
true
beta12orEarlier
bioinformatics
beta12orEarlier
edam
A type (represented as a string) of cancer.
data
data
BRENDA organism ID
bioinformatics
beta12orEarlier
data
A unique identifier for an organism used in the BRENDA database.
edam
identifier
identifiers
UniGene taxon
data
identifiers
beta12orEarlier
identifier
bioinformatics
UniGene organism abbreviation
edam
The name of a taxon using the controlled vocabulary of the UniGene database.
UTRdb taxon
bioinformatics
identifier
data
edam
The name of a taxon using the controlled vocabulary of the UTRdb database.
beta12orEarlier
identifiers
Catalogue identifier
An identifier of a catalogue of biological resources.
beta12orEarlier
identifier
identifiers
data
edam
bioinformatics
CABRI catalogue name
bioinformatics
beta12orEarlier
identifiers
data
The name of a catalogue of biological resources from the CABRI database.
edam
identifier
Secondary structure alignment metadata
true
data
edam
data
bioinformatics
beta12orEarlier
beta12orEarlier
An informative report on protein secondary structure alignment-derived data or metadata.
Molecular interaction
bioinformatics
Molecular interaction data
edam
beta12orEarlier
data
Physical, chemical or other information concerning the interaction of two or more molecules (or parts of molecules).
data
Pathway or network
data
Primary data about a specific biological pathway or network (the nodes and connections within the pathway or network).
edam
beta12orEarlier
bioinformatics
Network
data
Small molecule data
true
data
beta13
edam
beta12orEarlier
Data concerning one or more small molecules.
data
This is a broad data type and is used a placeholder for other, more specific types.
bioinformatics
Genotype and phenotype data
true
Data concerning a particular genotype, phenotype or a genotype / phenotype relation.
bioinformatics
data
data
edam
beta13
beta12orEarlier
Microarray data
This is a broad data type and is used a placeholder for other, more specific types. See also http://edamontology.org/data_0931
data
edam
beta12orEarlier
Image or hybridisation data for a microarray, typically a study of gene expression.
data
Gene expression data
bioinformatics
Compound ID (KEGG)
KEGG compound identifier
identifier
beta12orEarlier
Unique identifier of a chemical compound from the KEGG database.
edam
C[0-9]+
KEGG compound ID
bioinformatics
data
identifiers
RFAM name
beta12orEarlier
identifiers
edam
data
Name (not necessarily stable) an entry (RNA family) from the RFAM database.
bioinformatics
identifier
Reaction ID (KEGG)
edam
Identifier of a biological reaction from the KEGG reactions database.
beta12orEarlier
data
identifiers
R[0-9]+
identifier
bioinformatics
Drug ID (KEGG)
data
identifiers
identifier
D[0-9]+
edam
Unique identifier of a drug from the KEGG Drug database.
bioinformatics
beta12orEarlier
Ensembl ID
identifier
identifiers
data
beta12orEarlier
edam
Identifier of an entry (exon, gene, transcript or protein) from the Ensembl database.
ENS[A-Z]*[FPTG][0-9]{11}
bioinformatics
ICD identifier
identifier
data
bioinformatics
beta12orEarlier
identifiers
edam
[A-Z][0-9]+(\.[-[0-9]+])?
An identifier of a disease from the International Classification of Diseases (ICD) database.
Sequence cluster ID (CluSTr)
identifier
CluSTr cluster ID
identifiers
beta12orEarlier
edam
[0-9A-Za-z]+:[0-9]+:[0-9]{1,5}(\.[0-9])?
bioinformatics
CluSTr ID
Unique identifier of a sequence cluster from the CluSTr database.
data
KEGG Glycan ID
beta12orEarlier
bioinformatics
edam
data
identifier
Unique identifier of a glycan ligand from the KEGG GLYCAN database (a subset of KEGG LIGAND).
identifiers
G[0-9]+
TCDB ID
data
A unique identifier of a family from the transport classification database (TCDB) of membrane transport proteins.
identifier
[0-9]+\.[A-Z]\.[0-9]+\.[0-9]+\.[0-9]+
TC number
edam
OBO file for regular expression.
bioinformatics
beta12orEarlier
identifiers
MINT ID
beta12orEarlier
data
edam
identifier
Unique identifier of an entry from the MINT database of protein-protein interactions.
identifiers
MINT\-[0-9]{1,5}
bioinformatics
DIP ID
data
identifiers
bioinformatics
DIP[\:\-][0-9]{3}[EN]
beta12orEarlier
edam
identifier
Unique identifier of an entry from the DIP database of protein-protein interactions.
Signaling Gateway protein ID
bioinformatics
identifier
A[0-9]{6}
data
Unique identifier of a protein listed in the UCSD-Nature Signaling Gateway Molecule Pages database.
identifiers
edam
beta12orEarlier
Protein modification ID
Identifier of a protein modification catalogued in a database.
beta12orEarlier
bioinformatics
data
identifiers
edam
identifier
RESID ID
beta12orEarlier
edam
AA[0-9]{4}
identifier
Identifier of a protein modification catalogued in the RESID database.
bioinformatics
identifiers
data
RGD ID
Identifier of an entry from the RGD database.
beta12orEarlier
[0-9]{4,7}
identifiers
edam
bioinformatics
data
identifier
TAIR accession (protein)
beta12orEarlier
Identifier of a protein sequence from the TAIR database.
identifiers
AASequence:[0-9]{10}
data
edam
identifier
bioinformatics
Compound ID (HMDB)
identifiers
beta12orEarlier
bioinformatics
data
HMDB[0-9]{5}
Identifier of a small molecule metabolite from the Human Metabolome Database (HMDB).
identifier
HMDB ID
edam
LIPID MAPS ID
LM(FA|GL|GP|SP|ST|PR|SL|PK)[0-9]{4}([0-9a-zA-Z]{4})?
edam
identifiers
data
bioinformatics
Identifier of an entry from the LIPID MAPS database.
LM ID
identifier
beta12orEarlier
PeptideAtlas ID
data
PAp[0-9]{8}
identifiers
edam
identifier
beta12orEarlier
PDBML:pdbx_PDB_strand_id
bioinformatics
Identifier of a peptide from the PeptideAtlas peptide databases.
Molecular interaction ID
bioinformatics
beta12orEarlier
Identifier of a report of molecular interactions from a database (typically).
identifier
data
edam
identifiers
BioGRID interaction ID
A unique identifier of an interaction from the BioGRID database.
bioinformatics
beta12orEarlier
data
[0-9]+
identifier
identifiers
edam
Enzyme ID (MEROPS)
S[0-9]{2}\.[0-9]{3}
Unique identifier of a peptidase enzyme from the MEROPS database.
beta12orEarlier
MEROPS ID
bioinformatics
edam
identifiers
identifier
data
Mobile genetic element ID
beta12orEarlier
edam
identifier
identifiers
bioinformatics
data
An identifier of a mobile genetic element.
ACLAME ID
identifiers
An identifier of a mobile genetic element from the Aclame database.
mge:[0-9]+
beta12orEarlier
data
edam
bioinformatics
identifier
SGD ID
data
identifier
PWY[a-zA-Z_0-9]{2}\-[0-9]{3}
identifiers
Identifier of an entry from the Saccharomyces genome database (SGD).
beta12orEarlier
bioinformatics
edam
Book ID
data
bioinformatics
identifiers
identifier
edam
Unique identifier of a book.
beta12orEarlier
ISBN
identifier
bioinformatics
data
edam
identifiers
The International Standard Book Number (ISBN) is for identifying printed books.
(ISBN)?(-13|-10)?[:]?[ ]?([0-9]{2,3}[ -]?)?[0-9]{1,5}[ -]?[0-9]{1,7}[ -]?[0-9]{1,6}[ -]?([0-9]|X)
beta12orEarlier
Compound ID (3DMET)
edam
bioinformatics
identifiers
Identifier of a metabolite from the 3DMET database.
beta12orEarlier
data
identifier
3DMET ID
B[0-9]{5}
MatrixDB interaction ID
beta12orEarlier
edam
bioinformatics
identifier
data
A unique identifier of an interaction from the MatrixDB database.
identifiers
([A-NR-Z][0-9][A-Z][A-Z0-9][A-Z0-9][0-9])_.*|([OPQ][0-9][A-Z0-9][A-Z0-9][A-Z0-9][0-9]_.*)|(GAG_.*)|(MULT_.*)|(PFRAG_.*)|(LIP_.*)|(CAT_.*)
cPath ID
edam
A unique identifier for pathways, reactions, complexes and small molecules from the cPath (Pathway Commons) database.
beta12orEarlier
These identifiers are unique within the cPath database, however, they are not stable between releases.
identifiers
identifier
[0-9]+
data
bioinformatics
PubChem bioassay ID
bioinformatics
beta12orEarlier
data
identifier
edam
[0-9]+
Identifier of an assay from the PubChem database.
identifiers
PubChem identifier
Identifier of an entry from the PubChem database.
bioinformatics
edam
beta12orEarlier
identifier
data
identifiers
Reaction ID (MACie)
Identifier of an enzyme reaction mechanism from the MACie database.
identifier
bioinformatics
edam
MACie entry number
M[0-9]{4}
identifiers
data
beta12orEarlier
Gene ID (miRBase)
identifier
miRNA name
beta12orEarlier
miRNA identifier
MI[0-9]{7}
Identifier for a gene from the miRBase database.
identifiers
miRNA ID
data
edam
bioinformatics
Gene ID (ZFIN)
ZDB\-GENE\-[0-9]+\-[0-9]+
identifier
edam
beta12orEarlier
bioinformatics
Identifier for a gene from the Zebrafish information network genome (ZFIN) database.
data
identifiers
Reaction ID (Rhea)
edam
data
beta12orEarlier
identifier
Identifier of an enzyme-catalysed reaction from the Rhea database.
[0-9]{5}
bioinformatics
identifiers
Pathway ID (Unipathway)
Identifier of a biological pathway from the Unipathway database.
upaid
identifier
bioinformatics
edam
beta12orEarlier
data
identifiers
UPA[0-9]{5}
Compound ID (ChEMBL)
identifier
Identifier of a small molecular from the ChEMBL database.
data
identifiers
edam
[0-9]+
beta12orEarlier
ChEMBL ID
bioinformatics
LGICdb identifier
data
[a-zA-Z_0-9]+
identifier
beta12orEarlier
bioinformatics
identifiers
Unique identifier of an entry from the Ligand-gated ion channel (LGICdb) database.
edam
Reaction kinetics ID (SABIO-RK)
edam
identifiers
beta12orEarlier
bioinformatics
identifier
[0-9]+
Identifier of a biological reaction (kinetics entry) from the SABIO-RK reactions database.
data
PharmGKB ID
data
edam
identifier
beta12orEarlier
identifiers
PA[0-9]+
bioinformatics
Identifier of an entry from the pharmacogenetics and pharmacogenomics knowledge base (PharmGKB).
Pathway ID (PharmGKB)
beta12orEarlier
bioinformatics
PA[0-9]+
Identifier of a pathway from the pharmacogenetics and pharmacogenomics knowledge base (PharmGKB).
edam
identifier
identifiers
data
Disease ID (PharmGKB)
data
edam
Identifier of a disease from the pharmacogenetics and pharmacogenomics knowledge base (PharmGKB).
bioinformatics
identifiers
PA[0-9]+
identifier
beta12orEarlier
Drug ID (PharmGKB)
bioinformatics
identifiers
identifier
Identifier of a drug from the pharmacogenetics and pharmacogenomics knowledge base (PharmGKB).
data
PA[0-9]+
beta12orEarlier
edam
Drug ID (TTD)
beta12orEarlier
identifiers
Identifier of a drug from the Therapeutic Target Database (TTD).
data
edam
bioinformatics
identifier
DAP[0-9]+
Target ID (TTD)
bioinformatics
identifier
beta12orEarlier
Identifier of a target protein from the Therapeutic Target Database (TTD).
identifiers
TTDS[0-9]+
edam
data
Cell type identifier
edam
A unique identifier of a type or group of cells.
identifier
data
bioinformatics
identifiers
beta12orEarlier
NeuronDB ID
edam
identifiers
beta12orEarlier
identifier
A unique identifier of a neuron from the NeuronDB database.
data
[0-9]+
bioinformatics
NeuroMorpho ID
data
edam
bioinformatics
identifiers
identifier
[a-zA-Z_0-9]+
A unique identifier of a neuron from the NeuroMorpho database.
beta12orEarlier
Compound ID (ChemIDplus)
beta12orEarlier
[0-9]+
ChemIDplus ID
data
edam
identifiers
Identifier of a chemical from the ChemIDplus database.
bioinformatics
identifier
Pathway ID (SMPDB)
SMP[0-9]{5}
bioinformatics
data
edam
identifier
identifiers
Identifier of a pathway from the Small Molecule Pathway Database (SMPDB).
beta12orEarlier
BioNumbers ID
beta12orEarlier
data
bioinformatics
identifiers
identifier
edam
Identifier of an entry from the BioNumbers database of key numbers and associated data in molecular biology.
[0-9]+
T3DB ID
Unique identifier of a toxin from the Toxin and Toxin Target Database (T3DB) database.
identifier
T3D[0-9]+
data
beta12orEarlier
identifiers
bioinformatics
edam
Carbohydrate identifier
Identifier of a carbohydrate.
bioinformatics
edam
identifiers
beta12orEarlier
data
identifier
GlycomeDB ID
identifier
identifiers
data
edam
[0-9]+
beta12orEarlier
bioinformatics
Identifier of an entry from the GlycomeDB database.
LipidBank ID
Identifier of an entry from the LipidBank database.
identifier
edam
bioinformatics
identifiers
[a-zA-Z_0-9]+[0-9]+
data
beta12orEarlier
CDD ID
Identifier of a conserved domain from the Conserved Domain Database.
identifier
edam
data
cd[0-9]{5}
bioinformatics
identifiers
beta12orEarlier
MMDB ID
beta12orEarlier
data
bioinformatics
identifier
identifiers
An identifier of an entry from the MMDB database.
MMDB accession
[0-9]{1,5}
edam
iRefIndex ID
[0-9]+
beta12orEarlier
data
bioinformatics
Unique identifier of an entry from the iRefIndex database of protein-protein interactions.
identifiers
identifier
edam
ModelDB ID
edam
beta12orEarlier
identifiers
[0-9]+
bioinformatics
Unique identifier of an entry from the ModelDB database.
identifier
data
Pathway ID (DQCS)
bioinformatics
beta12orEarlier
identifiers
Identifier of a signaling pathway from the Database of Quantitative Cellular Signaling (DQCS).
data
edam
[0-9]+
identifier
Ensembl ID (Homo sapiens)
true
Identifier of an entry (exon, gene, transcript or protein) from the Ensembl 'core' database (Homo sapiens division).
ENS([EGTP])[0-9]{11}
edam
bioinformatics
identifier
beta12orEarlier
beta12orEarlier
data
identifiers
Ensembl ID ('Bos taurus')
true
ENSBTA([EGTP])[0-9]{11}
beta12orEarlier
identifiers
edam
bioinformatics
identifier
Identifier of an entry (exon, gene, transcript or protein) from the Ensembl 'core' database ('Bos taurus' division).
data
beta12orEarlier
Ensembl ID ('Canis familiaris')
true
bioinformatics
edam
identifiers
identifier
beta12orEarlier
ENSCAF([EGTP])[0-9]{11}
Identifier of an entry (exon, gene, transcript or protein) from the Ensembl 'core' database ('Canis familiaris' division).
data
beta12orEarlier
Ensembl ID ('Cavia porcellus')
true
beta12orEarlier
data
beta12orEarlier
identifiers
identifier
ENSCPO([EGTP])[0-9]{11}
bioinformatics
Identifier of an entry (exon, gene, transcript or protein) from the Ensembl 'core' database ('Cavia porcellus' division).
edam
Ensembl ID ('Ciona intestinalis')
true
identifier
ENSCIN([EGTP])[0-9]{11}
Identifier of an entry (exon, gene, transcript or protein) from the Ensembl 'core' database ('Ciona intestinalis' division).
beta12orEarlier
data
bioinformatics
edam
identifiers
beta12orEarlier
Ensembl ID ('Ciona savignyi')
true
identifiers
edam
identifier
beta12orEarlier
beta12orEarlier
ENSCSAV([EGTP])[0-9]{11}
Identifier of an entry (exon, gene, transcript or protein) from the Ensembl 'core' database ('Ciona savignyi' division).
data
bioinformatics
Ensembl ID ('Danio rerio')
true
ENSDAR([EGTP])[0-9]{11}
beta12orEarlier
beta12orEarlier
bioinformatics
identifiers
data
edam
Identifier of an entry (exon, gene, transcript or protein) from the Ensembl 'core' database ('Danio rerio' division).
identifier
Ensembl ID ('Dasypus novemcinctus')
true
identifier
bioinformatics
data
beta12orEarlier
ENSDNO([EGTP])[0-9]{11}
beta12orEarlier
identifiers
Identifier of an entry (exon, gene, transcript or protein) from the Ensembl 'core' database ('Dasypus novemcinctus' division).
edam
Ensembl ID ('Echinops telfairi')
true
Identifier of an entry (exon, gene, transcript or protein) from the Ensembl 'core' database ('Echinops telfairi' division).
data
edam
beta12orEarlier
identifier
bioinformatics
identifiers
beta12orEarlier
ENSETE([EGTP])[0-9]{11}
Ensembl ID ('Erinaceus europaeus')
true
beta12orEarlier
data
Identifier of an entry (exon, gene, transcript or protein) from the Ensembl 'core' database ('Erinaceus europaeus' division).
identifier
beta12orEarlier
bioinformatics
ENSEEU([EGTP])[0-9]{11}
edam
identifiers
Ensembl ID ('Felis catus')
true
data
edam
beta12orEarlier
Identifier of an entry (exon, gene, transcript or protein) from the Ensembl 'core' database ('Felis catus' division).
ENSFCA([EGTP])[0-9]{11}
identifier
beta12orEarlier
identifiers
bioinformatics
Ensembl ID ('Gallus gallus')
true
bioinformatics
beta12orEarlier
identifier
edam
data
beta12orEarlier
identifiers
ENSGAL([EGTP])[0-9]{11}
Identifier of an entry (exon, gene, transcript or protein) from the Ensembl 'core' database ('Gallus gallus' division).
Ensembl ID ('Gasterosteus aculeatus')
true
edam
identifier
Identifier of an entry (exon, gene, transcript or protein) from the Ensembl 'core' database ('Gasterosteus aculeatus' division).
bioinformatics
beta12orEarlier
ENSGAC([EGTP])[0-9]{11}
data
identifiers
beta12orEarlier
Ensembl ID ('Homo sapiens')
true
identifier
Identifier of an entry (exon, gene, transcript or protein) from the Ensembl 'core' database ('Homo sapiens' division).
beta12orEarlier
beta12orEarlier
data
bioinformatics
edam
ENSHUM([EGTP])[0-9]{11}
identifiers
Ensembl ID ('Loxodonta africana')
true
identifiers
ENSLAF([EGTP])[0-9]{11}
identifier
beta12orEarlier
Identifier of an entry (exon, gene, transcript or protein) from the Ensembl 'core' database ('Loxodonta africana' division).
beta12orEarlier
bioinformatics
data
edam
Ensembl ID ('Macaca mulatta')
true
bioinformatics
beta12orEarlier
beta12orEarlier
ENSMMU([EGTP])[0-9]{11}
identifiers
Identifier of an entry (exon, gene, transcript or protein) from the Ensembl 'core' database ('Macaca mulatta' division).
data
identifier
edam
Ensembl ID ('Monodelphis domestica')
true
ENSMOD([EGTP])[0-9]{11}
bioinformatics
edam
beta12orEarlier
identifier
identifiers
Identifier of an entry (exon, gene, transcript or protein) from the Ensembl 'core' database ('Monodelphis domestica' division).
beta12orEarlier
data
Ensembl ID ('Mus musculus')
true
edam
data
identifiers
identifier
beta12orEarlier
ENSMUS([EGTP])[0-9]{11}
beta12orEarlier
bioinformatics
Identifier of an entry (exon, gene, transcript or protein) from the Ensembl 'core' database ('Mus musculus' division).
Ensembl ID ('Myotis lucifugus')
true
data
identifiers
beta12orEarlier
edam
bioinformatics
Identifier of an entry (exon, gene, transcript or protein) from the Ensembl 'core' database ('Myotis lucifugus' division).
identifier
beta12orEarlier
ENSMLU([EGTP])[0-9]{11}
Ensembl ID ("Ornithorhynchus anatinus")
true
identifiers
ENSOAN([EGTP])[0-9]{11}
Identifier of an entry (exon, gene, transcript or protein) from the Ensembl 'core' database ('Ornithorhynchus anatinus' division).
edam
beta12orEarlier
bioinformatics
identifier
data
beta12orEarlier
Ensembl ID ('Oryctolagus cuniculus')
true
edam
Identifier of an entry (exon, gene, transcript or protein) from the Ensembl 'core' database ('Oryctolagus cuniculus' division).
bioinformatics
beta12orEarlier
data
identifiers
beta12orEarlier
ENSOCU([EGTP])[0-9]{11}
identifier
Ensembl ID ('Oryzias latipes')
true
beta12orEarlier
Identifier of an entry (exon, gene, transcript or protein) from the Ensembl 'core' database ('Oryzias latipes' division).
data
ENSORL([EGTP])[0-9]{11}
identifiers
edam
identifier
beta12orEarlier
bioinformatics
Ensembl ID ('Otolemur garnettii')
true
bioinformatics
ENSSAR([EGTP])[0-9]{11}
Identifier of an entry (exon, gene, transcript or protein) from the Ensembl 'core' database ('Otolemur garnettii' division).
beta12orEarlier
identifiers
edam
beta12orEarlier
data
identifier
Ensembl ID ('Pan troglodytes')
true
identifier
ENSPTR([EGTP])[0-9]{11}
data
identifiers
Identifier of an entry (exon, gene, transcript or protein) from the Ensembl 'core' database ('Pan troglodytes' division).
edam
bioinformatics
beta12orEarlier
beta12orEarlier
Ensembl ID ('Rattus norvegicus')
true
identifier
Identifier of an entry (exon, gene, transcript or protein) from the Ensembl 'core' database ('Rattus norvegicus' division).
bioinformatics
edam
identifiers
beta12orEarlier
ENSRNO([EGTP])[0-9]{11}
beta12orEarlier
data
Ensembl ID ('Spermophilus tridecemlineatus')
true
beta12orEarlier
Identifier of an entry (exon, gene, transcript or protein) from the Ensembl 'core' database ('Spermophilus tridecemlineatus' division).
ENSSTO([EGTP])[0-9]{11}
data
identifier
identifiers
beta12orEarlier
edam
bioinformatics
Ensembl ID ('Takifugu rubripes')
true
edam
Identifier of an entry (exon, gene, transcript or protein) from the Ensembl 'core' database ('Takifugu rubripes' division).
beta12orEarlier
beta12orEarlier
identifier
ENSFRU([EGTP])[0-9]{11}
bioinformatics
identifiers
data
Ensembl ID ('Tupaia belangeri')
true
beta12orEarlier
bioinformatics
beta12orEarlier
data
edam
identifier
identifiers
ENSTBE([EGTP])[0-9]{11}
Identifier of an entry (exon, gene, transcript or protein) from the Ensembl 'core' database ('Tupaia belangeri' division).
Ensembl ID ('Xenopus tropicalis')
true
identifiers
ENSXET([EGTP])[0-9]{11}
edam
data
bioinformatics
beta12orEarlier
beta12orEarlier
identifier
Identifier of an entry (exon, gene, transcript or protein) from the Ensembl 'core' database ('Xenopus tropicalis' division).
CATH identifier
identifier
edam
Identifier of a protein domain (or other node) from the CATH database.
identifiers
data
bioinformatics
beta12orEarlier
CATH node ID (family)
data
edam
2.10.10.10
A code number identifying a family from the CATH database.
beta12orEarlier
identifiers
identifier
bioinformatics
Enzyme ID (CAZy)
bioinformatics
CAZy ID
beta12orEarlier
Identifier of an enzyme from the CAZy enzymes database.
data
identifiers
identifier
edam
Clone ID (IMAGE)
bioinformatics
identifiers
IMAGE cloneID
I.M.A.G.E. cloneID
edam
data
beta12orEarlier
identifier
A unique identifier assigned by the I.M.A.G.E. consortium to a clone (cloned molecular sequence).
GO concept ID (cellular compartment)
beta12orEarlier
data
[0-9]{7}|GO:[0-9]{7}
edam
identifier
An identifier of a 'cellular compartment' concept from the Gene Ontology.
identifiers
bioinformatics
GO concept identifier (cellular compartment)
Chromosome name (BioCyc)
beta12orEarlier
identifier
bioinformatics
identifiers
Name of a chromosome as used in the BioCyc database.
data
edam
CleanEx entry name
bioinformatics
An identifier of a gene expression profile from the CleanEx database.
beta12orEarlier
data
identifier
edam
identifiers
CleanEx dataset code
identifiers
An identifier of (typically a list of) gene expression experiments catalogued in the CleanEx database.
beta12orEarlier
edam
bioinformatics
data
identifier
Genome metadata
beta12orEarlier
data
data
edam
bioinformatics
Provenance metadata or other general information concerning a genome as a whole.
Protein ID (CORUM)
Unique identifier for a protein complex from the CORUM database.
identifier
CORUM complex ID
data
beta12orEarlier
identifiers
bioinformatics
edam
CDD PSSM-ID
bioinformatics
identifiers
Unique identifier of a position-specific scoring matrix from the CDD database.
identifier
data
edam
beta12orEarlier
Protein ID (CuticleDB)
Unique identifier for a protein from the CuticleDB database.
identifiers
edam
identifier
bioinformatics
beta12orEarlier
CuticleDB ID
data
DBD ID
beta12orEarlier
data
Identifier of a predicted transcription factor from the DBD database.
bioinformatics
identifier
edam
identifiers
Oligonucleotide probe annotation
edam
data
General annotation on an oligonucleotide probe.
data
beta12orEarlier
bioinformatics
Oligonucleotide ID
data
beta12orEarlier
bioinformatics
edam
Identifier of an oligonucleotide from a database.
identifiers
identifier
dbProbe ID
bioinformatics
edam
data
identifiers
beta12orEarlier
identifier
Identifier of an oligonucleotide probe from the dbProbe database.
Dinucleotide property
Physicochemical property data for one or more dinucleotides.
data
beta12orEarlier
edam
data
bioinformatics
DiProDB ID
Identifier of an dinucleotide property from the DiProDB database.
edam
beta12orEarlier
data
identifier
identifiers
bioinformatics
Protein features (disordered structure)
Protein structure report (disordered structure)
An informative report about disordered structure in a protein.
data
data
bioinformatics
beta12orEarlier
edam
Protein ID (DisProt)
Unique identifier for a protein from the DisProt database.
edam
identifier
identifiers
bioinformatics
beta12orEarlier
DisProt ID
data
Embryo annotation
beta12orEarlier
data
data
edam
bioinformatics
Annotation on an embryo or concerning embryological development.
Transcript ID (Ensembl)
Unique identifier for a gene transcript from the Ensembl database.
beta12orEarlier
bioinformatics
edam
identifiers
data
identifier
Ensembl Transcript ID
Inhibitor annotation
edam
data
An informative report on one or more small molecules that are enzyme inhibitors.
data
beta12orEarlier
bioinformatics
Promoter ID
Moby:GeneAccessionList
identifiers
beta12orEarlier
edam
data
bioinformatics
An identifier of a promoter of a gene that is catalogued in a database.
identifier
EST accession
data
identifiers
beta12orEarlier
Identifier of an EST sequence.
identifier
bioinformatics
edam
COGEME EST ID
Identifier of an EST sequence from the COGEME database.
bioinformatics
beta12orEarlier
identifiers
identifier
edam
data
COGEME unisequence ID
identifiers
A unisequence is a single sequence assembled from ESTs.
beta12orEarlier
Identifier of a unisequence from the COGEME database.
bioinformatics
edam
identifier
data
Protein family ID (GeneFarm)
bioinformatics
Accession number of an entry (family) from the TIGRFam database.
identifier
identifiers
beta12orEarlier
data
GeneFarm family ID
edam
Family name
edam
identifier
bioinformatics
identifiers
The name of a family of organism.
data
beta12orEarlier
Genus name (virus)
true
beta12orEarlier
edam
data
bioinformatics
beta13
The name of a genus of viruses.
identifiers
identifier
Family name (virus)
true
bioinformatics
data
identifiers
The name of a family of viruses.
beta13
beta12orEarlier
edam
identifier
Database name (SwissRegulon)
true
data
bioinformatics
beta12orEarlier
edam
identifier
The name of a SwissRegulon database.
beta13
identifiers
Sequence feature ID (SwissRegulon)
This can be name of a gene, the ID of a TFBS, or genomic coordinates in form "chr:start..end".
data
A feature identifier as used in the SwissRegulon database.
identifiers
edam
beta12orEarlier
identifier
bioinformatics
FIG ID
data
identifiers
beta12orEarlier
A unique identifier of gene in the NMPDR database.
bioinformatics
identifier
A FIG ID consists of four parts: a prefix, genome id, locus type and id number.
edam
Gene ID (Xenbase)
data
beta12orEarlier
bioinformatics
A unique identifier of gene in the Xenbase database.
identifiers
identifier
edam
Gene ID (Genolist)
bioinformatics
data
edam
identifier
A unique identifier of gene in the Genolist database.
beta12orEarlier
identifiers
Gene name (Genolist)
Name of an entry (gene) from the Genolist genes database.
data
identifier
Genolist gene name
beta12orEarlier
bioinformatics
identifiers
edam
ABS ID
bioinformatics
identifiers
edam
Identifier of an entry (promoter) from the ABS database.
beta12orEarlier
ABS identifier
identifier
data
AraC-XylS ID
identifier
beta12orEarlier
identifiers
Identifier of a transcription factor from the AraC-XylS database.
data
edam
bioinformatics
Gene name (HUGO)
true
Name of an entry (gene) from the HUGO database.
bioinformatics
edam
beta12orEarlier
beta12orEarlier
data
identifiers
identifier
Locus ID (PseudoCAP)
Identifier of a locus from the PseudoCAP database.
identifiers
edam
bioinformatics
beta12orEarlier
identifier
data
Locus ID (UTR)
data
Identifier of a locus from the UTR database.
beta12orEarlier
edam
identifiers
bioinformatics
identifier
MonosaccharideDB ID
beta12orEarlier
identifier
identifiers
data
Unique identifier of a monosaccharide from the MonosaccharideDB database.
bioinformatics
edam
Database name (CMD)
true
edam
identifier
identifiers
beta13
The name of a subdivision of the Collagen Mutation Database (CMD) database.
data
beta12orEarlier
bioinformatics
Database name (Osteogenesis)
true
data
beta12orEarlier
identifiers
bioinformatics
The name of a subdivision of the Osteogenesis database.
edam
identifier
beta13
Genome identifier
identifier
edam
data
identifiers
beta12orEarlier
An identifier of a particular genome.
bioinformatics
GenomeReviews ID
data
An identifier of a particular genome.
bioinformatics
identifiers
edam
beta12orEarlier
identifier
GlycoMap ID
identifier
bioinformatics
data
Identifier of an entry from the GlycosciencesDB database.
beta12orEarlier
edam
[0-9]+
identifiers
Carbohydrate conformational map
A conformational energy map of the glycosidic linkages in a carbohydrate molecule.
bioinformatics
beta12orEarlier
data
data
edam
Nucleic acid features (intron)
Gene features (intron)
beta12orEarlier
edam
bioinformatics
An informative report on an intron in a nucleotide sequences.
data
data
Transcription factor name
beta12orEarlier
The name of a transcription factor.
identifiers
edam
data
identifier
bioinformatics
TCID
identifier
edam
bioinformatics
data
identifiers
beta12orEarlier
Identifier of a membrane transport proteins from the transport classification database (TCDB).
Pfam domain name
identifier
data
identifiers
edam
PF[0-9]{5}
bioinformatics
Name of a domain from the Pfam database.
beta12orEarlier
Pfam clan ID
Accession number of a Pfam clan.
identifier
data
edam
bioinformatics
identifiers
CL[0-9]{4}
beta12orEarlier
Gene ID (VectorBase)
edam
identifiers
data
Identifier for a gene from the VectorBase database.
bioinformatics
identifier
beta12orEarlier
VectorBase ID
UTRSite ID
beta12orEarlier
data
bioinformatics
identifiers
identifier
Identifier of an entry from the UTRSite database of regulatory motifs in eukaryotic UTRs.
edam
Sequence motif metadata
data
bioinformatics
beta12orEarlier
Annotation on a specific or conserved pattern in a molecular sequence, such as its context in genes or proteins, its role, origin or method of construction, etc.
data
edam
Locus annotation
true
beta12orEarlier
beta12orEarlier
Locus report
data
An informative report on a particular locus.
bioinformatics
edam
data
Protein name (UniProt)
identifiers
edam
beta12orEarlier
data
Official name of a protein as used in the UniProt database.
identifier
bioinformatics
Term ID list
data
edam
The concepts are typically provided as a persistent identifier or some other link the source ontologies. Evidence of the validity of the annotation might be included.
data
beta12orEarlier
bioinformatics
One or more terms from one or more controlled vocabularies which are annotations on an entity.
HAMAP ID
data
edam
identifiers
identifier
beta12orEarlier
Name of a protein family from the HAMAP database.
bioinformatics
Identifier with metadata
data
beta12orEarlier
edam
bioinformatics
Basic information concerning an identifier of data (typically including the identifier itself). For example, a gene symbol with information concerning its provenance.
data
Gene symbol annotation
true
beta12orEarlier
bioinformatics
data
edam
Annotation about a gene symbol.
beta12orEarlier
data
Transcript ID
identifiers
data
bioinformatics
identifier
beta12orEarlier
Identifier of a RNA transcript.
edam
HIT ID
beta12orEarlier
identifiers
edam
identifier
data
Identifier of an RNA transcript from the H-InvDB database.
bioinformatics
HIX ID
beta12orEarlier
A unique identifier of gene cluster in the H-InvDB database.
data
identifier
identifiers
edam
bioinformatics
HPA antibody id
bioinformatics
Identifier of a antibody from the HPA database.
beta12orEarlier
identifier
identifiers
edam
data
IMGT/HLA ID
identifier
edam
identifiers
beta12orEarlier
Identifier of a human major histocompatibility complex (HLA) or other protein from the IMGT/HLA database.
bioinformatics
data
Gene ID (JCVI)
bioinformatics
data
beta12orEarlier
edam
A unique identifier of gene assigned by the J. Craig Venter Institute (JCVI).
identifier
identifiers
Kinase name
identifiers
The name of a kinase protein.
data
bioinformatics
identifier
beta12orEarlier
edam
ConsensusPathDB entity ID
beta12orEarlier
bioinformatics
identifiers
edam
Identifier of a physical entity from the ConsensusPathDB database.
identifier
data
ConsensusPathDB entity name
bioinformatics
data
edam
Name of a physical entity from the ConsensusPathDB database.
identifier
beta12orEarlier
identifiers
CCAP strain number
beta12orEarlier
The number of a strain of algae and protozoa from the CCAP database.
identifiers
data
edam
identifier
bioinformatics
Stock number
edam
bioinformatics
data
identifier
An identifier of stock from a catalogue of biological resources.
identifiers
beta12orEarlier
Stock number (TAIR)
identifier
edam
data
identifiers
A stock number from The Arabidopsis information resource (TAIR).
bioinformatics
beta12orEarlier
REDIdb ID
data
Identifier of an entry from the RNA editing database (REDIdb).
identifier
identifiers
bioinformatics
edam
beta12orEarlier
SMART domain name
edam
beta12orEarlier
identifier
Name of a domain from the SMART database.
bioinformatics
identifiers
data
Protein family ID (PANTHER)
identifiers
Panther family ID
data
bioinformatics
Accession number of an entry (family) from the PANTHER database.
beta12orEarlier
edam
identifier
RNAVirusDB ID
data
A unique identifier for a virus from the RNAVirusDB database.
edam
Could list (or reference) other taxa here from https://www.phenoscape.org/wiki/Taxonomic_Rank_Vocabulary.
identifiers
identifier
beta12orEarlier
bioinformatics
Virus ID
An accession of annotation on a (group of) viruses (catalogued in a database).
bioinformatics
beta12orEarlier
data
identifier
edam
identifiers
NCBI Genome Project ID
beta12orEarlier
edam
identifiers
identifier
bioinformatics
data
An identifier of a genome project assigned by NCBI.
NCBI genome accession
identifiers
bioinformatics
beta12orEarlier
edam
data
identifier
A unique identifier of a whole genome assigned by the NCBI.
Sequence profile metadata
bioinformatics
edam
data
beta12orEarlier
data
Annotation on a sequence profile such as its name, length, technical details about the profile or it's construction, the biological role or annotation and so on.
Protein ID (TopDB)
data
identifiers
identifier
Unique identifier for a membrane protein from the TopDB database.
TopDB ID
bioinformatics
beta12orEarlier
edam
Gel identifier
data
identifier
bioinformatics
beta12orEarlier
Identifier of a two-dimensional (protein) gel.
identifiers
edam
Reference map name (SWISS-2DPAGE)
Name of a reference map gel from the SWISS-2DPAGE database.
identifiers
identifier
data
edam
beta12orEarlier
bioinformatics
Protein ID (PeroxiBase)
Unique identifier for a peroxidase protein from the PeroxiBase database.
beta12orEarlier
bioinformatics
identifier
data
PeroxiBase ID
identifiers
edam
SISYPHUS ID
bioinformatics
Identifier of an entry from the SISYPHUS database of tertiary structure alignments.
beta12orEarlier
identifiers
edam
identifier
data
ORF ID
Accession of an open reading frame (catalogued in a database).
bioinformatics
edam
identifiers
identifier
data
beta12orEarlier
ORF identifier
identifier
beta12orEarlier
An identifier of an open reading frame.
bioinformatics
edam
data
identifiers
Linucs ID
edam
beta12orEarlier
identifiers
Identifier of an entry from the GlycosciencesDB database.
data
identifier
bioinformatics
Protein ID (LGICdb)
identifiers
Unique identifier for a ligand-gated ion channel protein from the LGICdb database.
edam
LGICdb ID
data
bioinformatics
identifier
beta12orEarlier
MaizeDB ID
bioinformatics
data
identifiers
identifier
Identifier of an EST sequence from the MaizeDB database.
edam
beta12orEarlier
Gene ID (MfunGD)
identifier
A unique identifier of gene in the MfunGD database.
data
bioinformatics
identifiers
beta12orEarlier
edam
Orpha number
identifiers
An identifier of a disease from the Orpha database.
data
bioinformatics
beta12orEarlier
identifier
edam
Protein ID (EcID)
identifier
bioinformatics
Unique identifier for a protein from the EcID database.
data
beta12orEarlier
edam
identifiers
Clone ID (RefSeq)
identifier
identifiers
edam
data
beta12orEarlier
bioinformatics
A unique identifier of a cDNA molecule catalogued in the RefSeq database.
Protein ID (ConoServer)
data
bioinformatics
Unique identifier for a cone snail toxin protein from the ConoServer database.
identifiers
beta12orEarlier
identifier
edam
GeneSNP ID
beta12orEarlier
identifier
identifiers
bioinformatics
data
edam
Identifier of a GeneSNP database entry.
Lipid identifier
bioinformatics
edam
beta12orEarlier
identifier
data
identifiers
Identifier of a lipid.
Databank
true
beta12orEarlier
edam
A flat-file (textual) data archive.
beta12orEarlier
data
bioinformatics
data
Web portal
true
beta12orEarlier
data
edam
bioinformatics
data
beta12orEarlier
A web site providing data (web pages) on a common theme to a HTTP client.
Gene ID (VBASE2)
edam
identifiers
VBASE2 ID
beta12orEarlier
identifier
bioinformatics
Identifier for a gene from the VBASE2 database.
data
DPVweb ID
bioinformatics
A unique identifier for a virus from the DPVweb database.
identifier
beta12orEarlier
edam
identifiers
data
DPVweb virus ID
Pathway ID (BioSystems)
data
edam
[0-9]+
bioinformatics
Identifier of a pathway from the BioSystems pathway database.
beta12orEarlier
identifier
identifiers
Experimental data (proteomics)
true
bioinformatics
edam
data
beta12orEarlier
beta12orEarlier
Data concerning a proteomics experiment.
data
Abstract
bioinformatics
data
An abstract of a scientific article.
edam
beta12orEarlier
data
Lipid structure
data
bioinformatics
3D coordinate and associated data for a lipid structure.
data
beta12orEarlier
edam
Drug structure
bioinformatics
3D coordinate and associated data for the (3D) structure of a drug.
data
beta12orEarlier
edam
data
Toxin structure
beta12orEarlier
data
3D coordinate and associated data for the (3D) structure of a toxin.
bioinformatics
data
edam
Position-specific scoring matrix
data
A simple matrix of numbers, where each value (or column of values) is derived derived from analysis of the corresponding position in a sequence alignment.
edam
bioinformatics
data
beta12orEarlier
Distance matrix
data
edam
data
beta12orEarlier
bioinformatics
A matrix of distances between molecular entities, where a value (distance) is (typically) derived from comparison of two entities and reflects their similarity.
Structural distance matrix
data
edam
bioinformatics
beta12orEarlier
Distances (values representing similarity) between a group of molecular structures.
data
Article metadata
Bibliographic data concerning scientific article(s).
data
edam
beta12orEarlier
data
bioinformatics
Ontology concept
bioinformatics
data
data
A concept from a biological ontology.
This includes any fields from the concept definition such as concept name, definition, comments and so on.
beta12orEarlier
edam
Codon usage bias
data
data
beta12orEarlier
A numerical measure of differences in the frequency of occurrence of synonymous codons in DNA sequences.
edam
bioinformatics
Experiment annotation (Northern blot)
data
bioinformatics
data
beta12orEarlier
edam
General annotation on a Northern Blot experiment.
Nucleic acid features (VNTR)
VNTRs occur in non-coding regions of DNA and consists sub-sequence that is repeated a multiple (and varied) number of times.
data
beta12orEarlier
edam
Variable number of tandem repeat polymorphism
bioinformatics
VNTR annotation
data
Annotation on a variable number of tandem repeat (VNTR) polymorphism in a DNA sequence.
Nucleic acid features (microsatellite)
bioinformatics
beta12orEarlier
A microsatellite polymorphism is a very short subsequence that is repeated a variable number of times between individuals. These repeats consist of the nucleotides cytosine and adenosine.
Microsatellite annotation
edam
data
Annotation on a microsatellite polymorphism in a DNA sequence.
data
Nucleic acid features (RFLP)
data
Annotation on a restriction fragment length polymorphisms (RFLP) in a DNA sequence.
data
edam
An RFLP is defined by the presence or absence of a specific restriction site of a bacterial restriction enzyme.
RFLP annotation
bioinformatics
beta12orEarlier
Radiation hybrid map
data
beta12orEarlier
RH map
edam
data
A map showing distance between genetic markers estimated by radiation-induced breaks in a chromosome.
bioinformatics
The radiation method can break very closely linked markers providing a more detailed map. Most genetic markers and subsequences may be located to a defined map position and with a more precise estimates of distance than a linkage map.
ID list
edam
bioinformatics
data
beta12orEarlier
A simple list of data identifiers (such as database accessions), possibly with additional basic information on the addressed data.
data
Phylogenetic gene frequencies data
data
bioinformatics
data
Gene frequencies data that may be read during phylogenetic tree calculation.
beta12orEarlier
edam
Sequence set (polymorphic)
true
A set of sub-sequences displaying some type of polymorphism, typically indicating the sequence in which they occur, their position and other metadata.
bioinformatics
beta12orEarlier
data
edam
beta13
data
DRCAT resource
data
edam
beta12orEarlier
data
An entry (resource) from the DRCAT bioinformatics resource catalogue.
bioinformatics
Protein complex
bioinformatics
edam
data
3D coordinate and associated data for a multi-protein complex; two or more polypeptides chains in a stable, functional association with one another.
beta12orEarlier
data
Protein structural motif
data
3D coordinate and associated data for a protein (3D) structural motif; any group of contiguous or non-contiguous amino acid residues but typically those forming a feature with a structural or functional role.
edam
bioinformatics
beta12orEarlier
data
Lipid structure report
Annotation on or information derived from one or more specific lipid 3D structure(s).
bioinformatics
data
edam
data
beta12orEarlier
Secondary structure image
Image of one or more molecular secondary structures.
edam
bioinformatics
data
data
beta12orEarlier
Secondary structure report
edam
An informative report on general information, properties or features of one or more molecular secondary structures.
bioinformatics
data
beta12orEarlier
Secondary structure-derived report
data
DNA features
true
data
beta12orEarlier
bioinformatics
DNA sequence-specific feature annotation (not in a feature table).
beta12orEarlier
data
edam
Nucleic acid features (RNA features)
bioinformatics
RNA features
Features concerning RNA or regions of DNA that encode an RNA molecule.
beta12orEarlier
data
data
edam
Plot
true
beta12orEarlier
data
edam
bioinformatics
Biological data that is plotted as a graph of some type.
beta12orEarlier
data
Nucleic acid features (polymorphism annotation)
beta12orEarlier
Annotation on a polymorphism.
edam
Polymorphism annotation
data
data
bioinformatics
Sequence record (protein)
data
edam
Protein sequence record
bioinformatics
data
beta12orEarlier
A protein sequence and associated metadata.
Sequence record (nucleic acid)
edam
data
Nucleic acid sequence record
bioinformatics
A nucleic acid sequence and associated metadata.
Nucleotide sequence record
data
beta12orEarlier
Sequence record full (protein)
data
beta12orEarlier
data
bioinformatics
A protein sequence and comprehensive metadata (such as a feature table), typically corresponding to a full entry from a molecular sequence database.
SO:2000061
edam
Sequence record full (nucleic acid)
data
edam
SO:2000061
beta12orEarlier
A nucleic acid sequence and comprehensive metadata (such as a feature table), typically corresponding to a full entry from a molecular sequence database.
bioinformatics
data
Biological model accession
identifier
Accession of a mathematical model, typically an entry from a database.
beta12orEarlier
edam
identifiers
data
bioinformatics
Cell type name
beta12orEarlier
data
bioinformatics
identifiers
The name of a type or group of cells.
edam
identifier
Cell type accession
identifier
identifiers
Accession of a type or group of cells (catalogued in a database).
beta12orEarlier
data
bioinformatics
edam
Compound accession
Chemical compound accession
edam
beta12orEarlier
Accession of an entry from a database of chemicals.
identifier
data
identifiers
Small molecule accession
bioinformatics
Drug accession
data
beta12orEarlier
identifiers
edam
Accession of a drug.
identifier
bioinformatics
Toxin name
bioinformatics
edam
data
beta12orEarlier
identifier
Name of a toxin.
identifiers
Toxin accession
identifier
identifiers
bioinformatics
data
edam
Accession of a toxin (catalogued in a database).
beta12orEarlier
Monosaccharide accession
bioinformatics
identifier
beta12orEarlier
data
identifiers
edam
Accession of a monosaccharide (catalogued in a database).
Drug name
Common name of a drug.
identifiers
bioinformatics
edam
data
beta12orEarlier
identifier
Carbohydrate accession
bioinformatics
beta12orEarlier
edam
Accession of an entry from a database of carbohydrates.
data
identifier
identifiers
Molecule accession
bioinformatics
beta12orEarlier
data
identifiers
identifier
edam
Accession of a specific molecule (catalogued in a database).
Data resource definition accession
Accession of a data definition (catalogued in a database).
bioinformatics
beta12orEarlier
data
edam
identifier
identifiers
Genome accession
beta12orEarlier
An accession of a particular genome (in a database).
data
identifiers
edam
identifier
bioinformatics
Map accession
identifiers
beta12orEarlier
edam
An accession of a map of a molecular sequence (deposited in a database).
bioinformatics
data
identifier
Lipid accession
Accession of an entry from a database of lipids.
identifiers
data
edam
identifier
beta12orEarlier
bioinformatics
Peptide ID
Accession of a peptide deposited in a database.
identifier
identifiers
data
bioinformatics
beta12orEarlier
edam
Protein accession
Accession of a protein deposited in a database.
identifier
bioinformatics
data
beta12orEarlier
identifiers
edam
Organism accession
edam
beta12orEarlier
identifier
identifiers
data
An accession of annotation on a (group of) organisms (catalogued in a database).
bioinformatics
Organism name
identifiers
The name of an organism (or group of organisms).
data
Moby:OrganismsLongName
beta12orEarlier
edam
Moby:InfraspecificEpithet
Moby:FirstEpithet
Moby:BriefOccurrenceRecord
Moby:OccurrenceRecord
bioinformatics
Moby:OrganismsShortName
identifier
Moby:Organism_Name
Protein family accession
edam
bioinformatics
identifiers
beta12orEarlier
Accession of a protein family (that is deposited in a database).
data
identifier
Transcription factor accession
beta12orEarlier
edam
identifier
bioinformatics
data
Accession of an entry from a database of transcription factors or binding sites.
identifiers
Strain accession
identifier
edam
Identifier of a strain of an organism variant, typically a plant, virus or bacterium.
data
identifiers
bioinformatics
beta12orEarlier
Virus identifier
edam
identifiers
identifier
beta12orEarlier
An accession of annotation on a (group of) viruses (catalogued in a database).
data
bioinformatics
Sequence features metadata
data
edam
Metadata on sequence features.
data
bioinformatics
beta12orEarlier
Gramene identifier
bioinformatics
identifier
Identifier of a Gramene database entry.
data
identifiers
edam
beta12orEarlier
DDBJ accession
An identifier of an entry from the DDBJ sequence database.
identifier
bioinformatics
data
DDBJ identifier
DDBJ ID
edam
identifiers
beta12orEarlier
DDBJ accession number
ConsensusPathDB identifier
An identifier of an entity from the ConsensusPathDB database.
identifier
beta12orEarlier
data
edam
identifiers
bioinformatics
Sequence data
true
Data concerning molecular sequence(s).
bioinformatics
beta13
edam
data
data
beta12orEarlier
This is a broad data type and is used a placeholder for other, more specific types.
Codon usage data
true
Data concerning codon usage.
beta12orEarlier
This is a broad data type and is used a placeholder for other, more specific types.
bioinformatics
data
beta13
data
edam
Article report
data
Data concerning or derived from the analysis of a scientific article.
edam
bioinformatics
data
beta12orEarlier
Sequence report
edam
An informative report derived from molecular sequence analysis, including annotation on positional features (such as a feature table) or non-positional properties, and reports of general information (metadata).
bioinformatics
data
Sequence-derived report
data
beta12orEarlier
Protein secondary structure report
edam
data
data
beta12orEarlier
bioinformatics
An informative report about the properties or features of one or more protein secondary structures.
Hopp and Woods plot
data
bioinformatics
data
edam
beta12orEarlier
A Hopp and Woods plot of predicted antigenicity of a peptide or protein.
Nucleic acid melting curve
bioinformatics
beta12orEarlier
A melting curve of a double-stranded nucleic acid molecule (DNA or DNA/RNA).
Shows the proportion of nucleic acid which are double-stranded versus temperature.
edam
data
data
Nucleic acid probability profile
data
Shows the probability of a base pair not being melted (i.e. remaining as double-stranded DNA) at a specified temperature
data
bioinformatics
A probability profile of a double-stranded nucleic acid molecule (DNA or DNA/RNA).
beta12orEarlier
edam
Nucleic acid temperature profile
data
Melting map
beta12orEarlier
edam
bioinformatics
data
Plots melting temperature versus base position.
A temperature profile of a double-stranded nucleic acid molecule (DNA or DNA/RNA).
Pathway or network (gene regulation)
beta12orEarlier
A report typically including a map (diagram) of a gene regulatory network.
data
data
bioinformatics
edam
2D PAGE image (annotated)
2D PAGE image annotation
edam
beta12orEarlier
bioinformatics
data
data
An informative report on a two-dimensional (2D PAGE) gel.
Oligonucleotide probe sets annotation
data
data
General annotation on a set of oligonucleotide probes, such as the gene name with which the probe set is associated and which probes belong to the set.
bioinformatics
beta12orEarlier
edam
Microarray image
edam
bioinformatics
An image from a microarray experiment which (typically) allows a visualisation of probe hybridisation and gene-expression data.
data
data
Gene expression image
beta12orEarlier
Image
data
data
edam
Biological or biomedical data that may be rendered, for example displayed on screen or plotted on a graph of some type.
beta12orEarlier
bioinformatics
Sequence image
bioinformatics
data
edam
beta12orEarlier
data
Image of a molecular sequence, possibly with sequence features or properties shown.
Protein hydropathy data
data
beta12orEarlier
edam
A report on protein properties concerning hydropathy.
Protein hydropathy report
data
bioinformatics
Workflow data
true
data
bioinformatics
beta12orEarlier
data
edam
Data concerning a computational workflow.
beta13
Workflow
A computational workflow.
data
bioinformatics
beta12orEarlier
edam
data
Secondary structure data
true
Data concerning molecular secondary structure data.
bioinformatics
beta12orEarlier
beta13
data
data
edam
Raw sequence (protein)
edam
A raw protein sequence (string of characters).
bioinformatics
beta12orEarlier
data
data
Raw sequence (nucleic acid)
data
A raw nucleic acid sequence.
edam
data
beta12orEarlier
bioinformatics
Protein sequence
data
data
One or more protein sequences, possibly with associated annotation.
edam
bioinformatics
beta12orEarlier
Nucleic acid sequence
bioinformatics
beta12orEarlier
edam
One or more nucleic acid sequences, possibly with associated annotation.
data
data
Reaction data
bioinformatics
Reaction annotation
edam
data
This is a broad data type and is used a placeholder for other, more specific types.
data
Enzyme kinetics annotation
Data concerning a biochemical reaction, typically data and more general annotation on the kinetics of enzyme-catalysed reaction.
beta12orEarlier
Peptide property
data
Data concerning small peptides.
data
Peptide data
edam
beta12orEarlier
bioinformatics
Protein classification
bioinformatics
edam
beta12orEarlier
Data concerning the classification of protein sequences or structures.
data
Protein classification data
data
This is a broad data type and is used a placeholder for other, more specific types.
Sequence motif data
true
This is a broad data type and is used a placeholder for other, more specific types.
beta12orEarlier
beta13
bioinformatics
Data concerning specific or conserved pattern in molecular sequences.
data
data
edam
Sequence profile data
true
Data concerning models representing a (typically multiple) sequence alignment.
data
beta13
This is a broad data type and is used a placeholder for other, more specific types.
bioinformatics
beta12orEarlier
data
edam
Pathway or network data
true
beta12orEarlier
data
beta13
data
bioinformatics
Data concerning a specific biological pathway or network.
edam
Pathway or network report
An informative report concerning or derived from the analysis of a biological pathway or network, such as a map (diagram) or annotation.
beta12orEarlier
bioinformatics
data
edam
data
Nucleic acid thermodynamic data
data
edam
data
beta12orEarlier
Nucleic acid thermodynamic property
A thermodynamic or kinetic property of a nucleic acid molecule.
Nucleic acid property (thermodynamic or kinetic)
bioinformatics
Nucleic acid classification
data
data
This is a broad data type and is used a placeholder for other, more specific types.
edam
bioinformatics
Nucleic acid classification data
Data concerning the classification of nucleic acid sequences or structures.
beta12orEarlier
Classification
bioinformatics
data
This can include an entire classification, components such as classifiers, assignments of entities to a classification and so on.
edam
Classification data
data
Data concerning a classification of molecular sequences, structures or other entities.
beta12orEarlier
Protein features (key folding sites)
data
bioinformatics
beta12orEarlier
A report on key residues involved in protein folding.
data
edam
Protein torsion angle data
edam
data
beta12orEarlier
data
Torsion angle data for a protein structure.
bioinformatics
Torsion angle data
Protein structure image
Structure image (protein)
edam
data
bioinformatics
beta12orEarlier
data
An image of protein structure.
Phylogenetic character weights
data
beta12orEarlier
edam
Weights for sequence positions or characters in phylogenetic analysis where zero is defined as unweighted.
bioinformatics
data
Sequence annotation track
Genome track
Genome-browser track
Annotation of one particular positional feature on a biomolecular (typically genome) sequence, suitable for import and display in a genome browser.
Annotation track
Genomic track
data
data
edam
bioinformatics
Genome annotation track
beta12orEarlier
UniProt accession
UniProt accession number
identifier
bioinformatics
UniProtKB accession
Accession number of a UniProt (protein sequence) database entry.
[A-NR-Z][0-9][A-Z][A-Z0-9][A-Z0-9][0-9]|[OPQ][0-9][A-Z0-9][A-Z0-9][A-Z0-9][0-9]
identifiers
TrEMBL entry accession
Swiss-Prot entry accession
data
edam
UniProt entry accession
P43353|Q7M1G0|Q9C199|A5A6J6
beta12orEarlier
UniProtKB accession number
NCBI genetic code ID
identifiers
identifier
Identifier of a genetic code in the NCBI list of genetic codes.
bioinformatics
data
beta12orEarlier
edam
16
[1-9][0-9]?
Ontology concept identifier
beta12orEarlier
identifier
Identifier of a concept in an ontology of biological or bioinformatics concepts and relations.
bioinformatics
data
identifiers
edam
GO concept name (biological process)
true
beta12orEarlier
beta12orEarlier
bioinformatics
The name of a concept for a biological process from the GO ontology.
identifiers
edam
data
identifier
GO concept name (molecular function)
true
edam
The name of a concept for a molecular function from the GO ontology.
identifier
identifiers
beta12orEarlier
bioinformatics
data
beta12orEarlier
Taxonomy
Taxonomic data
data
edam
beta12orEarlier
Data concerning the classification, identification and naming of organisms.
This is a broad data type and is used a placeholder for other, more specific types.
data
bioinformatics
Protein ID (EMBL/GenBank/DDBJ)
data
identifiers
edam
This qualifier consists of a stable ID portion (3+5 format with 3 position letters and 5 numbers) plus a version number after the decimal point. When the protein sequence encoded by the CDS changes, only the version number of the /protein_id value is incremented; the stable part of the /protein_id remains unchanged and as a result will permanently be associated with a given protein; this qualifier is valid only on CDS features which translate into a valid protein.
EMBL/GENBANK/DDBJ coding feature protein identifier, issued by International collaborators.
beta13
identifier
bioinformatics
Core data
beta13
data
data
edam
bioinformatics
A type of data that (typically) corresponds to entries from the primary biological databases and which is (typically) the primary input or output of a tool, i.e. the data the tool processes or generates, as distinct from metadata and identifiers which describe and identify such core data, parameters that control the behaviour of tools, reports of derivative data generated by tools and annotation.
Core data entities typically have a format and may be identified by an accession number.
Sequence feature identifier
Name or other identifier of molecular sequence feature(s).
data
beta13
identifier
edam
bioinformatics
identifiers
Structure identifier
bioinformatics
An identifier of a molecular tertiary structure, typically an entry from a structure database.
identifiers
identifier
edam
data
beta13
Matrix identifier
edam
beta13
identifier
identifiers
An identifier of an array of numerical values, such as a comparison matrix.
bioinformatics
data
Protein sequence composition
Sequence property (protein composition)
A report (typically a table) on character or word composition / frequency of protein sequence(s).
bioinformatics
data
data
beta13
edam
Nucleic acid sequence composition
beta13
A report (typically a table) on character or word composition / frequency of nucleic acid sequence(s).
bioinformatics
data
edam
Sequence property (nucleic acid composition)
data
Protein domain classification node
bioinformatics
edam
data
A node from a classification of protein structural domain(s).
beta13
data
CAS number
edam
Unique numerical identifier of chemicals in the scientific literature, as assigned by the Chemical Abstracts Service.
bioinformatics
identifier
data
CAS registry number
beta13
identifiers
ATC code
beta13
Unique identifier of a drug conforming to the Anatomical Therapeutic Chemical (ATC) Classification System, a drug classification system controlled by the WHO Collaborating Centre for Drug Statistics Methodology (WHOCC).
edam
bioinformatics
identifier
identifiers
data
UNII
Unique Ingredient Identifier
A unique, unambiguous, alphanumeric identifier of a chemical substance as catalogued by the Substance Registration System of the Food and Drug Administration (FDA).
beta13
identifier
bioinformatics
identifiers
data
edam
Geotemporal metadata
Basic information concerning geographical location or time.
data
data
beta13
edam
bioinformatics
System metadata
data
data
beta13
Metadata concerning the software, hardware or other aspects of a computer system.
edam
bioinformatics
Sequence feature name
beta13
edam
data
A name of a sequence feature, e.g. the name of a feature to be displayed to an end-user.
identifiers
identifier
bioinformatics
Experimental measurement
Measurement data
Raw experimental data
Measurement metadata
Experimentally measured data
Measured data
This is a broad data type and is used a placeholder for other, more specific types. It is primarily intended to help navigation of EDAM and would not typically be used for annotation.
edam
Measurement
Experimental measurement data
bioinformatics
Raw data such as measurements or other results from laboratory experiments, as generated from laboratory hardware.
data
beta13
data
Raw microarray data
Such data as found in Affymetrix CEL or GPR files.
data
Raw data (typically MIAME-compliant) for hybridisations from a microarray experiment.
edam
bioinformatics
beta13
data
Processed microarray data
data
edam
Such data as found in Affymetrix .CHP files or data from other software such as RMA or dChip.
Gene expression report
Microarray probe set data
beta13
Gene annotation (expression)
data
bioinformatics
Data generated from processing and analysis of probe set data from a microarray experiment.
Normalised microarray data
Gene expression data matrix
The final processed (normalised) data for a set of hybridisations in a microarray experiment.
This combines data from all hybridisations.
beta13
edam
bioinformatics
data
Gene expression matrix
data
Sample annotation
data
bioinformatics
beta13
This might include compound and dose in a dose response experiment.
data
Annotation on a biological sample, for example experimental factors and their values.
edam
Microarray annotation
data
beta13
Annotation on the array itself used in a microarray experiment.
bioinformatics
edam
This might include gene identifiers, genomic coordinates, probe oligonucleotide sequences etc.
data
Microarray protocol annotation
Annotation on laboratory and/or data processing protocols used in an microarray experiment.
beta13
data
data
This might describe e.g. the normalisation methods used to process the raw data.
edam
bioinformatics
Microarray hybridisation data
beta13
data
Data concerning the hybridisations measured during a microarray experiment.
edam
bioinformatics
data
Protein features (topological domains)
bioinformatics
Summary of topological domains such as cytoplasmic regions in a protein.
beta13
Protein topological domains
data
data
edam
Sequence features (compositionally-biased regions)
A report of regions in a molecular sequence that are biased to certain characters.
edam
data
beta13
data
bioinformatics
Protein features (sequence variants)
data
A report on the protein sequence variants produced e.g. from alternative splicing, alternative promoter usage, alternative initiation and ribosomal frameshifting.
edam
bioinformatics
beta13
data
Nucleic acid features (difference and change)
data
edam
A report on features in a nucleic acid sequence that indicate changes to or differences between sequences.
data
bioinformatics
beta13
Nucleic acid features (expression signal)
data
data
edam
A report on regions within a nucleic acid sequence containing a signal that alters a biological function.
beta13
bioinformatics
Nucleic acid features (binding)
A report on regions of a nucleic acid sequence that bind some other molecule.
This includes ribosome binding sites (Shine-Dalgarno sequence in prokaryotes).
data
edam
beta13
data
bioinformatics
Nucleic acid features (repeats)
edam
beta13
bioinformatics
A report on repetitive elements within a nucleic acid sequence.
data
data
This includes long terminal repeats (LTRs); sequences (typically retroviral) directly repeated at both ends of a defined sequence and other types of repeating unit.
Nucleic acid features (replication and recombination)
data
data
This includes binding sites for initiation of replication (origin of replication), regions where transfer is initiated during the conjugation or mobilization (origin of transfer), starting sites for DNA duplication (origin of replication) and regions which are eliminated through any of kind of recombination.
bioinformatics
A report on regions within a nucleic acid sequence that are involved in DNA replcication or recombination.
edam
beta13
Nucleic acid features (structure)
beta13
edam
bioinformatics
data
data
A report on regions within a nucleic acid sequence which form secondary or tertiary (3D) structures.
Protein features (repeats)
beta13
data
data
edam
bioinformatics
Location of short repetitive subsequences (repeat sequences) in a protein sequence.
Protein features (motifs)
Use this concept if another, more specific concept is not available.
edam
bioinformatics
beta13
Report on the location of matches to profiles, motifs (conserved or functional patterns) or other signatures in one or more protein sequences.
data
data
Nucleic acid features (motifs)
bioinformatics
data
edam
beta13
Report on the location of matches to profiles, motifs (conserved or functional patterns) or other signatures in one or more nucleic acid sequences.
Use this concept if another, more specific concept is not available.
data
Nucleic acid features (d-loop)
edam
data
A displacement loop is a region of mitochondrial DNA in which one of the strands is displaced by an RNA molecule.
data
A report on displacement loops in a mitochondrial DNA sequence.
bioinformatics
beta13
Nucleic acid features (stem loop)
A stem loop is a hairpin structure; a double-helical structure formed when two complementary regions of a single strand of RNA or DNA molecule form base-pairs.
bioinformatics
A report on stem loops in a DNA sequence.
data
data
beta13
edam
Nucleic acid features (mRNA features)
edam
bioinformatics
data
beta13
Features concerning messenger RNA (mRNA) molecules including precursor RNA, primary (unprocessed) transcript and fully processed molecules.
data
This includes 5'untranslated region (5'UTR), coding sequences (CDS), exons, intervening sequences (intron) and 3'untranslated regions (3'UTR).
mRNA features
Nucleic acid features (signal or transit peptide)
data
data
beta13
bioinformatics
edam
A signal peptide coding sequence encodes an N-terminal domain of a secreted protein, which is involved in attaching the polypeptide to a membrane leader sequence. A transit peptide coding sequence encodes an N-terminal domain of a nuclear-encoded organellar protein; which is involved in import of the protein into the organelle.
A report on a coding sequence for a signal or transit peptide.
Nucleic acid features (non-coding RNA)
Non-coding RNA features
beta13
Features concerning non-coding or functional RNA molecules, including tRNA and rRNA.
data
edam
bioinformatics
ncRNA features
data
Nucleic acid features (transcriptional)
beta13
data
Features concerning transcription of DNA into RNA including the regulation of transcription.
data
edam
This includes promoters, CAAT signals, TATA signals, -35 signals, -10 signals, GC signals, primer binding sites for initiation of transcription or reverse transcription, enhancer, attenuator, terminators and ribosome binding sites.
bioinformatics
Nucleic acid features (STS)
beta13
A report on sequence tagged sites (STS) in nucleic acid sequences.
edam
bioinformatics
data
Sequence tagged sites are short DNA sequences that are unique within a genome and serve as a mapping landmark, detectable by PCR they allow a genome to be mapped via an ordering of STSs.
data
Nucleic acid features (immunoglobulin gene structure)
A report on predicted or actual immunoglobulin gene structure including constant, switch and variable regions and diversity, joining and variable segments.
edam
data
beta13
data
bioinformatics
SCOP class
Information on a 'class' node from the SCOP database.
data
beta13
data
bioinformatics
edam
SCOP fold
edam
data
Information on a 'fold' node from the SCOP database.
beta13
data
bioinformatics
SCOP superfamily
bioinformatics
data
Information on a 'superfamily' node from the SCOP database.
data
beta13
edam
SCOP family
bioinformatics
beta13
Information on a 'family' node from the SCOP database.
data
data
edam
SCOP protein
data
bioinformatics
beta13
Information on a 'protein' node from the SCOP database.
edam
data
SCOP species
bioinformatics
data
Information on a 'species' node from the SCOP database.
beta13
edam
data
Experiment annotation (mass spectrometry)
General annotation on a mass spectrometry experiment.
beta13
bioinformatics
data
edam
data
Gene family annotation
data
data
edam
An informative report on a particular family of genes, typically a set of genes with similar sequence that originate from duplication of a common ancestor gene.
bioinformatics
beta13
Protein image
edam
An image of a protein.
bioinformatics
beta13
data
data
Protein alignment
data
edam
beta13
data
bioinformatics
An alignment of protein sequences and/or structures.
Experiment annotation (sequencing)
data
data
Data on a sequencing experiment, including samples, sampling, preparation, sequencing, and analysis.
bioinformatics
1.0
edam
Sequence assembly report
data
1.1
edam
bioinformatics
data
An informative report about a DNA sequence assembly.
Genome index
edam
An index of a genome sequence.
Many sequence alignment tasks involving many or very large sequences rely on a precomputed index of the sequence to accelerate the alignment.
1.1
data
data
bioinformatics
Experiment annotation (GWAS)
Experiment annotation (genome-wide association study)
data
Metadata on a genome-wide association study (GWAS).
1.1
data
edam
bioinformatics
Cytoband position
bioinformatics
edam
data
data
Information might include start and end position in a chromosome sequence, chromosome identifier, name of band and so on.
The position of a cytogenetic band in a genome.
1.2
Cell type ontology ID
identifiers
identifier
bioinformatics
beta12orEarlier
CL_[0-9]{7}
edam
data
Cell type ontology concept ID.
CL ID
1.2
Kinetic model
Mathematical model of a network, that contains biochemical kinetics.
1.2
bioinformatics
data
data
edam
SMILES
formats
bioinformatics
format
Chemical structure specified in Simplified Molecular Input Line Entry System (SMILES) line notation.
edam
beta12orEarlier
InChI
format
bioinformatics
edam
beta12orEarlier
Chemical structure specified in IUPAC International Chemical Identifier (InChI) line notation.
formats
mf
format
The general MF query format consists of a series of valid atomic symbols, with an optional number or range.
bioinformatics
beta12orEarlier
edam
Chemical structure specified by Molecular Formula (MF), including a count of each element in a compound.
formats
inchikey
bioinformatics
An InChI identifier is not human-readable but is more suitable for web searches than an InChI chemical structure specification.
beta12orEarlier
format
The InChIKey (hashed InChI) is a fixed length (25 character) condensed digital representation of an InChI chemical structure specification. It uniquely identifies a chemical compound.
edam
formats
smarts
bioinformatics
formats
edam
beta12orEarlier
SMILES ARbitrary Target Specification (SMARTS) format for chemical structure specification, which is a subset of the SMILES line notation.
format
unambiguous pure
bioinformatics
formats
format
edam
Alphabet for a molecular sequence with possible unknown positions but without ambiguity or non-sequence characters.
beta12orEarlier
nucleotide
Non-sequence characters may be used for example for gaps.
format
beta12orEarlier
formats
bioinformatics
Alphabet for a nucleotide sequence with possible ambiguity, unknown positions and non-sequence characters.
edam
protein
beta12orEarlier
bioinformatics
Alphabet for a protein sequence with possible ambiguity, unknown positions and non-sequence characters.
format
formats
edam
Non-sequence characters may be used for gaps and translation stop.
consensus
formats
Alphabet for the consensus of two or more molecular sequences.
bioinformatics
edam
beta12orEarlier
format
pure nucleotide
edam
Alphabet for a nucleotide sequence with possible ambiguity and unknown positions but without non-sequence characters.
formats
beta12orEarlier
bioinformatics
format
unambiguous pure nucleotide
formats
Alphabet for a nucleotide sequence (characters ACGTU only) with possible unknown positions but without ambiguity or non-sequence characters .
bioinformatics
beta12orEarlier
edam
format
dna
format
beta12orEarlier
formats
edam
bioinformatics
Alphabet for a DNA sequence with possible ambiguity, unknown positions and non-sequence characters.
rna
edam
Alphabet for an RNA sequence with possible ambiguity, unknown positions and non-sequence characters.
bioinformatics
beta12orEarlier
formats
format
unambiguous pure dna
formats
beta12orEarlier
format
edam
Alphabet for a DNA sequence (characters ACGT only) with possible unknown positions but without ambiguity or non-sequence characters.
bioinformatics
pure dna
format
beta12orEarlier
formats
edam
bioinformatics
Alphabet for a DNA sequence with possible ambiguity and unknown positions but without non-sequence characters.
unambiguous pure rna sequence
bioinformatics
beta12orEarlier
Alphabet for an RNA sequence (characters ACGU only) with possible unknown positions but without ambiguity or non-sequence characters.
formats
format
edam
pure rna
beta12orEarlier
edam
bioinformatics
Alphabet for an RNA sequence with possible ambiguity and unknown positions but without non-sequence characters.
format
formats
unambiguous pure protein
bioinformatics
Alphabet for any protein sequence with possible unknown positions but without ambiguity or non-sequence characters.
format
beta12orEarlier
edam
formats
pure protein
format
edam
formats
Alphabet for any protein sequence with possible ambiguity and unknown positions but without non-sequence characters.
beta12orEarlier
bioinformatics
UniGene entry format
true
formats
edam
beta12orEarlier
format
bioinformatics
A UniGene entry includes a set of transcript sequences assigned to the same transcription locus (gene or expressed pseudogene), with information on protein similarities, gene expression, cDNA clone reagents, and genomic location.
Format of an entry from UniGene.
beta12orEarlier
COG sequence cluster format
true
edam
formats
Format of an entry from the COG database of clusters of (related) protein sequences.
bioinformatics
beta12orEarlier
format
beta12orEarlier
EMBL feature location
Format for sequence positions (feature location) as used in DDBJ/EMBL/GenBank database.
Feature location
beta12orEarlier
format
bioinformatics
edam
formats
quicktandem
beta12orEarlier
format
edam
bioinformatics
Report format for tandem repeats in a nucleotide sequence (format generated by the Sanger Centre quicktandem program).
formats
Sanger inverted repeats
Report format for inverted repeats in a nucleotide sequence (format generated by the Sanger Centre inverted program).
formats
edam
beta12orEarlier
format
bioinformatics
EMBOSS repeat
formats
bioinformatics
format
beta12orEarlier
Report format for tandem repeats in a sequence (an EMBOSS report format).
edam
est2genome format
Format of a report on exon-intron structure generated by EMBOSS est2genome.
edam
formats
bioinformatics
beta12orEarlier
format
restrict format
bioinformatics
beta12orEarlier
edam
format
formats
Report format for restriction enzyme recognition sites used by EMBOSS restrict program.
restover format
Report format for restriction enzyme recognition sites used by EMBOSS restover program.
beta12orEarlier
format
formats
edam
bioinformatics
REBASE restriction sites
Report format for restriction enzyme recognition sites used by REBASE database.
format
edam
bioinformatics
formats
beta12orEarlier
FASTA search results format
format
bioinformatics
formats
This includes (typically) score data, alignment data and a histogram (of observed and expected distribution of E values.)
beta12orEarlier
Format of results of a sequence database search using FASTA.
edam
BLAST results
formats
This includes score data, alignment data and summary table.
format
edam
beta12orEarlier
bioinformatics
Format of results of a sequence database search using some variant of BLAST.
mspcrunch
Format of results of a sequence database search using some variant of MSPCrunch.
bioinformatics
formats
beta12orEarlier
format
edam
Smith-Waterman format
formats
bioinformatics
Format of results of a sequence database search using some variant of Smith Waterman.
beta12orEarlier
format
edam
dhf
bioinformatics
formats
format
Format of EMBASSY domain hits file (DHF) of hits (sequences) with domain classification information.
The hits are relatives to a SCOP or CATH family and are found from a search of a sequence database.
beta12orEarlier
edam
lhf
format
The hits are putative ligand-binding sequences and are found from a search of a sequence database.
Format of EMBASSY ligand hits file (LHF) of database hits (sequences) with ligand classification information.
edam
bioinformatics
formats
beta12orEarlier
InterPro hits format
edam
bioinformatics
beta12orEarlier
Results format for searches of the InterPro database.
formats
format
InterPro protein view report format
beta12orEarlier
edam
bioinformatics
formats
The report includes a classification of regions in a query protein sequence which are assigned to a known InterPro protein family or group.
Format of results of a search of the InterPro database showing matches of query protein sequence(s) to InterPro entries.
format
InterPro match table format
bioinformatics
edam
format
formats
The table presents matches between query proteins (rows) and signature methods (columns) for this entry. Alternatively the sequence(s) might be from from the InterPro entry itself. The match position in the protein sequence and match status (true positive, false positive etc) are indicated.
beta12orEarlier
Format of results of a search of the InterPro database showing matches between protein sequence(s) and signatures for an InterPro entry.
HMMER Dirichlet prior
Dirichlet distribution HMMER format.
beta12orEarlier
formats
bioinformatics
format
edam
MEME Dirichlet prior
bioinformatics
Dirichlet distribution MEME format.
beta12orEarlier
edam
formats
format
HMMER emission and transition
formats
bioinformatics
format
beta12orEarlier
Format of a report from the HMMER package on the emission and transition counts of a hidden Markov model.
edam
prosite-pattern
formats
edam
beta12orEarlier
Format of a regular expression pattern from the Prosite database.
bioinformatics
format
EMBOSS sequence pattern
beta12orEarlier
Format of an EMBOSS sequence pattern.
bioinformatics
edam
formats
format
meme-motif
formats
bioinformatics
format
edam
beta12orEarlier
A motif in the format generated by the MEME program.
prosite-profile
bioinformatics
beta12orEarlier
formats
format
edam
Sequence profile (sequence classifier) format used in the PROSITE database.
JASPAR format
formats
A profile (sequence classifier) in the format used in the JASPAR database.
bioinformatics
edam
beta12orEarlier
format
MEME background Markov model
beta12orEarlier
edam
format
formats
bioinformatics
Format of the model of random sequences used by MEME.
HMMER format
formats
format
bioinformatics
beta12orEarlier
Format of a hidden Markov model representation used by the HMMER package.
edam
HMMER-aln
formats
FASTA-style format for multiple sequences aligned by HMMER package to an HMM.
edam
beta12orEarlier
format
bioinformatics
DIALIGN format
beta12orEarlier
edam
formats
bioinformatics
Format of multiple sequences aligned by DIALIGN package.
format
daf
formats
beta12orEarlier
bioinformatics
format
edam
EMBASSY 'domain alignment file' (DAF) format, containing a sequence alignment of protein domains belonging to the same SCOP or CATH family.
The format is clustal-like and includes annotation of domain family classification information.
Sequence-MEME profile alignment
Format for alignment of molecular sequences to MEME profiles (position-dependent scoring matrices) as generated by the MAST tool from the MEME package.
beta12orEarlier
bioinformatics
formats
format
edam
HMMER profile alignment (sequences versus HMMs)
formats
beta12orEarlier
format
bioinformatics
Format used by the HMMER package for an alignment of a sequence against a hidden Markov model database.
edam
HMMER profile alignment (HMM versus sequences)
formats
beta12orEarlier
Format used by the HMMER package for of an alignment of a hidden Markov model against a sequence database.
bioinformatics
edam
format
Phylip distance matrix
format
edam
Format of PHYLIP phylogenetic distance matrix data.
formats
beta12orEarlier
Data Type must include the distance matrix, probably as pairs of sequence identifiers with a distance (integer or float).
bioinformatics
ClustalW dendrogram
formats
edam
format
Dendrogram (tree file) format generated by ClustalW.
beta12orEarlier
bioinformatics
Phylip tree raw
formats
edam
beta12orEarlier
bioinformatics
Raw data file format used by Phylip from which a phylogenetic tree is directly generated or plotted.
format
Phylip continuous quantitative characters
PHYLIP file format for continuous quantitative character data.
bioinformatics
format
beta12orEarlier
formats
edam
Phylogenetic property values format
true
format
formats
bioinformatics
beta12orEarlier
beta12orEarlier
edam
Format of phylogenetic property data.
Phylip character frequencies format
edam
PHYLIP file format for phylogenetics character frequency data.
format
beta12orEarlier
bioinformatics
formats
Phylip discrete states format
bioinformatics
format
edam
formats
beta12orEarlier
Format of PHYLIP discrete states data.
Phylip cliques format
edam
Format of PHYLIP cliques data.
bioinformatics
formats
format
beta12orEarlier
Phylip tree format
beta12orEarlier
edam
format
formats
bioinformatics
Phylogenetic tree data format used by the PHYLIP program.
TreeBASE format
format
The format of an entry from the TreeBASE database of phylogenetic data.
beta12orEarlier
formats
bioinformatics
edam
TreeFam format
The format of an entry from the TreeFam database of phylogenetic data.
format
formats
edam
beta12orEarlier
bioinformatics
Phylip tree distance format
format
formats
beta12orEarlier
bioinformatics
edam
Format for distances, such as Branch Score distance, between two or more phylogenetic trees as used by the Phylip package.
dssp
Format of an entry from the DSSP database (Dictionary of Secondary Structure in Proteins).
bioinformatics
edam
formats
format
The DSSP database is built using the DSSP application which defines secondary structure, geometrical features and solvent exposure of proteins, given atomic coordinates in PDB format.
beta12orEarlier
hssp
beta12orEarlier
format
Entry format of the HSSP database (Homology-derived Secondary Structure in Proteins).
bioinformatics
formats
edam
Dot-bracket format
Vienna RNA secondary structure format
Format of RNA secondary structure in dot-bracket notation, originally generated by the Vienna RNA package/server.
beta12orEarlier
Vienna RNA format
edam
formats
format
bioinformatics
Vienna local RNA secondary structure format
beta12orEarlier
edam
bioinformatics
Format of local RNA secondary structure components with free energy values, generated by the Vienna RNA package/server.
formats
format
PDB database entry format
edam
format
Format of an entry (or part of an entry) from the PDB database.
beta12orEarlier
bioinformatics
PDB entry format
formats
PDB format
Entry format of PDB database in PDB format.
beta12orEarlier
bioinformatics
format
PDB
edam
formats
mmCIF
mmcif
edam
formats
beta12orEarlier
bioinformatics
Entry format of PDB database in mmCIF format.
format
PDBML
format
beta12orEarlier
formats
edam
bioinformatics
Entry format of PDB database in PDBML (XML) format.
Domainatrix 3D-1D scoring matrix format
true
formats
bioinformatics
Format of a matrix of 3D-1D scores used by the EMBOSS Domainatrix applications.
format
beta12orEarlier
beta12orEarlier
edam
aaindex
beta12orEarlier
format
formats
edam
bioinformatics
Amino acid index format used by the AAindex database.
IntEnz enzyme report format
true
bioinformatics
IntEnz is the master copy of the Enzyme Nomenclature, the recommendations of the NC-IUBMB on the Nomenclature and Classification of Enzyme-Catalysed Reactions.
edam
beta12orEarlier
beta12orEarlier
formats
format
Format of an entry from IntEnz (The Integrated Relational Enzyme Database).
BRENDA enzyme report format
true
edam
beta12orEarlier
beta12orEarlier
formats
Format of an entry from the BRENDA enzyme database.
format
bioinformatics
KEGG REACTION enzyme report format
true
beta12orEarlier
Format of an entry from the KEGG REACTION database of biochemical reactions.
format
edam
formats
bioinformatics
beta12orEarlier
KEGG ENZYME enzyme report format
true
beta12orEarlier
edam
Format of an entry from the KEGG ENZYME database.
format
beta12orEarlier
formats
bioinformatics
REBASE proto enzyme report format
true
format
bioinformatics
beta12orEarlier
beta12orEarlier
formats
edam
Format of an entry from the proto section of the REBASE enzyme database.
REBASE withrefm enzyme report format
true
formats
format
Format of an entry from the withrefm section of the REBASE enzyme database.
bioinformatics
beta12orEarlier
edam
beta12orEarlier
Pcons report format
edam
Pcons ranks protein models by assessing their quality based on the occurrence of recurring common three-dimensional structural patterns. Pcons returns a score reflecting the overall global quality and a score for each individual residue in the protein reflecting the local residue quality.
format
beta12orEarlier
formats
Format of output of the Pcons Model Quality Assessment Program (MQAP).
bioinformatics
ProQ report format
Format of output of the ProQ protein model quality predictor.
ProQ is a neural network-based predictor that predicts the quality of a protein model based on the number of structural features.
edam
formats
bioinformatics
beta12orEarlier
format
SMART domain assignment report format
true
beta12orEarlier
The SMART output file includes data on genetically mobile domains / analysis of domain architectures, including phyletic distributions, functional class, tertiary structures and functionally important residues.
Format of SMART domain assignment data.
beta12orEarlier
format
edam
bioinformatics
formats
BIND entry format
true
formats
edam
beta12orEarlier
beta12orEarlier
bioinformatics
format
Entry format for the BIND database of protein interaction.
IntAct entry format
true
formats
Entry format for the IntAct database of protein interaction.
bioinformatics
beta12orEarlier
format
edam
beta12orEarlier
InterPro entry format
true
format
This includes signature metadata, sequence references and a reference to the signature itself. There is normally a header (entry accession numbers and name), abstract, taxonomy information, example proteins etc. Each entry also includes a match list which give a number of different views of the signature matches for the sequences in each InterPro entry.
beta12orEarlier
edam
formats
beta12orEarlier
Entry format for the InterPro database of protein signatures (sequence classifiers) and classified sequences.
bioinformatics
InterPro entry abstract format
true
format
beta12orEarlier
Entry format for the textual abstract of signatures in an InterPro entry and its protein matches.
formats
References are included and a functional inference is made where possible.
beta12orEarlier
edam
bioinformatics
Gene3D entry format
true
bioinformatics
beta12orEarlier
beta12orEarlier
edam
format
formats
Entry format for the Gene3D protein secondary database.
PIRSF entry format
true
formats
format
beta12orEarlier
bioinformatics
beta12orEarlier
Entry format for the PIRSF protein secondary database.
edam
PRINTS entry format
true
formats
edam
Entry format for the PRINTS protein secondary database.
bioinformatics
beta12orEarlier
beta12orEarlier
format
Panther Families and HMMs entry format
true
formats
bioinformatics
Entry format for the Panther library of protein families and subfamilies.
format
beta12orEarlier
edam
beta12orEarlier
Pfam entry format
true
edam
beta12orEarlier
formats
format
beta12orEarlier
bioinformatics
Entry format for the Pfam protein secondary database.
SMART entry format
true
beta12orEarlier
formats
edam
Entry format for the SMART protein secondary database.
beta12orEarlier
format
bioinformatics
Superfamily entry format
true
format
beta12orEarlier
Entry format for the Superfamily protein secondary database.
beta12orEarlier
formats
bioinformatics
edam
TIGRFam entry format
true
beta12orEarlier
bioinformatics
Entry format for the TIGRFam protein secondary database.
edam
formats
format
beta12orEarlier
ProDom entry format
true
beta12orEarlier
bioinformatics
formats
format
edam
beta12orEarlier
Entry format for the ProDom protein domain classification database.
FSSP entry format
true
beta12orEarlier
bioinformatics
Entry format for the FSSP database.
beta12orEarlier
formats
format
edam
findkm
A report format for the kinetics of enzyme-catalysed reaction(s) in a format generated by EMBOSS findkm. This includes Michaelis Menten plot, Hanes Woolf plot, Michaelis Menten constant (Km) and maximum velocity (Vmax).
formats
bioinformatics
edam
beta12orEarlier
format
Ensembl gene report format
true
formats
beta12orEarlier
format
Entry format of Ensembl genome database.
bioinformatics
beta12orEarlier
edam
DictyBase gene report format
true
beta12orEarlier
Entry format of DictyBase genome database.
edam
bioinformatics
beta12orEarlier
formats
format
CGD gene report format
true
Entry format of Candida Genome database.
beta12orEarlier
format
beta12orEarlier
bioinformatics
edam
formats
DragonDB gene report format
true
format
edam
beta12orEarlier
formats
Entry format of DragonDB genome database.
bioinformatics
beta12orEarlier
EcoCyc gene report format
true
bioinformatics
edam
beta12orEarlier
format
beta12orEarlier
formats
Entry format of EcoCyc genome database.
FlyBase gene report format
true
Entry format of FlyBase genome database.
bioinformatics
beta12orEarlier
beta12orEarlier
edam
formats
format
Gramene gene report format
true
bioinformatics
edam
Entry format of Gramene genome database.
beta12orEarlier
beta12orEarlier
format
formats
KEGG GENES gene report format
true
beta12orEarlier
formats
Entry format of KEGG GENES genome database.
beta12orEarlier
bioinformatics
edam
format
MaizeGDB gene report format
true
format
Entry format of the Maize genetics and genomics database (MaizeGDB).
edam
bioinformatics
beta12orEarlier
beta12orEarlier
formats
MGD gene report format
true
edam
formats
format
beta12orEarlier
beta12orEarlier
bioinformatics
Entry format of the Mouse Genome Database (MGD).
RGD gene report format
true
beta12orEarlier
format
Entry format of the Rat Genome Database (RGD).
beta12orEarlier
edam
bioinformatics
formats
SGD gene report format
true
beta12orEarlier
Entry format of the Saccharomyces Genome Database (SGD).
bioinformatics
format
edam
beta12orEarlier
formats
GeneDB gene report format
true
format
beta12orEarlier
formats
beta12orEarlier
bioinformatics
edam
Entry format of the Sanger GeneDB genome database.
TAIR gene report format
true
format
Entry format of The Arabidopsis Information Resource (TAIR) genome database.
edam
beta12orEarlier
bioinformatics
formats
beta12orEarlier
WormBase gene report format
true
edam
formats
beta12orEarlier
beta12orEarlier
bioinformatics
format
Entry format of the WormBase genomes database.
ZFIN gene report format
true
Entry format of the Zebrafish Information Network (ZFIN) genome database.
beta12orEarlier
formats
edam
beta12orEarlier
format
bioinformatics
TIGR gene report format
true
bioinformatics
beta12orEarlier
beta12orEarlier
edam
formats
Entry format of the TIGR genome database.
format
dbSNP polymorphism report format
true
format
bioinformatics
beta12orEarlier
Entry format for the dbSNP database.
formats
beta12orEarlier
edam
OMIM entry format
true
Format of an entry from the OMIM database of genotypes and phenotypes.
edam
beta12orEarlier
beta12orEarlier
formats
bioinformatics
format
HGVbase entry format
true
formats
beta12orEarlier
beta12orEarlier
bioinformatics
format
Format of a record from the HGVbase database of genotypes and phenotypes.
edam
HIVDB entry format
true
edam
Format of a record from the HIVDB database of genotypes and phenotypes.
formats
format
bioinformatics
beta12orEarlier
beta12orEarlier
KEGG DISEASE entry format
true
formats
bioinformatics
beta12orEarlier
Format of an entry from the KEGG DISEASE database.
format
edam
beta12orEarlier
Primer3 primer
edam
beta12orEarlier
Report format on PCR primers and hybridization oligos as generated by Whitehead primer3 program.
bioinformatics
formats
format
ABI
formats
beta12orEarlier
edam
format
A format of raw sequence read data from an Applied Biosystems sequencing machine.
bioinformatics
mira
formats
beta12orEarlier
bioinformatics
format
Format of MIRA sequence trace information file.
edam
CAF
formats
beta12orEarlier
format
edam
bioinformatics
Common Assembly Format (CAF). A sequence assembly format including contigs, base-call qualities, and other metadata.
exp
edam
beta12orEarlier
formats
bioinformatics
format
Sequence assembly project file EXP format.
SCF
bioinformatics
edam
formats
Staden Chromatogram Files format (SCF) of base-called sequence reads, qualities, and other metadata.
beta12orEarlier
format
PHD
bioinformatics
beta12orEarlier
formats
format
PHD sequence trace format to store serialised chromatogram data (reads).
edam
dat
beta12orEarlier
edam
Affymetrix image data file format
Format of Affymetrix data file of raw image data.
bioinformatics
formats
format
cel
Format of Affymetrix data file of information about (raw) expression levels of the individual probes.
Affymetrix probe raw data format
beta12orEarlier
bioinformatics
format
edam
formats
affymetrix
bioinformatics
format
beta12orEarlier
Format of affymetrix gene cluster files (hc-genes.txt, hc-chips.txt) from hierarchical clustering.
edam
formats
ArrayExpress entry format
true
beta12orEarlier
format
Entry format for the ArrayExpress microarrays database.
edam
bioinformatics
formats
beta12orEarlier
affymetrix-exp
beta12orEarlier
edam
format
formats
Affymetrix experimental conditions data file format
Affymetrix data file format for information about experimental conditions and protocols.
bioinformatics
CHP
Format of Affymetrix data file of information about (normalised) expression levels of the individual probes.
format
edam
formats
bioinformatics
beta12orEarlier
Affymetrix probe normalised data format
EMDB entry format
true
edam
beta12orEarlier
Format of an entry from the Electron Microscopy DataBase (EMDB).
bioinformatics
formats
beta12orEarlier
format
KEGG PATHWAY entry format
true
format
The format of an entry from the KEGG PATHWAY database of pathway maps for molecular interactions and reaction networks.
formats
beta12orEarlier
edam
bioinformatics
beta12orEarlier
MetaCyc entry format
true
beta12orEarlier
The format of an entry from the MetaCyc metabolic pathways database.
beta12orEarlier
bioinformatics
formats
format
edam
HumanCyc entry format
true
bioinformatics
beta12orEarlier
format
beta12orEarlier
edam
formats
The format of a report from the HumanCyc metabolic pathways database.
INOH entry format
true
beta12orEarlier
bioinformatics
The format of an entry from the INOH signal transduction pathways database.
format
formats
edam
beta12orEarlier
PATIKA entry format
true
edam
beta12orEarlier
format
formats
The format of an entry from the PATIKA biological pathways database.
beta12orEarlier
bioinformatics
Reactome entry format
true
beta12orEarlier
The format of an entry from the reactome biological pathways database.
beta12orEarlier
format
formats
edam
bioinformatics
aMAZE entry format
true
beta12orEarlier
edam
The format of an entry from the aMAZE biological pathways and molecular interactions database.
format
bioinformatics
formats
beta12orEarlier
CPDB entry format
true
beta12orEarlier
edam
beta12orEarlier
The format of an entry from the CPDB database.
bioinformatics
format
formats
Panther Pathways entry format
true
format
bioinformatics
The format of an entry from the Panther Pathways database.
edam
beta12orEarlier
formats
beta12orEarlier
Taverna workflow format
formats
bioinformatics
beta12orEarlier
format
Format of Taverna workflows.
edam
BioModel mathematical model format
true
Format of mathematical models from the BioModel database.
format
beta12orEarlier
Models are annotated and linked to relevant data resources, such as publications, databases of compounds and pathways, controlled vocabularies, etc.
bioinformatics
formats
edam
beta12orEarlier
KEGG LIGAND entry format
true
format
bioinformatics
beta12orEarlier
formats
beta12orEarlier
The format of an entry from the KEGG LIGAND chemical database.
edam
KEGG COMPOUND entry format
true
bioinformatics
beta12orEarlier
format
formats
The format of an entry from the KEGG COMPOUND database.
beta12orEarlier
edam
KEGG PLANT entry format
true
edam
The format of an entry from the KEGG PLANT database.
bioinformatics
beta12orEarlier
format
beta12orEarlier
formats
KEGG GLYCAN entry format
true
beta12orEarlier
edam
bioinformatics
The format of an entry from the KEGG GLYCAN database.
formats
beta12orEarlier
format
PubChem entry format
true
edam
bioinformatics
beta12orEarlier
formats
beta12orEarlier
format
The format of an entry from PubChem.
ChemSpider entry format
true
edam
bioinformatics
beta12orEarlier
The format of an entry from a database of chemical structures and property predictions.
formats
format
beta12orEarlier
ChEBI entry format
true
ChEBI includes an ontological classification defining relations between entities or classes of entities.
The format of an entry from Chemical Entities of Biological Interest (ChEBI).
formats
beta12orEarlier
edam
bioinformatics
beta12orEarlier
format
MSDchem ligand dictionary entry format
true
beta12orEarlier
bioinformatics
formats
beta12orEarlier
The format of an entry from the MSDchem ligand dictionary.
edam
format
HET group dictionary entry format
bioinformatics
beta12orEarlier
format
formats
edam
The format of an entry from the HET group dictionary (HET groups from PDB files).
KEGG DRUG entry format
true
formats
beta12orEarlier
beta12orEarlier
format
edam
bioinformatics
The format of an entry from the KEGG DRUG database.
PubMed citation
formats
Format of bibliographic reference as used by the PubMed database.
bioinformatics
beta12orEarlier
edam
format
Medline Display Format
format
edam
bioinformatics
Bibliographic reference information including citation information is included
formats
Format for abstracts of scientific articles from the Medline database.
beta12orEarlier
CiteXplore-core
edam
formats
CiteXplore 'core' citation format including title, journal, authors and abstract.
format
beta12orEarlier
bioinformatics
CiteXplore-all
edam
beta12orEarlier
formats
bioinformatics
format
CiteXplore 'all' citation format includes all known details such as Mesh terms and cross-references.
pmc
formats
format
Article format of the PubMed Central database.
edam
beta12orEarlier
bioinformatics
iHOP text mining abstract format
format
iHOP abstract format.
bioinformatics
edam
formats
beta12orEarlier
Oscar3
edam
bioinformatics
beta12orEarlier
formats
Text mining abstract format from the Oscar 3 application.
format
Oscar 3 performs chemistry-specific parsing of chemical documents. It attempts to identify chemical names, ontology concepts and chemical data from a document.
PDB atom record format
true
Format of an ATOM record (describing data for an individual atom) from a PDB file.
format
beta12orEarlier
bioinformatics
formats
edam
beta13
CATH chain report format
true
beta12orEarlier
Format of CATH domain classification information for a polypeptide chain.
beta12orEarlier
edam
The report (for example http://www.cathdb.info/chain/1cukA) includes chain identifiers, domain identifiers and CATH codes for domains in a given protein chain.
format
formats
bioinformatics
CATH PDB report format
true
The report (for example http://www.cathdb.info/pdb/1cuk) includes chain identifiers, domain identifiers and CATH codes for domains in a given PDB file.
Format of CATH domain classification information for a protein PDB file.
beta12orEarlier
formats
bioinformatics
format
edam
beta12orEarlier
NCBI gene report format
true
formats
beta12orEarlier
beta12orEarlier
Entry (gene) format of the NCBI database.
bioinformatics
format
edam
GeneIlluminator gene report format
true
Moby:GI_Gene
beta12orEarlier
bioinformatics
format
edam
formats
beta12orEarlier
This includes a gene name and abbreviation of the name which may be in a name space indicating the gene status and relevant organisation.
Report format for biological functions associated with a gene name and its alternative names (synonyms, homonyms), as generated by the GeneIlluminator service.
BacMap gene card format
true
format
edam
beta12orEarlier
Moby:BacMapGeneCard
beta12orEarlier
formats
bioinformatics
Format of a report on the DNA and protein sequences for a given gene label from a bacterial chromosome maps from the BacMap database.
ColiCard report format
true
beta12orEarlier
beta12orEarlier
formats
bioinformatics
Format of a report on Escherichia coli genes, proteins and molecules from the CyberCell Database (CCDB).
format
Moby:ColiCard
edam
PlasMapper TextMap
Map of a plasmid (circular DNA) in PlasMapper TextMap format.
bioinformatics
beta12orEarlier
format
edam
formats
newick
Phylogenetic tree Newick (text) format.
format
formats
beta12orEarlier
bioinformatics
edam
TreeCon format
Phylogenetic tree TreeCon (text) format.
bioinformatics
beta12orEarlier
edam
format
formats
Nexus format
formats
beta12orEarlier
format
Phylogenetic tree Nexus (text) format.
edam
bioinformatics
beta12orEarlier
formats
bioinformatics
edam
format
Format
A defined way or layout of representing and structuring data in a computer file, blob, string, message, or elsewhere.
The main focus in EDAM lies on formats as means of structuring data exchanged between different tools or resources. The serialisation, compression, or encoding of concrete data formats/models is not in scope of EDAM. Format 'is format of' Data.
Exchange format
Data format
Data model
File format
A defined data format has its implicit or explicit data model, and EDAM does not distinguish the two. Some data models however do not have any standard way of serialisation into an exchange format, and those are thus not considered formats in EDAM. (Remark: even broader - or closely related - term to 'Data model' would be an 'Information model'.)
Data model
File format denotes only formats of a computer file, but the same formats apply also to data blobs or exchanged messages.
File format
Closely related concept focusing on the specification of a data format.
GFO 'Perpetuant' is in general broader than format, but it may be seen narrower in the sense of being a concrete individual and in the way of exhibiting presentials.
Compression and encoding' defines additional 'formatting' and/or encoding on top of the primary format.
BFO 'quality' is narrower in the sense that it is a 'dependent_continuant' (snap:DependentContinuant), and broader in the sense that it is any quality not just the data format.
Format can be a quality of a data record.
Atomic data format
true
format
bioinformatics
edam
Data format for an individual atom.
formats
beta13
beta12orEarlier
Sequence record format
formats
beta12orEarlier
edam
Data format for a molecular sequence record.
bioinformatics
format
Sequence feature annotation format
bioinformatics
Data format for molecular sequence feature information.
edam
formats
beta12orEarlier
format
Alignment format
beta12orEarlier
Data format for molecular sequence alignment information.
edam
formats
format
bioinformatics
acedb
format
formats
ACEDB sequence format.
beta12orEarlier
bioinformatics
edam
clustal sequence format
true
beta12orEarlier
formats
edam
Clustalw output format.
beta12orEarlier
bioinformatics
format
codata
beta12orEarlier
format
Codata entry format.
bioinformatics
formats
edam
dbid
bioinformatics
formats
format
edam
beta12orEarlier
Fasta format variant with database name before ID.
EMBL format
bioinformatics
EMBL
EMBL sequence format
EMBL entry format.
beta12orEarlier
edam
formats
format
Staden experiment format
formats
edam
beta12orEarlier
format
Staden experiment file format.
bioinformatics
FASTA format
FASTA sequence format
beta12orEarlier
format
FASTA format including NCBI-style IDs.
edam
formats
FASTA
bioinformatics
FASTQ
formats
beta12orEarlier
edam
bioinformatics
format
FASTQ short read format ignoring quality scores.
FASTQ-illumina
format
formats
beta12orEarlier
bioinformatics
FASTQ Illumina 1.3 short read format.
edam
FASTQ-sanger
beta12orEarlier
formats
edam
bioinformatics
format
FASTQ short read format with phred quality.
FASTQ-solexa
beta12orEarlier
formats
edam
bioinformatics
format
FASTQ Solexa/Illumina 1.0 short read format.
fitch program
edam
beta12orEarlier
formats
Fitch program format.
bioinformatics
format
gcg
formats
edam
format
bioinformatics
beta12orEarlier
GCG sequence format.
GenBank format
format
bioinformatics
formats
Genbank entry format.
beta12orEarlier
edam
genpept
Genpept protein entry format.
format
Currently identical to refseqp format
edam
bioinformatics
beta12orEarlier
formats
GFF2-seq
bioinformatics
format
beta12orEarlier
formats
GFF feature file format with sequence in the header.
edam
GFF3-seq
formats
edam
beta12orEarlier
GFF3 feature file format with sequence.
bioinformatics
format
giFASTA format
beta12orEarlier
bioinformatics
format
formats
FASTA sequence format including NCBI-style GIs.
edam
hennig86
edam
Hennig86 output sequence format.
bioinformatics
beta12orEarlier
format
formats
ig
beta12orEarlier
format
formats
bioinformatics
Intelligenetics sequence format.
edam
igstrict
format
Intelligenetics sequence format (strict version).
beta12orEarlier
formats
edam
bioinformatics
jackknifer
formats
format
beta12orEarlier
bioinformatics
edam
Jackknifer interleaved and non-interleaved sequence format.
mase format
Mase program sequence format.
edam
beta12orEarlier
bioinformatics
format
formats
mega-seq
format
Mega interleaved and non-interleaved sequence format.
edam
beta12orEarlier
formats
bioinformatics
msf
edam
format
formats
GCG MSF (multiple sequence file) file format.
beta12orEarlier
bioinformatics
nbrf
bioinformatics
edam
formats
beta12orEarlier
format
NBRF/PIR entry sequence format.
nexus-seq
edam
bioinformatics
formats
Nexus/paup interleaved sequence format.
format
beta12orEarlier
pdbatom
PDB sequence format (ATOM lines).
bioinformatics
beta12orEarlier
formats
format
pdb format in EMBOSS.
edam
pdbatomnuc
formats
format
pdbnuc format in EMBOSS.
beta12orEarlier
PDB nucleotide sequence format (ATOM lines).
bioinformatics
edam
pdbseqresnuc
edam
beta12orEarlier
PDB nucleotide sequence format (SEQRES lines).
format
formats
bioinformatics
pdbnucseq format in EMBOSS.
pdbseqres
bioinformatics
edam
pdbseq format in EMBOSS.
beta12orEarlier
PDB sequence format (SEQRES lines).
format
formats
Pearson format
Plain old FASTA sequence format (unspecified format for IDs).
edam
bioinformatics
format
beta12orEarlier
formats
phylip sequence format
true
bioinformatics
beta12orEarlier
formats
beta12orEarlier
Phylip interleaved sequence format.
edam
format
phylipnon sequence format
true
bioinformatics
formats
edam
beta12orEarlier
beta12orEarlier
Phylip non-interleaved sequence format.
format
raw
format
Raw sequence format with no non-sequence characters.
beta12orEarlier
bioinformatics
formats
edam
refseqp
Currently identical to genpept format
bioinformatics
format
formats
edam
Refseq protein entry sequence format.
beta12orEarlier
selex sequence format
true
Selex sequence format.
edam
beta12orEarlier
beta12orEarlier
format
formats
bioinformatics
Staden format
edam
beta12orEarlier
Staden suite sequence format.
formats
bioinformatics
format
Stockholm format
formats
Stockholm multiple sequence alignment format (used by Pfam and Rfam).
edam
format
bioinformatics
beta12orEarlier
strider format
formats
beta12orEarlier
DNA strider output sequence format.
edam
format
bioinformatics
UniProtKB format
beta12orEarlier
UniProtKB entry sequence format.
formats
format
bioinformatics
edam
plain text format (unformatted)
formats
Plain text sequence format (essentially unformatted).
beta12orEarlier
edam
format
bioinformatics
treecon sequence format
true
Treecon output sequence format.
edam
bioinformatics
format
beta12orEarlier
beta12orEarlier
formats
ASN.1 sequence format
format
bioinformatics
beta12orEarlier
NCBI ASN.1-based sequence format.
formats
edam
DAS format
formats
edam
bioinformatics
beta12orEarlier
format
DAS sequence (XML) format (any type).
das sequence format
dasdna
format
formats
bioinformatics
The use of this format is deprecated.
edam
DAS sequence (XML) format (nucleotide-only).
beta12orEarlier
debug-seq
format
edam
bioinformatics
beta12orEarlier
EMBOSS debugging trace sequence format of full internal data content.
formats
jackknifernon
bioinformatics
format
Jackknifer output sequence non-interleaved format.
beta12orEarlier
edam
formats
meganon sequence format
true
formats
edam
bioinformatics
Mega non-interleaved output sequence format.
format
beta12orEarlier
beta12orEarlier
NCBI format
beta12orEarlier
There are several variants of this.
bioinformatics
NCBI FASTA sequence format with NCBI-style IDs.
edam
formats
format
nexusnon
bioinformatics
beta12orEarlier
edam
Nexus/paup non-interleaved sequence format.
formats
format
GFF2
format
formats
edam
General Feature Format (GFF) of sequence features.
bioinformatics
beta12orEarlier
GFF3
Generic Feature Format version 3 (GFF3) of sequence features.
format
formats
beta12orEarlier
bioinformatics
edam
pir
format
beta12orEarlier
formats
PIR feature format.
bioinformatics
edam
swiss feature
true
bioinformatics
beta12orEarlier
edam
beta12orEarlier
format
Swiss-Prot feature format.
formats
DASGFF
edam
format
beta12orEarlier
DASGFF feature
bioinformatics
formats
DAS GFF (XML) feature format.
das feature
debug-feat
bioinformatics
format
edam
formats
beta12orEarlier
EMBOSS debugging trace feature format of full internal data content.
EMBL feature
true
beta12orEarlier
edam
format
formats
bioinformatics
beta12orEarlier
EMBL feature format.
GenBank feature
true
bioinformatics
edam
beta12orEarlier
format
Genbank feature format.
beta12orEarlier
formats
ClustalW format
bioinformatics
ClustalW format for (aligned) sequences.
formats
format
beta12orEarlier
edam
debug
beta12orEarlier
formats
edam
format
bioinformatics
EMBOSS alignment format for debugging trace of full internal data content.
FASTA-aln
edam
Fasta format for (aligned) sequences.
bioinformatics
format
beta12orEarlier
formats
markx0
Pearson MARKX0 alignment format.
beta12orEarlier
format
formats
edam
bioinformatics
markx1
edam
beta12orEarlier
bioinformatics
format
Pearson MARKX1 alignment format.
formats
markx10
bioinformatics
format
formats
Pearson MARKX10 alignment format.
beta12orEarlier
edam
markx2
format
bioinformatics
Pearson MARKX2 alignment format.
beta12orEarlier
formats
edam
markx3
Pearson MARKX3 alignment format.
format
formats
beta12orEarlier
edam
bioinformatics
match
edam
beta12orEarlier
bioinformatics
format
formats
Alignment format for start and end of matches between sequence pairs.
mega
format
beta12orEarlier
formats
Mega format for (typically aligned) sequences.
bioinformatics
edam
meganon
beta12orEarlier
Mega non-interleaved format for (typically aligned) sequences.
format
formats
bioinformatics
edam
msf alignment format
true
bioinformatics
formats
MSF format for (aligned) sequences.
beta12orEarlier
edam
format
beta12orEarlier
nexus alignment format
true
edam
format
formats
beta12orEarlier
beta12orEarlier
bioinformatics
Nexus/paup format for (aligned) sequences.
nexusnon alignment format
true
format
edam
beta12orEarlier
Nexus/paup non-interleaved format for (aligned) sequences.
bioinformatics
formats
beta12orEarlier
pair
format
bioinformatics
beta12orEarlier
formats
EMBOSS simple sequence pair alignment format.
edam
Phylip format
formats
bioinformatics
edam
Phylip format for (aligned) sequences.
beta12orEarlier
format
Phylipnon
formats
bioinformatics
Phylip non-interleaved format for (aligned) sequences.
beta12orEarlier
edam
format
scores format
edam
formats
Alignment format for score values for pairs of sequences.
beta12orEarlier
bioinformatics
format
selex
SELEX format for (aligned) sequences.
formats
beta12orEarlier
edam
format
bioinformatics
EMBOSS simple format
formats
edam
beta12orEarlier
format
bioinformatics
EMBOSS simple multiple alignment format.
srs format
Simple multiple sequence (alignment) format for SRS.
formats
bioinformatics
format
beta12orEarlier
edam
srspair
bioinformatics
edam
formats
format
Simple sequence pair (alignment) format for SRS.
beta12orEarlier
T-Coffee format
bioinformatics
beta12orEarlier
formats
T-Coffee program alignment format.
edam
format
TreeCon-seq
beta12orEarlier
formats
Treecon format for (aligned) sequences.
edam
format
bioinformatics
Phylogenetic tree format
bioinformatics
beta12orEarlier
Data format for a phylogenetic tree.
format
formats
edam
Biological pathway or network format
Data format for a biological pathway or network.
bioinformatics
format
formats
edam
beta12orEarlier
Sequence-profile alignment format
format
bioinformatics
Data format for a sequence-profile alignment.
edam
beta12orEarlier
formats
Sequence-profile alignment (HMM) format
true
Data format for a sequence-HMM profile alignment.
formats
bioinformatics
beta12orEarlier
edam
format
beta12orEarlier
Amino acid index format
bioinformatics
beta12orEarlier
edam
formats
format
Data format for an amino acid index.
Article format
formats
format
Data format for a full-text scientific article.
bioinformatics
edam
Literature format
beta12orEarlier
Text mining report format
formats
Data format for an abstract (report) from text mining.
format
beta12orEarlier
edam
bioinformatics
Enzyme kinetics report format
bioinformatics
edam
format
formats
beta12orEarlier
Data format for reports on enzyme kinetics.
Small molecule report format
format
Format of a report on a chemical compound.
bioinformatics
Chemical compound annotation format
edam
formats
beta12orEarlier
Gene annotation format
bioinformatics
format
formats
beta12orEarlier
Format of a report on a particular locus, gene, gene system or groups of genes.
edam
Workflow format
formats
Format of a workflow.
beta12orEarlier
edam
format
bioinformatics
Tertiary structure format
edam
Data format for a molecular tertiary structure.
beta12orEarlier
formats
bioinformatics
format
Biological model format
true
formats
edam
1.2
beta12orEarlier
bioinformatics
format
Data format for a biological model.
Chemical formula format
bioinformatics
edam
format
formats
Text format of a chemical formula.
beta12orEarlier
Phylogenetic character data format
bioinformatics
Format of raw (unplotted) phylogenetic data.
format
formats
edam
beta12orEarlier
Phylogenetic continuous quantitative character format
beta12orEarlier
edam
Format of phylogenetic continuous quantitative character data.
bioinformatics
formats
format
Phylogenetic discrete states format
edam
format
beta12orEarlier
bioinformatics
Format of phylogenetic discrete states data.
formats
Phylogenetic tree report (cliques) format
format
beta12orEarlier
bioinformatics
Format of phylogenetic cliques data.
formats
edam
Phylogenetic tree report (invariants) format
formats
format
Format of phylogenetic invariants data.
edam
bioinformatics
beta12orEarlier
Electron microscopy model format
true
bioinformatics
Annotation format for electron microscopy models.
format
beta12orEarlier
beta12orEarlier
formats
edam
Phylogenetic tree report (tree distances) format
edam
bioinformatics
formats
beta12orEarlier
format
Format for phylogenetic tree distance data.
Polymorphism report format
true
formats
format
1.0
Format for sequence polymorphism data.
bioinformatics
beta12orEarlier
edam
Protein family report format
formats
edam
format
bioinformatics
beta12orEarlier
Format for reports on a protein family.
Molecular interaction format
Format for molecular interaction data.
bioinformatics
beta12orEarlier
edam
format
formats
Sequence assembly format
Format for sequence assembly data.
bioinformatics
formats
beta12orEarlier
format
edam
Microarray experiment data format
Format for microarray experimental data.
beta12orEarlier
formats
edam
format
bioinformatics
Sequence trace format
format
beta12orEarlier
edam
formats
bioinformatics
Format for sequence trace data (i.e. including base call information).
Gene expression report format
beta12orEarlier
bioinformatics
edam
Format for a report on gene expression.
format
formats
Genotype and phenotype annotation format
true
bioinformatics
beta12orEarlier
format
edam
Format of a report on genotype / phenotype information.
beta12orEarlier
formats
Map format
bioinformatics
format
beta12orEarlier
formats
edam
Format of a map of (typically one) molecular sequence annotated with features.
Nucleic acid features (primers) format
formats
Format of a report on PCR primers or hybridization oligos in a nucleic acid sequence.
format
edam
beta12orEarlier
bioinformatics
Protein report format
beta12orEarlier
format
formats
bioinformatics
edam
Format of a report of general information about a specific protein.
Protein report (enzyme) format
true
bioinformatics
format
edam
beta12orEarlier
beta12orEarlier
Format of a report of general information about a specific enzyme.
formats
3D-1D scoring matrix format
formats
Format of a matrix of 3D-1D scores (amino acid environment probabilities).
beta12orEarlier
format
edam
bioinformatics
Protein structure report (quality evaluation) format
edam
formats
beta12orEarlier
Format of a report on the quality of a protein three-dimensional model.
bioinformatics
format
Database hits (sequence) format
beta12orEarlier
formats
format
Format of a report on sequence hits and associated data from searching a sequence database.
bioinformatics
edam
Sequence distance matrix format
Format of a matrix of genetic distances between molecular sequences.
beta12orEarlier
bioinformatics
format
edam
formats
Sequence motif format
beta12orEarlier
format
Format of a sequence motif.
edam
bioinformatics
formats
Sequence profile format
edam
formats
bioinformatics
beta12orEarlier
Format of a sequence profile.
format
Hidden Markov model format
bioinformatics
Format of a hidden Markov model.
edam
formats
format
beta12orEarlier
Dirichlet distribution format
format
formats
bioinformatics
beta12orEarlier
edam
Data format of a dirichlet distribution.
HMM emission and transition counts format
beta12orEarlier
format
edam
formats
bioinformatics
Data format for the emission and transition counts of a hidden Markov model.
RNA secondary structure format
Format for secondary structure (predicted or real) of an RNA molecule.
edam
format
bioinformatics
beta12orEarlier
formats
Protein secondary structure format
beta12orEarlier
format
Format for secondary structure (predicted or real) of a protein molecule.
edam
formats
bioinformatics
Sequence range format
format
formats
edam
bioinformatics
Format used to specify range(s) of sequence positions.
beta12orEarlier
pure
bioinformatics
beta12orEarlier
formats
edam
Alphabet for molecular sequence with possible unknown positions but without non-sequence characters.
format
unpure
formats
edam
bioinformatics
beta12orEarlier
Alphabet for a molecular sequence with possible unknown positions but possibly with non-sequence characters.
format
unambiguous sequence
formats
format
bioinformatics
beta12orEarlier
edam
Alphabet for a molecular sequence with possible unknown positions but without ambiguity characters.
ambiguous
edam
formats
format
Alphabet for a molecular sequence with possible unknown positions and possible ambiguity characters.
beta12orEarlier
bioinformatics
Sequence features (repeats) format
Format used for map of repeats in molecular (typically nucleotide) sequences.
format
bioinformatics
edam
beta12orEarlier
formats
Nucleic acid features (restriction sites) format
Format used for report on restriction enzyme recognition sites in nucleotide sequences.
bioinformatics
formats
format
beta12orEarlier
edam
Gene features (coding region) format
formats
Format used for report on coding regions in nucleotide sequences.
beta12orEarlier
format
bioinformatics
edam
Sequence cluster format
format
edam
Format used for clusters of molecular sequences.
bioinformatics
formats
beta12orEarlier
Sequence cluster format (protein)
beta12orEarlier
Format used for clusters of protein sequences.
bioinformatics
edam
format
formats
Sequence cluster format (nucleic acid)
beta12orEarlier
edam
format
bioinformatics
formats
Format used for clusters of nucleotide sequences.
Gene cluster format
true
formats
Format used for clusters of genes.
edam
format
beta12orEarlier
beta13
bioinformatics
EMBL-like (text)
A text format resembling EMBL entry format.
bioinformatics
edam
formats
beta12orEarlier
format
This concept may be used for the many non-standard EMBL-like text formats.
FASTQ-like format (text)
beta12orEarlier
formats
This concept may be used for non-standard FASTQ short read-like formats.
format
bioinformatics
A text format resembling FASTQ short read format.
edam
EMBLXML
edam
formats
beta12orEarlier
bioinformatics
format
XML format for EMBL entries.
cdsxml
bioinformatics
format
beta12orEarlier
edam
XML format for EMBL entries.
formats
insdxml
formats
format
XML format for EMBL entries.
edam
bioinformatics
beta12orEarlier
geneseq
format
edam
beta12orEarlier
Geneseq sequence format.
formats
bioinformatics
UniProt-like (text)
bioinformatics
format
formats
edam
beta12orEarlier
A text sequence format resembling uniprotkb entry format.
UniProt format
formats
UniProt entry sequence format.
format
beta12orEarlier
bioinformatics
edam
ipi
formats
bioinformatics
format
ipi sequence format.
edam
beta12orEarlier
medline
format
beta12orEarlier
bioinformatics
formats
Abstract format used by MedLine database.
edam
Ontology format
edam
bioinformatics
beta12orEarlier
format
Format used for ontologies.
formats
OBO format
A serialisation format conforming to the Open Biomedical Ontologies (OBO) model.
bioinformatics
format
beta12orEarlier
edam
formats
OWL format
formats
A serialisation format conforming to the Web Ontology Language (OWL) model.
beta12orEarlier
edam
bioinformatics
format
FASTA-like (text)
edam
beta12orEarlier
This concept may also be used for the many non-standard FASTA-like formats.
formats
bioinformatics
A text format resembling FASTA format.
format
Sequence record full format
bioinformatics
format
edam
formats
Data format for a molecular sequence record, typically corresponding to a full entry from a molecular sequence database.
beta12orEarlier
Sequence record lite format
Data format for a molecular sequence record 'lite', typically molecular sequence and minimal metadata, such as an identifier of the sequence and/or a comment.
bioinformatics
format
edam
formats
beta12orEarlier
EMBL format (XML)
formats
edam
format
bioinformatics
This is a placeholder for other more specific concepts. It should not normally be used for annotation.
An XML format for EMBL entries.
beta12orEarlier
GenBank-like format (text)
beta12orEarlier
edam
This concept may be used for the non-standard GenBank-like text formats.
bioinformatics
formats
A text format resembling GenBank entry (plain text) format.
format
Sequence feature table format (text)
bioinformatics
format
beta12orEarlier
Text format for a sequence feature table.
edam
formats
Strain data format
true
format
1.0
beta12orEarlier
formats
edam
bioinformatics
Format of a report on organism strain data / cell line.
CIP strain data format
true
formats
beta12orEarlier
edam
beta12orEarlier
bioinformatics
format
Format for a report of strain data as used for CIP database entries.
phylip property values
true
formats
beta12orEarlier
beta12orEarlier
edam
PHYLIP file format for phylogenetic property data.
bioinformatics
format
STRING entry format (HTML)
true
formats
Entry format (HTML) for the STRING database of protein interaction.
beta12orEarlier
format
edam
beta12orEarlier
bioinformatics
STRING entry format (XML)
format
beta12orEarlier
formats
edam
Entry format (XML) for the STRING database of protein interaction.
bioinformatics
GFF
bioinformatics
beta12orEarlier
GFF feature format (of indeterminate version).
formats
format
edam
GTF
edam
beta12orEarlier
bioinformatics
Gene Transfer Format (GTF), a restricted version of GFF.
format
formats
FASTA-HTML
format
bioinformatics
formats
beta12orEarlier
FASTA format wrapped in HTML elements.
edam
EMBL-HTML
beta12orEarlier
formats
format
edam
bioinformatics
EMBL entry format wrapped in HTML elements.
BioCyc enzyme report format
true
bioinformatics
formats
format
edam
Format of an entry from the BioCyc enzyme database.
beta12orEarlier
beta12orEarlier
ENZYME enzyme report format
true
formats
format
beta12orEarlier
beta12orEarlier
Format of an entry from the Enzyme nomenclature database (ENZYME).
bioinformatics
edam
PseudoCAP gene report format
true
Format of a report on a gene from the PseudoCAP database.
edam
beta12orEarlier
format
beta12orEarlier
formats
bioinformatics
GeneCards gene report format
true
bioinformatics
edam
beta12orEarlier
format
beta12orEarlier
Format of a report on a gene from the GeneCards database.
formats
beta12orEarlier
formats
edam
bioinformatics
format
Textual format
Textual format.
Data in text format can be compressed into binary format, or can be a value of an XML element or attribute. Markup formats are not considered textual (or more precisely, not plain-textual).
Tabular format
Plain text
Many textual formats used in bioinformatics are tabular (tab-separated values, TSV). Typically with an additional header in their own format.
Tabular format
fileext.com synonyms are in general narrower in the sense of being defined only as file formats, as opposed to data formats.
fileext.com synonyms are in general narrower in the sense of being defined only as file formats, as opposed to data formats.
beta12orEarlier
formats
edam
bioinformatics
format
HTML
HTML format.
Hypertext Markup Language
fileext.com synonyms are in general narrower in the sense of being defined only as file formats, as opposed to data formats.
beta12orEarlier
formats
edam
bioinformatics
format
XML
eXtensible Markup Language (XML) format.
Data in XML format can be serialised into text, or binary format.
Extensible Markup Language
fileext.com synonyms are in general narrower in the sense of being defined only as file formats, as opposed to data formats.
Binary format
edam
formats
bioinformatics
Binary format.
beta12orEarlier
Only specific native binary formats are listed under 'Binary format' in EDAM. Generic binary formats - such as any data being zipped, or any XML data being serialised into the Efficient XML Interchange (EXI) format - are not modelled in EDAM. Refer to http://wsio.org/compression_004.
format
URI format
true
bioinformatics
format
beta12orEarlier
Typical textual representation of a URI.
formats
edam
beta13
NCI-Nature pathway entry format
true
beta12orEarlier
format
The format of an entry from the NCI-Nature pathways database.
bioinformatics
beta12orEarlier
formats
edam
Format (typed)
edam
A broad class of format distinguished by the scientific nature of the data that is identified.
format
beta12orEarlier
This concept exists only to assist EDAM maintenance and navigation in graphical browsers. It does not add semantic information. The concept branch under 'Format (typed)' provides an alternative organisation of the concepts nested under the other top-level branches ('Binary', 'HTML', 'RDF', 'Text' and 'XML'. All concepts under here are already included under those branches.
bioinformatics
formats
BioXSD
BioXSD XML format of basic bioinformatics types of data (sequence records, alignments, feature records, references to resources, and more).
formats
bioinformatics
beta12orEarlier
format
BioXSD XML format
edam
RDF format
A serialisation format conforming to the Resource Description Framework (RDF) model.
beta12orEarlier
bioinformatics
format
edam
formats
GenBank-HTML
Genbank entry format wrapped in HTML elements.
bioinformatics
beta12orEarlier
edam
formats
format
Protein features (domains) format
true
formats
Format of a report on protein features (domain composition).
bioinformatics
beta12orEarlier
beta12orEarlier
format
edam
EMBL-like format
edam
formats
This concept may be used for the many non-standard EMBL-like formats.
A format resembling EMBL entry (plain text) format.
bioinformatics
format
beta12orEarlier
FASTQ-like format
bioinformatics
beta12orEarlier
This concept may be used for non-standard FASTQ short read-like formats.
formats
format
A format resembling FASTQ short read format.
edam
FASTA-like
format
edam
A format resembling FASTA format.
formats
beta12orEarlier
bioinformatics
This concept may be used for the many non-standard FASTA-like formats.
uniprotkb-like format
A sequence format resembling uniprotkb entry format.
beta12orEarlier
format
edam
formats
bioinformatics
Sequence feature table format
edam
beta12orEarlier
format
Format for a sequence feature table.
bioinformatics
formats
OBO
OBO ontology text format.
format
formats
bioinformatics
edam
beta12orEarlier
OBO-XML
edam
OBO ontology XML format.
bioinformatics
formats
beta12orEarlier
format
Sequence record format (text)
formats
bioinformatics
format
beta12orEarlier
edam
Data format for a molecular sequence record.
Sequence record format (XML)
format
bioinformatics
Data format for a molecular sequence record.
edam
formats
beta12orEarlier
Sequence feature table format (XML)
formats
bioinformatics
beta12orEarlier
format
XML format for a sequence feature table.
edam
Alignment format (text)
beta12orEarlier
bioinformatics
edam
format
formats
Text format for molecular sequence alignment information.
Alignment format (XML)
beta12orEarlier
bioinformatics
edam
formats
XML format for molecular sequence alignment information.
format
Phylogenetic tree format (text)
formats
bioinformatics
Text format for a phylogenetic tree.
beta12orEarlier
edam
format
Phylogenetic tree format (XML)
formats
format
XML format for a phylogenetic tree.
edam
beta12orEarlier
bioinformatics
EMBL-like (XML)
edam
beta12orEarlier
formats
format
This concept may be used for the any non-standard EMBL-like XML formats.
bioinformatics
An XML format resembling EMBL entry format.
GenBank-like format
bioinformatics
This concept may be used for the non-standard GenBank-like formats.
edam
formats
format
beta12orEarlier
A format resembling GenBank entry (plain text) format.
STRING entry format
true
bioinformatics
formats
beta12orEarlier
Entry format for the STRING database of protein interaction.
format
edam
beta12orEarlier
Sequence assembly format (text)
Text format for sequence assembly data.
bioinformatics
edam
format
beta12orEarlier
formats
Amino acid identifier format
true
beta12orEarlier
bioinformatics
Text format (representation) of amino acid residues.
format
beta13
edam
formats
completely unambiguous
formats
Alphabet for a molecular sequence without any unknown positions or ambiguity characters.
format
edam
beta12orEarlier
bioinformatics
completely unambiguous pure
Alphabet for a molecular sequence without unknown positions, ambiguity or non-sequence characters.
edam
bioinformatics
format
beta12orEarlier
formats
completely unambiguous pure nucleotide
edam
format
beta12orEarlier
Alphabet for a nucleotide sequence (characters ACGTU only) without unknown positions, ambiguity or non-sequence characters .
bioinformatics
formats
completely unambiguous pure dna
Alphabet for a DNA sequence (characters ACGT only) without unknown positions, ambiguity or non-sequence characters.
formats
bioinformatics
edam
format
beta12orEarlier
completely unambiguous pure rna sequence
formats
format
beta12orEarlier
Alphabet for an RNA sequence (characters ACGU only) without unknown positions, ambiguity or non-sequence characters.
edam
bioinformatics
Raw sequence format
Format of a raw molecular sequence (i.e. the alphabet used).
beta12orEarlier
formats
edam
format
bioinformatics
BAM
BAM format, the binary, BGZF-formatted compressed version of SAM format for alignment of nucleotide sequences (e.g. sequencing reads) to (a) reference sequence(s). May contain base-call and alignment qualities and other data.
beta12orEarlier
format
formats
edam
bioinformatics
SAM
edam
Sequence Alignment/Map (SAM) format for alignment of nucleotide sequences (e.g. sequencing reads) to (a) reference sequence(s). May contain base-call and alignment qualities and other data.
beta12orEarlier
The format supports short and long reads (up to 128Mbp) produced by different sequencing platforms and is used to hold mapped data within the GATK and across the Broad Institute, the Sanger Centre, and throughout the 1000 Genomes project.
format
formats
bioinformatics
SBML
formats
beta12orEarlier
edam
format
Systems Biology Markup Language (SBML), the standard XML format for models of biological processes such as for example metabolism, cell signaling, and gene regulation.
bioinformatics
completely unambiguous pure protein
edam
formats
format
Alphabet for any protein sequence without unknown positions, ambiguity or non-sequence characters.
beta12orEarlier
bioinformatics
Bibliographic reference format
bioinformatics
formats
beta12orEarlier
edam
Format of a bibliographic reference.
format
Sequence annotation track format
bioinformatics
formats
edam
beta12orEarlier
Format of a sequence annotation track.
format
Alignment format (pair only)
format
Data format for molecular sequence alignment information that can hold sequence alignment(s) of only 2 sequences.
formats
bioinformatics
edam
beta12orEarlier
Sequence variation annotation format
bioinformatics
edam
beta12orEarlier
Format of sequence variation annotation.
formats
format
markx0 variant
format
formats
edam
beta12orEarlier
Some variant of Pearson MARKX alignment format.
bioinformatics
mega variant
beta12orEarlier
edam
Some variant of Mega format for (typically aligned) sequences.
bioinformatics
format
formats
Phylip format variant
bioinformatics
Some variant of Phylip format for (aligned) sequences.
format
edam
formats
beta12orEarlier
AB1
format
AB1 uses the generic binary Applied Biosystems, Inc. Format (ABIF).
AB1 binary format of raw DNA sequence reads (output of Applied Biosystems' sequencing analysis software). Contains an electropherogram and the DNA base sequence.
bioinformatics
formats
edam
beta12orEarlier
ACE
format
edam
bioinformatics
beta12orEarlier
ACE sequence assembly format including contigs, base-call qualities, and other metadata (version Aug 1998 and onwards).
formats
BED
BED detail format includes 2 additional columns (http://genome.ucsc.edu/FAQ/FAQformat#format1.7) and BED 15 includes 3 additional columns for experiment scores (http://genomewiki.ucsc.edu/index.php/Microarray_track).
Browser Extensible Data (BED) format of sequence annotation track, typically to be displayed in a genome browser.
edam
bioinformatics
beta12orEarlier
format
formats
bigBed
bigBed format for large sequence annotation tracks, similar to textual BED format.
format
beta12orEarlier
bioinformatics
formats
edam
WIG
edam
bioinformatics
Wiggle format (WIG) of a sequence annotation track that consists of a value for each sequence position. Typically to be displayed in a genome browser.
format
beta12orEarlier
formats
bigWig
formats
format
beta12orEarlier
bigWig format for large sequence annotation tracks that consist of a value for each sequence position. Similar to textual WIG format.
edam
bioinformatics
PSL
beta12orEarlier
formats
edam
bioinformatics
format
PSL format of alignments, typically generated by BLAT or psLayout. Can be displayed in a genome browser like a sequence annotation track.
MAF
beta12orEarlier
format
Typically generated by Multiz and TBA aligners; can be displayed in a genome browser like a sequence annotation track. This should not be confused with MIRA Assembly Format or Mutation Annotation Format.
formats
edam
Multiple Alignment Format (MAF) supporting alignments of whole genomes with rearrangements, directions, multiple pieces to the alignment, and so forth.
bioinformatics
2bit
format
edam
2bit binary format of nucleotide sequences using 2 bits per nucleotide. In addition encodes unknown nucleotides and lower-case 'masking'.
formats
bioinformatics
beta12orEarlier
.nib
formats
format
beta12orEarlier
.nib (nibble) binary format of a nucleotide sequence using 4 bits per nucleotide (including unknown) and its lower-case 'masking'.
bioinformatics
edam
genePred
bioinformatics
beta12orEarlier
edam
formats
format
genePred format has 3 main variations (http://genome.ucsc.edu/FAQ/FAQformat#format9 http://www.broadinstitute.org/software/igv/genePred). They reflect UCSC Browser DB tables.
genePred table format for gene prediction tracks.
pgSnp
edam
beta12orEarlier
bioinformatics
formats
format
Personal Genome SNP (pgSnp) format for sequence variation tracks (indels and polymorphisms), supported by the UCSC Genome Browser.
axt
axt format of alignments, typically produced from BLASTZ.
format
beta12orEarlier
edam
formats
bioinformatics
LAV
beta12orEarlier
format
LAV format of alignments generated by BLASTZ and LASTZ.
bioinformatics
formats
edam
Pileup
bioinformatics
beta12orEarlier
format
formats
Pileup format of alignment of sequences (e.g. sequencing reads) to (a) reference sequence(s). Contains aligned bases per base of the reference sequence(s).
edam
VCF
Variant Call Format (VCF) for sequence variation (indels, polymorphisms, structural variation).
bioinformatics
beta12orEarlier
edam
formats
format
SRF
edam
beta12orEarlier
formats
bioinformatics
Sequence Read Format (SRF) of sequence trace data. Supports submission to the NCBI Short Read Archive.
format
ZTR
edam
format
formats
ZTR format for storing chromatogram data from DNA sequencing instruments.
beta12orEarlier
bioinformatics
GVF
edam
beta12orEarlier
formats
format
bioinformatics
Genome Variation Format (GVF). A GFF3-compatible format with defined header and attribute tags for sequence variation.
BCF
edam
format
BCF, the binary version of Variant Call Format (VCF) for sequence variation (indels, polymorphisms, structural variation).
bioinformatics
beta12orEarlier
formats
Matrix format
beta13
formats
format
Format of a matrix (array) of numerical values.
bioinformatics
edam
Protein domain classification format
bioinformatics
beta13
formats
edam
Format of data concerning the classification of the sequences and/or structures of protein structural domain(s).
format
Raw SCOP domain classification format
edam
These are the parsable data files provided by SCOP.
beta13
format
formats
Format of raw SCOP domain classification data files.
bioinformatics
Raw CATH domain classification format
bioinformatics
edam
format
Format of raw CATH domain classification data files.
beta13
These are the parsable data files provided by CATH.
formats
CATH domain report format
bioinformatics
The report (for example http://www.cathdb.info/domain/1cukA01) includes CATH codes for levels in the hierarchy for the domain, level descriptions and relevant data and links.
format
beta13
Format of summary of domain classification information for a CATH domain.
edam
formats
SBRML
format
1.0
bioinformatics
formats
edam
Systems Biology Result Markup Language (SBRML), the standard XML format for simulated or calculated results (e.g. trajectories) of systems biology models.
BioPAX
edam
format
formats
1.0
BioPAX is an exchange format for pathway data, with its data model defined in OWL.
bioinformatics
EBI Application Result XML
bioinformatics
edam
format
1.0
EBI Application Result XML is a format returned by sequence similarity search Web services at EBI.
formats
PSI MI XML (MIF)
edam
XML Molecular Interaction Format (MIF), standardised by HUPO PSI MI.
formats
bioinformatics
MIF
1.0
format
phyloXML
formats
1.0
bioinformatics
edam
format
phyloXML is a standardised XML format for phylogenetic trees, networks, and associated data.
NeXML
formats
bioinformatics
NeXML is a standardised XML format for rich phyloinformatic data.
edam
1.0
format
MAGE-ML
formats
1.0
MAGE-ML XML format for microarray expression data, standardised by MGED (now FGED).
format
bioinformatics
edam
MAGE-TAB
MAGE-TAB textual format for microarray expression data, standardised by MGED (now FGED).
formats
1.0
bioinformatics
edam
format
GCDML
bioinformatics
GCDML XML format for genome and metagenome metadata according to MIGS/MIMS/MIMARKS information standards, standardised by the Genomic Standards Consortium (GSC).
format
edam
1.0
formats
GTrack
GTrack is an optimised tabular format for genome/sequence feature tracks unifying the power of other tabular formats (e.g. GFF3, BED, WIG).
bioinformatics
format
edam
formats
1.0
Biological pathway or network report format
edam
formats
bioinformatics
format
beta12orEarlier
Data format for a report of information derived from a biological pathway or network.
Experiment annotation format
format
edam
bioinformatics
formats
Data format for annotation on a laboratory experiment.
beta12orEarlier
Cytoband format
format
formats
edam
1.2
Reflects a UCSC Browser DB table.
Cytoband format for chromosome cytobands.
bioinformatics
CopasiML
formats
CopasiML, the native format of COPASI.
bioinformatics
edam
format
1.2
CellML
1.2
formats
bioinformatics
edam
CellML, the format for mathematical models of biological and other networks.
format
1.2
formats
edam
bioinformatics
format
PSI MI TAB (MITAB)
Tabular Molecular Interaction format (MITAB), standardised by HUPO PSI MI.
1.2
formats
edam
bioinformatics
format
PSI-PAR
Protein affinity format (PSI-PAR), standardised by HUPO PSI MI. It is compatible with PSI MI XML (MIF) and uses the same XML Schema.
1.2
formats
edam
bioinformatics
format
mzML
mzML format for raw spectrometer output data, standardised by HUPO PSI MSS.
mzML is the successor and unifier of the mzData format developed by PSI and mzXML developed at the Seattle Proteome Center.
1.2
formats
bioinformatics
edam
format
Mass spectrometry data format
Format for mass spectrometry data.
1.2
formats
edam
bioinformatics
format
TraML
TraML (Transition Markup Language) is the format for mass spectrometry transitions, standardised by HUPO PSI MSS.
1.2
formats
edam
bioinformatics
format
mzIdentML
mzIdentML is the exchange format for peptides and proteins identified from mass spectra, standardised by HUPO PSI PI. It can be used for outputs of proteomics search engines.
1.2
formats
edam
bioinformatics
format
mzQuantML
mzQuantML is the format for quantitation values associated with peptides, proteins and small molecules from mass spectra, standardised by HUPO PSI PI. It can be used for outputs of quantitation software for proteomics.
1.2
formats
edam
bioinformatics
format
GelML
GelML is the format for describing the process of gel electrophoresis, standardised by HUPO PSI PS.
1.2
formats
edam
bioinformatics
format
spML
spML is the format for describing proteomics sample processing, other than using gels, prior to mass spectrometric protein identification, standardised by HUPO PSI PS. It may also be applicable for metabolomics.
OWL Functional Syntax
A human-readable encoding for the Web Ontology Language (OWL).
1.2
bioinformatics
format
edam
formats
Manchester OWL Syntax
A syntax for writing OWL class expressions.
This format was influenced by the OWL Abstract Syntax and the DL style syntax.
1.2
bioinformatics
format
edam
formats
KRSS2 Syntax
A superset of the "Description-Logic Knowledge Representation System Specification from the KRSS Group of the ARPA Knowledge Sharing Effort".
This format is used in Protege 4.
1.2
bioinformatics
format
edam
formats
Turtle
The Terse RDF Triple Language (Turtle) is a human-friendly serialization format for RDF (Resource Description Framework) graphs.
The SPARQL Query Language incorporates a very similar syntax.
1.2
bioinformatics
format
edam
formats
N-Triples
A plain text serialisation format for RDF (Resource Description Framework) graphs, and a subset of the Turtle (Terse RDF Triple Language) format.
N-Triples should not be confused with Notation 3 which is a superset of Turtle.
1.2
bioinformatics
format
edam
formats
Notation3
A shorthand non-XML serialization of Resource Description Framework model, designed with human-readability in mind.
N3
bioinformatics
format
edam
formats
RDF/XML
Resource Description Framework (RDF) XML format.
RDF/XML is a serialization syntax for OWL DL, but not for OWL Full.
1.2
bioinformatics
format
RDF
edam
formats
OWL/XML
formats
OWL ontology XML serialisation format.
1.2
OWL
edam
bioinformatics
format
beta12orEarlier
operations
edam
bioinformatics
operation
Operation
A function that processes a set of inputs and results in a set of outputs, or associates arguments (inputs) with values (outputs). Special cases are: a) An operation that consumes no input (has no input arguments). Such operation is either a constant function, or an operation depending only on the underlying state. b) An operation that may modify the underlying state but has no output. c) The singular-case operation with no input or output, that still may modify the underlying state.
Function
Computational operation
Computational procedure
Computational method
Computational subroutine
Function (programming)
Lambda abstraction
Mathematical operation
Mathematical function
Process
Computational tool
sumo:Function
Operation is a function that is computational. It typically has input(s) and output(s), which are always data.
Function
Process can have a function (as its quality/attribute), and can also perform an operation with inputs and outputs.
Process
Computational tool provides one or more operations.
Computational tool
Operation is a function that is computational. It typically has input(s) and output(s), which are always data. In addition, one may think of 'biotop:Disposition' (parent of 'biotop:Function') being also a 'biotop:Quality'.
However, operation is not a GFO 'Concept' present only in someone's mind.
GFO 'Perpetuant' is in general broader than operation, but it may be seen narrower in the sense of being a concrete individual and exhibiting presentials.
BFO 'function' is narrower in the sense that it is a 'realizable_entity' (snap:RealizableEntity) and a 'dependent_continuant' (snap:DependentContinuant), and broader in the sense that it does not need to have input(s) and output(s).
Function, including an operation, can be a quality/property of e.g. a computational tool.
Function, including an operation, can have a role of a quality/property in semantic annotation of e.g. a computational tool.
Process can have a function (as its quality/property), and can also have (perform) an operation with inputs and outputs.
Process can have a function (as its quality/property), and can also have (perform) an operation with inputs and outputs.
Process can have a function (as its quality/property), and can also have (perform) an operation with inputs and outputs.
However, one may think that an operation is not a process.
However, one may think that an operation is not a process and not a physical entity.
Method may in addition focus on how to achieve the result, not just on what to achieve as with operation.
Search and retrieval
bioinformatics
operations
operation
Database retrieval
Search or query a data resource and retrieve entries and / or annotation.
beta12orEarlier
edam
Data retrieval (database cross-reference)
true
beta13
operation
operations
Search database to retrieve all relevant references to a particular entity or entry.
beta12orEarlier
edam
bioinformatics
Annotation
bioinformatics
Annotate an entity (typically a biological or biomedical database entity) with terms from a controlled vocabulary.
This is a broad concept and is used a placeholder for other, more specific concepts.
edam
beta12orEarlier
operation
operations
Data indexing
Database indexing
bioinformatics
operations
edam
Generate an index of (typically a file of) biological data.
beta12orEarlier
operation
Data index analysis
edam
beta12orEarlier
Analyse an index of biological data.
bioinformatics
operation
operations
Database index analysis
Annotation retrieval (sequence)
true
beta12orEarlier
Retrieve basic information about a molecular sequence.
operations
bioinformatics
edam
beta12orEarlier
operation
Sequence generation
edam
Generate a molecular sequence by some means.
beta12orEarlier
operation
operations
bioinformatics
Sequence editing
operation
operations
beta12orEarlier
bioinformatics
edam
Edit or change a molecular sequence, either randomly or specifically.
Sequence merging
operations
Sequence splicing
operation
Merge two or more (typically overlapping) molecular sequences.
beta12orEarlier
edam
bioinformatics
Sequence conversion
operation
beta12orEarlier
operations
bioinformatics
edam
Convert a molecular sequence from one type to another.
Sequence complexity calculation
operations
edam
Calculate sequence complexity, for example to find low-complexity regions in sequences.
operation
beta12orEarlier
bioinformatics
Sequence ambiguity calculation
beta12orEarlier
operations
Calculate sequence ambiguity, for example identity regions in protein or nucleotide sequences with many ambiguity codes.
bioinformatics
operation
edam
Sequence composition calculation
edam
Calculate character or word composition or frequency of a molecular sequence.
operations
bioinformatics
beta12orEarlier
operation
Repeat sequence analysis
Repeat sequences include tandem repeats, inverted or palindromic repeats, DNA microsatellites (Simple Sequence Repeats or SSRs), interspersed repeats, maximal duplications and reverse, complemented and reverse complemented repeats etc. Repeat units can be exact or imperfect, in tandem or dispersed, of specified or unspecified length.
edam
beta12orEarlier
Find and/or analyse repeat sequences in (typically nucleotide) sequences.
operation
operations
bioinformatics
Sequence motif discovery
beta12orEarlier
edam
operations
Motif discovery
Discover new motifs or conserved patterns in sequences or sequence alignments (de-novo discovery).
operation
bioinformatics
Motifs and patterns might be conserved or over-represented (occur with improbable frequency).
Sequence motif recognition
bioinformatics
Sequence motif detection
operation
Find (scan for) known motifs, patterns and regular expressions in molecular sequence(s).
edam
beta12orEarlier
operations
Motif detection
Motif recognition
Sequence motif comparison
Find motifs shared by molecular sequences.
beta12orEarlier
operations
edam
operation
bioinformatics
Transcription regulatory sequence analysis
true
beta12orEarlier
operation
edam
Analyse the sequence, conformational or physicochemical properties of transcription regulatory elements in DNA sequences.
beta13
For example transcription factor binding sites (TFBS) analysis to predict accessibility of DNA to binding factors.
bioinformatics
operations
Conserved transcription regulatory sequence identification
operation
Identify common, conserved (homologous) or synonymous transcriptional regulatory motifs (transcription factor binding sites).
beta12orEarlier
bioinformatics
operations
edam
For example cross-species comparison of transcription factor binding sites (TFBS). Methods might analyse co-regulated or co-expressed genes, or sets of oppositely expressed genes.
Protein property calculation (from structure)
operations
This might be a residue-level search for properties such as solvent accessibility, hydropathy, secondary structure, ligand-binding etc.
bioinformatics
Protein structural property calculation
operation
edam
beta12orEarlier
Extract, calculate or predict non-positional (physical or chemical) properties of a protein from processing a protein (3D) structure.
Protein flexibility and motion analysis
bioinformatics
Use this concept for analysis of flexible and rigid residues, local chain deformability, regions undergoing conformational change, molecular vibrations or fluctuational dynamics, domain motions or other large-scale structural transitions in a protein structure.
Analyse flexibility and motion in protein structure.
operations
beta12orEarlier
operation
edam
Protein structural motif recognition
Protein structural feature identification
beta12orEarlier
operations
operation
This includes conserved substructures and conserved geometry, such as spatial arrangement of secondary structure or protein backbone. Methods might use structure alignment, structural templates, searches for similar electrostatic potential and molecular surface shape, surface-mapping of phylogenetic information etc.
Identify or screen for 3D structural motifs in protein structure(s).
bioinformatics
edam
Protein domain recognition
Identify structural domains in a protein structure from first principles (for example calculations on structural compactness).
bioinformatics
edam
operations
beta12orEarlier
operation
Protein architecture analysis
Analyse the architecture (spatial arrangement of secondary structure) of protein structure(s).
operations
operation
beta12orEarlier
bioinformatics
edam
Residue interaction calculation
operation
WHATIF: SymShellOneXML
WHATIF: SymShellFiveXML
Calculate or extract inter-atomic, inter-residue or residue-atom contacts, distances and interactions in protein structure(s).
bioinformatics
WHATIF:ListContactsNormal
beta12orEarlier
operations
WHATIF:ListSideChainContactsNormal
WHATIF:ListSideChainContactsRelaxed
edam
WHATIF: SymShellTwoXML
WHATIF:ListContactsRelaxed
WHATIF: SymShellTenXML
Torsion angle calculation
operation
bioinformatics
Calculate, visualise or analyse phi/psi angles of a protein structure.
beta12orEarlier
operations
edam
Protein property calculation
Protein property rendering
operation
beta12orEarlier
This includes methods to render and visualise the properties of a protein sequence.
operations
bioinformatics
edam
Calculate (or predict) physical or chemical properties of a protein, including any non-positional properties of the molecular sequence, from processing a protein sequence.
Peptide immunogenicity prediction
edam
bioinformatics
operations
operation
Predict antigenicity, allergenicity / immunogenicity, allergic cross-reactivity etc of peptides and proteins.
This is usually done in the development of peptide-specific antibodies or multi-epitope vaccines. Methods might use sequence data (for example motifs) and / or structural data.
beta12orEarlier
Feature prediction
SO:0000110
bioinformatics
beta12orEarlier
Predict, recognise and identify positional features in molecular sequences such as key functional sites or regions.
operation
edam
operations
Sequence feature detection
Data retrieval (feature table)
true
operations
operation
beta12orEarlier
bioinformatics
edam
Extract a sequence feature table from a sequence database entry.
beta13
Feature table query
beta12orEarlier
edam
operations
Query the features (in a feature table) of molecular sequence(s).
bioinformatics
operation
Feature comparison
edam
Feature table comparison
operations
operation
Compare the feature tables of two or more molecular sequences.
Sequence feature comparison
beta12orEarlier
bioinformatics
Data retrieval (sequence alignment)
true
Display basic information about a sequence alignment.
beta13
operations
beta12orEarlier
edam
operation
bioinformatics
Sequence alignment analysis
beta12orEarlier
operation
bioinformatics
Analyse a molecular sequence alignment.
operations
edam
Sequence alignment comparison
bioinformatics
edam
operation
See also 'Sequence profile alignment'.
beta12orEarlier
Compare (typically by aligning) two molecular sequence alignments.
operations
Sequence alignment conversion
operation
Convert a molecular sequence alignment from one type to another (for example amino acid to coding nucleotide sequence).
edam
beta12orEarlier
operations
bioinformatics
Nucleic acid property processing
true
edam
bioinformatics
beta13
Process (read and / or write) physicochemical property data of nucleic acids.
beta12orEarlier
operations
operation
Nucleic acid property calculation
edam
Calculate or predict physical or chemical properties of nucleic acid molecules, including any non-positional properties of the molecular sequence.
operation
beta12orEarlier
operations
bioinformatics
Splice transcript prediction
operations
Predict splicing alternatives or transcript isoforms from analysis of sequence data.
operation
beta12orEarlier
bioinformatics
edam
Frameshift error detection
operation
Methods include sequence alignment (if related sequences are available) and word-based sequence comparison.
edam
operations
bioinformatics
Detect frameshift errors in DNA sequences (from sequencing projects).
beta12orEarlier
Vector sequence detection
operation
edam
Detect vector sequences in nucleotide sequence, typically by comparison to a set of known vector sequences.
operations
bioinformatics
beta12orEarlier
Protein secondary structure prediction
beta12orEarlier
edam
Predict secondary structure of protein sequences.
operations
Secondary structure prediction (protein)
operation
bioinformatics
Methods might use amino acid composition, local sequence information, multiple sequence alignments, physicochemical features, estimated energy content, statistical algorithms, hidden Markov models, support vector machines, kernel machines, neural networks etc.
Protein super-secondary structure prediction
bioinformatics
Super-secondary structures include leucine zippers, coiled coils, Helix-Turn-Helix etc.
Predict super-secondary structure of protein sequence(s).
operations
edam
operation
beta12orEarlier
Transmembrane protein prediction
edam
beta12orEarlier
bioinformatics
operation
Predict transmembrane proteins or transmembrane (helical) domains or regions in protein sequences.
operations
Transmembrane protein analysis
bioinformatics
operation
operations
Use this (or child) concept for analysis of transmembrane domains (buried and exposed faces), transmembrane helices, helix topology, orientation, inter-helical contacts, membrane dipping (re-entrant) loops and other secondary structure etc. Methods might use pattern discovery, hidden Markov models, sequence alignment, structural profiles, amino acid property analysis, comparison to known domains or some combination (hybrid methods).
Analyse transmembrane protein(s), typically by processing sequence and / or structural data, and write an informative report for example about the protein and its transmembrane domains / regions.
beta12orEarlier
edam
Structure prediction
edam
beta12orEarlier
bioinformatics
operation
Predict tertiary structure of a molecular (biopolymer) sequence.
operations
Residue interaction prediction
Methods usually involve multiple sequence alignment analysis.
beta12orEarlier
Predict contacts, non-covalent interactions and distance (constraints) between amino acids in protein sequences.
operation
operations
edam
bioinformatics
Protein interaction raw data analysis
beta12orEarlier
bioinformatics
operation
edam
operations
Analyse experimental protein-protein interaction data from for example yeast two-hybrid analysis, protein microarrays, immunoaffinity chromatography followed by mass spectrometry, phage display etc.
Protein-protein interaction prediction (from protein sequence)
Identify or predict protein-protein interactions, interfaces, binding sites etc in protein sequences.
operations
operation
bioinformatics
edam
beta12orEarlier
Protein-protein interaction prediction (from protein structure)
bioinformatics
Identify or predict protein-protein interactions, interfaces, binding sites etc in protein structures.
beta12orEarlier
operations
edam
operation
Protein interaction network analysis
edam
operation
beta12orEarlier
Analyse a network of protein interactions.
operations
bioinformatics
Protein interaction network comparison
operation
Compare two or more networks of protein interactions.
edam
operations
beta12orEarlier
bioinformatics
RNA secondary structure prediction
beta12orEarlier
Predict RNA secondary structure (for example knots, pseudoknots, alternative structures etc).
bioinformatics
operations
Methods might use RNA motifs, predicted intermolecular contacts, or RNA sequence-structure compatibility (inverse RNA folding).
operation
edam
Nucleic acid folding analysis
operation
Nucleic acid folding
bioinformatics
operations
Nucleic acid folding modelling
Analyse some aspect of RNA/DNA folding, typically by processing sequence and/or structural data.
edam
beta12orEarlier
Data retrieval (restriction enzyme annotation)
true
Retrieve information on restriction enzymes or restriction enzyme sites.
bioinformatics
edam
beta12orEarlier
operations
Restriction enzyme information retrieval
beta13
operation
Genetic marker identification
true
edam
operations
beta13
bioinformatics
A genetic marker is any DNA sequence of known chromosomal location that is associated with and specific to a particular gene or trait. This includes short sequences surrounding a SNP, Sequence-Tagged Sites (STS) which are well suited for PCR amplification, a longer minisatellites sequence etc.
Identify genetic markers in DNA sequences.
operation
beta12orEarlier
Genetic mapping
operation
Generate a genetic (linkage) map of a DNA sequence (typically a chromosome) showing the relative positions of genetic markers based on estimation of non-physical distances.
bioinformatics
Mapping involves ordering genetic loci along a chromosome and estimating the physical distance between loci. A genetic map shows the relative (not physical) position of known genes and genetic markers.
Genetic map generation
edam
Linkage mapping
beta12orEarlier
operations
Linkage analysis
edam
For example, estimate how close two genes are on a chromosome by calculating how often they are transmitted together to an offspring, ascertain whether two genes are linked and parental linkage, calculate linkage map distance etc.
beta12orEarlier
bioinformatics
Analyse genetic linkage.
operation
operations
Codon usage table generation
operations
edam
operation
beta12orEarlier
bioinformatics
Calculate codon usage statistics and create a codon usage table.
Codon usage table comparison
Compare two or more codon usage tables.
bioinformatics
edam
operation
beta12orEarlier
operations
Codon usage analysis
edam
synon: Codon usage table analysis
operation
Process (read and / or write) codon usage data, e.g. analyse codon usage tables or codon usage in molecular sequences.
bioinformatics
operations
synon: Codon usage data analysis
beta12orEarlier
Base position variability plotting
bioinformatics
beta12orEarlier
Identify and plot third base position variability in a nucleotide sequence.
operation
operations
edam
Sequence word comparison
beta12orEarlier
bioinformatics
edam
operations
Find exact character or word matches between molecular sequences without full sequence alignment.
operation
Sequence distance matrix generation
bioinformatics
Calculate a sequence distance matrix or otherwise estimate genetic distances between molecular sequences.
operations
edam
beta12orEarlier
operation
Phylogenetic distance matrix generation
Sequence redundancy removal
operation
Compare two or more molecular sequences, identify and remove redundant sequences based on some criteria.
edam
operations
bioinformatics
beta12orEarlier
Sequence clustering
Sequence cluster generation
Build clusters of similar sequences, typically using scores from pair-wise alignment or other comparison of the sequences.
beta12orEarlier
bioinformatics
operations
edam
operation
The clusters may be output or used internally for some other purpose.
Sequence alignment construction
Sequence alignment computation
operation
operations
Sequence alignment
bioinformatics
Sequence alignment generation
edam
Align (identify equivalent sites within) molecular sequences.
beta12orEarlier
Hybrid sequence alignment construction
true
Align two or more molecular sequences of different types (for example genomic DNA to EST, cDNA or mRNA).
edam
operation
Hybrid sequence alignment
operations
beta13
beta12orEarlier
bioinformatics
Structure-based sequence alignment construction
Align molecular sequences using sequence and structural information.
bioinformatics
beta12orEarlier
operation
operations
edam
Structure-based sequence alignment
Structure alignment construction
operations
bioinformatics
Structure alignment
edam
beta12orEarlier
Align (superimpose) molecular tertiary structures.
operation
Sequence profile generation
operation
beta12orEarlier
Generate some type of sequence profile (for example a hidden Markov model) from a sequence alignment.
operations
bioinformatics
edam
Structural (3D) profile generation
operation
edam
beta12orEarlier
Generate some type of structural (3D) profile or template from a structure or structure alignment.
operations
Structural profile generation
bioinformatics
Sequence profile alignment construction
Align sequence profiles (representing sequence alignments).
beta12orEarlier
See also 'Sequence alignment comparison'.
bioinformatics
operation
Sequence profile alignment
edam
operations
Structural (3D) profile alignment construction
bioinformatics
Structural (3D) profile alignment
Align structural (3D) profiles or templates (representing structures or structure alignments).
Structural profile alignment
operation
edam
beta12orEarlier
operations
Sequence-profile alignment construction
operation
A sequence profile typically represents a sequence alignment. Methods might perform one-to-one, one-to-many or many-to-many comparisons.
operations
Sequence-profile alignment
Align molecular sequence(s) to sequence profile(s).
beta12orEarlier
bioinformatics
edam
Sequence-3D profile alignment construction
Sequence-3D profile alignment
Methods might perform one-to-one, one-to-many or many-to-many comparisons.
Align molecular sequence(s) to structural (3D) profile(s) or template(s) (representing a structure or structure alignment).
operations
edam
bioinformatics
beta12orEarlier
operation
Protein threading
Sequence-structure alignment
Align molecular sequence to structure in 3D space (threading).
beta12orEarlier
edam
Use this concept for methods that evaluate sequence-structure compatibility by assessing residue interactions in 3D. Methods might perform one-to-one, one-to-many or many-to-many comparisons.
operation
operations
bioinformatics
Protein fold recognition
bioinformatics
edam
Protein fold prediction
operation
Recognize (predict and identify) known protein structural domains or folds in protein sequence(s).
Methods use some type of mapping between sequence and fold, for example secondary structure prediction and alignment, profile comparison, sequence properties, homologous sequence search, kernel machines etc. Domains and folds might be taken from SCOP or CATH.
Protein domain prediction
operations
beta12orEarlier
Data retrieval (metadata and documentation)
operation
operations
This includes documentation, general information and other metadata on entities such as databases, database entries and tools.
beta12orEarlier
Data retrieval (metadata)
Search for and retrieve data concerning or describing some core data, as distinct from the primary data that is being described.
bioinformatics
Data retrieval (documentation)
edam
Literature search
bioinformatics
edam
operation
operations
Query the biomedical and informatics literature.
beta12orEarlier
Text mining
Process and analyse text (typically the biomedical and informatics literature) to extract information from it.
edam
Text data mining
operations
beta12orEarlier
operation
bioinformatics
Virtual PCR
Perform in-silico (virtual) PCR.
operation
operations
edam
bioinformatics
beta12orEarlier
PCR primer design
operations
bioinformatics
Design or predict oligonucleotide primers for PCR and DNA amplification etc.
edam
Primer design involves predicting or selecting primers that are specific to a provided PCR template. Primers can be designed with certain properties such as size of product desired, primer size etc. The output might be a minimal or overlapping primer set.
PCR primer prediction
beta12orEarlier
operation
Microarray probe design
operation
Microarray probe prediction
edam
bioinformatics
beta12orEarlier
operations
Predict and/or optimize oligonucleotide probes for DNA microarrays, for example for transcription profiling of genes, or for genomes and gene families.
Sequence assembly
Combine (align and merge) overlapping fragments of a DNA sequence to reconstruct the original sequence.
bioinformatics
edam
operation
operations
For example, assemble overlapping reads from paired-end sequencers into contigs (a contiguous sequence corresponding to read overlaps). Or assemble contigs, for example ESTs and genomic DNA fragments, depending on the detected fragment overlaps.
beta12orEarlier
Microarray data standardization and normalization
edam
This includes statistical analysis, for example of variability amongst microarrays experiments, comparison of heterogeneous microarray platforms etc.
operations
Standardize or normalize microarray data.
operation
bioinformatics
beta12orEarlier
Sequencing-based expression profile data processing
true
operations
beta12orEarlier
beta12orEarlier
bioinformatics
operation
edam
Process (read and / or write) SAGE, MPSS or SBS experimental data.
Gene expression profile clustering
edam
Perform cluster analysis of gene expression (microarray) data, for example clustering of similar gene expression profiles.
beta12orEarlier
operation
operations
bioinformatics
Gene expression profile generation
Gene expression profiling
beta12orEarlier
Generate a gene expression profile or pattern, for example from microarray data.
edam
operations
Expression profiling
bioinformatics
operation
Gene expression profile comparison
Compare gene expression profiles or patterns.
operation
beta12orEarlier
edam
operations
bioinformatics
Functional profiling
true
operation
beta12orEarlier
bioinformatics
operations
Interpret (in functional terms) and annotate gene expression data.
beta12orEarlier
edam
EST and cDNA sequence analysis
true
bioinformatics
operation
For example, identify full-length cDNAs from EST sequences or detect potential EST antisense transcripts.
beta12orEarlier
beta12orEarlier
edam
operations
Analyse EST or cDNA sequences.
Structural genomics target selection
true
operations
bioinformatics
beta12orEarlier
operation
Methods will typically navigate a graph of protein families of known structure.
Identify and select targets for protein structural determination.
edam
beta12orEarlier
Protein secondary structure assignment
bioinformatics
edam
Assign secondary structure from protein coordinate or experimental data.
operation
beta12orEarlier
operations
Protein structure assignment
edam
operation
operations
Assign a protein tertiary structure (3D coordinates) from raw experimental data.
beta12orEarlier
bioinformatics
Protein model evaluation
beta12orEarlier
operations
WHATIF: UseFileDB
operation
Model validation might involve checks for atomic packing, steric clashes (bumps), volume irregularities, agreement with electron density maps, number of amino acid residues, percentage of residues with missing or bad atoms, irregular Ramachandran Z-scores, irregular Chi-1 / Chi-2 normality scores, RMS-Z score on bonds and angles etc.
edam
bioinformatics
Evaluate the quality or correctness a protein three-dimensional model.
Protein model refinement
WHATIF: CorrectedPDBasXML
operation
bioinformatics
edam
The PDB file format has had difficulties, inconsistencies and errors. Corrections can include identifying a meaningful sequence, removal of alternate atoms, correction of nomenclature problems, removal of incomplete residues and spurious waters, addition or removal of water, modelling of missing side chains, optimisation of cysteine bonds, regularisation of bond lengths, bond angles and planarities etc.
operations
beta12orEarlier
Refine (after evlauation) a model of protein structure to reduce steric clashes, volume irregularities etc.
Phylogenetic tree construction
Phylogenetic tree generation
edam
beta12orEarlier
Construct a phylogenetic tree.
Phylogenetic trees are usually constructed from a set of sequences from which an alignment (or data matrix) is calculated.
bioinformatics
operations
Phylogenetic tree construction
operation
Phylogenetic tree analysis
bioinformatics
edam
operation
operations
Analyse an existing phylogenetic tree or trees, typically to detect features or make predictions.
beta12orEarlier
Phylogenetic tree comparison
operations
operation
For example, to produce a consensus tree, subtrees, supertrees, calculate distances between trees or test topological similarity between trees (e.g. a congruence index) etc.
Compare two or more phylogenetic trees.
beta12orEarlier
bioinformatics
edam
Phylogenetic tree editing
edam
operation
Edit a phylogenetic tree.
beta12orEarlier
bioinformatics
operations
Phylogenetic footprinting / shadowing
operation
Infer a phylogenetic tree by comparing orthologous sequences in different species, particularly many closely related species (phylogenetic shadowing).
beta12orEarlier
bioinformatics
operations
edam
A phylogenetic 'shadow' represents the additive differences between individual sequences. By masking or 'shadowing' variable positions a conserved sequence is produced with few or none of the variations, which is then compared to the sequences of interest to identify significant regions of conservation.
Protein folding simulation
edam
Simulate the folding of a protein.
beta12orEarlier
bioinformatics
operations
operation
Protein folding pathway prediction
edam
operations
beta12orEarlier
bioinformatics
operation
Predict the folding pathway(s) or non-native structural intermediates of a protein.
Protein SNP mapping
bioinformatics
edam
Map and model the effects of single nucleotide polymorphisms (SNPs) on protein structure(s).
operations
operation
beta12orEarlier
Protein modelling (mutation)
operations
beta12orEarlier
operation
Methods might predict silent or pathological mutations.
edam
Predict the effect of point mutation on a protein structure, in terms of strucural effects and protein folding, stability and function.
Protein mutation modelling
bioinformatics
Immunogen design
true
edam
operation
beta12orEarlier
beta12orEarlier
Design molecules that elicit an immune response (immunogens).
operations
bioinformatics
Zinc finger protein domain prediction and optimisation
edam
operations
beta12orEarlier
Predict and optimise zinc finger protein domains for DNA/RNA binding (for example for transcription factors and nucleases).
bioinformatics
operation
Enzyme kinetics calculation
operation
Calculate Km, Vmax and derived data for an enzyme reaction.
edam
beta12orEarlier
operations
bioinformatics
File reformatting
edam
operation
operations
Reformat a file of data (or equivalent entity in memory).
beta12orEarlier
bioinformatics
File validation
bioinformatics
operation
operations
beta12orEarlier
edam
Test and validate the format and content of a data file.
Plotting and rendering
operation
edam
Visualise, plot or render (graphically) biomolecular data such as molecular sequences or structures.
Visualisation
bioinformatics
operations
beta12orEarlier
Sequence database search
operations
beta12orEarlier
operation
bioinformatics
This excludes direct retrieval methods (e.g. the dbfetch program).
edam
Search a sequence database by sequence comparison and retrieve similar sequences.
Structure database search
operation
bioinformatics
beta12orEarlier
operations
Search a tertiary structure database by sequence and/or structure comparison and retrieve structures and associated data.
edam
Protein secondary database search
edam
operations
Search a secondary protein database (of classification information) to assign a protein sequence(s) to a known protein family or group.
beta12orEarlier
Methods might use fingerprints, profiles, hidden Markov models, sequence alignment etc to provide a mapping to a secondary database (Prosite, Blocks, ProDom, Prints, Pfam etc.).
operation
bioinformatics
Protein sequence classification
Motif database search
operations
bioinformatics
Screen a sequence against a motif or pattern database.
beta12orEarlier
edam
operation
Sequence profile database search
operations
Search a database of sequence profiles with a query sequence.
operation
bioinformatics
edam
beta12orEarlier
Transmembrane protein database search
true
edam
operation
bioinformatics
beta12orEarlier
operations
Search a database of transmembrane proteins, for example for sequence or structural similarities.
beta12orEarlier
Sequence retrieval (by code)
edam
beta12orEarlier
operations
operation
bioinformatics
Query a database and retrieve sequences with a given entry code or accession number.
Sequence retrieval (by keyword)
edam
beta12orEarlier
bioinformatics
operations
operation
Query a database and retrieve sequences containing a given keyword.
Sequence database search (by sequence)
beta12orEarlier
Search a sequence database and retrieve sequences that are similar to a query sequence.
operations
bioinformatics
Sequence similarity search
edam
operation
Sequence database search (by motif or pattern)
operation
operations
bioinformatics
edam
beta12orEarlier
Search a sequence database and retrieve sequences matching a given sequence motif or pattern, such as a Prosite pattern or regular expression.
Sequence database search (by amino acid composition)
beta12orEarlier
operation
operations
Search a sequence database and retrieve sequences of a given amino acid composition.
edam
bioinformatics
Sequence database search (by physicochemical property)
operation
beta12orEarlier
Search a sequence database and retrieve sequences with a specified physicochemical property.
operations
edam
bioinformatics
Sequence database search (by sequence using word-based methods)
operation
bioinformatics
Search a sequence database and retrieve sequences that are similar to a query sequence using a word-based method.
operations
Sequence similarity search (word-based methods)
Word-based methods (for example BLAST, gapped BLAST, MEGABLAST, WU-BLAST etc.) are usually quicker than alignment-based methods. They may or may not handle gaps.
edam
beta12orEarlier
Sequence database search (by sequence using profile-based methods)
operations
This includes tools based on PSI-BLAST.
beta12orEarlier
bioinformatics
Search a sequence database and retrieve sequences that are similar to a query sequence using a sequence profile-based method, or with a supplied profile as query.
Sequence similarity search (profile-based methods)
edam
operation
Sequence database search (by sequence using local alignment-based methods)
This includes tools based on the Smith-Waterman algorithm or FASTA.
Sequence similarity search (local alignment-based methods)
operation
beta12orEarlier
bioinformatics
operations
Search a sequence database for sequences that are similar to a query sequence using a local alignment-based method.
edam
Sequence database search (by sequence using global alignment-based methods)
edam
bioinformatics
Sequence similarity search (global alignment-based methods)
beta12orEarlier
operations
Search sequence(s) or a sequence database for sequences that are similar to a query sequence using a global alignment-based method.
This includes tools based on the Needleman and Wunsch algorithm.
operation
Sequence database search (by sequence for primer sequences)
edam
Sequence similarity search (primer sequences)
operation
operations
STSs are genetic markers that are easily detected by the polymerase chain reaction (PCR) using specific primers.
bioinformatics
Search a DNA database (for example a database of conserved sequence tags) for matches to Sequence-Tagged Site (STS) primer sequences.
beta12orEarlier
Sequence database search (by molecular weight)
beta12orEarlier
edam
Peptide mass fingerprinting
bioinformatics
operations
Protein fingerprinting
operation
Search sequence(s) or a sequence database for sequences which match a set of peptide masses, for example a peptide mass fingerprint from mass spectrometry.
Sequence database search (by isoelectric point)
bioinformatics
edam
operation
Search sequence(s) or a sequence database for sequences of a given isoelectric point.
operations
beta12orEarlier
Structure retrieval (by code)
beta12orEarlier
operation
bioinformatics
Query a tertiary structure database and retrieve entries with a given entry code or accession number.
operations
edam
Structure retrieval (by keyword)
beta12orEarlier
operation
operations
Query a tertiary structure database and retrieve entries containing a given keyword.
edam
bioinformatics
Structure database search (by sequence)
operation
Search a tertiary structure database and retrieve structures with a sequence similar to a query sequence.
operations
bioinformatics
Structure retrieval by sequence
edam
beta12orEarlier
Structure database search (by structure)
operation
Structural similarity search
operations
Structure retrieval by structure
bioinformatics
edam
Search a tertiary structure database and retrieve structures that are similar to a query structure.
beta12orEarlier
Sequence annotation
Annotate a molecular sequence record with terms from a controlled vocabulary.
edam
beta12orEarlier
operations
bioinformatics
operation
Genome annotation
Annotate a genome sequence with terms from a controlled vocabulary.
operations
bioinformatics
beta12orEarlier
edam
operation
Nucleic acid sequence reverse and complement
Generate the reverse and / or complement of a nucleotide sequence.
operation
beta12orEarlier
operations
edam
bioinformatics
Random sequence generation
beta12orEarlier
operations
operation
bioinformatics
Generate a random sequence, for example, with a specific character composition.
edam
Nucleic acid restriction digest
operations
edam
Generate digest fragments for a nucleotide sequence containing restriction sites.
bioinformatics
operation
beta12orEarlier
Protein sequence cleavage
edam
Cleave a protein sequence into peptide fragments (by enzymatic or chemical cleavage) and calculate the fragment masses.
bioinformatics
beta12orEarlier
operations
operation
Sequence mutation and randomization
operations
edam
operation
bioinformatics
Mutate a molecular sequence a specified amount or shuffle it to produce a randomized sequence with the same overall composition.
beta12orEarlier
Sequence masking
bioinformatics
operations
beta12orEarlier
edam
operation
For example, SNPs or repeats in a DNA sequence might be masked.
Mask characters in a molecular sequence (replacing those characters with a mask character).
Sequence cutting
operation
beta12orEarlier
operations
edam
bioinformatics
Cut (remove) characters or a region from a molecular sequence.
Restriction site creation
beta12orEarlier
operation
bioinformatics
edam
Create (or remove) restriction sites in sequences, for example using silent mutations.
operations
DNA translation
operation
Translate a DNA sequence into protein.
operations
beta12orEarlier
edam
bioinformatics
DNA transcription
Transcribe a nucleotide sequence into mRNA sequence(s).
operation
edam
beta12orEarlier
bioinformatics
operations
Sequence composition calculation (nucleic acid)
beta12orEarlier
operations
edam
bioinformatics
Calculate base frequency or word composition of a nucleotide sequence.
operation
Sequence composition calculation (protein)
bioinformatics
beta12orEarlier
Calculate amino acid frequency or word composition of a protein sequence.
edam
operation
operations
Repeat sequence detection
operations
Find (and possibly render) short repetitive subsequences (repeat sequences) in (typically nucleotide) sequences.
bioinformatics
operation
edam
beta12orEarlier
Repeat sequence organisation analysis
Analyse repeat sequence organization such as periodicity.
beta12orEarlier
edam
operations
bioinformatics
operation
Protein hydropathy calculation (from structure)
edam
beta12orEarlier
operations
Analyse the hydrophobic, hydrophilic or charge properties of a protein structure.
bioinformatics
operation
Protein solvent accessibility calculation
Calculate solvent accessible or buried surface areas in protein structures.
bioinformatics
edam
operation
beta12orEarlier
operations
Protein hydropathy cluster calculation
operation
Identify clusters of hydrophobic or charged residues in a protein structure.
bioinformatics
beta12orEarlier
operations
edam
Protein dipole moment calculation
edam
Calculate whether a protein structure has an unusually large net charge (dipole moment).
bioinformatics
operation
beta12orEarlier
operations
Protein surface and interior calculation
Identify the protein surface and interior, surface accessible pockets, interior inaccessible cavities etc.
bioinformatics
beta12orEarlier
edam
operations
operation
Binding site prediction (from structure)
edam
operation
Ligand-binding and active site prediction (from structure)
bioinformatics
beta12orEarlier
Identify or predict catalytic residues, active sites or other ligand-binding sites in protein structures.
operations
Protein-nucleic acid binding site analysis
beta12orEarlier
edam
operation
operations
bioinformatics
Analyse RNA or DNA-binding sites in protein structure.
Protein peeling
edam
operation
bioinformatics
Decompose a structure into compact or globular fragments (protein peeling).
operations
beta12orEarlier
Protein distance matrix calculation
operations
edam
Calculate a matrix of distance between residues (for example the C-alpha atoms) in a protein structure.
operation
bioinformatics
beta12orEarlier
Protein contact map calculation
operations
Calculate a residue contact map (typically all-versus-all inter-residue contacts) for a protein structure.
operation
bioinformatics
edam
beta12orEarlier
Protein residue cluster calculation
beta12orEarlier
edam
Calculate clusters of contacting residues in protein structures.
operations
Cluster of contacting residues might be key structural residues.
operation
bioinformatics
Hydrogen bond calculation
Identify potential hydrogen bonds between amino acids and other groups.
operations
edam
WHATIF:ShowHydrogenBonds
operation
WHATIF:ShowHydrogenBondsM
bioinformatics
beta12orEarlier
The output might include the atoms involved in the bond, bond geometric parameters and bond enthalpy.
WHATIF:HasHydrogenBonds
Residue non-canonical interaction detection
Calculate non-canonical atomic interactions in protein structures.
bioinformatics
operations
beta12orEarlier
edam
operation
Ramachandran plot calculation
bioinformatics
operations
operation
beta12orEarlier
Calculate a Ramachandran plot of a protein structure.
edam
Ramachandran plot evaluation
operation
Analyse (typically to validate) a Ramachandran plot of a protein structure.
edam
operations
beta12orEarlier
bioinformatics
Protein molecular weight calculation
operations
beta12orEarlier
edam
Calculate the molecular weight of a protein sequence or fragments.
operation
bioinformatics
Protein extinction coefficient calculation
Predict extinction coefficients or optical density of a protein sequence.
operations
operation
edam
bioinformatics
beta12orEarlier
Protein pH-dependent property calculation
beta12orEarlier
Calculate pH-dependent properties from pKa calculations of a protein sequence.
operation
bioinformatics
edam
operations
Protein hydropathy calculation (from sequence)
edam
beta12orEarlier
operations
Hydropathy calculation on a protein sequence.
operation
bioinformatics
Protein titration curve plotting
bioinformatics
Plot a protein titration curve.
edam
operation
operations
beta12orEarlier
Protein isoelectric point calculation
operations
Calculate isoelectric point of a protein sequence.
bioinformatics
beta12orEarlier
operation
edam
Protein hydrogen exchange rate calculation
operation
beta12orEarlier
bioinformatics
operations
edam
Estimate hydrogen exchange rate of a protein sequence.
Protein hydrophobic region calculation
beta12orEarlier
Calculate hydrophobic or hydrophilic / charged regions of a protein sequence.
edam
bioinformatics
operations
operation
Protein aliphatic index calculation
edam
bioinformatics
operations
Calculate aliphatic index (relative volume occupied by aliphatic side chains) of a protein.
beta12orEarlier
operation
Protein hydrophobic moment plotting
operations
edam
Calculate the hydrophobic moment of a peptide sequence and recognize amphiphilicity.
operation
bioinformatics
Hydrophobic moment is a peptides hydrophobicity measured for different angles of rotation.
beta12orEarlier
Protein globularity prediction
edam
operation
operations
bioinformatics
beta12orEarlier
Predict the stability or globularity of a protein sequence, whether it is intrinsically unfolded etc.
Protein solubility prediction
edam
operation
operations
bioinformatics
beta12orEarlier
Predict the solubility or atomic solvation energy of a protein sequence.
Protein crystallizability prediction
operations
Predict crystallizability of a protein sequence.
operation
edam
beta12orEarlier
bioinformatics
Protein signal peptide detection (eukaryotes)
Detect or predict signal peptides (and typically predict subcellular localization) of eukaryotic proteins.
operation
operations
bioinformatics
edam
beta12orEarlier
Protein signal peptide detection (bacteria)
beta12orEarlier
Detect or predict signal peptides (and typically predict subcellular localization) of bacterial proteins.
operations
bioinformatics
operation
edam
MHC peptide immunogenicity prediction
beta12orEarlier
bioinformatics
operation
operations
edam
Predict MHC class I or class II binding peptides, promiscuous binding peptides, immunogenicity etc.
Protein feature prediction (from sequence)
edam
bioinformatics
Predict, recognise and identify positional features in protein sequences such as functional sites or regions and secondary structure.
operation
Sequence feature detection (protein)
beta12orEarlier
Methods typically involve scanning for known motifs, patterns and regular expressions.
operations
Nucleic acid feature prediction
operation
edam
operations
beta12orEarlier
bioinformatics
Methods typically involve scanning for known motifs, patterns and regular expressions.
Predict, recognise and identify features in nucleotide sequences such as functional sites or regions, typically by scanning for known motifs, patterns and regular expressions.
Sequence feature detection (nucleic acid)
Epitope mapping
operation
beta12orEarlier
Epitope mapping is commonly done during vaccine design.
operations
Predict antigenic determinant sites (epitopes) in protein sequences.
edam
bioinformatics
Protein post-translation modification site prediction
Methods might predict sites of methylation, N-terminal myristoylation, N-terminal acetylation, sumoylation, palmitoylation, phosphorylation, sulfation, glycosylation, glycosylphosphatidylinositol (GPI) modification sites (GPI lipid anchor signals) etc.
operations
operation
Predict post-translation modification sites in protein sequences.
beta12orEarlier
edam
bioinformatics
Protein signal peptide detection
operations
beta12orEarlier
bioinformatics
operation
Detect or predict signal peptides and signal peptide cleavage sites in protein sequences.
Methods might use sequence motifs and features, amino acid composition, profiles, machine-learned classifiers, etc.
edam
Binding site prediction (from sequence)
operations
beta12orEarlier
edam
Ligand-binding and active site prediction (from sequence)
Predict catalytic residues, active sites or other ligand-binding sites in protein sequences.
bioinformatics
operation
Protein-nucleic acid binding prediction
operation
bioinformatics
Predict RNA and DNA-binding binding sites in protein sequences.
operations
beta12orEarlier
edam
Protein folding site prediction
beta12orEarlier
Predict protein sites that are key to protein folding, such as possible sites of nucleation or stabilization.
operations
bioinformatics
operation
edam
Protein cleavage site prediction
operation
Detect or predict cleavage sites (enzymatic or chemical) in protein sequences.
beta12orEarlier
operations
bioinformatics
edam
Epitope mapping (MHC Class I)
operation
beta12orEarlier
Predict epitopes that bind to MHC class I molecules.
edam
operations
bioinformatics
Epitope mapping (MHC Class II)
bioinformatics
operations
beta12orEarlier
operation
edam
Predict epitopes that bind to MHC class II molecules.
Whole gene prediction
operation
edam
Detect, predict and identify whole gene structure in DNA sequences. This includes protein coding regions, exon-intron structure, regulatory regions etc.
operations
bioinformatics
beta12orEarlier
Gene component prediction
operations
Detect, predict and identify genetic elements such as promoters, coding regions, splice sites, etc in DNA sequences.
bioinformatics
beta12orEarlier
edam
Methods for gene prediction might be ab initio, based on phylogenetic comparisons, use motifs, sequence features, support vector machine, alignment etc.
operation
Transposon prediction
edam
bioinformatics
Detect or predict transposons, retrotransposons / retrotransposition signatures etc.
operations
operation
beta12orEarlier
PolyA signal detection
beta12orEarlier
bioinformatics
operation
edam
operations
Detect polyA signals in nucleotide sequences.
Quadruplex formation site detection
Quadruplex structure prediction
beta12orEarlier
edam
operation
Quadruplex (4-stranded) structures are formed by guanine-rich regions and are implicated in various important biological processes and as therapeutic targets.
operations
Detect quadruplex-forming motifs in nucleotide sequences.
bioinformatics
CpG island and isochore detection
CpG island and isochores rendering
operations
beta12orEarlier
operation
CpG island and isochores detection
Find CpG rich regions in a nucleotide sequence or isochores in genome sequences.
An isochore is long region (> 3 KB) of DNA with very uniform GC content, in contrast to the rest of the genome. Isochores tend tends to have more genes, higher local melting or denaturation temperatures, and different flexibility. Methods might calculate fractional GC content or variation of GC content, predict methylation status of CpG islands etc. This includes methods that visualise CpG rich regions in a nucleotide sequence, for example plot isochores in a genome sequence.
bioinformatics
edam
Restriction site recognition
Find and identify restriction enzyme cleavage sites (restriction sites) in (typically) DNA sequences, for example to generate a restriction map.
beta12orEarlier
edam
operations
bioinformatics
operation
Nucleosome formation or exclusion sequence prediction
Identify or predict nucleosome exclusion sequences (nucleosome free regions) in DNA.
beta12orEarlier
edam
operation
bioinformatics
operations
Splice site prediction
operation
Methods might require a pre-mRNA or genomic DNA sequence.
Identify, predict or analyse splice sites in nucleotide sequences.
edam
bioinformatics
operations
beta12orEarlier
Integrated gene prediction
operation
bioinformatics
beta12orEarlier
Predict whole gene structure using a combination of multiple methods to achieve better predictions.
edam
operations
Operon prediction
operations
bioinformatics
operation
beta12orEarlier
Find operons (operators, promoters and genes) in bacteria genes.
edam
Coding region prediction
beta12orEarlier
bioinformatics
operations
operation
Predict protein-coding regions (CDS or exon) or open reading frames in nucleotide sequences.
edam
Selenocysteine insertion sequence (SECIS) prediction
edam
beta12orEarlier
SECIS elements are around 60 nucleotides in length with a stem-loop structure directs the cell to translate UGA codons as selenocysteines.
Predict selenocysteine insertion sequence (SECIS) in a DNA sequence.
operation
operations
bioinformatics
Transcription regulatory element prediction
This includes promoters, enhancers, silencers and boundary elements / insulators, regulatory protein or transcription factor binding sites etc. Methods might be specific to a particular genome and use motifs, word-based / grammatical methods, position-specific frequency matrices, discriminative pattern analysis etc.
operation
beta12orEarlier
operations
edam
bioinformatics
Identify or predict transcription regulatory motifs, patterns, elements or regions in DNA sequences.
Translation initiation site prediction
bioinformatics
edam
operation
operations
Predict translation initiation sites, possibly by searching a database of sites.
beta12orEarlier
Promoter prediction
edam
beta12orEarlier
operations
Methods might recognize CG content, CpG islands, splice sites, polyA signals etc.
bioinformatics
Identify or predict whole promoters or promoter elements (transcription start sites, RNA polymerase binding site, transcription factor binding sites, promoter enhancers etc) in DNA sequences.
operation
Transcription regulatory element prediction (DNA-cis)
Identify, predict or analyse cis-regulatory elements (TATA box, Pribnow box, SOS box, CAAT box, CCAAT box, operator etc.) in DNA sequences.
edam
Cis-regulatory elements (cis-elements) regulate the expression of genes located on the same strand. Cis-elements are found in the 5' promoter region of the gene, in an intron, or in the 3' untranslated region. Cis-elements are often binding sites of one or more trans-acting factors.
operation
beta12orEarlier
operations
bioinformatics
Transcription regulatory element prediction (RNA-cis)
beta12orEarlier
operation
operations
bioinformatics
edam
Identify, predict or analyse cis-regulatory elements (for example riboswitches) in RNA sequences.
Cis-regulatory elements (cis-elements) regulate genes located on the same strand from which the element was transcribed. A riboswitch is a region of an mRNA molecule that bind a small target molecule that regulates the gene's activity.
Transcription regulatory element prediction (trans)
operation
Functional RNA identification
bioinformatics
operations
Identify or predict functional RNA sequences with a gene regulatory role (trans-regulatory elements) or targets.
Trans-regulatory elements regulate genes distant from the gene from which they were transcribed.
edam
beta12orEarlier
Matrix/scaffold attachment site prediction
beta12orEarlier
operation
Identify matrix/scaffold attachment regions (MARs/SARs) in DNA sequences.
operations
bioinformatics
edam
MAR/SAR sites often flank a gene or gene cluster and are found nearby cis-regulatory sequences. They might contribute to transcription regulation.
Transcription factor binding site prediction
bioinformatics
Identify or predict transcription factor binding sites in DNA sequences.
operation
operations
beta12orEarlier
edam
Exonic splicing enhancer prediction
edam
beta12orEarlier
bioinformatics
operations
operation
Identify or predict exonic splicing enhancers (ESE) in exons.
An exonic splicing enhancer (ESE) is 6-base DNA sequence motif in an exon that enhances or directs splicing of pre-mRNA or hetero-nuclear RNA (hnRNA) into mRNA.
Sequence alignment quality evaluation
beta12orEarlier
operations
edam
Evaluate molecular sequence alignment accuracy.
operation
bioinformatics
Evaluation might be purely sequence-based or use structural information.
Sequence alignment analysis (conservation)
operations
operation
Use this concept for methods that calculate substitution rates, estimate relative site variability, identify sites with biased properties, derive a consensus sequence, or identify highly conserved or very poorly conserved sites, regions, blocks etc.
edam
bioinformatics
beta12orEarlier
Analyse character conservation in a molecular sequence alignment, for example to derive a consensus sequence.
Sequence alignment analysis (site correlation)
bioinformatics
beta12orEarlier
operations
edam
operation
This is typically done to identify possible covarying positions and predict contacts or structural constraints in protein structures.
Analyse correlations between sites in a molecular sequence alignment.
Sequence alignment analysis (chimeric sequence detection)
operation
Detects chimeric sequences (chimeras) from a sequence alignment.
edam
operations
bioinformatics
A chimera includes regions from two or more phylogenetically distinct sequences. They are usually artifacts of PCR and are thought to occur when a prematurely terminated amplicon reanneals to another DNA strand and is subsequently copied to completion in later PCR cycles.
beta12orEarlier
Sequence alignment analysis (recombination detection)
operations
bioinformatics
Tools might use a genetic algorithm, quartet-mapping, bootscanning, graphical methods, random forest model and so on.
Detect recombination (hotspots and coldspots) and identify recombination breakpoints in a sequence alignment.
operation
beta12orEarlier
edam
Sequence alignment analysis (indel detection)
Identify insertion, deletion and duplication events from a sequence alignment.
Tools might use a genetic algorithm, quartet-mapping, bootscanning, graphical methods, random forest model and so on.
bioinformatics
operations
beta12orEarlier
operation
edam
Nucleosome formation potential prediction
true
operation
operations
beta12orEarlier
Predict nucleosome formation potential of DNA sequences.
edam
beta12orEarlier
bioinformatics
Nucleic acid thermodynamic property calculation
beta12orEarlier
operations
edam
Calculate a thermodynamic property of DNA or DNA/RNA, such as melting temperature, enthalpy and entropy.
operation
bioinformatics
Nucleic acid melting profile plotting
operation
operations
bioinformatics
A melting profile is used to visualise and analyse partly melted DNA conformations.
beta12orEarlier
Calculate and plot a DNA or DNA/RNA melting profile.
edam
Nucleic acid stitch profile plotting
edam
Calculate and plot a DNA or DNA/RNA stitch profile.
operations
bioinformatics
beta12orEarlier
operation
A stitch profile represents the alternative conformations that partly melted DNA can adopt in a temperature range.
Nucleic acid melting curve plotting
bioinformatics
edam
Calculate and plot a DNA or DNA/RNA melting curve.
beta12orEarlier
operations
operation
Nucleic acid probability profile plotting
Calculate and plot a DNA or DNA/RNA probability profile.
edam
bioinformatics
operation
operations
beta12orEarlier
Nucleic acid temperature profile plotting
operations
bioinformatics
edam
beta12orEarlier
operation
Calculate and plot a DNA or DNA/RNA temperature profile.
Nucleic acid curvature calculation
bioinformatics
operation
This includes properties such as.
beta12orEarlier
edam
Calculate curvature and flexibility / stiffness of a nucleotide sequence.
operations
microRNA detection
beta12orEarlier
Identify or predict microRNA sequences (miRNA) and precursors or microRNA targets / binding sites in an RNA sequence.
edam
bioinformatics
operation
operations
tRNA gene prediction
edam
beta12orEarlier
operations
operation
Identify or predict tRNA genes in genomic sequences (tRNA).
bioinformatics
siRNA binding specificity prediction
edam
operations
operation
beta12orEarlier
Assess binding specificity of putative siRNA sequence(s), for example for a functional assay, typically with respect to designing specific siRNA sequences.
bioinformatics
Protein secondary structure prediction (integrated)
beta12orEarlier
bioinformatics
operations
operation
edam
Predict secondary structure of protein sequence(s) using multiple methods to achieve better predictions.
Protein secondary structure prediction (helices)
Predict helical secondary structure of protein sequences.
beta12orEarlier
bioinformatics
operations
operation
edam
Protein secondary structure prediction (turns)
beta12orEarlier
operations
edam
Predict turn structure (for example beta hairpin turns) of protein sequences.
bioinformatics
operation
Protein secondary structure prediction (coils)
edam
bioinformatics
Predict open coils, non-regular secondary structure and intrinsically disordered / unstructured regions of protein sequences.
operations
operation
beta12orEarlier
Protein secondary structure prediction (disulfide bonds)
Predict cysteine bonding state and disulfide bond partners in protein sequences.
edam
operation
operations
beta12orEarlier
bioinformatics
GPCR prediction
Predict G protein-coupled receptors (GPCR).
edam
G protein-coupled receptor (GPCR) prediction
beta12orEarlier
bioinformatics
operations
operation
GPCR analysis
operation
G protein-coupled receptor (GPCR) analysis
bioinformatics
beta12orEarlier
operations
Analyse G-protein coupled receptor proteins (GPCRs).
edam
Protein structure prediction
Predict tertiary structure (backbone and side-chain conformation) of protein sequences.
operation
operations
edam
bioinformatics
beta12orEarlier
Nucleic acid structure prediction
operations
Methods might identify thermodynamically stable or evolutionarily conserved structures.
operation
bioinformatics
Predict tertiary structure of DNA or RNA.
beta12orEarlier
edam
Ab initio structure prediction
bioinformatics
beta12orEarlier
edam
operation
Predict tertiary structure of protein sequence(s) without homologs of known structure.
operations
Protein modelling
Protein structure comparative modelling
Homology structure modelling
The model might be of a whole, part or aspect of protein structure. Molecular modelling methods might use sequence-structure alignment, structural templates, molecular dynamics, energy minimization etc.
Homology modelling
bioinformatics
Build a three-dimensional protein model based on known (for example homologs) structures.
beta12orEarlier
operation
Comparative modelling
edam
operations
Protein docking
edam
bioinformatics
This includes protein-protein interactions, protein-nucleic acid, protein-ligand binding etc. Methods might predict whether the molecules are likely to bind in vivo, their conformation when bound, the strength of the interaction, possible mutations to achieve bonding and so on.
operation
beta12orEarlier
Model the structure of a protein in complex with a small molecule or another macromolecule.
operations
Protein modelling (backbone)
operations
edam
beta12orEarlier
Methods might require a preliminary C(alpha) trace.
operation
Model protein backbone conformation.
bioinformatics
Protein modelling (side chains)
Model, analyse or edit amino acid side chain conformation in protein structure, optimize side-chain packing, hydrogen bonding etc.
bioinformatics
edam
operation
Methods might use a residue rotamer library.
operations
beta12orEarlier
Protein modelling (loops)
Model loop conformation in protein structures.
edam
operations
bioinformatics
operation
beta12orEarlier
Protein-ligand docking
edam
bioinformatics
Virtual ligand screening
Methods aim to predict the position and orientation of a ligand bound to a protein receptor or enzyme.
operations
beta12orEarlier
Model protein-ligand (for example protein-peptide) binding using comparative modelling or other techniques.
operation
Structured RNA prediction and optimisation
operations
bioinformatics
Predict or optimise RNA sequences (sequence pools) with likely secondary and tertiary structure for in vitro selection.
edam
beta12orEarlier
RNA inverse folding
operation
Nucleic acid folding family identification
SNP detection
edam
operations
operation
bioinformatics
This includes functional SNPs for large-scale genotyping purposes, disease-associated non-synonymous SNPs etc.
beta12orEarlier
Single nucleotide polymorphism detection
Find single nucleotide polymorphisms (SNPs) between sequences.
Radiation Hybrid Mapping
operations
operation
beta12orEarlier
Generate a physical (radiation hybrid) map of genetic markers in a DNA sequence using provided radiation hybrid (RH) scores for one or more markers.
bioinformatics
edam
Functional mapping
true
beta12orEarlier
operations
This can involve characterization of the underlying quantitative trait loci (QTLs) or nucleotides (QTNs).
beta12orEarlier
operation
edam
Map the genetic architecture of dynamic complex traits.
bioinformatics
Haplotype inference
Haplotype mapping
bioinformatics
Haplotype reconstruction
edam
Haplotype inference can help in population genetic studies and the identification of complex disease genes, , and is typically based on aligned single nucleotide polymorphism (SNP) fragments. Haplotype comparison is a useful way to characterize the genetic variation between individuals. An individual's haplotype describes which nucleotide base occurs at each position for a set of common SNPs. Tools might use combinatorial functions (for example parsimony) or a likelihood function or model with optimization such as minimum error correction (MEC) model, expectation-maximization algorithm (EM), genetic algorithm or Markov chain Monte Carlo (MCMC).
beta12orEarlier
Infer haplotypes, either alleles at multiple loci that are transmitted together on the same chromosome, or a set of single nucleotide polymorphisms (SNPs) on a single chromatid that are statistically associated.
operation
operations
Linkage disequilibrium calculation
Calculate linkage disequilibrium; the non-random association of alleles or polymorphisms at two or more loci (not necessarily on the same chromosome).
operations
operation
Linkage disequilibrium is identified where a combination of alleles (or genetic markers) occurs more or less frequently in a population than expected by chance formation of haplotypes.
edam
bioinformatics
beta12orEarlier
Genetic code prediction
operations
bioinformatics
Predict genetic code from analysis of codon usage data.
edam
beta12orEarlier
operation
Dotplot plotting
bioinformatics
operations
edam
Draw a dotplot of sequence similarities identified from word-matching or character comparison.
beta12orEarlier
operation
Pairwise sequence alignment construction
bioinformatics
beta12orEarlier
edam
Methods might perform one-to-one, one-to-many or many-to-many comparisons.
Pairwise sequence alignment
Align exactly two molecular sequences.
operations
operation
Multiple sequence alignment construction
beta12orEarlier
Multiple sequence alignment
operations
This includes methods that use an existing alignment, for example to incorporate sequences into an alignment, or combine several multiple alignments into a single, improved alignment.
Align two or more molecular sequences.
edam
bioinformatics
operation
Pairwise sequence alignment construction (local)
Local alignment methods identify regions of local similarity.
bioinformatics
operations
operation
Pairwise sequence alignment (local)
Locally align exactly two molecular sequences.
Local pairwise sequence alignment construction
beta12orEarlier
edam
Pairwise sequence alignment construction (global)
operations
Pairwise sequence alignment (global)
bioinformatics
beta12orEarlier
Global alignment methods identify similarity across the entire length of the sequences.
Globally align exactly two molecular sequences.
operation
Global pairwise sequence alignment construction
edam
Multiple sequence alignment construction (local)
Locally align two or more molecular sequences.
operation
edam
Local alignment methods identify regions of local similarity.
beta12orEarlier
bioinformatics
Local multiple sequence alignment construction
operations
Multiple sequence alignment (local)
Multiple sequence alignment construction (global)
edam
Global alignment methods identify similarity across the entire length of the sequences.
Globally align two or more molecular sequences.
operations
beta12orEarlier
Multiple sequence alignment (global)
Global multiple sequence alignment construction
operation
bioinformatics
Multiple sequence alignment construction (constrained)
operation
Constrained multiple sequence alignment construction
Multiple sequence alignment (constrained)
edam
bioinformatics
operations
Align two or more molecular sequences with user-defined constraints.
beta12orEarlier
Multiple sequence alignment construction (consensus)
operation
Multiple sequence alignment (consensus)
Align two or more molecular sequences using multiple methods to achieve higher quality.
beta12orEarlier
Consensus multiple sequence alignment construction
bioinformatics
operations
edam
Multiple sequence alignment construction (phylogenetic tree-based)
Multiple sequence alignment (phylogenetic tree-based)
operations
bioinformatics
Align multiple sequences using relative gap costs calculated from neighbors in a supplied phylogenetic tree.
beta12orEarlier
Phylogenetic tree-based multiple sequence alignment construction
This is supposed to give a more biologically meaningful alignment than standard alignments.
operation
edam
Secondary structure alignment construction
Align molecular secondary structure (represented as a 1D string).
edam
beta12orEarlier
operations
Secondary structure alignment
operation
bioinformatics
Protein secondary structure alignment construction
Secondary structure alignment (protein)
bioinformatics
Align protein secondary structures.
operation
Protein secondary structure alignment
beta12orEarlier
edam
operations
RNA secondary structure alignment construction
operation
beta12orEarlier
Secondary structure alignment (RNA)
bioinformatics
edam
Align RNA secondary structures.
RNA secondary structure alignment
operations
Pairwise structure alignment construction
Align (superimpose) exactly two molecular tertiary structures.
bioinformatics
Pairwise structure alignment
operation
operations
edam
beta12orEarlier
Multiple structure alignment construction
Multiple structure alignment
beta12orEarlier
operations
edam
This includes methods that use an existing alignment.
bioinformatics
operation
Align (superimpose) two or more molecular tertiary structures.
Structure alignment (protein)
true
operation
Align protein tertiary structures.
edam
operations
beta12orEarlier
bioinformatics
beta13
Structure alignment (RNA)
true
beta12orEarlier
operation
Align RNA tertiary structures.
edam
bioinformatics
operations
beta13
Pairwise structure alignment construction (local)
edam
bioinformatics
operations
operation
beta12orEarlier
Local pairwise structure alignment construction
Local alignment methods identify regions of local similarity, common substructures etc.
Pairwise structure alignment (local)
Locally align (superimpose) exactly two molecular tertiary structures.
Pairwise structure alignment construction (global)
Pairwise structure alignment (global)
Globally align (superimpose) exactly two molecular tertiary structures.
operations
edam
beta12orEarlier
bioinformatics
operation
Global alignment methods identify similarity across the entire structures.
Global pairwise structure alignment construction
Multiple structure alignment construction (local)
Multiple structure alignment (local)
edam
Locally align (superimpose) two or more molecular tertiary structures.
beta12orEarlier
operations
Local alignment methods identify regions of local similarity, common substructures etc.
bioinformatics
Local multiple structure alignment construction
operation
Multiple structure alignment construction (global)
Globally align (superimpose) two or more molecular tertiary structures.
Global multiple structure alignment construction
Global alignment methods identify similarity across the entire structures.
operation
edam
beta12orEarlier
bioinformatics
Multiple structure alignment (global)
operations
Sequence profile alignment construction (pairwise)
Align exactly two molecular profiles.
beta12orEarlier
operations
Sequence profile alignment (pairwise)
bioinformatics
operation
Pairwise sequence profile alignment construction
edam
Methods might perform one-to-one, one-to-many or many-to-many comparisons.
Sequence profile alignment construction (multiple)
Sequence profile alignment (multiple)
bioinformatics
Multiple sequence profile alignment construction
beta12orEarlier
operation
edam
Align two or more molecular profiles.
operations
Structural (3D) profile alignment construction (pairwise)
beta12orEarlier
bioinformatics
Structural (3D) profile alignment (pairwise)
Methods might perform one-to-one, one-to-many or many-to-many comparisons.
operations
operation
Align exactly two molecular Structural (3D) profiles.
edam
Pairwise structural (3D) profile alignment construction
Structural (3D) profile alignment construction (multiple)
bioinformatics
Structural (3D) profile alignment (multiple)
edam
beta12orEarlier
Multiple structural (3D) profile alignment construction
operations
Align two or more molecular 3D profiles.
operation
Data retrieval (tool metadata)
Search and retrieve names of or documentation on bioinformatics tools, for example by keyword or which perform a particular function.
Data retrieval (tool annotation)
Tool information retrieval
beta12orEarlier
operations
edam
bioinformatics
operation
Data retrieval (database metadata)
edam
operations
beta12orEarlier
Database information retrieval
Search and retrieve names of or documentation on bioinformatics databases or query terms, for example by keyword.
Data retrieval (database annotation)
operation
bioinformatics
PCR primer design (for large scale sequencing)
bioinformatics
operation
operations
beta12orEarlier
edam
Predict primers for large scale sequencing.
PCR primer design (for genotyping polymorphisms)
operations
edam
operation
Predict primers for genotyping polymorphisms, for example single nucleotide polymorphisms (SNPs).
beta12orEarlier
bioinformatics
PCR primer design (for gene transcription profiling)
operation
edam
bioinformatics
beta12orEarlier
operations
Predict primers for gene transcription profiling.
PCR primer design (for conserved primers)
bioinformatics
operations
Predict primers that are conserved across multiple genomes or species.
beta12orEarlier
operation
edam
PCR primer design (based on gene structure)
operation
beta12orEarlier
bioinformatics
operations
Predict primers based on gene structure, promoters, exon-exon junctions etc.
edam
PCR primer design (for methylation PCRs)
operations
Predict primers for methylation PCRs.
bioinformatics
edam
beta12orEarlier
operation
Sequence assembly (mapping assembly)
edam
The final sequence will resemble the backbone sequence. Mapping assemblers are usually much faster and less memory intensive than de-novo assemblers.
bioinformatics
beta12orEarlier
Sequence assembly by combining fragments using an existing backbone sequence.
operation
operations
Sequence assembly (de-novo assembly)
beta12orEarlier
De-novo assemblers are much slower and more memory intensive than mapping assemblers.
operations
bioinformatics
Sequence assembly by combining fragments into a new, previously unknown sequence.
edam
operation
Sequence assembly (genome assembly)
edam
bioinformatics
operation
beta12orEarlier
Sequence assembly capable on a very large scale such as assembly of whole genomes.
operations
Sequence assembly (EST assembly)
Assemblers must handle (or be complicated by) alternative splicing, trans-splicing, single-nucleotide polymorphism (SNP), recoding, and post-transcriptional modification.
operations
Sequence assembly for EST sequences (transcribed mRNA).
bioinformatics
beta12orEarlier
edam
operation
Tag mapping
bioinformatics
beta12orEarlier
Tag mapping might assign experimentally obtained tags to known transcripts or annotate potential virtual tags in a genome.
edam
Make gene to tag assignments (tag mapping) of SAGE, MPSS and SBS data, by annotating tags with ontology concepts.
operation
operations
Tag to gene assignment
SAGE data processing
true
operation
bioinformatics
edam
Serial analysis of gene expression data processing
Process (read and / or write) serial analysis of gene expression (SAGE) data.
operations
beta12orEarlier
beta12orEarlier
MPSS data processing
true
operations
beta12orEarlier
edam
bioinformatics
Massively parallel signature sequencing data processing
beta12orEarlier
operation
Process (read and / or write) massively parallel signature sequencing (MPSS) data.
SBS data processing
true
bioinformatics
operations
edam
operation
Process (read and / or write) sequencing by synthesis (SBS) data.
beta12orEarlier
beta12orEarlier
Sequencing by synthesis data processing
Heat map generation
Generate a heat map of gene expression from microarray data.
The heat map usually uses a coloring scheme to represent clusters. They can show how expression of mRNA by a set of genes was influenced by experimental conditions.
beta12orEarlier
operations
bioinformatics
operation
edam
Gene expression profile analysis
edam
operation
Analyse one or more gene expression profiles, typically to interpret them in functional terms.
bioinformatics
beta12orEarlier
operations
Functional profiling
Gene expression profile pathway mapping
operations
Map a gene expression profile to known biological pathways, for example, to identify or reconstruct a pathway.
beta12orEarlier
edam
operation
bioinformatics
Protein secondary structure assignment (from coordinate data)
bioinformatics
Assign secondary structure from protein coordinate data.
beta12orEarlier
operation
operations
edam
Protein secondary structure assignment (from CD data)
operation
operations
edam
bioinformatics
beta12orEarlier
Assign secondary structure from circular dichroism (CD) spectroscopic data.
Protein structure assignment (from X-ray crystallographic data)
operations
beta12orEarlier
bioinformatics
operation
edam
Assign a protein tertiary structure (3D coordinates) from raw X-ray crystallography data.
Protein structure assignment (from NMR data)
operation
operations
edam
Assign a protein tertiary structure (3D coordinates) from raw NMR spectroscopy data.
bioinformatics
beta12orEarlier
Phylogenetic tree construction (data centric)
Construct a phylogenetic tree from a specific type of data.
operations
beta12orEarlier
operation
bioinformatics
edam
Phylogenetic tree construction (method centric)
operation
operations
beta12orEarlier
Construct a phylogenetic tree using a specific method.
edam
bioinformatics
Phylogenetic tree construction (from molecular sequences)
Methods typically compare multiple molecular sequence and estimate evolutionary distances and relationships to infer gene families or make functional predictions.
operations
bioinformatics
operation
beta12orEarlier
Phylogenetic tree construction from molecular sequences.
edam
Phylogenetic tree construction (from continuous quantitative characters)
operation
Phylogenetic tree construction from continuous quantitative character data.
bioinformatics
beta12orEarlier
operations
edam
Phylogenetic tree construction (from gene frequencies)
beta12orEarlier
Phylogenetic tree construction from gene frequency data.
bioinformatics
operation
edam
operations
Phylogenetic tree construction (from polymorphism data)
bioinformatics
operation
beta12orEarlier
operations
Phylogenetic tree construction from polymorphism data including microsatellites, RFLP (restriction fragment length polymorphisms), RAPD (random-amplified polymorphic DNA) and AFLP (amplified fragment length polymorphisms) data.
edam
Phylogenetic species tree construction
beta12orEarlier
Construct a phylogenetic species tree, for example, from a genome-wide sequence comparison.
bioinformatics
edam
operations
operation
Phylogenetic tree construction (parsimony methods)
beta12orEarlier
operation
This includes evolutionary parsimony (invariants) methods.
Construct a phylogenetic tree by computing a sequence alignment and searching for the tree with the fewest number of character-state changes from the alignment.
operations
bioinformatics
edam
Phylogenetic tree construction (minimum distance methods)
Construct a phylogenetic tree by computing (or using precomputed) distances between sequences and searching for the tree with minimal discrepancies between pairwise distances.
edam
beta12orEarlier
bioinformatics
operations
This includes neighbor joining (NJ) clustering method.
operation
Phylogenetic tree construction (maximum likelihood and Bayesian methods)
edam
beta12orEarlier
operations
operation
bioinformatics
Maximum likelihood methods search for a tree that maximizes a likelihood function, i.e. that is most likely given the data and model. Bayesian analysis estimate the probability of tree for branch lengths and topology, typically using a Monte Carlo algorithm.
Construct a phylogenetic tree by relating sequence data to a hypothetical tree topology using a model of sequence evolution.
Phylogenetic tree construction (quartet methods)
Construct a phylogenetic tree by computing four-taxon trees (4-trees) and searching for the phylogeny that matches most closely.
operation
edam
operations
beta12orEarlier
bioinformatics
Phylogenetic tree construction (AI methods)
beta12orEarlier
operation
edam
operations
Construct a phylogenetic tree by using artificial-intelligence methods, for example genetic algorithms.
bioinformatics
Sequence alignment analysis (phylogenetic modelling)
bioinformatics
beta12orEarlier
operations
edam
Identify a plausible model of DNA substitution that explains a DNA sequence alignment.
operation
Phylogenetic tree analysis (shape)
beta12orEarlier
Analyse the shape (topology) of a phylogenetic tree.
bioinformatics
edam
operation
Phylogenetic tree topology analysis
operations
Phylogenetic tree bootstrapping
operations
bioinformatics
operation
beta12orEarlier
Apply bootstrapping or other measures to estimate confidence of a phylogenetic tree.
edam
Phylogenetic tree analysis (gene family prediction)
beta12orEarlier
edam
bioinformatics
operation
operations
Predict families of genes and gene function based on their position in a phylogenetic tree.
Phylogenetic tree analysis (natural selection)
bioinformatics
Analyse a phylogenetic tree to identify allele frequency distribution and change that is subject to evolutionary pressures (natural selection, genetic drift, mutation and gene flow). Identify type of natural selection (such as stabilizing, balancing or disruptive).
Stabilizing/purifying (directional) selection favors a single phenotype and tends to decrease genetic diversity as a population stabilizes on a particular trait, selecting out trait extremes or deleterious mutations. In contrast, balancing selection maintain genetic polymorphisms (or multiple alleles), whereas disruptive (or diversifying) selection favors individuals at both extremes of a trait.
edam
operations
operation
beta12orEarlier
Phylogenetic tree construction (consensus)
edam
operations
bioinformatics
beta12orEarlier
Methods typically test for topological similarity between trees using for example a congruence index.
operation
Compare two or more phylogenetic trees to produce a consensus tree.
Phylogenetic sub/super tree detection
Compare two or more phylogenetic trees to detect subtrees or supertrees.
beta12orEarlier
bioinformatics
edam
operation
operations
Phylogenetic tree distances calculation
operations
bioinformatics
edam
operation
Compare two or more phylogenetic trees to calculate distances between trees.
beta12orEarlier
Phylogenetic tree annotation
operation
Annotate a phylogenetic tree with terms from a controlled vocabulary.
operations
bioinformatics
edam
beta12orEarlier
Peptide immunogen prediction and optimisation
Predict and optimise peptide ligands that elicit an immunological response.
beta12orEarlier
operations
edam
bioinformatics
operation
DNA vaccine prediction and optimisation
edam
operation
bioinformatics
beta12orEarlier
operations
Predict or optimise DNA to elicit (via DNA vaccination) an immunological response.
Sequence reformatting
operations
bioinformatics
operation
edam
Reformat (a file or other report of) molecular sequence(s).
beta12orEarlier
Sequence alignment reformatting
operation
beta12orEarlier
Reformat (a file or other report of) molecular sequence alignment(s).
operations
bioinformatics
edam
Codon usage table reformatting
edam
beta12orEarlier
operations
bioinformatics
Reformat a codon usage table.
operation
Sequence rendering
edam
operations
beta12orEarlier
bioinformatics
operation
Visualise, format or render a molecular sequence, possibly with sequence features or properties shown.
Sequence alignment rendering
Visualise, format or print a molecular sequence alignment.
bioinformatics
operations
operation
beta12orEarlier
edam
Sequence cluster rendering
Visualise, format or render sequence clusters.
operation
bioinformatics
edam
operations
beta12orEarlier
Phylogenetic tree rendering
operation
operations
bioinformatics
edam
beta12orEarlier
Visualise or plot a phylogenetic tree.
RNA secondary structure rendering
beta12orEarlier
Visualise RNA secondary structure, knots, pseudoknots etc.
operation
edam
bioinformatics
operations
Protein secondary structure rendering
bioinformatics
operations
edam
beta12orEarlier
Render and visualise protein secondary structure.
operation
Structure rendering
Visualise or render a molecular tertiary structure, for example a high-quality static picture or animation.
beta12orEarlier
bioinformatics
operation
edam
operations
Microarray data rendering
operations
beta12orEarlier
bioinformatics
edam
Visualise microarray data.
operation
Protein interaction network rendering
edam
bioinformatics
Identify and analyse networks of protein interactions.
operation
beta12orEarlier
operations
Map rendering
edam
operation
operations
Render and visualise a DNA map.
DNA map rendering
bioinformatics
beta12orEarlier
Sequence motif rendering
true
beta12orEarlier
edam
beta12orEarlier
bioinformatics
Render a sequence with motifs.
operations
operation
Restriction map rendering
beta12orEarlier
operation
bioinformatics
operations
edam
Visualise restriction maps in DNA sequences.
DNA linear map rendering
true
operation
bioinformatics
edam
Draw a linear maps of DNA.
operations
beta12orEarlier
beta12orEarlier
DNA circular map rendering
Draw a circular maps of DNA, for example a plasmid map.
bioinformatics
edam
operations
beta12orEarlier
operation
Operon rendering
beta12orEarlier
bioinformatics
edam
operation
operations
Visualise operon structure etc.
Nucleic acid folding family identification
true
operation
bioinformatics
beta12orEarlier
Identify folding families of related RNAs.
edam
beta12orEarlier
operations
Nucleic acid folding energy calculation
edam
operations
Compute energies of nucleic acid folding, e.g. minimum folding energies for DNA or RNA sequences or energy landscape of RNA mutants.
bioinformatics
beta12orEarlier
operation
Annotation retrieval
true
beta12orEarlier
edam
operations
Use this concepts for tools which retrieve pre-existing annotations, not for example prediction methods that might make annotations.
operation
Retrieve existing annotation (or documentation), typically annotation on a database entity.
beta12orEarlier
bioinformatics
Protein function prediction
operation
For functional properties that can be mapped to a sequence, use 'Sequence feature detection (protein)' instead.
beta12orEarlier
edam
operations
Predict general functional properties of a protein.
bioinformatics
Protein function comparison
bioinformatics
beta12orEarlier
edam
operation
Compare the functional properties of two or more proteins.
operations
Sequence submission
operation
beta12orEarlier
edam
bioinformatics
Submit a molecular sequence to a database.
operations
Gene regulatory network analysis
Analyse a known network of gene regulation.
operations
bioinformatics
edam
operation
beta12orEarlier
Data loading
operation
Data submission
edam
beta12orEarlier
Prepare or load a user-specified data file so that it is available for use.
Database submission
operations
bioinformatics
WHATIF:UploadPDB
Sequence retrieval
beta12orEarlier
operation
operations
This includes direct retrieval methods (e.g. the dbfetch program) but not those that perform calculations on the sequence.
edam
bioinformatics
Query a sequence data resource (typically a database) and retrieve sequences and / or annotation.
Data retrieval (sequences)
Structure retrieval
operation
Query a tertiary structure data resource (typically a database) and retrieve structures, structure-related data and annotation.
edam
WHATIF:EchoPDB
operations
bioinformatics
This includes direct retrieval methods but not those that perform calculations on the sequence or structure.
WHATIF:DownloadPDB
beta12orEarlier
Surface rendering
edam
A dot has three coordinates (x,y,z) and (typically) a color.
bioinformatics
beta12orEarlier
operation
Calculate the positions of dots that are homogeneously distributed over the surface of a molecule.
WHATIF:GetSurfaceDots
operations
Protein atom surface calculation (accessible)
Calculate the solvent accessibility ('accessible surface') for each atom in a structure.
WHATIF:AtomAccessibilitySolvent
operation
WHATIF:AtomAccessibilitySolventPlus
beta12orEarlier
Waters are not considered.
edam
bioinformatics
operations
Protein atom surface calculation (accessible molecular)
operation
Waters are not considered.
operations
WHATIF:AtomAccessibilityMolecularPlus
WHATIF:AtomAccessibilityMolecular
edam
Calculate the solvent accessibility ('accessible molecular surface') for each atom in a structure.
bioinformatics
beta12orEarlier
Protein residue surface calculation (accessible)
Calculate the solvent accessibility ('accessible surface') for each residue in a structure.
beta12orEarlier
Solvent accessibility might be calculated for the backbone, sidechain and total (backbone plus sidechain).
operations
bioinformatics
WHATIF:ResidueAccessibilitySolvent
edam
operation
Protein residue surface calculation (vacuum accessible)
beta12orEarlier
edam
operations
WHATIF:ResidueAccessibilityVacuum
operation
bioinformatics
Calculate the solvent accessibility ('vacuum accessible surface') for each residue in a structure. This is the accessibility of the residue when taken out of the protein together with the backbone atoms of any residue it is covalently bound to.
Solvent accessibility might be calculated for the backbone, sidechain and total (backbone plus sidechain).
Protein residue surface calculation (accessible molecular)
beta12orEarlier
edam
bioinformatics
Solvent accessibility might be calculated for the backbone, sidechain and total (backbone plus sidechain).
operations
Calculate the solvent accessibility ('accessible molecular surface') for each residue in a structure.
WHATIF:ResidueAccessibilityMolecular
operation
Protein residue surface calculation (vacuum molecular)
operations
WHATIF:ResidueAccessibilityVacuumMolecular
Calculate the solvent accessibility ('vacuum molecular surface') for each residue in a structure. This is the accessibility of the residue when taken out of the protein together with the backbone atoms of any residue it is covalently bound to.
beta12orEarlier
edam
operation
bioinformatics
Solvent accessibility might be calculated for the backbone, sidechain and total (backbone plus sidechain).
Protein surface calculation (accessible molecular)
Calculate the solvent accessibility ('accessible molecular surface') for a structure as a whole.
WHATIF:TotAccessibilityMolecular
beta12orEarlier
operations
edam
operation
bioinformatics
Protein surface calculation (accessible)
operations
bioinformatics
operation
Calculate the solvent accessibility ('accessible surface') for a structure as a whole.
beta12orEarlier
WHATIF:TotAccessibilitySolvent
edam
Backbone torsion angle calculation
Calculate for each residue in a protein structure all its backbone torsion angles.
WHATIF:ResidueTorsionsBB
beta12orEarlier
operations
edam
bioinformatics
operation
Full torsion angle calculation
WHATIF:ResidueTorsions
Calculate for each residue in a protein structure all its torsion angles.
beta12orEarlier
operation
bioinformatics
operations
edam
Cysteine torsion angle calculation
edam
operation
operations
Calculate for each cysteine (bridge) all its torsion angles.
WHATIF:CysteineTorsions
bioinformatics
beta12orEarlier
Tau angle calculation
operation
For each amino acid in a protein structure calculate the backbone angle tau.
bioinformatics
Tau is the backbone angle N-Calpha-C (angle over the C-alpha).
beta12orEarlier
edam
WHATIF:ShowTauAngle
operations
Cysteine bridge detection
operation
beta12orEarlier
bioinformatics
Detect cysteine bridges (from coordinate data) in a protein structure.
WHATIF:ShowCysteineBridge
edam
operations
Free cysteine detection
WHATIF:ShowCysteineFree
bioinformatics
operations
Detect free cysteines in a protein structure.
beta12orEarlier
A free cysteine is neither involved in a cysteine bridge, nor functions as a ligand to a metal.
operation
edam
Metal-bound cysteine detection
Detect cysteines that are bound to metal in a protein structure.
operation
operations
WHATIF:ShowCysteineMetal
edam
bioinformatics
beta12orEarlier
Residue contact calculation (residue-nucleic acid)
operations
bioinformatics
edam
Calculate protein residue contacts with nucleic acids in a structure.
WHATIF:ShowProteiNucleicContacts
WHATIF:HasNucleicContacts
operation
beta12orEarlier
Residue contact calculation (residue-metal)
edam
WHATIF:HasMetalContactsPlus
operation
WHATIF:HasMetalContacts
bioinformatics
operations
Calculate protein residue contacts with metal in a structure.
beta12orEarlier
Residue contact calculation (residue-negative ion)
WHATIF:HasNegativeIonContacts
beta12orEarlier
bioinformatics
WHATIF:HasNegativeIonContactsPlus
edam
operations
Calculate ion contacts in a structure (all ions for all side chain atoms).
operation
Residue bump detection
Detect 'bumps' between residues in a structure, i.e. those with pairs of atoms whose Van der Waals' radii interpenetrate more than a defined distance.
operation
operations
edam
bioinformatics
beta12orEarlier
WHATIF:ShowBumps
Residue symmetry contact calculation
bioinformatics
operations
operation
beta12orEarlier
A symmetry contact is a contact between two atoms in different asymmetric unit.
WHATIF:SymmetryContact
edam
Calculate the number of symmetry contacts made by residues in a protein structure.
Residue contact calculation (residue-ligand)
WHATIF:ShowDrugContacts
WHATIF:ShowDrugContactsShort
bioinformatics
beta12orEarlier
WHATIF:ShowLigandContacts
operation
Calculate contacts between residues and ligands in a protein structure.
operations
edam
Salt bridge calculation
WHATIF:HasSaltBridgePlus
operation
edam
operations
Calculate (and possibly score) salt bridges in a protein structure.
bioinformatics
WHATIF:ShowSaltBridgesH
WHATIF:HasSaltBridge
beta12orEarlier
WHATIF:ShowSaltBridges
Salt bridges are interactions between oppositely charged atoms in different residues. The output might include the inter-atomic distance.
Rotamer likelihood prediction
bioinformatics
beta12orEarlier
WHATIF:ShowLikelyRotamers100
WHATIF:ShowLikelyRotamers200
WHATIF:ShowLikelyRotamers300
WHATIF:ShowLikelyRotamers400
operation
WHATIF:ShowLikelyRotamers700
Output typically includes, for each residue position, the likelihoods for the 20 amino acid types with estimated reliability of the 20 likelihoods.
WHATIF:ShowLikelyRotamers900
operations
WHATIF:ShowLikelyRotamers800
WHATIF:ShowLikelyRotamers600
Predict rotamer likelihoods for all 20 amino acid types at each position in a protein structure.
WHATIF:ShowLikelyRotamers500
WHATIF:ShowLikelyRotamers
edam
Proline mutation value calculation
beta12orEarlier
operation
WHATIF:ProlineMutationValue
edam
Calculate for each position in a protein structure the chance that a proline, when introduced at this position, would increase the stability of the whole protein.
operations
bioinformatics
Residue packing validation
bioinformatics
operations
edam
Identify poorly packed residues in protein structures.
WHATIF: PackingQuality
operation
beta12orEarlier
Dihedral angle validation
WHATIF: ImproperQualityMax
edam
operation
beta12orEarlier
bioinformatics
Identify for each residue in a protein structure any improper dihedral (phi/psi) angles.
WHATIF: ImproperQualitySum
operations
PDB file sequence retrieval
true
beta12orEarlier
edam
bioinformatics
WHATIF: PDB_sequence
operations
beta12orEarlier
operation
Extract a molecular sequence from a PDB file.
HET group detection
A HET group usually corresponds to ligands, lipids, but might also (not consistently) include groups that are attached to amino acids. Each HET group is supposed to have a unique three letter code and a unique name which might be given in the output.
beta12orEarlier
Identify HET groups in PDB files.
WHATIF: HETGroupNames
bioinformatics
operation
edam
operations
DSSP secondary structure assignment
true
operations
beta12orEarlier
edam
Determine for residue the DSSP determined secondary structure in three-state (HSC).
operation
bioinformatics
WHATIF: ResidueDSSP
beta12orEarlier
Structure reformatting
bioinformatics
WHATIF: PDBasXML
operation
beta12orEarlier
Reformat (a file or other report of) tertiary structure data.
edam
operations
Protein cysteine and disulfide bond assignment
beta12orEarlier
operations
bioinformatics
Assign cysteine bonding state and disulfide bond partners in protein structures.
edam
operation
Residue validation
The scoring function to identify poor quality residues may consider residues with bad atoms or atoms with high B-factor, residues in the N- or C-terminal position, adjacent to an unstructured residue, non-canonical residues, glycine and proline (or adjacent to these such residues).
WHATIF: UseResidueDB
operation
beta12orEarlier
Identify poor quality amino acid positions in protein structures.
operations
edam
bioinformatics
Structure retrieval (water)
operation
Query a tertiary structure database and retrieve water molecules.
bioinformatics
WHATIF:MovedWaterPDB
operations
edam
beta12orEarlier
siRNA duplex prediction
operations
edam
Identify or predict siRNA duplexes in RNA.
beta12orEarlier
operation
bioinformatics
Sequence alignment refinement
beta12orEarlier
edam
Refine an existing sequence alignment.
operations
bioinformatics
operation
Listfile processing
beta12orEarlier
Process an EMBOSS listfile (list of EMBOSS Uniform Sequence Addresses).
bioinformatics
operation
operations
edam
Sequence file processing
beta12orEarlier
edam
operations
operation
bioinformatics
Perform basic (non-analytical) operations on a report or file of sequences (which might include features), such as file concatenation, removal or ordering of sequences, or create a file of sequences.
Sequence alignment file processing
Perform basic (non-analytical) operations on a sequence alignment file, such as copying or removal and ordering of sequences.
operation
edam
operations
bioinformatics
beta12orEarlier
Small molecule data processing
true
operation
edam
Process (read and / or write) physicochemical property data for small molecules.
operations
beta12orEarlier
bioinformatics
beta13
Data retrieval (ontology annotation)
true
beta12orEarlier
bioinformatics
operation
Ontology information retrieval
operations
edam
Search and retrieve documentation on a bioinformatics ontology.
beta13
Data retrieval (ontology concept)
true
bioinformatics
operation
beta13
beta12orEarlier
Query an ontology and retrieve concepts or relations.
operations
Ontology retrieval
edam
Representative sequence identification
bioinformatics
beta12orEarlier
operation
operations
edam
Identify a representative sequence from a set of sequences, typically using scores from pair-wise alignment or other comparison of the sequences.
Structure file processing
bioinformatics
operations
beta12orEarlier
edam
Perform basic (non-analytical) operations on a file of molecular tertiary structural data.
operation
Data retrieval (sequence profile)
true
beta13
This includes direct retrieval methods that retrieve a profile by, e.g. the profile name.
bioinformatics
operation
edam
beta12orEarlier
operations
Query a profile data resource and retrieve one or more profile(s) and / or associated annotation.
Statistical calculation
true
edam
beta12orEarlier
beta12orEarlier
operations
bioinformatics
Perform a statistical data operation of some type, e.g. calibration or validation.
operation
3D-1D scoring matrix generation
Calculate a 3D-1D scoring matrix from analysis of protein sequence and structural data.
operation
operations
bioinformatics
A 3D-1D scoring matrix scores the probability of amino acids occurring in different structural environments.
beta12orEarlier
edam
Transmembrane protein rendering
operations
beta12orEarlier
bioinformatics
operation
Visualise transmembrane proteins, typically the transmembrane regions within a sequence.
edam
Demonstration
true
beta13
bioinformatics
edam
operation
operations
An operation performing purely illustrative (pedagogical) purposes.
beta12orEarlier
Data retrieval (pathway or network)
true
beta13
Query a biological pathways database and retrieve annotation on one or more pathways.
operation
bioinformatics
operations
beta12orEarlier
edam
Data retrieval (identifier)
true
edam
beta13
bioinformatics
Query a database and retrieve one or more data identifiers.
beta12orEarlier
operation
operations
Nucleic acid density plotting
edam
operation
operations
Calculate a density plot (of base composition) for a nucleotide sequence.
bioinformatics
beta12orEarlier
Sequence analysis
operations
edam
operation
bioinformatics
Sequence analysis (general)
beta12orEarlier
Analyse one or more known molecular sequences.
Sequence motif processing
operation
edam
beta12orEarlier
Process (read and / or write) molecular sequence motifs.
operations
bioinformatics
Protein interaction data processing
operations
bioinformatics
beta12orEarlier
edam
Process (read and / or write) protein interaction data.
operation
Protein structure analysis
operation
bioinformatics
beta12orEarlier
Structure analysis (protein)
Analyse protein tertiary structural data.
edam
operations
Annotation processing
true
operation
operations
bioinformatics
Process (read and / or write) annotation of some type, typically annotation on an entry from a biological or biomedical database entity.
edam
beta12orEarlier
beta12orEarlier
Sequence feature analysis
true
beta12orEarlier
operation
Analyse features in molecular sequences.
operations
bioinformatics
beta12orEarlier
edam
File processing
Data file processing
operations
File handling
bioinformatics
edam
operation
Report handling
beta12orEarlier
Process (read and / or write) a data file (or equivalent entity in memory). Processing is limited to basic (non-analytical) operations.
Gene expression analysis
true
beta12orEarlier
bioinformatics
operations
Analyse gene expression and regulation data.
beta12orEarlier
edam
operation
Structural (3D) profile processing
operations
operation
edam
bioinformatics
Process (read and / or write) one or more structural (3D) profile(s) or template(s) of some type.
beta12orEarlier
Data index processing
Process (read and / or write) an index of (typically a file of) biological data.
operations
operation
Database index processing
bioinformatics
beta12orEarlier
edam
Sequence profile processing
bioinformatics
Process (read and / or write) some type of sequence profile.
beta12orEarlier
edam
operation
operations
Protein function analysis
edam
operations
beta12orEarlier
Analyse protein function, typically by processing protein sequence and/or structural data, and generate an informative report.
This is a broad concept and is used a placeholder for other, more specific concepts.
operation
bioinformatics
Protein folding analysis
beta12orEarlier
edam
operation
Protein folding modelling
Analyse protein folding, typically by processing sequence and / or structural data, and write an informative report.
This is a broad concept and is used a placeholder for other, more specific concepts.
bioinformatics
operations
Protein secondary structure analysis
operation
edam
Analyse known protein secondary structure data.
operations
bioinformatics
Secondary structure analysis (protein)
beta12orEarlier
Physicochemical property data processing
true
bioinformatics
operation
beta12orEarlier
Process (read and / or write) data on the physicochemical property of a molecule.
operations
beta13
edam
Primer and probe design
bioinformatics
operations
Primer and probe prediction
Predict oligonucleotide primers or probes.
operation
beta12orEarlier
edam
Analysis and processing
operation
Process (read and / or write) data of a specific type, for example applying analytical methods.
beta12orEarlier
edam
bioinformatics
Calculation
operations
Computation
Database search
Search a database (or other data resource) with a supplied query and retrieve entries (or parts of entries) that are similar to the query.
Typically the query is compared to each entry and high scoring matches (hits) are returned. For example, a BLAST search of a sequence database.
operations
bioinformatics
operation
beta12orEarlier
edam
Data retrieval
Retrieve an entry (or part of an entry) from a data resource that matches a supplied query. This might include some primary data and annotation. The query is a data identifier or other indexed term. For example, retrieve a sequence record with the specified accession number, or matching supplied keywords.
edam
operations
beta12orEarlier
operation
bioinformatics
Information retrieval
Prediction, detection and recognition
operations
edam
Predict, recognise, detect or identify some properties of a biomolecule.
bioinformatics
beta12orEarlier
operation
Comparison
Compare two or more things to identify similarities.
edam
bioinformatics
operation
operations
beta12orEarlier
Optimisation and refinement
operation
bioinformatics
edam
Refine or optimise some data model.
beta12orEarlier
operations
Modelling and simulation
Model or simulate some biological entity or system.
edam
operation
operations
beta12orEarlier
bioinformatics
Data handling
true
beta12orEarlier
operation
operations
Perform basic operations on some data or a database.
bioinformatics
beta12orEarlier
edam
Evaluation and validation
edam
Validate or standardise some data.
Validation and standardisation
bioinformatics
operation
operations
beta12orEarlier
Mapping and assembly
operations
Map properties to positions on an biological entity (typically a molecular sequence or structure), or assemble such an entity from constituent parts.
operation
bioinformatics
beta12orEarlier
edam
This is a broad concept and is used a placeholder for other, more specific concepts.
Design
true
edam
Design a biological entity (typically a molecular sequence or structure) with specific properties.
beta13
operation
operations
beta12orEarlier
bioinformatics
Microarray data processing
true
operation
bioinformatics
operations
beta12orEarlier
edam
Process (read and / or write) microarray data.
beta12orEarlier
Codon usage table processing
beta12orEarlier
bioinformatics
operations
Process (read and / or write) a codon usage table.
operation
edam
Data retrieval (codon usage table)
true
beta13
edam
bioinformatics
beta12orEarlier
Retrieve a codon usage table and / or associated annotation.
operations
operation
Gene expression profile processing
beta12orEarlier
operations
Process (read and / or write) a gene expression profile.
bioinformatics
operation
edam
Gene expression profile annotation
Annotate a gene expression profile with concepts from an ontology of gene functions.
operation
bioinformatics
beta12orEarlier
operations
edam
Gene regulatory network prediction
beta12orEarlier
operation
Predict a network of gene regulation.
edam
operations
bioinformatics
Pathway or network processing
operation
bioinformatics
operations
edam
beta12orEarlier
Generate, analyse or handle a biological pathway or network.
RNA secondary structure processing
operations
bioinformatics
edam
Process (read and / or write) RNA secondary structure data.
operation
beta12orEarlier
Structure processing (RNA)
true
bioinformatics
operations
edam
beta13
beta12orEarlier
Process (read and / or write) RNA tertiary structure data.
operation
RNA structure prediction
Predict RNA tertiary structure.
beta12orEarlier
bioinformatics
edam
operation
operations
DNA structure prediction
bioinformatics
beta12orEarlier
Predict DNA tertiary structure.
operation
edam
operations
Phylogenetic tree processing
edam
Process (read and / or write) a phylogenetic tree.
beta12orEarlier
operations
bioinformatics
operation
Protein secondary structure processing
beta12orEarlier
operations
edam
bioinformatics
operation
Process (read and / or write) protein secondary structure data.
Protein interaction network processing
edam
operation
Process (read and / or write) a network of protein interactions.
bioinformatics
beta12orEarlier
operations
Sequence processing
bioinformatics
edam
beta12orEarlier
operation
Process (read and / or write) one or more molecular sequences and associated annotation.
operations
Sequence processing (general)
Sequence processing (protein)
edam
bioinformatics
operation
beta12orEarlier
Process (read and / or write) a protein sequence and associated annotation.
operations
Sequence processing (nucleic acid)
operations
operation
bioinformatics
beta12orEarlier
edam
Process (read and / or write) a nucleotide sequence and associated annotation.
Sequence comparison
beta12orEarlier
edam
operations
bioinformatics
Compare two or more molecular sequences.
operation
Sequence cluster processing
operation
bioinformatics
beta12orEarlier
edam
operations
Process (read and / or write) a sequence cluster.
Feature table processing
operations
beta12orEarlier
operation
edam
bioinformatics
Process (read and / or write) a sequence feature table.
Gene and gene component prediction
Gene finding
bioinformatics
operations
beta12orEarlier
operation
edam
Detect, predict and identify genes or components of genes in DNA sequences.
GPCR classification
edam
Classify G-protein coupled receptors (GPCRs) into families and subfamilies.
beta12orEarlier
operations
bioinformatics
operation
G protein-coupled receptor (GPCR) classification
GPCR coupling selectivity prediction
operation
edam
operations
beta12orEarlier
bioinformatics
Predict G-protein coupled receptor (GPCR) coupling selectivity.
Structure processing (protein)
Process (read and / or write) a protein tertiary structure.
beta12orEarlier
operations
operation
bioinformatics
edam
Protein atom surface calculation
operation
edam
operations
beta12orEarlier
Calculate the solvent accessibility for each atom in a structure.
bioinformatics
Waters are not considered.
Protein residue surface calculation
operation
Calculate the solvent accessibility for each residue in a structure.
edam
operations
beta12orEarlier
bioinformatics
Protein surface calculation
operations
edam
beta12orEarlier
Calculate the solvent accessibility of a structure as a whole.
operation
bioinformatics
Sequence alignment processing
edam
operations
beta12orEarlier
bioinformatics
Process (read and / or write) a molecular sequence alignment.
operation
Protein-protein interaction prediction
bioinformatics
operations
beta12orEarlier
edam
Identify or predict protein-protein interactions, interfaces, binding sites etc.
operation
Structure processing
beta12orEarlier
Process (read and / or write) a molecular tertiary structure.
operation
edam
bioinformatics
operations
Map annotation
edam
bioinformatics
Annotate a DNA map of some type with terms from a controlled vocabulary.
operations
operation
beta12orEarlier
Data retrieval (protein annotation)
true
operation
Protein information retrieval
edam
beta12orEarlier
bioinformatics
operations
beta13
Retrieve information on a protein.
Data retrieval (phylogenetic tree)
true
operations
Retrieve a phylogenetic tree from a data resource.
bioinformatics
operation
beta13
edam
beta12orEarlier
Data retrieval (protein interaction annotation)
true
operation
edam
operations
beta12orEarlier
bioinformatics
beta13
Retrieve information on a protein interaction.
Data retrieval (protein family annotation)
true
operation
operations
Retrieve information on a protein family.
edam
beta12orEarlier
bioinformatics
beta13
Protein family information retrieval
Data retrieval (RNA family annotation)
true
beta13
beta12orEarlier
operations
RNA family information retrieval
operation
Retrieve information on an RNA family.
edam
bioinformatics
Data retrieval (gene annotation)
true
bioinformatics
beta13
Retrieve information on a specific gene.
Gene information retrieval
beta12orEarlier
operations
edam
operation
Data retrieval (genotype and phenotype annotation)
true
bioinformatics
operations
Retrieve information on a specific genotype or phenotype.
beta13
beta12orEarlier
Genotype and phenotype information retrieval
edam
operation
Protein architecture comparison
beta12orEarlier
operation
bioinformatics
Compare the architecture of two or more protein structures.
operations
edam
Protein architecture recognition
operation
Identify the architecture of a protein structure.
edam
operations
bioinformatics
beta12orEarlier
Molecular dynamics simulation
edam
Simulate molecular (typically protein) conformation using a computational model of physical forces and computer simulation.
bioinformatics
operation
operations
beta12orEarlier
Nucleic acid sequence analysis
operations
edam
bioinformatics
Sequence analysis (nucleic acid)
beta12orEarlier
operation
Analyse a nucleic acid sequence (using methods that are only applicable to nucleic acid sequences).
Protein sequence analysis
operations
beta12orEarlier
Sequence analysis (protein)
operation
edam
Analyse a protein sequence (using methods that are only applicable to protein sequences).
bioinformatics
Structure analysis
operations
Analyse known molecular tertiary structures.
operation
edam
bioinformatics
beta12orEarlier
Nucleic acid structure analysis
operations
operation
edam
bioinformatics
beta12orEarlier
Analyse nucleic acid tertiary structural data.
Secondary structure processing
beta12orEarlier
operations
edam
Process (read and / or write) a molecular secondary structure.
bioinformatics
operation
Structure comparison
Compare two or more molecular tertiary structures.
operation
bioinformatics
operations
beta12orEarlier
edam
Helical wheel rendering
Render a helical wheel representation of protein secondary structure.
operation
operations
beta12orEarlier
bioinformatics
edam
Topology diagram rendering
bioinformatics
operations
edam
Render a topology diagram of protein secondary structure.
beta12orEarlier
operation
Protein structure comparison
Compare protein tertiary structures.
beta12orEarlier
operations
Methods might identify structural neighbors, find structural similarities or define a structural core.
Structure comparison (protein)
operation
edam
bioinformatics
Protein secondary structure comparison
edam
bioinformatics
operation
Compare protein secondary structures.
beta12orEarlier
Protein secondary structure
Secondary structure comparison (protein)
operations
Protein subcellular localization prediction
bioinformatics
Predict the subcellular localization of a protein sequence.
The prediction might include subcellular localization (nuclear, cytoplasmic, mitochondrial, chloroplast, plastid, membrane etc) or export (extracellular proteins) of a protein.
beta12orEarlier
operations
Protein targeting prediction
edam
operation
Residue contact calculation (residue-residue)
operation
Calculate contacts between residues in a protein structure.
operations
bioinformatics
edam
beta12orEarlier
Hydrogen bond calculation (inter-residue)
beta12orEarlier
operations
bioinformatics
Identify potential hydrogen bonds between amino acid residues.
operation
edam
Protein interaction prediction
edam
beta12orEarlier
operations
operation
bioinformatics
Predict the interactions of proteins with other molecules.
Codon usage data processing
true
edam
operation
bioinformatics
operations
Process (read and / or write) codon usage data.
beta13
beta12orEarlier
Gene expression data processing
Process (read and / or write) gene expression (typically microarray) data.
bioinformatics
operations
Microarray data processing
beta12orEarlier
operation
edam
Gene expression (microarray) data processing
Gene regulatory network processing
operations
bioinformatics
Process (read and / or write) a network of gene regulation.
operation
beta12orEarlier
edam
Pathway or network analysis
beta12orEarlier
Network analysis
edam
Analyse a known biological pathway or network.
bioinformatics
operations
Pathway analysis
operation
Sequencing-based expression profile data analysis
true
operations
beta12orEarlier
operation
beta12orEarlier
Analyse SAGE, MPSS or SBS experimental data, typically to identify or quantify mRNA transcripts.
edam
bioinformatics
Splicing analysis
operations
beta12orEarlier
Analyse (e.g. characterize and model) alternative splicing events from comparing multiple nucleic acid sequences.
operation
edam
Splicing modelling
bioinformatics
Microarray raw data analysis
true
bioinformatics
operations
Analyse raw microarray data.
beta12orEarlier
beta12orEarlier
edam
operation
Nucleic acid data processing
beta12orEarlier
Process (read and / or write) nucleic acid sequence or structural data.
operation
operations
edam
bioinformatics
Protein data processing
operation
operations
beta12orEarlier
edam
bioinformatics
Process (read and / or write) protein sequence or structural data.
Sequence data processing
true
operation
edam
beta13
beta12orEarlier
Process (read and / or write) molecular sequence data.
operations
bioinformatics
Structural data processing
true
operations
edam
beta12orEarlier
operation
bioinformatics
beta13
Process (read and / or write) molecular structural data.
Text processing
edam
Process (read and / or write) text.
operations
beta12orEarlier
operation
bioinformatics
Sequence alignment analysis (protein)
operation
edam
operations
beta12orEarlier
bioinformatics
Analyse a protein sequence alignment, typically to detect features or make predictions.
Sequence alignment analysis (nucleic acid)
Analyse a protein sequence alignment, typically to detect features or make predictions.
operations
beta12orEarlier
edam
bioinformatics
operation
Nucleic acid sequence comparison
operation
beta12orEarlier
Compare two or more nucleic acid sequences.
operations
bioinformatics
Sequence comparison (nucleic acid)
edam
Protein sequence comparison
operations
operation
Compare two or more protein sequences.
bioinformatics
Sequence comparison (protein)
edam
beta12orEarlier
DNA back-translation
Back-translate a protein sequence into DNA.
operation
beta12orEarlier
operations
edam
bioinformatics
Sequence editing (nucleic acid)
beta12orEarlier
bioinformatics
Edit or change a nucleic acid sequence, either randomly or specifically.
operations
edam
operation
Sequence editing (protein)
beta12orEarlier
operation
edam
operations
Edit or change a protein sequence, either randomly or specifically.
bioinformatics
Sequence generation (nucleic acid)
operations
bioinformatics
operation
beta12orEarlier
Generate a nucleic acid sequence by some means.
edam
Sequence generation (protein)
edam
operation
bioinformatics
operations
beta12orEarlier
Generate a protein sequence by some means.
Sequence rendering (nucleic acid)
Visualise, format or render a nucleic acid sequence.
beta12orEarlier
edam
Various nucleic acid sequence analysis methods might generate a sequence rendering but are not (for brevity) listed under here.
operations
operation
bioinformatics
Sequence rendering (protein)
Various protein sequence analysis methods might generate a sequence rendering but are not (for brevity) listed under here.
edam
operation
bioinformatics
Visualise, format or render a protein sequence.
beta12orEarlier
operations
Nucleic acid structure comparison
operation
bioinformatics
operations
edam
Structure comparison (nucleic acid)
beta12orEarlier
Compare nucleic acid tertiary structures.
Structure processing (nucleic acid)
bioinformatics
operation
Process (read and / or write) nucleic acid tertiary structure data.
edam
beta12orEarlier
operations
DNA mapping
edam
Generate a map of a DNA sequence annotated with positional or non-positional features of some type.
beta12orEarlier
bioinformatics
operation
operations
Map data processing
operations
Process (read and / or write) a DNA map of some type.
bioinformatics
beta12orEarlier
edam
DNA map data processing
operation
Protein hydropathy calculation
operations
Analyse the hydrophobic, hydrophilic or charge properties of a protein (from analysis of sequence or structural information).
operation
bioinformatics
edam
beta12orEarlier
Binding site prediction
Identify or predict catalytic residues, active sites or other ligand-binding sites in protein sequences or structures.
operations
edam
bioinformatics
beta12orEarlier
operation
Ligand-binding and active site prediction
Sequence tagged site (STS) mapping
An STS is a short subsequence of known sequence and location that occurs only once in the chromosome or genome that is being mapped. Sources of STSs include 1. expressed sequence tags (ESTs), simple sequence length polymorphisms (SSLPs), and random genomic sequences from cloned genomic DNA or database sequences.
bioinformatics
Generate a physical DNA map (sequence map) from analysis of sequence tagged sites (STS).
edam
beta12orEarlier
Sequence mapping
operations
operation
Alignment construction
beta12orEarlier
Alignment
bioinformatics
edam
operations
operation
Compare two or more entities, typically the sequence or structure (or derivatives) of macromolecules, to identify equivalent subunits.
Protein fragment weight comparison
Calculate the molecular weight of a protein (or fragments) and compare it another protein or reference data.
beta12orEarlier
edam
bioinformatics
operation
operations
Protein property comparison
beta12orEarlier
Compare the physicochemical properties of two or more proteins (or reference data).
bioinformatics
operations
operation
edam
Secondary structure comparison
operation
beta12orEarlier
Compare two or more molecular secondary structures.
edam
bioinformatics
operations
Hopp and Woods plotting
edam
bioinformatics
operations
Generate a Hopp and Woods plot of antigenicity of a protein.
beta12orEarlier
operation
Microarray cluster textual view rendering
edam
operations
bioinformatics
operation
Visualise gene clusters with gene names.
beta12orEarlier
Microarray wave graph rendering
edam
Visualise clustered gene expression data as a set of waves, where each wave corresponds to a gene across samples on the X-axis.
This view can be rendered as a pie graph. The distance matrix is sorted by cluster number and typically represented as a diagonal matrix with distance values displayed in different color shades.
bioinformatics
operations
operation
beta12orEarlier
Microarray cluster temporal graph rendering
Microarray dendrograph rendering
operations
Microarray view rendering
Microarray checks view rendering
operation
edam
beta12orEarlier
Generate a dendrograph of raw, preprocessed or clustered microarray data.
bioinformatics
Microarray proximity map rendering
bioinformatics
operations
Generate a plot of distances (distance matrix) between genes.
operation
edam
Microarray distance map rendering
beta12orEarlier
Microarray tree or dendrogram view rendering
Microarray matrix tree plot rendering
Microarray 2-way dendrogram rendering
edam
operations
Visualise clustered gene expression data using a gene tree, array tree and color coded band of gene expression.
beta12orEarlier
operation
bioinformatics
Microarray principal component rendering
Generate a line graph drawn as sum of principal components (Eigen value) and individual expression values.
edam
bioinformatics
operations
beta12orEarlier
operation
Microarray scatter plot rendering
operations
operation
edam
beta12orEarlier
bioinformatics
Generate a scatter plot of microarray data, typically after principal component analysis.
Whole microarray graph view rendering
beta12orEarlier
edam
bioinformatics
Visualise gene expression data where each band (or line graph) corresponds to a sample.
operations
operation
Microarray tree-map rendering
operation
Visualise gene expression data after hierarchical clustering for representing hierarchical relationships.
operations
beta12orEarlier
edam
bioinformatics
Microarray Box-Whisker plot rendering
bioinformatics
edam
operation
Visualise raw and pre-processed gene expression data, via a plot showing over- and under-expression along with mean, upper and lower quartiles.
operations
beta12orEarlier
Physical mapping
operation
operations
edam
bioinformatics
Generate a physical (sequence) map of a DNA sequence showing the physical distance (base pairs) between features or landmarks such as restriction sites, cloned DNA fragments, genes and other genetic markers.
beta12orEarlier
Analysis
true
beta12orEarlier
edam
beta12orEarlier
For non-analytical operations, see the 'Processing' branch.
bioinformatics
operations
operation
Apply analytical methods to existing data of a specific type.
Alignment analysis
Analyse an existing alignment of two or more molecular sequences, structures or derived data.
operation
operations
beta12orEarlier
edam
bioinformatics
Article analysis
beta12orEarlier
bioinformatics
operations
operation
edam
Analyse a body of scientific text (typically a full text article from a scientific journal.)
Molecular interaction analysis
true
bioinformatics
beta13
edam
operation
operations
beta12orEarlier
Analyse the interactions of two or more molecules (or parts of molecules) that are known to interact.
Protein interaction analysis
Analyse known protein-protein, protein-DNA/RNA or protein-ligand interactions.
operations
edam
operation
bioinformatics
beta12orEarlier
Residue contact calculation
bioinformatics
edam
Calculate contacts between residues and some other group in a protein structure.
beta12orEarlier
operations
operation
Alignment processing
beta12orEarlier
edam
Process (read and / or write) an alignment of two or more molecular sequences, structures or derived data.
operation
bioinformatics
operations
Structure alignment processing
edam
bioinformatics
beta12orEarlier
Process (read and / or write) a molecular tertiary (3D) structure alignment.
operations
operation
Codon usage bias calculation
operations
edam
beta12orEarlier
bioinformatics
Calculate codon usage bias.
operation
Codon usage bias plotting
bioinformatics
operation
edam
operations
Generate a codon usage bias plot.
beta12orEarlier
Codon usage fraction calculation
operations
edam
bioinformatics
beta12orEarlier
Calculate the differences in codon usage fractions between two sequences, sets of sequences, codon usage tables etc.
operation
Classification
operations
Assign molecular sequences, structures or other biological data to a specific group or category according to qualities it shares with that group or category.
bioinformatics
edam
beta12orEarlier
operation
Molecular interaction data processing
true
Process (read and / or write) molecular interaction data.
edam
bioinformatics
operation
operations
beta12orEarlier
beta13
Sequence classification
operation
bioinformatics
Assign molecular sequence(s) to a group or category.
edam
operations
beta12orEarlier
Structure classification
operations
edam
bioinformatics
beta12orEarlier
Assign molecular structure(s) to a group or category.
operation
Protein comparison
beta12orEarlier
edam
bioinformatics
operation
Compare two or more proteins (or some aspect) to identify similarities.
operations
Nucleic acid comparison
Compare two or more nucleic acids to identify similarities.
beta12orEarlier
operation
bioinformatics
operations
edam
Prediction, detection and recognition (protein)
operations
Predict, recognise, detect or identify some properties of proteins.
bioinformatics
beta12orEarlier
operation
edam
Prediction, detection and recognition (nucleic acid)
Predict, recognise, detect or identify some properties of nucleic acids.
edam
operation
operations
bioinformatics
beta12orEarlier
Structure editing
beta13
Edit, convert or otherwise change a molecular tertiary structure, either randomly or specifically.
edam
operations
bioinformatics
operation
Sequence alignment editing
operation
bioinformatics
operations
beta13
edam
Edit, convert or otherwise change a molecular sequence alignment, either randomly or specifically.
Pathway or network rendering
beta13
bioinformatics
operations
operation
edam
Render (visualise) a biological pathway or network.
Protein function prediction (from sequence)
Predict general (non-positional) functional properties of a protein from analysing its sequence.
For functional properties that can be mapped to a sequence, use 'Sequence feature detection (protein)' instead.
edam
operation
bioinformatics
operations
beta13
Protein site detection
bioinformatics
beta13
operations
edam
name: Sequence motif recognition (protein)
Predict, recognise and identify functional or other key sites within protein sequences, typically by scanning for known motifs, patterns and regular expressions.
operation
Protein property calculation (from sequence)
edam
bioinformatics
operations
operation
Calculate (or predict) physical or chemical properties of a protein, including any non-positional properties of the molecular sequence, from processing a protein sequence.
beta13
Protein feature prediction (from structure)
Predict, recognise and identify positional features in proteins from analysing protein structure.
operations
operation
beta13
bioinformatics
edam
Protein feature prediction
operations
operation
edam
beta13
Predict, recognise and identify positional features in proteins from analysing protein sequences or structures.
bioinformatics
Sequence screening
bioinformatics
beta13
Screen a molecular sequence(s) against a database (of some type) to identify similarities between the sequence and database entries.
edam
operations
operation
Protein interaction network prediction
operations
Predict a network of protein interactions.
edam
bioinformatics
operation
beta13
Nucleic acid design
operations
Design (or predict) nucleic acid sequences with specific chemical or physical properties.
bioinformatics
beta13
edam
operation
Editing
Edit, convert or otherwise change a data entity, either randomly or specifically.
bioinformatics
edam
operation
beta13
operations
Sequence assembly evaluation
bioinformatics
edam
operations
1.1
Evaluate a DNA sequence assembly, typically for purposes of quality control.
operation
Genome alignment construction
Genome alignment
1.1
edam
operations
Align two or more (tpyically huge) molecular sequences that represent genomes.
bioinformatics
operation
Localized reassembly
operation
1.1
operations
bioinformatics
Reconstruction of a sequence assembly in a localised area.
edam
Sequence assembly rendering
Assembly visualisation
edam
operations
1.1
bioinformatics
operation
Render and visualise a DNA sequence assembly.
Assembly rendering
Sequence assembly visualisation
Base-calling
Base calling
operation
operations
1.1
Phred base calling
Phred base-calling
bioinformatics
Identify base (nucleobase) sequence from a fluorescence 'trace' data generated by an automated DNA sequencer.
edam
Bisulfite mapping
Bisulfite sequence alignment
The mapping of methylation sites in a DNA (genome) sequence.
bioinformatics
Bisulfite sequence mapping
operations
edam
operation
Bisulfite mapping follows high-throughput sequencing of DNA which has undergone bisulfite treatment followed by PCR amplification; unmethylated cytosines are specifically converted to thymine, allowing the methylation status of cytosine in the DNA to be detected.
1.1
Sequence contamination filtering
Identify and filter a (typically large) sequence data set to remove sequences from contaminants in the sample that was sequenced.
operations
operation
beta12orEarlier
bioinformatics
edam
Trim ends
bioinformatics
For example trim polyA tails, introns and primer sequence flanking the sequence of amplified exons, or other unwanted sequence.
operation
operations
1.1
Trim sequences (typically from an automated DNA sequencer) to remove misleading ends.
edam
Trim vector
operations
edam
1.1
operation
bioinformatics
Trim sequences (typically from an automated DNA sequencer) to remove sequence-specific end regions, typically contamination from vector sequences.
Trim to reference
operation
Trim sequences (typically from an automated DNA sequencer) to remove the sequence ends that extend beyond an assembled reference sequence.
bioinformatics
edam
1.1
operations
Sequence trimming
bioinformatics
operations
edam
Cut (remove) the end from a molecular sequence.
operation
1.1
Genome feature comparison
1.1
Compare the features of two genome sequences.
Genomic elements that might be compared include genes, indels, single nucleotide polymorphisms (SNPs), retrotransposons, tandem repeats and so on.
bioinformatics
operation
edam
operations
Sequencing error detection
beta12orEarlier
Short-read error correction
operation
Detect errors in DNA sequences generated from sequencing projects).
operations
edam
bioinformatics
Short read error correction
Genotyping
edam
Analyse DNA sequence data to identify differences between the genetic composition (genotype) of an individual compared to other individual's or a reference sequence.
Methods might consider cytogenetic analyses, copy number polymorphism (and calculate copy number calls for copy-number variation(CNV) regions), single nucleotide polymorphism (SNP), , rare copy number variation (CNV) identification, loss of heterozygosity data and so on.
operations
1.1
bioinformatics
operation
Genetic variation analysis
Genetic variation annotation
edam
operation
bioinformatics
1.1
Analyse a genetic variation, for example to annotate its location, alleles, classification, and effects on individual transcripts predicted for a gene model.
Genetic variation annotation provides contextual interpretation of coding SNP consequences in transcripts. It allows comparisons to be made between variation data in different populations or strains for the same transcript.
Sequence variation analysis
operations
Oligonucleotide alignment construction
operation
operations
Short sequence read mapping
Align short oligonucleotide sequences (reads) to a larger (genomic) sequence.
Short read alignment
1.1
Read alignment
Oligonucleotide alignment
The purpose of read mapping is to identify the location of sequenced fragments within a reference genome and assumes that there is, in fact, at least local similarity between the fragment and reference sequences.
edam
Short read mapping
bioinformatics
Read mapping
Oligonucleotide mapping
Short oligonucleotide alignment
Split read mapping
operations
operation
1.1
edam
bioinformatics
A varient of oligonucleotide mapping where a read is mapped to two separate locations because of possible structural variation.
DNA barcoding
Sample barcoding
operation
1.1
edam
operations
bioinformatics
Analyse DNA sequences in order to identify a DNA barcode; short fragment(s) of DNA that are useful to diagnose the taxa of biological organisms.
SNP calling
bioinformatics
edam
operation
1.1
Operations usually score confidence in the prediction or some other statistical measure of evidence.
operations
Identify single nucleotide change in base positions in sequencing data that differ from a reference genome and which might, especially by reference to population frequency or functional data, indicate a polymorphism.
Mutation detection
operations
bioinformatics
edam
1.1
operation
Detect mutations in multiple DNA sequences, for example, from the alignment and comparison of the fluorescent traces produced by DNA sequencing hardware.
Polymorphism detection
Chromatogram visualisation
edam
operations
1.1
operation
Chromatogram viewing
Visualise, format or render an image of a Chromatogram.
bioinformatics
Methylation analysis
bioinformatics
Determine cytosine methylation states in nucleic acid sequences.
1.1
edam
operation
operations
Methylation calling
bioinformatics
Determine cytosine methylation status of specific positions in a nucleic acid sequences.
edam
operations
1.1
operation
Methylation level analysis (global)
operations
Measure the overall level of methyl cytosines in a genome from analysis of experimental data, typically from chromatographic methods and methyl accepting capacity assay.
bioinformatics
operation
edam
Global methylation analysis
1.1
Methylation level analysis (gene-specific)
bioinformatics
operation
operations
Many different techniques are available for this.
Measure the level of methyl cytosines in specific genes.
edam
1.1
Gene-specific methylation analysis
Genome rendering
Genome browsing
bioinformatics
Genome viewing
operation
edam
Genome visualisation
Visualise, format or render a nucleic acid sequence that is part of (and in context of) a complete genome sequence.
1.1
operations
Genome visualization
Genome comparison
bioinformatics
Genomic region matching
1.1
edam
operations
operation
Compare the sequence or features of two or more genomes, for example, to find matching regions.
Genome indexing
1.1
bioinformatics
Generate an index of a genome sequence.
operations
Many sequence alignment tasks involving many or very large sequences rely on a precomputed index of the sequence to accelerate the alignment.
operation
edam
Genome indexing (Burrows-Wheeler)
operations
operation
1.1
Generate an index of a genome sequence using the Burrows-Wheeler algorithm.
edam
bioinformatics
The Burrows-Wheeler Transform (BWT) is a permutation of the genome based on a suffix array algorithm.
Genome indexing (suffix arrays)
bioinformatics
operations
A suffix array consists of the lexicographically sorted list of suffixes of a genome.
operation
edam
suffix arrays
Generate an index of a genome sequence using a suffix arrays algorithm.
1.1
Spectrum analysis
Mass spectrum analysis
operation
Analyse a spectrum from a mass spectrometry (or other) experiment.
edam
bioinformatics
1.1
Spectral analysis
operations
Peak detection
operation
operations
edam
Identify peaks in a spectrum from a mass spectrometry experiment.
bioinformatics
Peak finding
1.1
Scaffolding
operation
Scaffold may be positioned along a chromosome physical map to create a "golden path".
1.1
edam
operations
bioinformatics
Link together a non-contiguous series of genomic sequences into a scaffold, consisting of sequences separated by gaps of known length. The sequences that are linked are typically typically contigs; contiguous sequences corresponding to read overlaps.
Scaffold construction
Scaffold gap completion
bioinformatics
Different techniques are used to generate gap sequences to connect contigs, depending on the size of the gap. For small (5-20kb) gaps, PCR amplification and sequencing is used. For large (>20kb) gaps, fragments are cloned (e.g. in BAC (Bacterial artificial chromosomes) vectors) and then sequenced.
operations
1.1
operation
Fill the gaps in a sequence assembly (scaffold) by merging in additional sequences.
edam
Sequencing quality control
bioinformatics
operations
operation
Sequencing QC
Raw sequence data quality control.
edam
Analyse raw sequence data from a sequencing pipeline and identify problems.
1.1
Read pre-processing
1.1
operations
Pre-process sequence reads to ensure (or improve) quality and reliability.
operation
bioinformatics
This is a broad concept and is used a placeholder for other, more specific concepts. For example process paired end reads to trim low quality ends remove short sequences, identify sequence inserts, detect chimeric reads, or remove low quality sequnces including vector, adaptor, low complexity and contaminant sequences. Sequences might come from genomic DNA library, EST libraries, SSH library and so on.
edam
Sequence read pre-processing
Species frequency estimation
1.1
Estimate the frequencies of different species from analysis of the molecular sequences, typically of DNA recovered from environmental samples.
operations
bioinformatics
operation
edam
Peak calling
1.1
Protein binding peak detection
Identify putative protein-binding regions in a genome sequence from analysis of Chip-sequencing data or ChIP-on-chip data.
bioinformatics
Chip-sequencing combines chromatin immunoprecipitation (ChIP) with massively parallel DNA sequencing to generate a set of reads, which are aligned to a genome sequence. The enriched areas contain the binding sites of DNA-associated proteins. For example, a transcription factor binding site. ChIP-on-chip in contrast combines chromatin immunoprecipitation ('ChIP') with microarray ('chip').
edam
operations
operation
Differential expression analysis
operations
operation
1.1
Differential expression analysis is used, for example, to identify which genes are up-regulated (increased expression) or down-regulated (decreased expression) between a group treated with a drug and a control groups.
edam
Differentially expressed gene identification
bioinformatics
Identify (typically from analysis of microarray or RNA-seq data) genes whose expression levels are significantly different between two sample groups.
Gene set testing
edam
Analyse gene expression patterns (typically from DNA microarray datasets) to identify sets of genes that are associated with a specific trait, condition, clinical outcome etc.
bioinformatics
Gene sets can be defined beforehand by biological function, chromosome locations and so on.
operations
1.1
operation
Variant classification
operations
edam
Variants are typically classified by their position (intronic, exonic, etc.) in a gene transcript and (for variants in coding exons) by their effect on the protein sequence (synonymous, non-synonymous, frameshifting, etc.)
operation
bioinformatics
1.1
Classify variants based on their potential effect on genes, especially functional effects on the expressed proteins.
Variant prioritization
edam
bioinformatics
1.1
operation
operations
Identify biologically interesting variants by prioritizing individual variants, for example, homozygous variants absent in control genomes.
Variant prioritization can be used for example to produce a list of variants responsible for 'knocking out' genes in specific genomes. Methods amino acid substitution, aggregative approaches, probabilistic approach, inheritance and unified likelihood-frameworks.
Variant mapping
bioinformatics
operations
Variant calling
Identify and map genomic alterations, including single nucleotide polymorphisms, short indels and structural variants, in a genome sequence.
edam
Methods often utilise a database of aligned reads.
operation
1.1
Structural variation discovery
operations
Methods might involve analysis of whole-genome array comparative genome hybridization or single-nucleotide polymorphism arrays, paired-end mapping of sequencing data, or from analysis of short reads from new sequencing technologies.
bioinformatics
operation
1.1
edam
Detect large regions in a genome subject to copy-number variation, or other structural variations in genome(s).
Exome analysis
Anaylse sequencing data from experiments aiming to selectively sequence the coding regions of the genome.
Exome sequencing is considered a cheap alternative to whole genome sequencing.
bioinformatics
Targeted exome capture
1.1
edam
Exome sequence analysis
operation
operations
Read depth analysis
Analyse mapping density (read depth) of (typically) short reads from sequencing platforms, for example, to detect deletions and duplications.
1.1
operations
bioinformatics
operation
edam
Gene expression QTL analysis
operation
operations
bioinformatics
1.1
edam
expression quantitative trait loci profiling
expression QTL profiling
Combine classical quantitative trait loci (QTL) analysis with gene expression profiling, for example, to describe describe cis- and trans-controlling elements for the expression of phenotype associated genes.
eQTL profiling
Copy number estimation
operation
Estimate the number of copies of loci of particular gene(s) in DNA sequences typically from gene-expression profiling technology based on microarray hybridization-based experiments. For example, estimate copy number (or marker dosage) of a dominant marker in samples from polyploid plant cells or tissues, or chromosomal gains and losses in tumors.
bioinformatics
edam
1.1
operations
Transcript copy number estimation
Methods typically implement some statistical model for hypothesis testing, and methods estimate total copy number, i.e. do not distinguish the two inherited chromosomes quantities (specific copy number).
Primer removal
bioinformatics
Remove forward and/or reverse primers from nucleic acid sequences (typically PCR products).
operations
operation
1.2
edam
Transcriptome assembly
Infer a transcriptome sequence by analysis of short sequence reads.
bioinformatics
edam
operation
operations
1.2
Transcriptome assembly (de novo)
Infer a transcriptome sequence without the aid of a reference genome, i.e. by comparing short sequences (reads) to each other.
de novo transcriptome assembly
bioinformatics
edam
operation
operations
1.2
Transcriptome assembly (mapping)
Infer a transcriptome sequence by mapping short reads to a reference genome.
bioinformatics
edam
operation
operations
1.2
beta12orEarlier
topics
edam
bioinformatics
topic
Topic
A category denoting a rather broad domain or field of interest, of study, application, work, data, or technology. Topics have no clearly defined borders between each other.
sumo:FieldOfStudy
GFO 'Category' is in general broader than topic, but it may be seen narrower in the sense that it can be instantiated.
GFO 'Perpetuant' is in general broader than topic, but depending on metaphysical (non-)beliefs it may be seen narrower in the sense of being a concrete individual and exhibiting presentials.
BFO 'quality' is narrower in the sense that it is a 'dependent_continuant' (snap:DependentContinuant), and broader in the sense that it is any quality not just the topic.
Topic can be a quality of an entity.
Nucleic acid analysis
edam
beta12orEarlier
bioinformatics
topic
topics
Nucleic acid informatics
Processing and analysis of nucleic acid data, typically (but not exclusively) nucleic acid sequence analysis.
Nucleic acid bioinformatics
Nucleic acids
Protein analysis
beta12orEarlier
Processing and analysis of protein data, typically molecular sequence and structural data.
bioinformatics
Protein bioinformatics
edam
Proteins
topic
Protein informatics
topics
Metabolites
beta12orEarlier
Topic concerning the reactants or products of metabolism, for example small molecules such as including vitamins, polyols, nucleotides and amino acids.
bioinformatics
This concept excludes macromolecules such as proteins and nucleic acids.
topics
topic
edam
Sequence analysis
Processing and analysis of molecular sequences (monomer composition of polymers) including related concepts such as sequence sites, alignments, motifs and profiles.
bioinformatics
topic
BioCatalogue:Sequence Analysis
topics
Sequences
edam
beta12orEarlier
Structure analysis
beta12orEarlier
topics
This includes related concepts such as structural properties, alignments and structural motifs.
edam
Structural bioinformatics
topic
bioinformatics
Processing and analysis of molecular secondary or tertiary (3D) structure, typically of proteins and nucleic acids.
Computation structural biology
Structure prediction
bioinformatics
topic
edam
beta12orEarlier
Topic concerning the prediction of molecular (secondary or tertiary) structure.
topics
Alignment
true
edam
Topic concerning the alignment (equivalence between sites) of molecular sequences, structures or profiles (representing a sequence or structure alignment).
beta12orEarlier
beta12orEarlier
topic
topics
bioinformatics
Phylogenetics
Phylogenetic simulation
bioinformatics
BioCatalogue:Phylogeny
edam
This includes diverse phylogenetic methods, including phylogenetic tree construction, typically from molecular sequence or morphological data, methods that simulate DNA sequence evolution, a phylogenetic tree or the underlying data, or which estimate or use molecular clock and stratigraphic (age) data.
BioCatalogue:Statistical Robustness
topics
Phylogenetic clocks, dating and stratigraphy
Topic concerning the study of evolutionary relationships amongst organisms; phylogenetic trees, gene transfer, mode of selection / evolution etc.
topic
beta12orEarlier
Functional genomics
Topic concerning the study of gene or protein functions and their interactions.
bioinformatics
topics
edam
topic
beta12orEarlier
Ontology
topics
BioCatalogue:Ontology Lookup
bioinformatics
This includes the annotation of entities (typically biological database entries) with concepts from a controlled vocabulary.
Applied ontology
beta12orEarlier
Topic concerning ontologies, controlled vocabularies, structured glossary or other related resource.
Ontologies
topic
edam
BioCatalogue:Ontology
Data search and retrieval
Topic concerning the search and query of data sources (typically biological databases or ontologies) in order to retrieve entries or other information.
BioCatalogue:Structure Retrieval
BioCatalogue:Image Retrieval
bioinformatics
edam
BioCatalogue:Sequence Retrieval
BioCatalogue:Data Retrieval
BioCatalogue:Identifier Retrieval
topic
This includes, for example, search, query and retrieval of molecular sequences and associated data.
topics
Data retrieval
beta12orEarlier
Data handling
Topic for the generic management of biological data including basic handling of files and databases, datatypes, workflows and annotation.
beta12orEarlier
topics
edam
topic
bioinformatics
Data types, processing and visualisation
Data visualisation
edam
bioinformatics
Data rendering and visualisation
Topic for the plotting or rendering (drawing on a computer screen) of molecular sequences, structures or other biomolecular data.
beta12orEarlier
topic
Data rendering
Data plotting
topics
Nucleic acid thermodynamics
Nucleic acid properties
DNA melting
edam
topic
This includes the study of thermal and conformational properties including DNA or DNA/RNA denaturation (melting).
Topic concerning the study of the thermodynamic properties of a nucleic acid.
topics
Nucleic acid denaturation
beta12orEarlier
bioinformatics
Nucleic acid physicochemistry
Nucleic acid structure analysis
bioinformatics
edam
The processing and analysis of nucleic acid (secondary or tertiary) structural data.
topic
topics
beta12orEarlier
RNA
Topic concerning RNA sequences and structures.
topic
edam
bioinformatics
topics
beta12orEarlier
Nucleic acid restriction
topics
bioinformatics
topic
edam
Topic for the study of restriction enzymes, their cleavage sites and the restriction of nucleic acids.
beta12orEarlier
Mapping
topic
edam
beta12orEarlier
topics
Topic concerning the mapping of complete (typically nucleotide) sequences.
bioinformatics
Codon usage analysis
bioinformatics
topic
edam
topics
Topic concerning the study of codon usage in nucleotide sequence(s), genetic codes and so on.
beta12orEarlier
Translation
Topic concerning the translation of mRNA into protein.
bioinformatics
beta12orEarlier
topic
edam
topics
Gene finding
BioCatalogue:Gene Prediction
bioinformatics
edam
Gene discovery
Topic that aims to identify, predict, model or analyse genes or gene structure in DNA sequences.
This includes the study of promoters, coding regions, splice sites, etc. Methods for gene prediction might be ab initio, based on phylogenetic comparisons, use motifs, sequence features, support vector machine, alignment etc.
Gene prediction
topics
beta12orEarlier
topic
Transcription
BioCatalogue:Transcription Factors
bioinformatics
beta12orEarlier
topic
Topic concerning the transcription of DNA into mRNA.
edam
topics
Promoters
true
topic
edam
Topic concerning promoters in DNA sequences (region of DNA that facilitates the transcription of a particular gene by binding RNA polymerase and transcription factor proteins).
BioCatalogue:Promoter Prediction
beta13
topics
bioinformatics
beta12orEarlier
Nucleic acid folding
true
beta12orEarlier
edam
Topic concerning the folding (in 3D space) of nucleic acid molecules.
topic
bioinformatics
beta12orEarlier
topics
Gene structure and RNA splicing
topics
This includes splice sites, splicing patterns, splice alternatives or variants, isoforms, etc.
RNA splicing
topic
Gene structure
beta12orEarlier
bioinformatics
edam
Topic concerning introns, exons, gene fusion and RNA splicing (post-transcription RNA modification involving the removal of introns and joining of exons).
Proteomics
BioCatalogue:Proteomics
Protein expression
Proteomics uses high-throughput methods to separate, characterize and identify expressed proteins or analyse protein expression data (for example in different cells or tissues).
bioinformatics
edam
Topic concerning the study of whole proteomes of organisms.
beta12orEarlier
topic
topics
Structural genomics
edam
beta12orEarlier
BioCatalogue:Structural Genomics
topic
topics
bioinformatics
Informatics reesources concerning the elucidation of the three dimensional structure for all (available) proteins in a given organism.
Protein properties
topics
beta12orEarlier
Protein physicochemistry
Topic for the study of the physical and biochemical properties of peptides and proteins.
topic
edam
bioinformatics
Protein interactions
topics
BioCatalogue:Protein Interaction
Topic concerning protein-protein, protein-DNA/RNA and protein-ligand interactions, including analysis of known interactions and prediction of putative interactions.
edam
beta12orEarlier
bioinformatics
This includes experimental (e.g. yeast two-hybrid) and computational analysis techniques.
topic
Protein folding and stability
bioinformatics
edam
beta12orEarlier
topic
topics
Topic concerning protein folding (in 3D space) and protein sequence-structure-function relationships, for example the effect of mutation.
Two-dimensional gel electrophoresis
true
topics
beta12orEarlier
bioinformatics
beta13
Topic concerning two-dimensional gel electrophoresis image and related data.
topic
edam
Mass spectrometry
true
topic
topics
beta13
Topic concerning mass spectrometry and related data.
beta12orEarlier
edam
bioinformatics
Protein microarrays
true
beta13
edam
beta12orEarlier
bioinformatics
topics
topic
Topic concerning protein microarray data.
Protein hydropathy
topic
bioinformatics
topics
beta12orEarlier
Topic for the study of the hydrophobic, hydrophilic and charge properties of a protein.
edam
Protein targeting and localization
edam
beta12orEarlier
topic
topics
bioinformatics
Topic for the study of how proteins are transported within and without the cell, including signal peptides, protein subcellular localization and export.
Protein cleavage sites and proteolysis
bioinformatics
topic
edam
topics
beta12orEarlier
Topic concerning enzyme or chemical cleavage sites and proteolytic or mass calculations on a protein sequence.
Protein structure comparison
true
beta12orEarlier
topics
Use this concept for methods that are exclusively for protein structure.
beta12orEarlier
bioinformatics
topic
Topic concerning the comparison of two or more protein structures.
edam
Protein residue interaction analysis
beta12orEarlier
bioinformatics
Protein residue interactions
topic
topics
The processing and analysis of inter-atomic or inter-residue interactions in protein (3D) structures.
edam
Protein-protein interactions
Topic concerning protein-protein interactions, protein complexes, protein functional coupling etc.
beta12orEarlier
bioinformatics
edam
topics
topic
Protein-ligand interactions
bioinformatics
topic
edam
topics
Topic concerning protein-ligand (small molecule) interactions.
BioCatalogue:Ligand Interaction
beta12orEarlier
Protein-nucleic acid interactions
topics
bioinformatics
topic
Topic concerning protein-DNA/RNA interactions.
edam
beta12orEarlier
Protein design
bioinformatics
topic
Topic concerning the design of proteins with specific properties, typically by designing changes (via site-directed mutagenesis) to an existing protein.
topics
beta12orEarlier
edam
G protein-coupled receptors (GPCR)
true
beta12orEarlier
beta12orEarlier
Topic concerning G-protein coupled receptors (GPCRs).
bioinformatics
topic
edam
topics
Carbohydrates
edam
Topic concerning carbohydrates, typically including structural information.
topics
bioinformatics
beta12orEarlier
topic
Lipids
edam
topics
Topic concerning lipids and their structures.
beta12orEarlier
bioinformatics
topic
Small molecules
Topic concerning small molecules of biological significance, typically including structural information.
edam
topic
beta12orEarlier
CHEBI:23367
bioinformatics
Small molecules include organic molecules, metal-organic compounds, small polypeptides, small polysaccharides and oligonucleotides. Structural data is usually included.
topics
Sequence editing
true
beta12orEarlier
Edit, convert or otherwise change a molecular sequence, either randomly or specifically.
edam
beta12orEarlier
bioinformatics
topic
topics
Sequence composition analysis
topic
topics
bioinformatics
beta12orEarlier
Processing and analysis of the basic character composition of molecular sequences, for example character or word frequency, ambiguity, complexity or repeats.
edam
Sequence motifs
Topic concerning conserved patterns (motifs) in molecular sequences, that (typically) describe functional or other key sites.
bioinformatics
topic
topics
Motifs
edam
beta12orEarlier
Sequence comparison
topic
bioinformatics
topics
BioCatalogue:Nucleotide Sequence Similarity
The comparison might be on the basis of sequence, physico-chemical or some other properties of the sequences.
Topic concerning the comparison of two or more molecular sequences.
edam
beta12orEarlier
BioCatalogue:Protein Sequence Similarity
Sequence sites and features
beta12orEarlier
bioinformatics
topic
edam
topics
Topic concerning the positional features, such as functional and other key sites, in molecular sequences.
Sequence database search
true
beta12orEarlier
topic
edam
The query is a sequence-based entity such as another sequence, a motif or profile.
beta12orEarlier
Search and retrieve molecular sequences that are similar to a sequence-based query (typically a simple sequence).
topics
bioinformatics
Sequence clustering
Topic concerning the comparison and grouping together of molecular sequences on the basis of their similarities.
topics
bioinformatics
topic
beta12orEarlier
edam
This includes systems that generate, process and analyse sequence clusters.
Protein structural motifs and surfaces
topic
topics
Protein surfaces
bioinformatics
beta12orEarlier
Topic concerning (3D) structural features or common 3D motifs within protein structures, including the surface of a protein structure, such as biological interfaces with other molecules.
This includes solvent-exposed surfaces, internal cavities, the analysis of shape, hydropathy, electrostatic patches and so on.
Structural motifs
Protein structural features
Protein structural motifs
edam
Structural (3D) profiles
Structural profiles
edam
bioinformatics
beta12orEarlier
topics
The processing, analysis or use of some type of structural (3D) profile or template; a computational entity (typically a numerical matrix) that is derived from and represents a structure or structure alignment.
topic
Protein structure prediction
BioCatalogue:Protein Structure Prediction
edam
topic
topics
Topic concerning the prediction, modelling, recognition or design of protein secondary or tertiary structure or other structural features.
beta12orEarlier
bioinformatics
Nucleic acid structure prediction
topic
BioCatalogue:Nucleotide Structure Prediction
BioCatalogue:Nucleotide Tertiary Structure
Nucleic acid folding
Topic concerning the folding of nucleic acid molecules and particularly the prediction or design of (typically RNA) secondary or tertiary structure.
bioinformatics
edam
BioCatalogue:Nucleotide Secondary Structure
beta12orEarlier
topics
RNA/DNA structure prediction
Ab initio structure prediction
beta12orEarlier
bioinformatics
de novo protein structure prediction
edam
Topic for the prediction of three-dimensional structure of a (typically protein) sequence from first principles, using a physics-based or empirical scoring function and without using explicit structural templates.
topics
topic
Homology modelling
topics
beta12orEarlier
topic
edam
Topic for the modelling of the three-dimensional structure of a protein using known sequence and structural data.
bioinformatics
Comparative modelling
Molecular dynamics
bioinformatics
beta12orEarlier
Topic concerning the simulation of molecular (typically protein) conformation using a computational model of physical forces and computer simulation.
Molecular flexibility
edam
Molecular motions
topics
This includes resources concerning flexibility and motion in protein and other molecular structures.
topic
Molecular docking
Topic for modelling the structure of proteins in complex with small molecules or other macromolecules.
topics
bioinformatics
edam
topic
beta12orEarlier
Protein secondary structure prediction
topic
beta12orEarlier
edam
Topic concerning the prediction of secondary or supersecondary structure of protein sequences.
topics
bioinformatics
BioCatalogue:Protein Secondary Structure
Protein tertiary structure prediction
Topic concerning the prediction of tertiary structure of protein sequences.
beta12orEarlier
bioinformatics
edam
BioCatalogue:Protein Tertiary Structure
topic
topics
Protein fold recognition
topic
beta12orEarlier
bioinformatics
Topic concerning the recognition (prediction and assignment) of known protein structural domains or folds in protein sequence(s).
edam
topics
Sequence alignment
bioinformatics
BioCatalogue:Protein Multiple Alignment
Topic concerning the alignment of molecular sequences or sequence profiles (representing sequence alignments).
topic
BioCatalogue:Protein Pairwise Alignment
edam
beta12orEarlier
This includes the generation of alignments (the identification of equivalent sites), the analysis of alignments, editing, visualisation, alignment databases, the alignment (equivalence between sites) of sequence profiles (representing sequence alignments) and so on.
BioCatalogue:Nucleotide Multiple Alignment
BioCatalogue:Protein Sequence Alignment
topics
BioCatalogue:Nucleotide Pairwise Alignment
BioCatalogue:Nucleotide Sequence Alignment
Structure alignment
beta12orEarlier
This includes the generation, storage, analysis, rendering etc. of structure alignments.
edam
topics
bioinformatics
topic
Topic concerning the superimposition of molecular tertiary structures or structural (3D) profiles (representing a structure or structure alignment).
Structure alignment generation
Threading
edam
beta12orEarlier
Topic concerning the alignment of molecular sequences to structures, structural (3D) profiles or templates (representing a structure or structure alignment).
topics
bioinformatics
topic
Sequence-structure alignment
Sequence profiles and HMMs
Sequence profiles include position-specific scoring matrix (position weight matrix), hidden Markov models etc.
beta12orEarlier
bioinformatics
Topic concerning sequence profiles; typically a positional, numerical matrix representing a sequence alignment.
topics
edam
topic
Phylogeny reconstruction
topic
BioCatalogue:Tree Inference
topics
Currently too specific for the topic sub-ontology (but might be unobsoleted).
Topic concerning the reconstruction of a phylogeny (evolutionary relatedness amongst organisms), for example, by building a phylogenetic tree.
BioCatalogue:Evolutionary Distance Measurements
bioinformatics
beta12orEarlier
edam
Phylogenomics
topics
edam
topic
Topic concerning the integrated study of evolutionary relationships and whole genome data, for example, in the analysis of species trees, horizontal gene transfer and evolutionary reconstruction.
bioinformatics
beta12orEarlier
Virtual PCR
true
Topic concerning simulated polymerase chain reaction (PCR).
topic
edam
topics
PCR
beta12orEarlier
bioinformatics
beta13
Polymerase chain reaction
Sequence assembly
bioinformatics
Topic concerning the assembly of fragments of a DNA sequence to reconstruct the original sequence.
beta12orEarlier
topic
edam
topics
Genetic variation
Mutation and polymorphism
edam
bioinformatics
DNA variation
topics
beta12orEarlier
Topic concerning DNA sequence variation (mutation and polymorphism) data.
topic
Microarrays
beta12orEarlier
topics
Topic concerning microarrays, for example, to process microarray data or design probes and experiments.
DNA microarrays
edam
topic
BioCatalogue:Microarrays
bioinformatics
Pharmacoinformatics
topic
Computational pharmacology
bioinformatics
edam
Topic for the application of information technology to drug research, including the structure, effects of and response to drugs, drug design and so on.
beta12orEarlier
topics
Transcriptomics
Gene expression resources
Gene expression profiling
Expression profiling
edam
Topic concerning primarily raw or processed gene (mRNA) expression (typically microarray) data, including the analysis of gene expression levels, by identifying, quantifying or comparing mRNA transcripts and the interpretation (in functional terms) of gene expression data.
Gene expression analysis
bioinformatics
http://edamontology.org/topic_0197
topics
topic
beta12orEarlier
This includes microarray data, northern blots, gene-indexed expression profiles and any annotation on genetic information that is used in the synthesis of a protein.
Gene regulation
Gene regulation resources
bioinformatics
beta12orEarlier
topic
edam
Topic concerning primarily the regulation of gene expression.
topics
Pharmacogenomics
bioinformatics
Pharmacogenetics
topic
topics
Topic concerning the influence of genotype on drug response, for example by correlating gene expression or single-nucleotide polymorphisms with drug efficacy or toxicity.
edam
beta12orEarlier
Drug design
edam
bioinformatics
beta12orEarlier
topic
This includes methods that search compound collections, identify or search a database of antimicrobial peptides, generate or analyse drug 3D conformations, identify drug targets with structural docking etc.
Topic concerning the design of drugs or potential drug compounds.
topics
Fish
edam
The resource may be specific to a fish, a group of fish or all fish.
topics
Topic concerning fish, e.g. information on a specific fish genome including molecular sequences, genes and annotation.
bioinformatics
beta12orEarlier
topic
Flies
Fly
edam
topics
beta12orEarlier
topic
bioinformatics
Topic concerning flies, e.g. information on a specific fly genome including molecular sequences, genes and annotation.
The resource may be specific to a fly, a group of flies or all flies.
Mice or rats
Topic concerning mice or rats, e.g. information on a specific genome including molecular sequences, genes and annotation.
topics
Mouse or rat
beta12orEarlier
topic
edam
bioinformatics
The resource may be specific to a group of mice / rats or all mice / rats.
Worms
beta12orEarlier
topics
Worm
bioinformatics
The resource may be specific to a worm, a group of worms or all worms.
edam
Topic concerning worms, e.g. information on a specific worm genome including molecular sequences, genes and annotation.
topic
Literature analysis
topic
topics
BioCatalogue: Document Discovery
BioCatalogue: Literature retrieval
beta12orEarlier
The processing and analysis of the bioinformatics literature and bibliographic data, such as literature search and query.
bioinformatics
Literature search and analysis
Literature sources
edam
Text mining
BioCatalogue:Document Clustering
BioCatalogue:Document Similarity
edam
Text data mining
topic
BioCatalogue:Text Mining
beta12orEarlier
Topic concerning the analysis of the biomedical and informatics literature.
bioinformatics
topics
BioCatalogue:Named Entity Recognition
Annotation
edam
Ontology annotation
Topic for the annotation of entities (typically biological database entries) with terms from a controlled vocabulary.
bioinformatics
BioCatalogue:Ontology Annotation
topics
BioCatalogue:Genome Annotation
beta12orEarlier
topic
Data processing and validation
topic
beta12orEarlier
Data file handling
Report processing
topics
Report handling
bioinformatics
This includes editing, reformatting, conversion, transformation, validation, debugging, indexing and so on.
edam
Topic concerning basic manipulations of files or reports of generic biological data.
File handling
Sequence annotation
true
edam
Annotate a molecular sequence.
bioinformatics
topics
beta12orEarlier
topic
beta12orEarlier
Genome annotation
true
topic
topics
Annotate a genome.
beta12orEarlier
beta12orEarlier
bioinformatics
BioCatalogue:Genome Annotation
edam
NMR
true
Topic concerning raw NMR data.
edam
topic
bioinformatics
beta13
topics
beta12orEarlier
Sequence classification
Methods including sequence motifs, profile and other diagnostic elements which (typically) represent conserved patterns (of residues or properties) in molecular sequences.
topics
Topic concerning the classification of molecular sequences based on some measure of their similarity.
beta12orEarlier
topic
edam
bioinformatics
Protein classification
bioinformatics
beta12orEarlier
topic
edam
Topic concerning primarily the classification of proteins (from sequence or structural data) into clusters, groups, families etc.
topics
Sequence motif or profile
true
beta12orEarlier
Topic concerning sequence motifs, or sequence profiles derived from an alignment of molecular sequences of a particular type.
edam
This includes comparison, discovery, recognition etc. of sequence motifs.
topics
bioinformatics
topic
beta12orEarlier
Protein modifications
edam
MOD:00000
topic
bioinformatics
EDAM does not describe all possible protein modifications. For fine-grained annotation of protein modification use the Gene Ontology (children of concept GO:0006464) and/or the Protein Modifications ontology (children of concept MOD:00000)
topics
GO:0006464
beta12orEarlier
Topic concerning protein chemical modifications, e.g. post-translational modifications.
Protein post-translational modification
Pathways, networks and models
Topic concerning biological pathways, networks and other models, including their construction and analysis.
BioCatalogue:Pathways
Network or pathway analysis
edam
http://edamontology.org/topic_3076
beta13
bioinformatics
BioCatalogue:Pathway Retrieval
topic
topics
Informatics
true
bioinformatics
topic
beta12orEarlier
beta12orEarlier
A database concerning biological data management and modelling, including datatypes, workflows and models. A sub-discipline of bioinformatics; the application of information technology to a specialised biological area.
edam
topics
Literature data resources
beta12orEarlier
topics
topic
edam
Data resources for the biological or biomedical literature, either a primary source of literature or some derivative.
bioinformatics
Laboratory resources
topic
Topic concerning biological resources for use in the lab including cell lines, viruses, plasmids, phages, DNA probes and primers and so on.
beta12orEarlier
topics
edam
bioinformatics
Cell culture resources
beta12orEarlier
topics
Topic concerning general cell culture or data on a specific cell lines.
edam
topic
bioinformatics
Ecoinformatics
Topic concerning the application of information technology to the ecological and environmental sciences.
Computational ecology
edam
beta12orEarlier
bioinformatics
topic
topics
Ecological informatics
Electron microscopy
true
edam
Topic concerning electron microscopy data.
bioinformatics
beta12orEarlier
topics
beta13
topic
Cell cycle
true
topic
beta12orEarlier
topics
edam
Topic concerning the cell cycle including key genes and proteins.
beta13
bioinformatics
Peptides and amino acids
bioinformatics
topics
edam
Topic concerning the physicochemical, biochemical or structural properties of amino acids or peptides.
topic
beta12orEarlier
Organelle genes and proteins
beta12orEarlier
bioinformatics
topics
edam
Topic concerning a specific organelle, or organelles in general, typically the genes and proteins (or genome and proteome).
topic
Ribosomal genes and proteins
Ribosome genes and proteins
topic
bioinformatics
topics
Topic concerning ribosomes, typically of ribosome-related genes and proteins.
beta12orEarlier
edam
Scents
true
beta13
edam
beta12orEarlier
topic
A database about scents.
topics
bioinformatics
Drugs and targets
topic
edam
topics
beta12orEarlier
Topic concerning the structures of drugs, drug target, their interactions and binding affinities.
bioinformatics
Genome, proteome and model organisms
topic
General information on or more organisms, genomes (including molecular sequences and map, genes and annotation) and proteomes may be included.
topics
beta12orEarlier
Genome map
edam
bioinformatics
Topic concerning the genome, proteome or other information about a specific organism, such as a model organism, or group of organisms.
Genomics
BioCatalogue:Functional Genomics
beta12orEarlier
Topic concerning whole genomes of one or more organisms, or genomes in general, such as meta-information on genomes, genome projects, gene names etc.
bioinformatics
topic
topics
BioCatalogue:Genomics
edam
Genes, gene family or system
bioinformatics
topics
edam
topic
Topic concerning particular gene(s), gene system or groups of genes.
beta12orEarlier
Chromosomes
edam
Topic concerning chromosomes.
beta12orEarlier
bioinformatics
topics
topic
Genotype and phenotype
bioinformatics
beta12orEarlier
topics
edam
Genotyping
Topic concerning the study of genetic constitution of a living entity, such as an individual, and organism, a cell and so on, typically with respect to a particular observable phenotypic traits, or resources concerning such traits, which might be an aspect of biochemistry, physiology, morphology, anatomy, development and so on.
Genotype and phenotype resources
topic
Gene expression and microarray
true
topic
beta12orEarlier
bioinformatics
edam
topics
beta12orEarlier
Topic concerning gene expression e.g. microarray data, northern blots, gene-indexed expression profiles etc.
Probes and primers
edam
beta12orEarlier
bioinformatics
topics
Topic concerning molecular probes (e.g. a peptide probe or DNA microarray probe) or primers (e.g. for PCR).
topic
Disease resources
beta12orEarlier
Topic concerning diseases.
topics
edam
bioinformatics
topic
Specific protein resources
topic
beta12orEarlier
topics
Specific protein
edam
Topic concerning a particular protein, protein family or other group of proteins.
bioinformatics
Taxonomy
bioinformatics
topic
Topic concerning organism classification, identification and naming.
topics
beta12orEarlier
edam
Protein sequence analysis
topic
topics
BioCatalogue:Protein Sequence Analysis
Processing and analysis of protein sequences and sequence-based entities such as alignments, motifs and profiles.
edam
beta12orEarlier
bioinformatics
Nucleic acid sequence analysis
topic
topics
edam
bioinformatics
Processing and analysis of nucleotide sequences and sequence-based entities such as alignments, motifs and profiles.
BioCatalogue:Nucleotide Sequence Analysis
beta12orEarlier
Repeat sequences
Repeat sequence
edam
topics
beta12orEarlier
bioinformatics
topic
BioCatalogue:Repeats
Topic concerning the repetitive nature of molecular sequences.
Low complexity sequences
edam
topics
beta12orEarlier
bioinformatics
Topic concerning the (character) complexity of molecular sequences, particularly regions of low complexity.
topic
Proteome
true
edam
bioinformatics
beta13
topics
beta12orEarlier
Topic concerning a specific proteome including protein sequences and annotation.
topic
DNA
The DNA sequences might be coding or non-coding sequences.
bioinformatics
topics
edam
topic
Topic concerning DNA sequences and structure, including processes such as methylation and replication.
beta12orEarlier
DNA analysis
mRNA, EST or cDNA database
beta12orEarlier
edam
mRNA, EST or cDNA
topics
Transcriptome database
Topic concerning data resources for messenger RNA (mRNA), expressed sequence tag (EST) or complementary DNA (cDNA) sequences.
topic
Transcriptome
bioinformatics
Functional and non-coding RNA
bioinformatics
topic
beta12orEarlier
edam
Non-coding RNA
For example, piwi-interacting RNA (piRNA), small nuclear RNA (snRNA) and small nucleolar RNA (snoRNA).
Topic concerning functional or non-coding RNA sequences.
Functional RNA
topics
rRNA
topics
bioinformatics
Topic concerning one or more ribosomal RNA (rRNA) sequences.
beta12orEarlier
edam
topic
tRNA
topic
bioinformatics
Topic concerning one or more transfer RNA (tRNA) sequences.
edam
beta12orEarlier
topics
Protein secondary structure
topics
This includes assignment, analysis, comparison, prediction, rendering etc. of secondary structure data.
bioinformatics
topic
Topic concerning protein secondary structure or secondary structure alignments.
edam
Protein secondary structure analysis
beta12orEarlier
RNA structure and alignment
beta12orEarlier
Topic concerning RNA secondary or tertiary structure and alignments.
RNA alignment
edam
topics
RNA structure alignment
bioinformatics
topic
RNA structure
Protein tertiary structure
topics
beta12orEarlier
Protein tertiary structure analysis
Topic concerning protein tertiary structures.
edam
topic
bioinformatics
Nucleic acid classification
topics
beta12orEarlier
topic
bioinformatics
Topic concerning nucleic acid classification (typically sequence classification).
edam
Protein families
Topic concerning primarily proteins that have been classified as members of a protein family (or other grouping).
A protein families database might include the classifier (e.g. a sequence profile) used to build the classification.
beta12orEarlier
topics
Protein sequence classification
Protein secondary
topic
edam
bioinformatics
Protein domains and folds
topic
BioCatalogue:Domains
topics
edam
bioinformatics
Topic concerning protein tertiary structural domains and folds.
beta12orEarlier
Nucleic acid sequence alignment
bioinformatics
topic
Topic concerning nucleotide sequence alignments.
edam
beta12orEarlier
topics
Protein sequence alignment
topics
topic
A sequence profile typically represents a sequence alignment.
Topic concerning protein sequence alignments.
bioinformatics
edam
beta12orEarlier
Nucleic acid sites and features
Nucleic acid functional sites
topics
Topic concerning positional features such as functional sites in nucleotide sequences.
bioinformatics
topic
Nucleic acid features
edam
beta12orEarlier
Protein sites and features
edam
Protein functional sites
topic
topics
Topic concerning positional features such as functional sites in protein sequences.
beta12orEarlier
Protein sequence features
bioinformatics
Transcription factors and regulatory sites
Topic concerning transcription factors; proteins that bind to DNA and control transcription of DNA to mRNA, either promoting (as an activator) or blocking (as a repressor) the binding to DNA of RNA polymerase, and also transcriptional regulatory sites, elements and regions (such as promoters, enhancers, silencers and boundary elements / insulators) in nucleotide sequences.
This includes promoters, enhancers, silencers and boundary elements / insulators. This includes sequence and structural information, binding profiles etc, and may also include the transcription factor binding site in DNA.
beta12orEarlier
Transcriptional regulatory sites
bioinformatics
Transcription factor and binding site
Transcription factors
edam
topic
topics
Phosphorylation sites
true
beta12orEarlier
Topic concerning protein phosphorylation and phosphorylation sites in protein sequences.
edam
topic
topics
bioinformatics
1.0
Metabolic pathways
topic
Topic concerning metabolic pathways.
edam
bioinformatics
beta12orEarlier
topics
Signaling pathways
topic
beta12orEarlier
Topic concerning signaling pathways.
topics
edam
bioinformatics
Protein and peptide identification
beta12orEarlier
Peptide identification and proteolysis
edam
bioinformatics
Proteomics data resources
Topic concerning protein and peptide identification including proteomics experiments such as mass spectrometry, two-dimensional gel electrophoresis and protein microarrays.
Proteomics data
This includes the results of any methods that separate, characterize and identify expressed proteins.
topics
topic
Workflows
true
topics
1.0
bioinformatics
beta12orEarlier
Topic concerning biological or biomedical analytical workflows or pipelines.
topic
edam
Data types and objects
true
topics
1.0
beta12orEarlier
topic
bioinformatics
Topic concerning structuring data into basic types and (computational) objects.
edam
Biological models
beta12orEarlier
topics
Topic concerning mathematical or other models of biological processes.
bioinformatics
topic
This includes databases of models and methods to construct or analyse a model.
edam
BioCatalogue:Model Creation
Mitochondrial genes and proteins
Mitochondria genes and proteins
topic
topics
Topic concerning mitochondria, typically of mitochondrial genes and proteins.
bioinformatics
beta12orEarlier
edam
Plants
bioinformatics
topics
beta12orEarlier
Topic concerning plants, e.g. information on a specific plant genome including molecular sequences, genes and annotation.
topic
edam
The resource may be specific to a plant, a group of plants or all plants.
Plant
Viruses
bioinformatics
The resource may be specific to a virus, a group of viruses or all viruses.
topics
beta12orEarlier
Virus
Topic concerning viruses, e.g. sequence and structural data, interactions of viral proteins, or a viral genome including molecular sequences, genes and annotation.
edam
topic
Fungi
topics
Fungal
The resource may be specific to a fungus, a group of fungi or all fungi.
bioinformatics
Topic concerning fungi and molds, e.g. information on a specific fungal genome including molecular sequences, genes and annotation.
edam
topic
beta12orEarlier
Pathogens
Topic concerning pathogens, e.g. information on a specific vertebrate genome including molecular sequences, genes and annotation.
edam
bioinformatics
The resource may be specific to a pathogen, a group of pathogens or all pathogens.
beta12orEarlier
topics
topic
Pathogen
Arabidopsis
topic
beta12orEarlier
bioinformatics
Topic concerning Arabidopsis-specific data.
topics
edam
Rice
beta12orEarlier
Topic concerning rice-specific data.
topics
edam
bioinformatics
topic
Genetic mapping and linkage
Informatics resources that aim to identify, map or analyse genetic markers in DNA sequences, for example to produce a genetic (linkage) map of a chromosome or genome or to analyse genetic linkage and synteny.
Genetic linkage
topics
topic
edam
bioinformatics
beta12orEarlier
Linkage mapping
Comparative genomics
BioCatalogue:Comparative Genomics
bioinformatics
topic
edam
beta12orEarlier
topics
Topic concerning the study (typically comparison) of the sequence, structure or function of multiple genomes.
Mobile genetic elements
edam
bioinformatics
Topic concerning mobile genetic elements, such as transposons, Plasmids, Bacteriophage elements and Group II introns.
beta12orEarlier
topic
topics
Human disease
true
Topic concerning human diseases, typically describing the genes, mutations and proteins implicated in disease.
bioinformatics
edam
beta12orEarlier
topic
topics
beta13
Immunoinformatics
Topic for the application of information technology to immunology such as immunological processes, immunological genes, proteins and peptide ligands, antigens and so on.
edam
beta12orEarlier
topics
bioinformatics
Computational immunology
topic
Membrane proteins
Topic concerning a protein or region of a protein that spans a membrane.
topics
bioinformatics
Transmembrane proteins
topic
beta12orEarlier
edam
Enzymes and reactions
topics
bioinformatics
edam
beta12orEarlier
topic
Topic concerning proteins that catalyze chemical reaction and the kinetics of enzyme-catalysed reactions, enzyme nomenclature etc.
Structure comparison
beta12orEarlier
bioinformatics
Topic concerning the comparison of two or more molecular structures.
topics
This might involve comparison of secondary or tertiary (3D) structural information.
edam
topic
Protein function analysis
beta12orEarlier
edam
Topic for the study of protein function.
bioinformatics
topics
topic
Prokaryotes and archae
Prokaryote and archae
Topic concerning specific bacteria or archaea, e.g. information on a specific prokaryote genome including molecular sequences, genes and annotation.
beta12orEarlier
topic
topics
bioinformatics
edam
The resource may be specific to a prokaryote, a group of prokaryotes or all prokaryotes.
Protein databases
topics
Protein data resources
Topic concerning protein data resources.
beta12orEarlier
topic
bioinformatics
edam
Structure determination
Raw structural data analysis
Structure assignment
topics
Structural determination
edam
Topic concerning experimental methods for biomolecular structure determination, such as X-ray crystallography, nuclear magnetic resonance (NMR), circular dichroism (CD) spectroscopy, including the assignment or modelling of molecular structure from such data.
beta12orEarlier
topic
Structural assignment
bioinformatics
Cell biology resources
Topic concerning cells, such as key genes and proteins involved in the cell cycle.
bioinformatics
edam
topic
topics
beta12orEarlier
Classification
true
edam
topics
Topic focused on identifying, grouping, or naming things in a structured way according to some schema based on observable relationships.
beta12orEarlier
bioinformatics
topic
beta13
Lipoproteins
topic
topics
beta12orEarlier
Topic concerning lipoproteins (protein-lipid assemblies).
bioinformatics
edam
Phylogeny visualisation
true
bioinformatics
edam
topics
beta12orEarlier
beta12orEarlier
topic
BioCatalogue:Tree Display
Visualise a phylogeny, for example, render a phylogenetic tree.
Chemoinformatics
edam
bioinformatics
Chemical informatics
BioCatalogue:Chemoinformatics
topic
Computational chemistry
topics
beta12orEarlier
Cheminformatics
Topic for the application of information technology to chemistry.
Systems biology
BioCatalogue:Systems Biology
topic
edam
bioinformatics
topics
Topic concerning the holistic modelling and analysis of biological systems and the interactions therein.
beta12orEarlier
Biostatistics
Topic for the application of statistical methods to biological problems.
beta12orEarlier
BioCatalogue:Biostatistics
Biometry
bioinformatics
topics
topic
Biometrics
edam
Structure database search
true
edam
Search for and retrieve molecular structures that are similar to a structure-based query (typically another structure or part of a structure).
topic
beta12orEarlier
beta12orEarlier
bioinformatics
topics
The query is a structure-based entity such as another structure, a 3D (structural) motif, 3D profile or template.
Molecular modelling
edam
Topic for the construction, analysis, evaluation, refinement etc. of models of a molecules properties or behaviour.
beta12orEarlier
topics
bioinformatics
topic
Protein function prediction
Topic concerning the prediction of functional properties of a protein.
bioinformatics
topics
topic
BioCatalogue:Function Prediction
edam
beta12orEarlier
SNPs
A SNP is a DNA sequence variation where a single nucleotide differs between members of a species or paired chromosomes in an individual.
Topic concerning single nucleotide polymorphisms (SNP) and associated data, for example, the discovery and annotation of SNPs.
edam
bioinformatics
beta12orEarlier
topics
topic
Transmembrane protein prediction
true
topics
beta12orEarlier
beta12orEarlier
Predict transmembrane domains and topology in protein sequences.
topic
bioinformatics
edam
Nucleic acid structure comparison
true
beta12orEarlier
beta12orEarlier
topic
Use this concept for methods that are exclusively for nucleic acid structures.
topics
Topic concerning the comparison two or more nucleic acid (typically RNA) secondary or tertiary structures.
edam
bioinformatics
Cancer
Informatics resources dedicated to the study of cancer, for example, genes and proteins implicated in cancer.
topics
edam
beta12orEarlier
Cancer resources
topic
bioinformatics
Toxins and targets
beta12orEarlier
topic
Topic concerning structural and associated data for toxic chemical substances.
topics
bioinformatics
edam
Tool topic
true
beta12orEarlier
edam
topic
bioinformatics
A topic concerning primarily bioinformatics software tools, typically the broad function or purpose of a tool.
topics
beta12orEarlier
Study topic
true
beta12orEarlier
bioinformatics
edam
topic
topics
beta12orEarlier
A general area of bioinformatics study, typically the broad scope or category of content of a bioinformatics journal or conference proceeding.
Nomenclature
topics
bioinformatics
beta12orEarlier
Topic concerning biological nomenclature (naming), symbols and terminology.
edam
topic
Disease genes and proteins
edam
beta12orEarlier
Topic concerning the genes, gene variations and proteins involved in one or more specific diseases.
topic
bioinformatics
topics
Protein structure analysis
Topic concerning protein secondary or tertiary structural data and/or associated annotation.
edam
beta12orEarlier
http://edamontology.org/topic_3040
Protein structure
topics
bioinformatics
topic
Humans
bioinformatics
edam
topic
The resource may be specific to a human, a group of humans or all humans.
Topic concerning the human genome, including molecular sequences, genes, annotation, maps and viewers, the human proteome or human beings in general.
Human
topics
beta12orEarlier
Gene resources
Gene database
Informatics resource (typically a database) primarily focussed on genes.
edam
bioinformatics
topic
beta12orEarlier
topics
Gene resource
Yeast
edam
topics
topic
beta12orEarlier
Topic concerning yeast, e.g. information on a specific yeast genome including molecular sequences, genes and annotation.
bioinformatics
Eukaryotes
beta12orEarlier
edam
The resource may be specific to a eukaryote, a group of eukaryotes or all eukaryotes.
Topic concerning eukaryotes or data concerning eukaryotes, e.g. information on a specific eukaryote genome including molecular sequences, genes and annotation.
topic
Eukaryote
topics
bioinformatics
Invertebrates
Topic concerning invertebrates, e.g. information on a specific invertebrate genome including molecular sequences, genes and annotation.
topics
bioinformatics
beta12orEarlier
The resource may be specific to an invertebrate, a group of invertebrates or all invertebrates.
edam
topic
Vertebrates
beta12orEarlier
Topic concerning vertebrates, e.g. information on a specific vertebrate genome including molecular sequences, genes and annotation.
topics
topic
bioinformatics
edam
The resource may be specific to a vertebrate, a group of vertebrates or all vertebrates.
Vertebrate
Unicellular eukaryotes
bioinformatics
Topic concerning unicellular eukaryotes, e.g. information on a unicellular eukaryote genome including molecular sequences, genes and annotation.
Unicellular eukaryote
topics
The resource may be specific to a unicellular eukaryote, a group of unicellular eukaryotes or all unicellular eukaryotes.
beta12orEarlier
topic
edam
Protein structure alignment
topics
beta12orEarlier
Topic concerning protein secondary or tertiary structure alignments.
edam
topic
bioinformatics
X-ray crystallography
true
bioinformatics
topics
edam
beta12orEarlier
topic
Topic concerning X-ray crystallography data.
beta13
Ontologies, nomenclature and classification
topics
beta12orEarlier
Topic concerning conceptualisation, categorisation and naming, or that help to identify, group, or name things in a structured way according to some schema based on observable relationships.
topic
edam
bioinformatics
Immunity genes, immunoproteins and antigens
edam
topics
beta12orEarlier
bioinformatics
This includes T cell receptors (TR), major histocompatibility complex (MHC), immunoglobulin superfamily (IgSF) / antibodies, major histocompatibility complex superfamily (MhcSF), etc."
Immunoproteins and immunopeptides
topic
Topic concerning immunity-related genes, proteins and their ligands.
Molecules
true
edam
beta12orEarlier
beta12orEarlier
Topic concerning specific molecules, including large molecules built from repeating subunits (macromolecules) and small molecules of biological significance.
topics
topic
CHEBI:23367
bioinformatics
Toxicoinformatics
Computational toxicology
topic
Topic concerning the adverse effects of chemical substances on living organisms.
beta12orEarlier
edam
bioinformatics
topics
High-throughput sequencing
true
topics
beta12orEarlier
beta13
edam
bioinformatics
Topic concerning parallelized sequencing processes that are capable of sequencing many thousands of sequences simultaneously.
topic
Next-generation sequencing
Structural clustering
beta12orEarlier
topics
topic
bioinformatics
Topic concerning the comparison and grouping together of molecular structures on the basis of similarity; generate, process or analyse structural clusters.
Structure classification
edam
Gene regulatory networks
topic
beta12orEarlier
topics
edam
bioinformatics
Topic concerning gene regulatory networks.
Disease (specific)
true
topic
edam
bioinformatics
beta12orEarlier
topics
Informatics resources dedicated to one or more specific diseases (not diseases in general).
beta12orEarlier
Nucleic acid design
bioinformatics
topics
topic
edam
beta12orEarlier
Topic for the design of nucleic acid sequences with specific conformations.
Primer or probe design
beta13
bioinformatics
Topic concerning the design of primers for PCR and DNA amplification or the design of molecular probes.
edam
topics
topic
Structure databases
topic
Topic concerning molecular secondary or tertiary (3D) structural data resources, typically of proteins and nucleic acids.
Structure data resources
bioinformatics
edam
topics
beta13
Nucleic acid structure
edam
topics
bioinformatics
Topic concerning nucleic acid (secondary or tertiary) structure, such as whole structures, structural features and associated annotation.
beta13
topic
Sequence databases
edam
bioinformatics
Sequence data resources
topics
beta13
Topic concerning molecular sequence data resources, including sequence sites, alignments, motifs and profiles.
Sequence data resource
Sequence data
topic
Nucleic acid sequences
topic
topics
bioinformatics
edam
Topic concerning nucleotide sequences and associated concepts such as sequence sites, alignments, motifs and profiles.
beta13
Protein sequences
edam
beta13
topic
topics
BioCatalogue:Protein Sequence Analysis
bioinformatics
Topic concerning protein sequences and associated concepts such as sequence sites, alignments, motifs and profiles.
Protein interaction networks
Topic concerning protein-protein interaction networks.
edam
topic
topics
beta13
bioinformatics
Molecular biology reference
topic
bioinformatics
BioCatalogue: Document Discovery
topics
BioCatalogue: Literature retrieval
Topic concerning general molecular biology information extracted from the literature.
edam
beta13
Mammals
bioinformatics
topics
Topic concerning mammals, e.g. information on a specific mammal genome including molecular sequences, genes and annotation.
beta13
topic
edam
Biodiversity
topics
Biodiversity data resources
Topic concerning the degree of variation of life forms within a given ecosystem, biome or an entire planet.
Biodiversity data resource
topic
bioinformatics
beta13
edam
Sequence clusters and classification
beta13
Sequence clusters
topic
Sequence families
Topic concerning the comparison, grouping together and classification of macromolecules on the basis of sequence similarity.
edam
topics
bioinformatics
This includes the results of sequence clustering, ortholog identification, assignment to families, annotation etc.
Genetics
bioinformatics
beta13
edam
Genetics data resources
topic
topics
Topic concerning the study of genes, genetic variation and heredity in living organisms.
Quantitative genetics
Topic concerning the genes, Mendelian inheritance and mechanisms underlying continuous phenotypic traits (such as height or weight).
topics
edam
bioinformatics
topic
beta13
Population genetics
topics
topic
beta13
Topic concerning the distribution of allele frequencies in a population of organisms and its change subject to evolutionary processes including natural selection, genetic drift, mutation and gene flow.
edam
bioinformatics
Regulatory RNA
beta13
bioinformatics
Topic concerning regulatory RNA sequences including microRNA (miRNA) and small interfering RNA (siRNA).
edam
topic
topics
Micro RNAs are short single stranded RNA molecules that regulate gene expression.
Documentation and help
bioinformatics
beta13
topics
Topic concerning documentation and getting help.
topic
edam
Genetic organisation
topics
bioinformatics
edam
beta13
Topic concerning the structural and functional organisation of genes and other genetic elements.
topic
Medical informatics resources
Health and disease
beta13
Topic for the application of information technology to health, disease and biomedicine.
Healthcare informatics
Biomedical informatics
edam
Clinical informatics
Health informatics
bioinformatics
topics
topic
Developmental biology resources
beta13
bioinformatics
topic
edam
Topic concerning how organisms grow and develop.
topics
Embryology resources
topic
Topic concerning the development of organisms between the one-cell stage (typically the zygote) and the end of the embryonic stage.
topics
beta13
edam
bioinformatics
Anatomy resources
topics
topic
Topic concerning the structures of living organisms.
beta13
bioinformatics
edam
Literature and reference
bioinformatics
edam
Topic concerning the scientific literature, reference information and documentation.
topics
topic
beta13
Biological science resources
topics
Phenotype resource
bioinformatics
topic
edam
Topic concerning a particular biological science, especially observable traits such as aspects of biochemistry, physiology, morphology, anatomy, development and so on.
beta13
Biological data resources
Biological data resource
topics
bioinformatics
Biological databases
topic
beta13
edam
A topic concerning primarily a specific type of bioinformatics data, typically the broad category of content of a digital archives of biological data, including databanks, databases proper, web portals and other data resources.
Sequence feature detection
edam
beta13
topics
Topic concerning the detection of the positional features, such as functional and other key sites, in molecular sequences.
topic
bioinformatics
Nucleic acid feature detection
edam
beta13
topic
Topic concerning the detection of positional features such as functional sites in nucleotide sequences.
bioinformatics
topics
Protein feature detection
topic
edam
bioinformatics
topics
Topic concerning the detection, identification and analysis of positional protein sequence features, such as functional sites.
beta13
Biological system modelling
Topic for modelling biological systems in mathematical terms.
beta13
topics
BioCatalogue:Model Execution
bioinformatics
topic
BioCatalogue:Model Analysis
edam
Data acquisition and deposition
Topic concerning the acquisition and deposition of biological data.
edam
Database submission
beta13
bioinformatics
topics
topic
Gene and protein resources
topic
topics
Genes and proteins resources
bioinformatics
Topic concerning specific genes and their encoded proteins or a related group of such genes and proteins.
beta13
edam
Sequencing
topics
bioinformatics
1.1
topic
Topic concerning the determination of complete (typically nucleotide) sequences, including those of genomes (full genome sequencing, de novo sequencing and resequencing), amplicons and transcriptomes.
edam
ChIP-seq
Chip-sequencing
1.1
Chip seq
Topic concerning the analysis of protein-DNA interactions where chromatin immunoprecipitation (ChIP) is used in combination with massively parallel DNA sequencing to identify the binding sites of DNA-associated proteins.
bioinformatics
topics
edam
Chip sequencing
topic
RNA-Seq
This includes small RNA profiling (small RNA-Seq), for example to find novel small RNAs, characterize mutations and analyze expression of small RNAs.
WTSS
topics
A topic concerning high-throughput sequencing of cDNA to measure the RNA content (transcriptome) of a sample, for example, to investigate how different alleles of a gene are expressed, detect post-transcriptional mutations or identify gene fusions.
Small RNA-Seq
edam
bioinformatics
1.1
topic
Whole transcriptome shotgun sequencing
RNA-seq
Small RNA-seq
DNA methylation
topic
1.1
bioinformatics
topics
edam
Topic concerning DNA methylation including bisulfite sequencing, methylation sites and analysis, for example of patterns and profiles of DNA methylation in a population, tissue etc.
Metabolomics
Topic concerning the study of metabolites and the chemical processes they are involved in, especially the systematic study of the chemical fingerprints of specific cellular processes.
bioinformatics
1.1
topic
edam
topics
Epigenomics
A topic concerning the study of the epigenetic modifications of a whole cell, tissue, organism etc.
bioinformatics
topics
edam
1.1
Epigenetics concerns the heritable changes in gene expression owing to mechanisms other than DNA sequence variation.
topic
Epigenetics
Metagenomics
1.1
edam
Topic concerning the study of genetic material recovered from environmental samples.
bioinformatics
topic
Environmental genomics
Ecogenomics
topics
Community genomics
Structural variation
Genomic structural variation
edam
bioinformatics
Topic concerning variation in chromosome structure including microscopic and submicroscopic types of variation such as deletions, duplications, copy-number variants, insertions, inversions and translocations.
1.1
topic
topics
DNA packaging
topic
Topic concerning DNA-histone complexes (chromatin), organisation of chromatin into nucleosomes and packaging into higher-order structures.
topics
edam
beta12orEarlier
bioinformatics
DNA-Seq
topics
DNA-seq
edam
bioinformatics
A topic concerning high-throughput sequencing of randomly fragmented genomic DNA, for example, to investigate whole-genome sequencing and resequencing, SNP discovery, identification of copy number variations and chromosomal rearrangements.
1.1
topic
RNA-Seq alignment
topics
beta12orEarlier
topic
bioinformatics
RNA-seq alignment
Topic concerning the alignment of sequences of (typically millions) of short reads to a reference genome. This is a specialised topic within sequence alignment, especially because of complications arising from RNA splicing.
edam
ChIP-on-chip
ChIP-chip
1.1
Topic concerning experimental techniques that combine chromatin immunoprecipitation ('ChIP') with microarray ('chip'). ChIP-on-chip is used for high-throughput study protein-DNA interactions.
edam
topics
topic
bioinformatics
Obsolete concept (EDAM)
true
An obsolete concept (redefined in EDAM).
Needed for conversion to the OBO format.
1.2 / http://www.geneontology.org/formats/oboInOwl