The NIH genetic sequence database, an annotated collection of all publicly available DNA sequences. GenBank is part of the International Nucleotide Sequence Database Collaboration, which comprises the DNA DataBank of Japan (DDBJ), the European Molecular Biology Laboratory (EMBL), and GenBank at NCBI.
Gene integrates information from a wide range of species. A record may include nomenclature, Reference Sequences (RefSeqs), maps, pathways, variations, phenotypes, and links to genome-, phenotype-, and locus-specific resources worldwide.
Not sure which gene database to use from the NIH/NLM or can't find the one you're looking for listed here? Click this link to find all the gene- and genomics-related databases.
Tools
May require download of software or use of open-source platforms to run searches or sequences.
An interactive graphical viewer that allows users to explore variant calls, genotype calls and supporting evidence (such as aligned sequence reads) that have been produced by the "1000 Genomes Project."
BLAST finds regions of similarity between biological sequences. The program compares nucleotide or protein sequences to sequence databases and calculates the statistical significance.
AlphaFold Database provides open access to protein structure predictions for the human proteome and 20 other key organisms to accelerate scientific research.
The mission of UniProt is to provide the scientific community with a comprehensive, high-quality and freely accessible resource of protein sequence and functional information. Also through UniProt is UniRef, UniParc, Proteomes, and UniProtKB (Swiss-Prot and TrEMBL).
Not sure which protein database to use from the NIH/NLM or can't find the one you're looking for listed here? Click this link to find all the protein-related databases.
Tools
May require download of software or use of open-source platforms to run searches or sequences.
BLAST finds regions of similarity between biological sequences. The program compares nucleotide or protein sequences to sequence databases and calculates the statistical significance.
COBALT is a multiple sequence alignment tool that finds a collection of pairwise constraints derived from conserved domain database, protein motif database, and sequence similarity, using RPS-BLAST, BLASTP, and PHI-BLAST.
Pairwise constraints are then incorporated into a progressive multiple alignment.
More details in the paper: Papadopoulos, J.S. and Agarwala, R. (2007). Bioinformatics 23:1073-79. PMID: 17332019