Help for Motif Search

[ GenomeNet Home Page | Motif Search Home Page | Motif Help Page ]

Search with a protein sequence against Pfam library

OVERVIEW

Pfam [1] is a large collection of protein families and domains. The database contains multiple sequence alignments and hidden Markov models covering many common protein domains of these families. To search against Pfam database the program hmmscan [2, 3] is used.

SCORE

Hmmscan calculates the matching score between the query sequence and each domain found in Pfam library in bit score and in E-value. In this service the cut-off threshould should be given as E-value. Found motifs with smaller E-value than the threshould value are lited. Smaller cut-off E-value makes the search more selective.

RESULTS

The found motifs are listed in a table. From the ID numbers of Pfam you can jump into DBGET to look at the hits as well as related informations precisely. Under Position (E-value) column of the table the position (start and end sequence numbers) and the scores of found motifs are listed. Click Detail bottun to see actual positions of the motif along the query sequence . (shown in red)

USER'S PROFILE LIBRARY

You can search your query sequnce against a profile library defined by a user which contains either single or multiple profile data in HMMER save file format. Check the "User-defined Profile Library" box in the motif library list and provide a file name containing the profile.
The user defined profile library may be a subset of original Pfam database or one generated from multiple sequence alignment data.

References

1. Bateman A., Birney E., Cerruti L., Durbin R., Etwiller L., Eddy SR., Griffiths-Jones S., Howe K.L., Marshall M. and Sonnhammer E.L.
"The Pfam Protein Families Database"
Nucl. Acids Res. 30(1):276-280, 2002.

PubMed: 11752314

2. Eddy S.R.
Profile hidden Markov models
Bioinformatics 14:755-763 (1998)

PubMed: 9918945

3. "Biological Sequence Analysis: Probabilistic Models of Proteins and Nucleic Acids."
Durbin R., Eddy S., Krogh A., Mitchison G., Cambridge University Press (1998) 350 pages.