[Admin Home]
Use the top form to enter a single new protein into UniPROBE, or use the bottom form to enter multiple proteins at once through a file.
Publication accession id : You should have received this by email and on the web page after adding your publication.
Protein Name [REQUIRED]: e.g.: Sox4 Note: While most proteins are properly formatted as "Yfg123", C. elegans proteins are properly formatted as "YFG123"
Species : Acanthamoeba castellaniiAcyrthosiphon pisumAllomyces macrogynusArabidopsis lyrataArabidopsis thalianaAshbya gossypiiAspergillus nidulansBurkholderia pseudomallei K962Burkholderia thailandensis E26Caenorhabditis elegansChimeraCryptosporidium parvumDanio rerioDrosophila melanogasterHomo sapiensKluyveromyces lactisMonosiga brevicollisMus musculusMycosphaerella graminicolaN/ANematostella vectensisPatiria miniataPlasmodium falciparumSaccharomyces cerevisiaeStrongylocentrotus purpuratusToxoplasma gondiiTrichoplax adhaerensTuber melanosporumVibrio harveyi Note: If you don't see your species here, you should contact the admin at uniprobe@genetics.med.harvard.edu and it will be added.
Full protein name: e.g.: SRY (sex determining region Y)-box 4
Synonyms: e.g.: Transcription factor SOX-4, OTTHUMP00000039358, SRY-related HMG-box gene 4, ecotropic viral integration site 16, EVI16
IHOP id: e.g.: 92329
Uniprot ID: e.g.: Q06945
RefSeq ID: For the protein (not gene). If there is more than one NP identifier, use the first one with a valid link. e.g.: NP_003098
Jaspar: If there is an entry for this protein, or an ortholog, in JASPAR, this column holds its ID tag.
Description: A brief description of the gene / protein. e.g.: This intronless gene encodes a member of the SOX (SRY-related HMG-box) family of transcription factors involved in the regulation of embryonic development and in the determination of the cell fate. The encoded protein may act as a transcriptional regulator after forming a protein complex with other proteins, such as syndecan binding protein (syntenin). The protein may function in the apoptosis pathway leading to cell death as well as to tumorigenesis and may mediate downstream effects of parathyroid hormone (PTH) and PTH-related protein (PTHrP) in bone development. The solution structure has been resolved for the HMG-box of a similar mouse protein.
Domain: Abbreviation for DNA binding domain type, use Pfam. e.g.: HMG_box
Unique Species ID: This column holds a unique identification number / string for a particular species. This id (and the corresponding database name) can be specified by making custom details page (see details.php in section 1). e.g.: for Yeast, it might be YDR310C
Do you have PBM data for this whole protein? (Answer "No" if you only have data for the protein in complex with other proteins, or if you have data from multiple distinct clones.) Yes No
The file should contain each protein on a different line, with the fields in each line separated by tabs in the order specified in the top form(with the exception of publication): i.e.,protein name, species, full protein name, synonyms, IHOP id, Uniprot id, RefSeq id, JASPAR id, description, domain, unique species id, has_pbm_data ("y" or "n"), has_pbm_data for whole protein ("y" or "n").
For the various definitions and requirements of each field, see the top form.
NOTE: While using an Excel spreadsheet is helpful in preparing such a file, we cannot guarantee the proper formatting will result from submitting an Excel file, so you should submit a text file with tab-separated values. But you can simply copy and paste your spreadsheet content directly into a text editor to achieve this. However, PLEASE ENSURE THAT YOUR FILES END UP IN UNIX FORMAT.
Any species listed in your input file must already be listed in the database's species file; see the top form's "Species" pulldown to see which species are currently available. If one or more of your species is absent from this list, email uniprobe@genetics.med.harvard.edu, and we will make it available for you.
If you're leaving out certain optional fields (any but protein name and species), please use NULL for these fields.
Be careful to enter all your data correctly and properly formatted! This means the species must exactly match one of the options listed in the pulldowns in the top form, or your upload may fail.
For example, a line in your file might look like this:
Pdr3 Saccharomyces cerevisiae Pleiotropic Drug Resistance AMY2, TPE2 32649 P33200 NP_009548 MA0353.1 Transcriptional activator of the pleiotropic drug resistance network, regulates expression of ATP-binding cassette (ABC) transporters through binding to cis-acting sites known as PDREs (PDR responsive elements); post-translationally up-regulated in cells lacking a functional mitochondrial genome Zn_clus YBL005W y
If your dataset is very large, you may want to use our web tool to generate a template input file for you. You will have to input a file with proteins listed on separate line, with each line having a protein name and a species separated by a tab. No other fields are necessary. The tool will mine public databases such as NCBI Entrez Gene, IHOP, and UniProtKB/SwissProt for the relevant information, and assemble it into a template file. Not all fields will be filled in, and these will be filled with something like "[insert value here]". There is also some possibility of error for those fields that are filled in, so you should manually check the results, and make any corrections that are necessary. Nevertheless, we hope this will make it easier for you to build your file than starting from scratch.