Download


 

Description

  • Interaction table files are tab separated tables (tsv) of transcription factor - target gene interactions that contain interactions validated by small-scale or large-scale experiments or these two data altogether. All tables contain the following data: Uniprot IDs, NCBI Gene IDs, gene names of transcription factors and target genes, names of the detection methods, Pubmed IDs of the original publications, name of the organism, source databases, and indication about if the data were confirmed by small-scale evidences and the Uniprot IDs of ortholog transcription factors and target genes.
  • Interaction MITAB files contain transcription factor - target gene interactions in HUPO-PSI MITAB 2.8 format. The detailed description of the format and a header for MITAB tables are available in the FAQ.
  • Interaction GMT (Gene Matrix Transposed) is a tab delimited file format that describes gene sets – target genes of a transcription factor – in each row. The first and second column contains information about the transcription factors (various IDs and gene names). The first cell in each row is always unique. From the third to the last column the target genes of the transcription factor are listed. The number of target genes can vary from transcription factor to transcription factor, therefore the number of cells can be different in every row. The user can choose between GMT files with Uniprot IDs, NCBI Gene IDs, and gene names.
  • Binding site table files are tab separated tables (tsv) of binding site annotations that contain unique TFLink IDs of binding sites, Uniprot IDs and gene names of the transcription factors, the name of the organism, the genome version, chromosome name, the start and end coordinates of the binding sites, the coding strand, a link to the particular genomic location at the UCSC genome browser website, names of the detection method, Pubmed IDs of the original publications, the source database, indication if it is a small- or a large-scale method, the number and the TFLink IDs of overlapping binding sites of the same transcription factor.
  • Binding site annotation file contains the essential information about the genomic locations of the binding sites in GFF3 format. The source databases are indicated at the 3rd column starting with “TFLink_from_”. The type of the entries (4th column) are indicated as “TF_binding_site”. The attributes filed (10th column) contains TFLink IDs (ID), names (Name) and Uniprot IDs (Note) of the transcription factors.
  • Binding site sequence files are fasta files containing the DNA sequences of the transcription factor binding sites. The header of each sequence contains the unique internal TFLink ID of the binding site, the Uniprot ID and gene name of the transcription factor, the version of the genome assembly, the name of the chromosome, and the start and end coordinates of the binding sites.