Database Download

Please note that the file whose name has "WHOLEDB" has the entire database, while others have individual tables. So, you can download the entire database at once or individual tables separately depending on your network capacity. The file name consist of three parts: database name _ table name _ PubMed to date.

The database created from December 2012 has a new column PYear in CITATIONS table, which indicates the publication year of the citations. So using the SQL query "select count(*) from CITATIONS where PYear = 2010", you can retrieve how many citations are in the database whose publication year is 2010.

Another thing that changed from the version 2.3 (semmedVER23) is that CITATIONS table has a new column EDAT that contains the date when the citation is entered into PubMed. It keeps the column DP that has the date of publication, whereas removing two columns - DA and DCOM.

From the version 2.3, the type of column PMID is changed from int to var(20) in order that it can have arbitrary string type PMID other than PubMed citations.

From the version 24_2, the PREDICATION_AGGREGATE table is no longer obtained by joining the other tables in order to avoid the anomaly that names (s_name, o_name) and cuis (s_cui and o_cui) are Cartesian product of mapped entities appearing in SemRep output. Instead, name and cui are represented in the same order as appearing in SemRep Full Fielded output. For instance, if a SemRep Full Fielded output has a predication whose preferred name of subject is s1 and its Entrez gene id is se1 and preferred name of object is o1 and its Entrez gene id is oe1, the s_name of the predication in PREDICATION AGGREGATE is s1|se1 and o_name is o1|oe1. Previously in this case, s_name of the predication was represented as s1|||s1|||se1|||se1 and o_name as o1|||oe1|||o1|||oe1 in PREDICATION_AGGREGATE table. With this joining method, if there are many Entrez Gene ids in a predication, the s_name and o_name became unnecessarily big and cut off in the middle to be stored at the corresponding columns of PREDICATION_AGGREGATE in the previous databases. The detailed information of SemRep Full Fielded output is explained at here.

It is also noted that in version 24_2, there are two changes. First of all, PREDICATION_AGGREGATE table is made as MyISAM table for the efficiency, whereas others as InnoDB. In the previous version 23, all the tables are made as InnoDB. Secondly, a new METAINFO table is added where the semrep version info, database version info and pubmed toDate (the EDAT to which date the search is limited) are saved. Here are the number of rows for each table in the database semmedVER26 processed up to April 30 2016.

table name# of rows
CITATIONS25981012
CONCEPT1339227
CONCEPT_SEMTYPE1550482
PREDICATION18377514
PREDICATION_ARGUMENT39807059
PREDICATION_AGGREGATE85777203
SENTENCE158910856
SENTENCE_PREDICATION85759911



Database name: semmedVER26 (Processed up to April 30 2016)

Semrep version: Regular semrep version 1.7
Number of citations processed: 25979013
Number of predications: 85777203
* This database was obtained from the semrepping result without anaphora feature turned on.

TABLE NAME START DATE END DATE Size Download linksha1summd5sum
Entire Database 1865 Apr 30 2016 14.9G download download download
CITATIONS 1865 Apr 30 2016 126M download download download
METAINFO N/A N/A 764 download download download
CONCEPT N/A N/A 20.5M download download download
CONCEPT_SEMTYPE N/A N/A 8.5M download download download
PREDICATION_AGGREGATE 1865 Apr 30 2016 2.1G download download download
PREDICATION 1865 Apr 30 2016 67.4M download download download
PREDICATION_ARGUMENT 1865 Apr 30 2016 299M download download download
SENTENCE 1865 Apr 30 2016 9.2G download download download
SENTENCE_PREDICATION 1865 Apr 30 2016 3.1G download download download

The semrep files for this database is available for download at link.



Database name: semmedVER25 (Processed up to Dec 31 2015)

Semrep version: Regular semrep version 1.6
Number of citations processed: 25582462
Number of predications: 84624649
* This database was obtained from the semrepping result without anaphora feature turned on.

TABLE NAME START DATE END DATE Size Download linksha1summd5sum
Entire Database 1865 Dec 31 2015 13.8G download download download
CITATIONS 1865 Dec 31 2015 113M download download download
METAINFO N/A N/A 764 download download download
CONCEPT N/A N/A 20.5M download download download
CONCEPT_SEMTYPE N/A N/A 8.5M download download download
PREDICATION_AGGREGATE 1865 Dec 31 2015 2.1G download download download
PREDICATION 1865 Dec 31 2015 62.5M download download download
PREDICATION_ARGUMENT 1865 Dec 31 2015 284M download download download
SENTENCE 1865 Dec 31 2015 8.6G download download download
SENTENCE_PREDICATION 1865 Dec 31 2015 3.0G download download download




Database name: semmedVER25 (Processed up to June 30 2015)

Semrep version: Regular semrep version 1.6
Number of citations processed: 25027441
Number of predications: 82239652
* This database was obtained from the semrepping result without anaphora feature turned on.

TABLE NAME START DATE END DATE Size Download linksha1summd5sum
Entire Database 1865 Jun 30 2015 13.8G download download download
CITATIONS 1865 Jun 30 2015 112M download download download
METAINFO N/A N/A 764 download download download
CONCEPT N/A N/A 20.5M download download download
CONCEPT_SEMTYPE N/A N/A 8.5M download download download
PREDICATION_AGGREGATE 1865 Jun 30 2015 2.0G download download download
PREDICATION 1865 Jun 30 2015 62M download download download
PREDICATION_ARGUMENT 1865 Jun 30 2015 280M download download download
SENTENCE 1865 Jun 30 2015 8.4G download download download
SENTENCE_PREDICATION 1865 Jun 30 2015 2.9G download download download




Database name: semmedVER24_2 (Processed up to June 30 2014)

Semrep version: Regular semrep version 1.5
Number of citations processed: 23921088
Number of predications: 70364020

TABLE NAME START DATE END DATE Size Download linksha1summd5sum
Entire Database 1865 Jun 30 2014 12.1G download download download
CITATIONS 1865 Jun 30 2014 106M download download download
METAINFO N/A N/A 767 download download download
CONCEPT - VER 2.42 N/A N/A 20.5M download download download
CONCEPT_SEMTYPE - VER 2.42 N/A N/A 8.5M download download download
PREDICATION_AGGREGATE 1865 Jun 30 2014 1.19G download download download
PREDICATION 1865 Jun 30 2014 55M download download download
PREDICATION_ARGUMENT 1865 Jun 30 2014 248M download download download
SENTENCE 1865 Jun 30 2014 8.2G download download download
SENTENCE_PREDICATION 1865 Jun 30 2014 2.4G download download download

The semrep files for this database is available for download at link.



Database name: semmedVER21_MB

Semrep version: Molecular Engineering semrep version

TABLE NAME START DATE END DATE Size Download linksha1summd5sum
Entire Database 2000 Sep 30 2012 10.3G download download download

The semrep files for this database is available for download at link.