Database Download

Databases downloadble here (from version 3.0 on) have been created using the new database schema (The schema and database information is avaiable here. For the previous versions of database, click here. The previous database schema is avaiable here

Note that the database file named "WHOLEDB" has the entire database except ENTITY table. The new databse provides ENTITY table. Individual tables are also provided separately. So, you can download the entire database at once or individual tables separately depending on your needs. The file name consist of four parts: database name _ R (or A) _ table name _ PubMed to date. Letter R represents that the database is a regular version whereas A denotes it is an anaphora version.

The new database schema differs from the previous one (versions 2X) in the following ways:

  1. We simplified the schema significantly by removing CONCEPT, CONCEPT_SEMTYPE, PREDICATION_ARGUMENT, and SENTENCE_PREDICATION tables. The relevant contents of these tables can still be derived from PREDICATION if needed.
  2. A GENERIC_CONCEPT table has been added to the schema. This table contains generic concepts, as indicated by SemRep. The concepts that are not in this table are considered novel.
From version VER30 on, we plan to annually release a database (VER30_A and so on) of predications generated by SemRep using the sortal anaphora resolution. This is expected to increase the number of predications slightly, while also making others more specific. For sortal anaphora resolution in SemRep, see this paper.




Database name: semmedVER30_R (Processed up to June 30 2017)

Semrep version: Regular semrep version 1.7
Number of citations processed: 27283927
Number of predications: 91567597
* This database was obtained from the semrepping result without anaphora feature turned on.

TABLE NAME START DATE END DATE Size Download linksha1summd5sum
Entire Database 1865 June 30 2017 16.3G download download download
CITATIONS 1865 June 30 2017 129M download download download
ENTITY 1865 June 30 2017 34.0G download download download
GENERIC_CONCEPT N/A N/A 129M download download download
METAINFO N/A N/A 764 download download download
PREDICATION 1865 June 30 2017 2.34G download download download
PREDICATION_AUX 1865 June 30 2016 2.96G download download download
SENTENCE 1865 June 30 2017 10.8G download download download




Database name: semmedVER30_R (Processed up to December 31 2016)

Semrep version: Regular semrep version 1.7
Number of citations processed: 26737750
Number of predications: 89230566
* This database was obtained from the semrepping result without anaphora feature turned on.

TABLE NAME START DATE END DATE Size Download linksha1summd5sum
Entire Database 1865 DEc 31 2016 15.8G download download download
CITATIONS 1865 Dec 31 2016 129M download download download
ENTITY 1865 Dec 31 2016 30.8G download download download
GENERIC_CONCEPT N/A N/A 129M download download download
METAINFO N/A N/A 764 download download download
PREDICATION 1865 DEc 31 2016 2.24G download download download
PREDICATION_AUX 1865 DEc 31 2016 2.89G download download download
SENTENCE 1865 Dec 31 2016 10.5G download download download




Database name: semmedVER30_A (Processed up to December 31 2016)

Semrep version: Regular semrep version 1.7
Number of citations processed: 26723252
Number of predications: 89173359
* This database was obtained from the semrepping result with anaphora feature turned on.

TABLE NAME START DATE END DATE Size Download linksha1summd5sum
Entire Database 1865 DEc 31 2016 16.2G download download download
CITATIONS 1865 Dec 31 2016 129M download download download
COREFERENCE 1865 Dec 31 2016 450M download download download
ENTITY 1865 Dec 31 2016 30.8G download download download
GENERIC_CONCEPT 1865 Dec 31 2016 129M download download download
METAINFO N/A N/A 764 download download download
PREDICATION 1865 DEc 31 2016 2.29G download download download
PREDICATION_AUX 1865 DEc 31 2016 2.87G download download download
SENTENCE 1865 Dec 31 2016 10.5G download download download