Commit e4873b59 authored by Mateusz Pawlik's avatar Mateusz Pawlik
Browse files

Finalized Swissprot. Tested.

parent 4ec3f4f7
...@@ -27,7 +27,8 @@ https://www.uniprot.org/downloads ...@@ -27,7 +27,8 @@ https://www.uniprot.org/downloads
## RAM requirements ## RAM requirements
To be measured. The current way of processing Swissprot dataset requires **60GB** of RAM memory
(60GB for conversion, 10GB for sorting).
## Steps ## Steps
...@@ -40,4 +41,5 @@ Execute the following to download and prepare the dataset. ...@@ -40,4 +41,5 @@ Execute the following to download and prepare the dataset.
## Estimated time ## Estimated time
To be measured. On an Intel Xeon 2.40GHz CPU, it takes around **50min**
(including downloading).
...@@ -25,7 +25,7 @@ from lxml import etree ...@@ -25,7 +25,7 @@ from lxml import etree
import lxml.sax import lxml.sax
from xml.sax.handler import ContentHandler from xml.sax.handler import ContentHandler
# This script converts DBLP from XML to bracket notation. # This script converts Swissprot from XML to bracket notation.
# NOTE: Filenames are hardcoded in this script. # NOTE: Filenames are hardcoded in this script.
......
Supports Markdown
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment