Commit e4873b59 authored by Mateusz Pawlik's avatar Mateusz Pawlik
Browse files

Finalized Swissprot. Tested.

parent 4ec3f4f7
Loading
Loading
Loading
Loading
+4 −2
Original line number Diff line number Diff line
@@ -27,7 +27,8 @@ https://www.uniprot.org/downloads

## RAM requirements

To be measured.
The current way of processing Swissprot dataset requires **60GB** of RAM memory
(60GB for conversion, 10GB for sorting).

## Steps

@@ -40,4 +41,5 @@ Execute the following to download and prepare the dataset.

## Estimated time

To be measured.
On an Intel Xeon 2.40GHz CPU, it takes around **50min**
(including downloading).
+1 −1
Original line number Diff line number Diff line
@@ -25,7 +25,7 @@ from lxml import etree
import lxml.sax
from xml.sax.handler import ContentHandler

# This script converts DBLP from XML to bracket notation.
# This script converts Swissprot from XML to bracket notation.

# NOTE: Filenames are hardcoded in this script.