Commit 7ed0745e authored by Mateusz Pawlik's avatar Mateusz Pawlik
Browse files

Finalised python and tested.

parent 183b6142
Loading
Loading
Loading
Loading
+3 −2
Original line number Diff line number Diff line
@@ -26,7 +26,8 @@ https://www.sri.inf.ethz.ch/py150

## RAM requirements

The current way of processing DBLP dataset requires **??GB** of RAM memory.
The current way of processing DBLP dataset requires **10GB** of RAM memory for
sorting the dataset.

## Steps

@@ -37,4 +38,4 @@ Execute the following to download and prepare the dataset.

## Estimated time

On an Intel Xeon 2.40GHz CPU, it takes around **??min**.
 No newline at end of file
On an Intel Xeon 2.40GHz CPU, it takes around **15min**.
 No newline at end of file
+5 −0
Original line number Diff line number Diff line
@@ -35,3 +35,8 @@ python3 parse_json.py --inputfile python50k_eval.json >> python.bracket

# Sort the trees by size.
./../utilities/sort_dataset.sh python.bracket > python_sorted.bracket

# Tidy up.
rm py150.tar.gz
rm python.bracket
rm *.json