Commit 0412c8e9 authored by Mateusz Pawlik's avatar Mateusz Pawlik
Browse files

Added statistics.

parent 8e732b9b
Loading
Loading
Loading
Loading
+10 −0
Original line number Diff line number Diff line
@@ -17,6 +17,16 @@ Currently we support the following datasest:
The details about each dataset can be found in the README files in the
datasets subdirectories.

## Statistics

Dataset   | Number of trees | Avg. tree size | Min tree size | Max tree size | Number of distinct labels
----------|-----------------|----------------|---------------|---------------|-------------------------- 
Bolzano   | 299             | 166            | 2             | 2105          | 592
DBLP      | 3934134         | 25             | 8             | 2986          | 14664605
Python    | 150000          | 946            | 1             | 46481         | 3523697
Sentiment | 9645            | 37             | 3             | 103           | 19470
Swissprot | 556196          | 862            | 101           | 48286         | 11439467

## Repository organisation

Each dataset and its corresponding scripts belong to a separate directory with a name identifying the dataset.