Commit 0412c8e9 authored by Mateusz Pawlik's avatar Mateusz Pawlik
Browse files

Added statistics.

parent 8e732b9b
Loading
Loading
Loading
Loading
+10 −0
Original line number Original line Diff line number Diff line
@@ -17,6 +17,16 @@ Currently we support the following datasest:
The details about each dataset can be found in the README files in the
The details about each dataset can be found in the README files in the
datasets subdirectories.
datasets subdirectories.


## Statistics

Dataset   | Number of trees | Avg. tree size | Min tree size | Max tree size | Number of distinct labels
----------|-----------------|----------------|---------------|---------------|-------------------------- 
Bolzano   | 299             | 166            | 2             | 2105          | 592
DBLP      | 3934134         | 25             | 8             | 2986          | 14664605
Python    | 150000          | 946            | 1             | 46481         | 3523697
Sentiment | 9645            | 37             | 3             | 103           | 19470
Swissprot | 556196          | 862            | 101           | 48286         | 11439467

## Repository organisation
## Repository organisation


Each dataset and its corresponding scripts belong to a separate directory with a name identifying the dataset.
Each dataset and its corresponding scripts belong to a separate directory with a name identifying the dataset.