Loading README.md +11 −7 Original line number Diff line number Diff line # Tree Edit Distance similarity join - datasets scripts # Datasets for tree edit distance experiments This repository contains all resources to download and process the datasets for tree similarity join experiments. This repository contains all resources to acquire datasets for experimenting on tree edit distance algorithms. **We do not store the datasets**, only the scripts to obtain and prepare them. ## Datasets description Currently we support the following datasest: - **bozen** - Bozen streets - **dblp** - DBLP The details about each dataset can be found in the corresponding README files. - **Bolzano** - Residential addresses in the city of Bolzano. - **DBLP** - Bibliographic XML data. - **Python** - Abstract syntax trees of Python source code in JSON. - **Sentiment** - Semantic trees of movie reviews in the PennTreeBank format. - **Swissprot** - Protein sequence data in XML. The details about each dataset can be found in the README files in the datasets subdirectories. ## Repository organisation Loading Loading
README.md +11 −7 Original line number Diff line number Diff line # Tree Edit Distance similarity join - datasets scripts # Datasets for tree edit distance experiments This repository contains all resources to download and process the datasets for tree similarity join experiments. This repository contains all resources to acquire datasets for experimenting on tree edit distance algorithms. **We do not store the datasets**, only the scripts to obtain and prepare them. ## Datasets description Currently we support the following datasest: - **bozen** - Bozen streets - **dblp** - DBLP The details about each dataset can be found in the corresponding README files. - **Bolzano** - Residential addresses in the city of Bolzano. - **DBLP** - Bibliographic XML data. - **Python** - Abstract syntax trees of Python source code in JSON. - **Sentiment** - Semantic trees of movie reviews in the PennTreeBank format. - **Swissprot** - Protein sequence data in XML. The details about each dataset can be found in the README files in the datasets subdirectories. ## Repository organisation Loading