Loading README.md +11 −7 Original line number Original line Diff line number Diff line # Tree Edit Distance similarity join - datasets scripts # Datasets for tree edit distance experiments This repository contains all resources to download and process the datasets for This repository contains all resources to acquire datasets for experimenting tree similarity join experiments. on tree edit distance algorithms. **We do not store the datasets**, only the scripts to obtain and prepare them. **We do not store the datasets**, only the scripts to obtain and prepare them. ## Datasets description ## Datasets description Currently we support the following datasest: Currently we support the following datasest: - **bozen** - Bozen streets - **Bolzano** - Residential addresses in the city of Bolzano. - **dblp** - DBLP - **DBLP** - Bibliographic XML data. - **Python** - Abstract syntax trees of Python source code in JSON. The details about each dataset can be found in the corresponding README files. - **Sentiment** - Semantic trees of movie reviews in the PennTreeBank format. - **Swissprot** - Protein sequence data in XML. The details about each dataset can be found in the README files in the datasets subdirectories. ## Repository organisation ## Repository organisation Loading Loading
README.md +11 −7 Original line number Original line Diff line number Diff line # Tree Edit Distance similarity join - datasets scripts # Datasets for tree edit distance experiments This repository contains all resources to download and process the datasets for This repository contains all resources to acquire datasets for experimenting tree similarity join experiments. on tree edit distance algorithms. **We do not store the datasets**, only the scripts to obtain and prepare them. **We do not store the datasets**, only the scripts to obtain and prepare them. ## Datasets description ## Datasets description Currently we support the following datasest: Currently we support the following datasest: - **bozen** - Bozen streets - **Bolzano** - Residential addresses in the city of Bolzano. - **dblp** - DBLP - **DBLP** - Bibliographic XML data. - **Python** - Abstract syntax trees of Python source code in JSON. The details about each dataset can be found in the corresponding README files. - **Sentiment** - Semantic trees of movie reviews in the PennTreeBank format. - **Swissprot** - Protein sequence data in XML. The details about each dataset can be found in the README files in the datasets subdirectories. ## Repository organisation ## Repository organisation Loading