Skip to content
GitLab
Menu
Projects
Groups
Snippets
/
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
Menu
Open sidebar
Mateusz Pawlik
ted-datasets
Commits
4c05cc9d
Commit
4c05cc9d
authored
Jun 13, 2018
by
Thomas Huetter
Browse files
minor fixes
parent
74ecde05
Changes
1
Show whitespace changes
Inline
Side-by-side
python_ast/download_prepare.sh
100644 → 100755
View file @
4c05cc9d
...
...
@@ -12,14 +12,11 @@ wget http://files.srl.inf.ethz.ch/data/py150.tar.gz
# extract abstract syntax trees
tar
-xzf
py150.tar.gz
# change to extracted directory
cd
py150
# convert ast to bracket notation
python
../
parse_json.py
--inputfile
python100k_train.json
>
python_ast.bracket
python
3
parse_json.py
--inputfile
python100k_train.json
>
python_ast.bracket
# convert ast to bracket notation
python
../
parse_json.py
--inputfile
python50k_eval.json
>>
python_ast.bracket
python
3
parse_json.py
--inputfile
python50k_eval.json
>>
python_ast.bracket
# sort the trees ascending by their size
../sort_dataset.sh python_ast.bracket
\ No newline at end of file
./sort_dataset.sh python_ast.bracket
Write
Preview
Supports
Markdown
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment