Skip to content

Grammar improvements #1

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 98 commits into from
May 11, 2017
Merged

Grammar improvements #1

merged 98 commits into from
May 11, 2017

Conversation

rewinfrey
Copy link
Member

Working through ~700 errors in parsing the tensorflow repo. I'll be adjusting the grammar as I work through the errrors, and open this as a WIP PR for visibility and to welcome any suggestions / advice along the way 🙇

/cc @maxbrunsfeld

global_statement: $ => seq(
'global',
commaSep1($.identifier)
),
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I based this on https://docs.python.org/3/reference/grammar.html (Python 2's grammar specifies global statements in the same way). To me this reads as a comma separated list of identifiers. Does that seem accurate to you @maxbrunsfeld?

@maxbrunsfeld
Copy link
Contributor

This looks great so far, and your approach (parsing tensorflow) sounds 💯 . Thanks @rewinfrey!

(expression_statement (call (identifier) (identifier) (number)))
(expression_statement (call (call
(identifier) (number) (number) (number))
(conditional_expression (identifier) (identifier)))))
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@maxbrunsfeld when we get a chance to 🍐 , I'd like to look at this with you. The nested calls bother me, but in making adjustments to the grammar I haven't found a way to reduce this to a single call.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that there shouldn't be a nested call; this statement should parse as something like

(expression_statement
  (conditional_expression
    (call (identifier) (number) (number) (number))
    (identifier)
    (identifier)))

I think that one problem is that the conditional_expression rule needs an additional expression at the beginning, before the if. From what I understand, conditional expressions have the form

expression1 if condition else expression2

which is similar to the statement form

if condition:
  expression1
else:
  expression2

Currently, you have conditional_expression starting with the keyword if, but that leaves out the expression that should be evaluated if the condition is true.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@maxbrunsfeld thanks for the suggestion! I was trying to model conditional_expression as part of a call expression, but was confused why I was seeing double call in the AST. I've pulled conditional_expression out of call entirely in 1b46c61, so it is modeled in the way you describe. Now the test is structured in the way we'd expect 😄

=====================================

if a
if a else b
Copy link
Contributor

@maxbrunsfeld maxbrunsfeld Apr 29, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think these are valid python. Testing them in the Python 2.7 REPL like so:

>>> a = 1
>>> b = 2
>>> if a

I get this:

  File "<stdin>", line 1
    if a
       ^
SyntaxError: invalid syntax

Same with the if/else version. I think some valid examples would be:

doSomething() if someCondition()
doSomething() if someCondition() else doSomethingElse()

In terms of the grammar, I think the first expression in the conditional_expression rule should not be optional.

@rewinfrey rewinfrey changed the title [WIP] Grammar improvements Grammar improvements May 11, 2017
@rewinfrey
Copy link
Member Author

I think this is ready for review!

Huge 🎩 @maxbrunsfeld for all the advice and 🍐

This parses python2 and python3 grammar rules, and django, hug and tensorflow.

@maxbrunsfeld maxbrunsfeld merged commit 743caba into master May 11, 2017
@maxbrunsfeld maxbrunsfeld deleted the grammar-improvements branch May 11, 2017 20:20
@maxbrunsfeld
Copy link
Contributor

Awesome work, @rewinfrey!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants