abstract syntax tree - Python AST module can not detect "if" or "for" -


i trying restrict user-provided script, following visitor:

class syntaxchecker(ast.nodevisitor):      def check(self, syntax):         tree = ast.parse(syntax)         print(ast.dump(tree), syntax)         self.visit(tree)      def visit_call(self, node):         print('called call', ast.dump(node))         if isinstance(node.func, ast.call) , node.func.id not in allowed_functions:             raise codeerror("%s not allowed function!"%node.func.id)         elif isinstance(node.func, ast.attribute) , node.func.value.id not in allowed_classes:             raise codeerror('{0} not calling allowed class'.format(node.func.value.id))         elif isinstance(node.func, ast.name) , node.func.id in allowed_classes:             raise codeerror('you not allowed instantiate class, {0}'.format(node.func.id))         else:             ast.nodevisitor.generic_visit(self, node)      def visit_assign(self, node):         print('called assign', ast.dump(node))         ast.nodevisitor.generic_visit(self, node)      def visit_attribute(self, node):         print('called attribute', ast.dump(node))         if node.value.id not in allowed_classes:             raise codeerror('"{0}" not allowed class'.format(node.value.id))         elif node.value.id in allowed_classes , isinstance(node.ctx, ast.store):             raise codeerror('trying change in pre-defined class, "{0}" in "{1}"'.format(node.attr, node.value.id))         else:             ast.nodevisitor.generic_visit(self, node)      def visit_expr(self, node):         print('called expr', ast.dump(node))         ast.nodevisitor.generic_visit(self, node)      def visit_name(self, node):         print('called name', ast.dump(node))         if isinstance(node.ctx, ast.store) , node.id in allowed_classes:             raise codeerror('trying change pre-defined class, {0}'.format(node.id))         elif isinstance(node.ctx, ast.load) , node.id not in safe_names , node.id not in allowed_functions , node.id not in allowed_classes:             raise codeerror('"{0}" function not allowed'.format(node.id))         else:             ast.nodevisitor.generic_visit(self, node)      def generic_visit(self, node):         print('called generic', ast.dump(node))                 if type(node).__name__ not in allowed_node_types:             raise codeerror("%s not allowed!"%type(node).__name__)         else:             ast.nodevisitor.generic_visit(self, node)  if __name__ == '__main__':     # check whole file     x = syntaxchecker()     code = open(sys.argv[1], 'r').read()     try:         x.check(code)     except codeerror e:         print(repr(e))      # or check line line, considering multiline statements     code = ''     line in open(sys.argv[1], 'r'):         line = line.strip()         if line:             code += line             try:                 print('[{0}]'.format(code))                 x.check(code)                 code = ''             except codeerror e:                 print(repr(e))                 break             except syntaxerror e:                 print('********feeding next line', repr(e)) 

it doing fine time being, , tune more problem throws syntaxerror('unexpected eof while parsing', ('<unknown>', 1, 15, 'for j in a.b():')) while parsing this

for j in a.b():     print('hey') 

and because of this, no for or if gets parsed.

edit: have added code check whole code @ once, or check multi-line statements.

you parsing code line line, for loop does not stand alone. for loop without suite syntax error. python expected find suite , found eof (end of file) instead.

in other words, parser can handle simple statements , standalone expressions on 1 physical line, , compound statements if directly followed simple statement or expression on same line.

your code fail for:

  • multiline strings

    somestring = """containing more 1 line""" 
  • line continuations

    if the_line == 'too long' , \    a_backslash_was_used in (true, 'true'):     # code fails  somevar = (you_are_allowed_to_use_newlines,     "inside parentheses , brackets , braces") 

using ast.parse() check code line line not going work here; suitable whole suites; on file file basis i'd pass in whole file.

to check code line line need tokenize yourself. can use tokenize library; it'll report either syntaxerror exception or tokenize.tokenerror on syntax errors.

if wanted restrict script, take @ asteval; either project or source code. parse the whole script, execute based on resulting ast nodes (limiting nodes they'll accept).


Comments

Popular posts from this blog

php - Submit Form Data without Reloading page -

linux - Rails running on virtual machine in Windows -