abstract syntax tree - Python AST module can not detect "if" or "for" -
i trying restrict user-provided script, following visitor:
class syntaxchecker(ast.nodevisitor): def check(self, syntax): tree = ast.parse(syntax) print(ast.dump(tree), syntax) self.visit(tree) def visit_call(self, node): print('called call', ast.dump(node)) if isinstance(node.func, ast.call) , node.func.id not in allowed_functions: raise codeerror("%s not allowed function!"%node.func.id) elif isinstance(node.func, ast.attribute) , node.func.value.id not in allowed_classes: raise codeerror('{0} not calling allowed class'.format(node.func.value.id)) elif isinstance(node.func, ast.name) , node.func.id in allowed_classes: raise codeerror('you not allowed instantiate class, {0}'.format(node.func.id)) else: ast.nodevisitor.generic_visit(self, node) def visit_assign(self, node): print('called assign', ast.dump(node)) ast.nodevisitor.generic_visit(self, node) def visit_attribute(self, node): print('called attribute', ast.dump(node)) if node.value.id not in allowed_classes: raise codeerror('"{0}" not allowed class'.format(node.value.id)) elif node.value.id in allowed_classes , isinstance(node.ctx, ast.store): raise codeerror('trying change in pre-defined class, "{0}" in "{1}"'.format(node.attr, node.value.id)) else: ast.nodevisitor.generic_visit(self, node) def visit_expr(self, node): print('called expr', ast.dump(node)) ast.nodevisitor.generic_visit(self, node) def visit_name(self, node): print('called name', ast.dump(node)) if isinstance(node.ctx, ast.store) , node.id in allowed_classes: raise codeerror('trying change pre-defined class, {0}'.format(node.id)) elif isinstance(node.ctx, ast.load) , node.id not in safe_names , node.id not in allowed_functions , node.id not in allowed_classes: raise codeerror('"{0}" function not allowed'.format(node.id)) else: ast.nodevisitor.generic_visit(self, node) def generic_visit(self, node): print('called generic', ast.dump(node)) if type(node).__name__ not in allowed_node_types: raise codeerror("%s not allowed!"%type(node).__name__) else: ast.nodevisitor.generic_visit(self, node) if __name__ == '__main__': # check whole file x = syntaxchecker() code = open(sys.argv[1], 'r').read() try: x.check(code) except codeerror e: print(repr(e)) # or check line line, considering multiline statements code = '' line in open(sys.argv[1], 'r'): line = line.strip() if line: code += line try: print('[{0}]'.format(code)) x.check(code) code = '' except codeerror e: print(repr(e)) break except syntaxerror e: print('********feeding next line', repr(e))
it doing fine time being, , tune more problem throws syntaxerror('unexpected eof while parsing', ('<unknown>', 1, 15, 'for j in a.b():'))
while parsing this
for j in a.b(): print('hey')
and because of this, no for
or if
gets parsed.
edit: have added code check whole code @ once, or check multi-line statements.
you parsing code line line, for
loop does not stand alone. for
loop without suite syntax error. python expected find suite , found eof (end of file) instead.
in other words, parser can handle simple statements , standalone expressions on 1 physical line, , compound statements if directly followed simple statement or expression on same line.
your code fail for:
multiline strings
somestring = """containing more 1 line"""
line continuations
if the_line == 'too long' , \ a_backslash_was_used in (true, 'true'): # code fails somevar = (you_are_allowed_to_use_newlines, "inside parentheses , brackets , braces")
using ast.parse()
check code line line not going work here; suitable whole suites; on file file basis i'd pass in whole file.
to check code line line need tokenize yourself. can use tokenize
library; it'll report either syntaxerror
exception or tokenize.tokenerror
on syntax errors.
if wanted restrict script, take @ asteval
; either project or source code. parse the whole script, execute based on resulting ast nodes (limiting nodes they'll accept).
Comments
Post a Comment