1*16467b97STreehugger Robot/* 2*16467b97STreehugger Robot [The 'BSD licence'] 3*16467b97STreehugger Robot Copyright (c) 2004 Terence Parr and Loring Craymer 4*16467b97STreehugger Robot All rights reserved. 5*16467b97STreehugger Robot 6*16467b97STreehugger Robot Redistribution and use in source and binary forms, with or without 7*16467b97STreehugger Robot modification, are permitted provided that the following conditions 8*16467b97STreehugger Robot are met: 9*16467b97STreehugger Robot 1. Redistributions of source code must retain the above copyright 10*16467b97STreehugger Robot notice, this list of conditions and the following disclaimer. 11*16467b97STreehugger Robot 2. Redistributions in binary form must reproduce the above copyright 12*16467b97STreehugger Robot notice, this list of conditions and the following disclaimer in the 13*16467b97STreehugger Robot documentation and/or other materials provided with the distribution. 14*16467b97STreehugger Robot 3. The name of the author may not be used to endorse or promote products 15*16467b97STreehugger Robot derived from this software without specific prior written permission. 16*16467b97STreehugger Robot 17*16467b97STreehugger Robot THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR 18*16467b97STreehugger Robot IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES 19*16467b97STreehugger Robot OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. 20*16467b97STreehugger Robot IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT, 21*16467b97STreehugger Robot INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT 22*16467b97STreehugger Robot NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, 23*16467b97STreehugger Robot DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY 24*16467b97STreehugger Robot THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT 25*16467b97STreehugger Robot (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF 26*16467b97STreehugger Robot THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. 27*16467b97STreehugger Robot*/ 28*16467b97STreehugger Robot 29*16467b97STreehugger Robot/** Python 2.3.3 Grammar 30*16467b97STreehugger Robot * 31*16467b97STreehugger Robot * Terence Parr and Loring Craymer 32*16467b97STreehugger Robot * February 2004 33*16467b97STreehugger Robot * 34*16467b97STreehugger Robot * Converted to ANTLR v3 November 2005 by Terence Parr. 35*16467b97STreehugger Robot * 36*16467b97STreehugger Robot * This grammar was derived automatically from the Python 2.3.3 37*16467b97STreehugger Robot * parser grammar to get a syntactically correct ANTLR grammar 38*16467b97STreehugger Robot * for Python. Then Terence hand tweaked it to be semantically 39*16467b97STreehugger Robot * correct; i.e., removed lookahead issues etc... It is LL(1) 40*16467b97STreehugger Robot * except for the (sometimes optional) trailing commas and semi-colons. 41*16467b97STreehugger Robot * It needs two symbols of lookahead in this case. 42*16467b97STreehugger Robot * 43*16467b97STreehugger Robot * Starting with Loring's preliminary lexer for Python, I modified it 44*16467b97STreehugger Robot * to do my version of the whole nasty INDENT/DEDENT issue just so I 45*16467b97STreehugger Robot * could understand the problem better. This grammar requires 46*16467b97STreehugger Robot * PythonTokenStream.java to work. Also I used some rules from the 47*16467b97STreehugger Robot * semi-formal grammar on the web for Python (automatically 48*16467b97STreehugger Robot * translated to ANTLR format by an ANTLR grammar, naturally <grin>). 49*16467b97STreehugger Robot * The lexical rules for python are particularly nasty and it took me 50*16467b97STreehugger Robot * a long time to get it 'right'; i.e., think about it in the proper 51*16467b97STreehugger Robot * way. Resist changing the lexer unless you've used ANTLR a lot. ;) 52*16467b97STreehugger Robot * 53*16467b97STreehugger Robot * I (Terence) tested this by running it on the jython-2.1/Lib 54*16467b97STreehugger Robot * directory of 40k lines of Python. 55*16467b97STreehugger Robot * 56*16467b97STreehugger Robot * REQUIRES ANTLR v3 57*16467b97STreehugger Robot */ 58*16467b97STreehugger Robotgrammar Python; 59*16467b97STreehugger Robotoptions {language=JavaScript;} 60*16467b97STreehugger Robot 61*16467b97STreehugger Robottokens { 62*16467b97STreehugger Robot INDENT; 63*16467b97STreehugger Robot DEDENT; 64*16467b97STreehugger Robot} 65*16467b97STreehugger Robot 66*16467b97STreehugger Robot@lexer::members { 67*16467b97STreehugger Robot/** Handles context-sensitive lexing of implicit line joining such as 68*16467b97STreehugger Robot * the case where newline is ignored in cases like this: 69*16467b97STreehugger Robot * a = [3, 70*16467b97STreehugger Robot * 4] 71*16467b97STreehugger Robot */ 72*16467b97STreehugger Robot this.implicitLineJoiningLevel= 0; 73*16467b97STreehugger Robot this.startPos = -1; 74*16467b97STreehugger Robot} 75*16467b97STreehugger Robot 76*16467b97STreehugger Robotsingle_input 77*16467b97STreehugger Robot : NEWLINE 78*16467b97STreehugger Robot | simple_stmt 79*16467b97STreehugger Robot | compound_stmt NEWLINE 80*16467b97STreehugger Robot ; 81*16467b97STreehugger Robot 82*16467b97STreehugger Robotfile_input 83*16467b97STreehugger Robot : (NEWLINE | stmt)* 84*16467b97STreehugger Robot ; 85*16467b97STreehugger Robot 86*16467b97STreehugger Roboteval_input 87*16467b97STreehugger Robot : (NEWLINE)* testlist (NEWLINE)* 88*16467b97STreehugger Robot ; 89*16467b97STreehugger Robot 90*16467b97STreehugger Robotfuncdef 91*16467b97STreehugger Robot : 'def' NAME parameters COLON suite 92*16467b97STreehugger Robot {xlog("found method def "+$NAME.text);} 93*16467b97STreehugger Robot ; 94*16467b97STreehugger Robot 95*16467b97STreehugger Robotparameters 96*16467b97STreehugger Robot : LPAREN (varargslist)? RPAREN 97*16467b97STreehugger Robot ; 98*16467b97STreehugger Robot 99*16467b97STreehugger Robotvarargslist 100*16467b97STreehugger Robot : defparameter (options {greedy=true;}:COMMA defparameter)* 101*16467b97STreehugger Robot (COMMA 102*16467b97STreehugger Robot ( STAR NAME (COMMA DOUBLESTAR NAME)? 103*16467b97STreehugger Robot | DOUBLESTAR NAME 104*16467b97STreehugger Robot )? 105*16467b97STreehugger Robot )? 106*16467b97STreehugger Robot | STAR NAME (COMMA DOUBLESTAR NAME)? 107*16467b97STreehugger Robot | DOUBLESTAR NAME 108*16467b97STreehugger Robot ; 109*16467b97STreehugger Robot 110*16467b97STreehugger Robotdefparameter 111*16467b97STreehugger Robot : fpdef (ASSIGN test)? 112*16467b97STreehugger Robot ; 113*16467b97STreehugger Robot 114*16467b97STreehugger Robotfpdef 115*16467b97STreehugger Robot : NAME 116*16467b97STreehugger Robot | LPAREN fplist RPAREN 117*16467b97STreehugger Robot ; 118*16467b97STreehugger Robot 119*16467b97STreehugger Robotfplist 120*16467b97STreehugger Robot : fpdef (options {greedy=true;}:COMMA fpdef)* (COMMA)? 121*16467b97STreehugger Robot ; 122*16467b97STreehugger Robot 123*16467b97STreehugger Robot 124*16467b97STreehugger Robotstmt: simple_stmt 125*16467b97STreehugger Robot | compound_stmt 126*16467b97STreehugger Robot ; 127*16467b97STreehugger Robot 128*16467b97STreehugger Robotsimple_stmt 129*16467b97STreehugger Robot : small_stmt (options {greedy=true;}:SEMI small_stmt)* (SEMI)? NEWLINE 130*16467b97STreehugger Robot ; 131*16467b97STreehugger Robot 132*16467b97STreehugger Robotsmall_stmt: expr_stmt 133*16467b97STreehugger Robot | print_stmt 134*16467b97STreehugger Robot | del_stmt 135*16467b97STreehugger Robot | pass_stmt 136*16467b97STreehugger Robot | flow_stmt 137*16467b97STreehugger Robot | import_stmt 138*16467b97STreehugger Robot | global_stmt 139*16467b97STreehugger Robot | exec_stmt 140*16467b97STreehugger Robot | assert_stmt 141*16467b97STreehugger Robot ; 142*16467b97STreehugger Robot 143*16467b97STreehugger Robotexpr_stmt 144*16467b97STreehugger Robot : testlist 145*16467b97STreehugger Robot ( augassign testlist 146*16467b97STreehugger Robot | (ASSIGN testlist)+ 147*16467b97STreehugger Robot )? 148*16467b97STreehugger Robot ; 149*16467b97STreehugger Robot 150*16467b97STreehugger Robotaugassign 151*16467b97STreehugger Robot : PLUSEQUAL 152*16467b97STreehugger Robot | MINUSEQUAL 153*16467b97STreehugger Robot | STAREQUAL 154*16467b97STreehugger Robot | SLASHEQUAL 155*16467b97STreehugger Robot | PERCENTEQUAL 156*16467b97STreehugger Robot | AMPEREQUAL 157*16467b97STreehugger Robot | VBAREQUAL 158*16467b97STreehugger Robot | CIRCUMFLEXEQUAL 159*16467b97STreehugger Robot | LEFTSHIFTEQUAL 160*16467b97STreehugger Robot | RIGHTSHIFTEQUAL 161*16467b97STreehugger Robot | DOUBLESTAREQUAL 162*16467b97STreehugger Robot | DOUBLESLASHEQUAL 163*16467b97STreehugger Robot ; 164*16467b97STreehugger Robot 165*16467b97STreehugger Robotprint_stmt: 166*16467b97STreehugger Robot 'print' 167*16467b97STreehugger Robot ( testlist 168*16467b97STreehugger Robot | RIGHTSHIFT testlist 169*16467b97STreehugger Robot )? 170*16467b97STreehugger Robot ; 171*16467b97STreehugger Robot 172*16467b97STreehugger Robotdel_stmt: 'del' exprlist 173*16467b97STreehugger Robot ; 174*16467b97STreehugger Robot 175*16467b97STreehugger Robotpass_stmt: 'pass' 176*16467b97STreehugger Robot ; 177*16467b97STreehugger Robot 178*16467b97STreehugger Robotflow_stmt: break_stmt 179*16467b97STreehugger Robot | continue_stmt 180*16467b97STreehugger Robot | return_stmt 181*16467b97STreehugger Robot | raise_stmt 182*16467b97STreehugger Robot | yield_stmt 183*16467b97STreehugger Robot ; 184*16467b97STreehugger Robot 185*16467b97STreehugger Robotbreak_stmt: 'break' 186*16467b97STreehugger Robot ; 187*16467b97STreehugger Robot 188*16467b97STreehugger Robotcontinue_stmt: 'continue' 189*16467b97STreehugger Robot ; 190*16467b97STreehugger Robot 191*16467b97STreehugger Robotreturn_stmt: 'return' (testlist)? 192*16467b97STreehugger Robot ; 193*16467b97STreehugger Robot 194*16467b97STreehugger Robotyield_stmt: 'yield' testlist 195*16467b97STreehugger Robot ; 196*16467b97STreehugger Robot 197*16467b97STreehugger Robotraise_stmt: 'raise' (test (COMMA test (COMMA test)?)?)? 198*16467b97STreehugger Robot ; 199*16467b97STreehugger Robot 200*16467b97STreehugger Robotimport_stmt 201*16467b97STreehugger Robot : 'import' dotted_as_name (COMMA dotted_as_name)* 202*16467b97STreehugger Robot | 'from' dotted_name 'import' 203*16467b97STreehugger Robot (STAR | import_as_name (COMMA import_as_name)*) 204*16467b97STreehugger Robot ; 205*16467b97STreehugger Robot 206*16467b97STreehugger Robotimport_as_name 207*16467b97STreehugger Robot : NAME (NAME NAME)? 208*16467b97STreehugger Robot ; 209*16467b97STreehugger Robot 210*16467b97STreehugger Robotdotted_as_name: dotted_name (NAME NAME)? 211*16467b97STreehugger Robot ; 212*16467b97STreehugger Robot 213*16467b97STreehugger Robotdotted_name: NAME (DOT NAME)* 214*16467b97STreehugger Robot ; 215*16467b97STreehugger Robot 216*16467b97STreehugger Robotglobal_stmt: 'global' NAME (COMMA NAME)* 217*16467b97STreehugger Robot ; 218*16467b97STreehugger Robot 219*16467b97STreehugger Robotexec_stmt: 'exec' expr ('in' test (COMMA test)?)? 220*16467b97STreehugger Robot ; 221*16467b97STreehugger Robot 222*16467b97STreehugger Robotassert_stmt: 'assert' test (COMMA test)? 223*16467b97STreehugger Robot ; 224*16467b97STreehugger Robot 225*16467b97STreehugger Robot 226*16467b97STreehugger Robotcompound_stmt: if_stmt 227*16467b97STreehugger Robot | while_stmt 228*16467b97STreehugger Robot | for_stmt 229*16467b97STreehugger Robot | try_stmt 230*16467b97STreehugger Robot | funcdef 231*16467b97STreehugger Robot | classdef 232*16467b97STreehugger Robot ; 233*16467b97STreehugger Robot 234*16467b97STreehugger Robotif_stmt: 'if' test COLON suite ('elif' test COLON suite)* ('else' COLON suite)? 235*16467b97STreehugger Robot ; 236*16467b97STreehugger Robot 237*16467b97STreehugger Robotwhile_stmt: 'while' test COLON suite ('else' COLON suite)? 238*16467b97STreehugger Robot ; 239*16467b97STreehugger Robot 240*16467b97STreehugger Robotfor_stmt: 'for' exprlist 'in' testlist COLON suite ('else' COLON suite)? 241*16467b97STreehugger Robot ; 242*16467b97STreehugger Robot 243*16467b97STreehugger Robottry_stmt 244*16467b97STreehugger Robot : 'try' COLON suite 245*16467b97STreehugger Robot ( (except_clause COLON suite)+ ('else' COLON suite)? 246*16467b97STreehugger Robot | 'finally' COLON suite 247*16467b97STreehugger Robot ) 248*16467b97STreehugger Robot ; 249*16467b97STreehugger Robot 250*16467b97STreehugger Robotexcept_clause: 'except' (test (COMMA test)?)? 251*16467b97STreehugger Robot ; 252*16467b97STreehugger Robot 253*16467b97STreehugger Robotsuite: simple_stmt 254*16467b97STreehugger Robot | NEWLINE INDENT (stmt)+ DEDENT 255*16467b97STreehugger Robot ; 256*16467b97STreehugger Robot 257*16467b97STreehugger Robot 258*16467b97STreehugger Robottest: and_test ('or' and_test)* 259*16467b97STreehugger Robot | lambdef 260*16467b97STreehugger Robot ; 261*16467b97STreehugger Robot 262*16467b97STreehugger Robotand_test 263*16467b97STreehugger Robot : not_test ('and' not_test)* 264*16467b97STreehugger Robot ; 265*16467b97STreehugger Robot 266*16467b97STreehugger Robotnot_test 267*16467b97STreehugger Robot : 'not' not_test 268*16467b97STreehugger Robot | comparison 269*16467b97STreehugger Robot ; 270*16467b97STreehugger Robot 271*16467b97STreehugger Robotcomparison: expr (comp_op expr)* 272*16467b97STreehugger Robot ; 273*16467b97STreehugger Robot 274*16467b97STreehugger Robotcomp_op: LESS 275*16467b97STreehugger Robot |GREATER 276*16467b97STreehugger Robot |EQUAL 277*16467b97STreehugger Robot |GREATEREQUAL 278*16467b97STreehugger Robot |LESSEQUAL 279*16467b97STreehugger Robot |ALT_NOTEQUAL 280*16467b97STreehugger Robot |NOTEQUAL 281*16467b97STreehugger Robot |'in' 282*16467b97STreehugger Robot |'not' 'in' 283*16467b97STreehugger Robot |'is' 284*16467b97STreehugger Robot |'is' 'not' 285*16467b97STreehugger Robot ; 286*16467b97STreehugger Robot 287*16467b97STreehugger Robotexpr: xor_expr (VBAR xor_expr)* 288*16467b97STreehugger Robot ; 289*16467b97STreehugger Robot 290*16467b97STreehugger Robotxor_expr: and_expr (CIRCUMFLEX and_expr)* 291*16467b97STreehugger Robot ; 292*16467b97STreehugger Robot 293*16467b97STreehugger Robotand_expr: shift_expr (AMPER shift_expr)* 294*16467b97STreehugger Robot ; 295*16467b97STreehugger Robot 296*16467b97STreehugger Robotshift_expr: arith_expr ((LEFTSHIFT|RIGHTSHIFT) arith_expr)* 297*16467b97STreehugger Robot ; 298*16467b97STreehugger Robot 299*16467b97STreehugger Robotarith_expr: term ((PLUS|MINUS) term)* 300*16467b97STreehugger Robot ; 301*16467b97STreehugger Robot 302*16467b97STreehugger Robotterm: factor ((STAR | SLASH | PERCENT | DOUBLESLASH ) factor)* 303*16467b97STreehugger Robot ; 304*16467b97STreehugger Robot 305*16467b97STreehugger Robotfactor 306*16467b97STreehugger Robot : (PLUS|MINUS|TILDE) factor 307*16467b97STreehugger Robot | power 308*16467b97STreehugger Robot ; 309*16467b97STreehugger Robot 310*16467b97STreehugger Robotpower 311*16467b97STreehugger Robot : atom (trailer)* (options {greedy=true;}:DOUBLESTAR factor)? 312*16467b97STreehugger Robot ; 313*16467b97STreehugger Robot 314*16467b97STreehugger Robotatom: LPAREN (testlist)? RPAREN 315*16467b97STreehugger Robot | LBRACK (listmaker)? RBRACK 316*16467b97STreehugger Robot | LCURLY (dictmaker)? RCURLY 317*16467b97STreehugger Robot | BACKQUOTE testlist BACKQUOTE 318*16467b97STreehugger Robot | NAME 319*16467b97STreehugger Robot | INT 320*16467b97STreehugger Robot | LONGINT 321*16467b97STreehugger Robot | FLOAT 322*16467b97STreehugger Robot | COMPLEX 323*16467b97STreehugger Robot | (STRING)+ 324*16467b97STreehugger Robot ; 325*16467b97STreehugger Robot 326*16467b97STreehugger Robotlistmaker: test ( list_for | (options {greedy=true;}:COMMA test)* ) (COMMA)? 327*16467b97STreehugger Robot ; 328*16467b97STreehugger Robot 329*16467b97STreehugger Robotlambdef: 'lambda' (varargslist)? COLON test 330*16467b97STreehugger Robot ; 331*16467b97STreehugger Robot 332*16467b97STreehugger Robottrailer: LPAREN (arglist)? RPAREN 333*16467b97STreehugger Robot | LBRACK subscriptlist RBRACK 334*16467b97STreehugger Robot | DOT NAME 335*16467b97STreehugger Robot ; 336*16467b97STreehugger Robot 337*16467b97STreehugger Robotsubscriptlist 338*16467b97STreehugger Robot : subscript (options {greedy=true;}:COMMA subscript)* (COMMA)? 339*16467b97STreehugger Robot ; 340*16467b97STreehugger Robot 341*16467b97STreehugger Robotsubscript 342*16467b97STreehugger Robot : DOT DOT DOT 343*16467b97STreehugger Robot | test (COLON (test)? (sliceop)?)? 344*16467b97STreehugger Robot | COLON (test)? (sliceop)? 345*16467b97STreehugger Robot ; 346*16467b97STreehugger Robot 347*16467b97STreehugger Robotsliceop: COLON (test)? 348*16467b97STreehugger Robot ; 349*16467b97STreehugger Robot 350*16467b97STreehugger Robotexprlist 351*16467b97STreehugger Robot : expr (options {k=2;}:COMMA expr)* (COMMA)? 352*16467b97STreehugger Robot ; 353*16467b97STreehugger Robot 354*16467b97STreehugger Robottestlist 355*16467b97STreehugger Robot : test (options {k=2;}: COMMA test)* (COMMA)? 356*16467b97STreehugger Robot ; 357*16467b97STreehugger Robot 358*16467b97STreehugger Robotdictmaker 359*16467b97STreehugger Robot : test COLON test 360*16467b97STreehugger Robot (options {k=2;}:COMMA test COLON test)* (COMMA)? 361*16467b97STreehugger Robot ; 362*16467b97STreehugger Robot 363*16467b97STreehugger Robotclassdef: 'class' NAME (LPAREN testlist RPAREN)? COLON suite 364*16467b97STreehugger Robot {xlog("found class def "+$NAME.text);} 365*16467b97STreehugger Robot ; 366*16467b97STreehugger Robot 367*16467b97STreehugger Robotarglist: argument (COMMA argument)* 368*16467b97STreehugger Robot ( COMMA 369*16467b97STreehugger Robot ( STAR test (COMMA DOUBLESTAR test)? 370*16467b97STreehugger Robot | DOUBLESTAR test 371*16467b97STreehugger Robot )? 372*16467b97STreehugger Robot )? 373*16467b97STreehugger Robot | STAR test (COMMA DOUBLESTAR test)? 374*16467b97STreehugger Robot | DOUBLESTAR test 375*16467b97STreehugger Robot ; 376*16467b97STreehugger Robot 377*16467b97STreehugger Robotargument : test (ASSIGN test)? 378*16467b97STreehugger Robot ; 379*16467b97STreehugger Robot 380*16467b97STreehugger Robotlist_iter: list_for 381*16467b97STreehugger Robot | list_if 382*16467b97STreehugger Robot ; 383*16467b97STreehugger Robot 384*16467b97STreehugger Robotlist_for: 'for' exprlist 'in' testlist (list_iter)? 385*16467b97STreehugger Robot ; 386*16467b97STreehugger Robot 387*16467b97STreehugger Robotlist_if: 'if' test (list_iter)? 388*16467b97STreehugger Robot ; 389*16467b97STreehugger Robot 390*16467b97STreehugger RobotLPAREN : '(' {this.implicitLineJoiningLevel++;} ; 391*16467b97STreehugger Robot 392*16467b97STreehugger RobotRPAREN : ')' {this.implicitLineJoiningLevel--;} ; 393*16467b97STreehugger Robot 394*16467b97STreehugger RobotLBRACK : '[' {this.implicitLineJoiningLevel++;} ; 395*16467b97STreehugger Robot 396*16467b97STreehugger RobotRBRACK : ']' {this.implicitLineJoiningLevel--;} ; 397*16467b97STreehugger Robot 398*16467b97STreehugger RobotCOLON : ':' ; 399*16467b97STreehugger Robot 400*16467b97STreehugger RobotCOMMA : ',' ; 401*16467b97STreehugger Robot 402*16467b97STreehugger RobotSEMI : ';' ; 403*16467b97STreehugger Robot 404*16467b97STreehugger RobotPLUS : '+' ; 405*16467b97STreehugger Robot 406*16467b97STreehugger RobotMINUS : '-' ; 407*16467b97STreehugger Robot 408*16467b97STreehugger RobotSTAR : '*' ; 409*16467b97STreehugger Robot 410*16467b97STreehugger RobotSLASH : '/' ; 411*16467b97STreehugger Robot 412*16467b97STreehugger RobotVBAR : '|' ; 413*16467b97STreehugger Robot 414*16467b97STreehugger RobotAMPER : '&' ; 415*16467b97STreehugger Robot 416*16467b97STreehugger RobotLESS : '<' ; 417*16467b97STreehugger Robot 418*16467b97STreehugger RobotGREATER : '>' ; 419*16467b97STreehugger Robot 420*16467b97STreehugger RobotASSIGN : '=' ; 421*16467b97STreehugger Robot 422*16467b97STreehugger RobotPERCENT : '%' ; 423*16467b97STreehugger Robot 424*16467b97STreehugger RobotBACKQUOTE : '`' ; 425*16467b97STreehugger Robot 426*16467b97STreehugger RobotLCURLY : '{' {this.implicitLineJoiningLevel++;} ; 427*16467b97STreehugger Robot 428*16467b97STreehugger RobotRCURLY : '}' {this.implicitLineJoiningLevel--;} ; 429*16467b97STreehugger Robot 430*16467b97STreehugger RobotCIRCUMFLEX : '^' ; 431*16467b97STreehugger Robot 432*16467b97STreehugger RobotTILDE : '~' ; 433*16467b97STreehugger Robot 434*16467b97STreehugger RobotEQUAL : '==' ; 435*16467b97STreehugger Robot 436*16467b97STreehugger RobotNOTEQUAL : '!=' ; 437*16467b97STreehugger Robot 438*16467b97STreehugger RobotALT_NOTEQUAL: '<>' ; 439*16467b97STreehugger Robot 440*16467b97STreehugger RobotLESSEQUAL : '<=' ; 441*16467b97STreehugger Robot 442*16467b97STreehugger RobotLEFTSHIFT : '<<' ; 443*16467b97STreehugger Robot 444*16467b97STreehugger RobotGREATEREQUAL : '>=' ; 445*16467b97STreehugger Robot 446*16467b97STreehugger RobotRIGHTSHIFT : '>>' ; 447*16467b97STreehugger Robot 448*16467b97STreehugger RobotPLUSEQUAL : '+=' ; 449*16467b97STreehugger Robot 450*16467b97STreehugger RobotMINUSEQUAL : '-=' ; 451*16467b97STreehugger Robot 452*16467b97STreehugger RobotDOUBLESTAR : '**' ; 453*16467b97STreehugger Robot 454*16467b97STreehugger RobotSTAREQUAL : '*=' ; 455*16467b97STreehugger Robot 456*16467b97STreehugger RobotDOUBLESLASH : '//' ; 457*16467b97STreehugger Robot 458*16467b97STreehugger RobotSLASHEQUAL : '/=' ; 459*16467b97STreehugger Robot 460*16467b97STreehugger RobotVBAREQUAL : '|=' ; 461*16467b97STreehugger Robot 462*16467b97STreehugger RobotPERCENTEQUAL : '%=' ; 463*16467b97STreehugger Robot 464*16467b97STreehugger RobotAMPEREQUAL : '&=' ; 465*16467b97STreehugger Robot 466*16467b97STreehugger RobotCIRCUMFLEXEQUAL : '^=' ; 467*16467b97STreehugger Robot 468*16467b97STreehugger RobotLEFTSHIFTEQUAL : '<<=' ; 469*16467b97STreehugger Robot 470*16467b97STreehugger RobotRIGHTSHIFTEQUAL : '>>=' ; 471*16467b97STreehugger Robot 472*16467b97STreehugger RobotDOUBLESTAREQUAL : '**=' ; 473*16467b97STreehugger Robot 474*16467b97STreehugger RobotDOUBLESLASHEQUAL : '//=' ; 475*16467b97STreehugger Robot 476*16467b97STreehugger RobotDOT : '.' ; 477*16467b97STreehugger Robot 478*16467b97STreehugger RobotFLOAT 479*16467b97STreehugger Robot : '.' DIGITS (Exponent)? 480*16467b97STreehugger Robot | DIGITS ('.' (DIGITS (Exponent)?)? | Exponent) 481*16467b97STreehugger Robot ; 482*16467b97STreehugger Robot 483*16467b97STreehugger RobotLONGINT 484*16467b97STreehugger Robot : INT ('l'|'L') 485*16467b97STreehugger Robot ; 486*16467b97STreehugger Robot 487*16467b97STreehugger Robotfragment 488*16467b97STreehugger RobotExponent 489*16467b97STreehugger Robot : ('e' | 'E') ( '+' | '-' )? DIGITS 490*16467b97STreehugger Robot ; 491*16467b97STreehugger Robot 492*16467b97STreehugger RobotINT : // Hex 493*16467b97STreehugger Robot '0' ('x' | 'X') ( '0' .. '9' | 'a' .. 'f' | 'A' .. 'F' )+ 494*16467b97STreehugger Robot ('l' | 'L')? 495*16467b97STreehugger Robot | // Octal 496*16467b97STreehugger Robot '0' DIGITS* 497*16467b97STreehugger Robot | '1'..'9' DIGITS* 498*16467b97STreehugger Robot ; 499*16467b97STreehugger Robot 500*16467b97STreehugger RobotCOMPLEX 501*16467b97STreehugger Robot : INT ('j'|'J') 502*16467b97STreehugger Robot | FLOAT ('j'|'J') 503*16467b97STreehugger Robot ; 504*16467b97STreehugger Robot 505*16467b97STreehugger Robotfragment 506*16467b97STreehugger RobotDIGITS : ( '0' .. '9' )+ ; 507*16467b97STreehugger Robot 508*16467b97STreehugger RobotNAME: ( 'a' .. 'z' | 'A' .. 'Z' | '_') 509*16467b97STreehugger Robot ( 'a' .. 'z' | 'A' .. 'Z' | '_' | '0' .. '9' )* 510*16467b97STreehugger Robot ; 511*16467b97STreehugger Robot 512*16467b97STreehugger Robot/** Match various string types. Note that greedy=false implies ''' 513*16467b97STreehugger Robot * should make us exit loop not continue. 514*16467b97STreehugger Robot */ 515*16467b97STreehugger RobotSTRING 516*16467b97STreehugger Robot : ('r'|'u'|'ur')? 517*16467b97STreehugger Robot ( '\'\'\'' (options {greedy=false;}:.)* '\'\'\'' 518*16467b97STreehugger Robot | '"""' (options {greedy=false;}:.)* '"""' 519*16467b97STreehugger Robot | '"' (ESC|~('\\'|'\n'|'"'))* '"' 520*16467b97STreehugger Robot | '\'' (ESC|~('\\'|'\n'|'\''))* '\'' 521*16467b97STreehugger Robot ) 522*16467b97STreehugger Robot ; 523*16467b97STreehugger Robot 524*16467b97STreehugger Robotfragment 525*16467b97STreehugger RobotESC 526*16467b97STreehugger Robot : '\\' . 527*16467b97STreehugger Robot ; 528*16467b97STreehugger Robot 529*16467b97STreehugger Robot/** Consume a newline and any whitespace at start of next line */ 530*16467b97STreehugger RobotCONTINUED_LINE 531*16467b97STreehugger Robot : '\\' ('\r')? '\n' (' '|'\t')* { $channel=HIDDEN; } 532*16467b97STreehugger Robot ; 533*16467b97STreehugger Robot 534*16467b97STreehugger Robot/** Treat a sequence of blank lines as a single blank line. If 535*16467b97STreehugger Robot * nested within a (..), {..}, or [..], then ignore newlines. 536*16467b97STreehugger Robot * If the first newline starts in column one, they are to be ignored. 537*16467b97STreehugger Robot */ 538*16467b97STreehugger RobotNEWLINE 539*16467b97STreehugger Robot : (('\r')? '\n' )+ 540*16467b97STreehugger Robot {if ( this.startPos==0 || this.implicitLineJoiningLevel>0 ) 541*16467b97STreehugger Robot $channel=HIDDEN; 542*16467b97STreehugger Robot } 543*16467b97STreehugger Robot ; 544*16467b97STreehugger Robot 545*16467b97STreehugger RobotWS : {this.startPos>0}?=> (' '|'\t')+ {$channel=HIDDEN;} 546*16467b97STreehugger Robot ; 547*16467b97STreehugger Robot 548*16467b97STreehugger Robot/** Grab everything before a real symbol. Then if newline, kill it 549*16467b97STreehugger Robot * as this is a blank line. If whitespace followed by comment, kill it 550*16467b97STreehugger Robot * as it's a comment on a line by itself. 551*16467b97STreehugger Robot * 552*16467b97STreehugger Robot * Ignore leading whitespace when nested in [..], (..), {..}. 553*16467b97STreehugger Robot */ 554*16467b97STreehugger RobotLEADING_WS 555*16467b97STreehugger Robot@init { 556*16467b97STreehugger Robot var spaces = 0; 557*16467b97STreehugger Robot} 558*16467b97STreehugger Robot : {this.startPos==0}?=> 559*16467b97STreehugger Robot ( {this.implicitLineJoiningLevel>0}? ( ' ' | '\t' )+ {$channel=HIDDEN;} 560*16467b97STreehugger Robot | ( ' ' { spaces++; } 561*16467b97STreehugger Robot | '\t' { spaces += 8; spaces -= (spaces \% 8); } 562*16467b97STreehugger Robot )+ 563*16467b97STreehugger Robot { 564*16467b97STreehugger Robot // make a string of n spaces where n is column number - 1 565*16467b97STreehugger Robot var indentation = new Array(spaces); 566*16467b97STreehugger Robot for (var i=0; i<spaces; i++) { 567*16467b97STreehugger Robot indentation[i] = ' '; 568*16467b97STreehugger Robot } 569*16467b97STreehugger Robot var s = indentation.join(""); 570*16467b97STreehugger Robot this.emit(new org.antlr.runtime.CommonToken(this.LEADING_WS,s)); 571*16467b97STreehugger Robot } 572*16467b97STreehugger Robot // kill trailing newline if present and then ignore 573*16467b97STreehugger Robot ( ('\r')? '\n' {if (this.state.token!=null) this.state.token.setChannel(HIDDEN); else $channel=HIDDEN;})* 574*16467b97STreehugger Robot // {this.token.setChannel(99); } 575*16467b97STreehugger Robot ) 576*16467b97STreehugger Robot ; 577*16467b97STreehugger Robot 578*16467b97STreehugger Robot/** Comments not on line by themselves are turned into newlines. 579*16467b97STreehugger Robot 580*16467b97STreehugger Robot b = a # end of line comment 581*16467b97STreehugger Robot 582*16467b97STreehugger Robot or 583*16467b97STreehugger Robot 584*16467b97STreehugger Robot a = [1, # weird 585*16467b97STreehugger Robot 2] 586*16467b97STreehugger Robot 587*16467b97STreehugger Robot This rule is invoked directly by nextToken when the comment is in 588*16467b97STreehugger Robot first column or when comment is on end of nonwhitespace line. 589*16467b97STreehugger Robot 590*16467b97STreehugger Robot Only match \n here if we didn't start on left edge; let NEWLINE return that. 591*16467b97STreehugger Robot Kill if newlines if we live on a line by ourselves 592*16467b97STreehugger Robot 593*16467b97STreehugger Robot Consume any leading whitespace if it starts on left edge. 594*16467b97STreehugger Robot */ 595*16467b97STreehugger RobotCOMMENT 596*16467b97STreehugger Robot@init { 597*16467b97STreehugger Robot $channel=HIDDEN; 598*16467b97STreehugger Robot} 599*16467b97STreehugger Robot : {this.startPos==0}?=> (' '|'\t')* '#' (~'\n')* '\n'+ 600*16467b97STreehugger Robot | {this.startPos>0}?=> '#' (~'\n')* // let NEWLINE handle \n unless char pos==0 for '#' 601*16467b97STreehugger Robot ; 602