php2python第一弹:词法分析

阿暖 <[email protected]>
sender-time     Sent at 18:22 (GMT+08:00). Current time there: 6:54 PM. ✆
reply-to        [email protected]
to      [email protected]
date    Mon, Apr 12, 2010 at 18:22

subject [CPyUG] php2python第一弹:还是自己写个语法分析器吧

python在国内还是太小众,好多国内的api都没有python的封装,如果能把php的资源直接转换过来就能用,那就方便多了,所以要写这个php2python

语法分析器我第一个想到是antlr 因为GAE用的就是antlr 而且添加到他的第三方库里 ,python之父也对antlr赞不绝口,更重要的是java有一个php2java的项目用的就是antlr.果然很方便就找到一个antlr解析php的语法文件.

但是可能是版本的问题我用了一下并不成功,接下来读文档,发现python的antlr文档少得可怜,而且官方的python文档就明确指出antlr对python的支持时间还很短,如果发现bug请不要感到意外.

大汗,转用PLY(Python Lex Yacc)一个纯python的Lex Yacc. 找了个例子,发现特好用,一目了然,但是语法细节还是要读文档.这就是通用语法分析器不爽的地方.

作为一个pythoner,能不能自己写一个语法分析器呢?
能!
不仅能而且很简单用一个正则就搞得定
语法你可以根据自己的习惯.爱怎么写就怎么写,像我只想写一个文本到文本的解析器ply好多东西是用不到的

四则运算的解析器

这就是我写的一个四则运算的解析器(用以前学数据结构做的练习改的)

   1 #coding:utf-8
   2 def parser(s):
   3    #词法分析
   4    token=[ r'\d+',#数值
   5            r'[\(\)\+\-\*\/]',#运算符
   6            ]
   7 
   8    import re
   9    rule="((?:"+")|(?:".join(token)+"))"
  10    rule=re.compile(rule)
  11 
  12    #语法分析
  13    op=['(']#运算符栈
  14    value=[]#数值栈
  15    d={ '+':1,
  16        '-':1,
  17        '*':2,
  18        '/':2,
  19        '(':0,
  20    }#运算符优先级
  21    def _calc():
  22        l=value.pop()
  23        r=eval(value.pop()+op.pop()+l)
  24        value.append(str(r))
  25    def _right_par():
  26        if op[-1]=='(':
  27            op.pop()
  28            return
  29        else:
  30            _calc()
  31            _right_par()
  32    def _parser(m):
  33        i=m.group(1)
  34        if i in '+-*/':
  35            if d[op[-1]]>d[i] :_calc()
  36            op.append(i)
  37        elif i=='(':
  38            op.append(i)
  39        elif i==')':
  40            _right_par()
  41        else:
  42            value.append(i)
  43 
  44    rule.sub(_parser,s)
  45    _right_par()
  46    return value[0]
  47 
  48 while 1:
  49    try:
  50        s = raw_input('calc > ')
  51    except EOFError:
  52        break
  53    print parser(s)

反馈

创建 by -- ZoomQuiet [2010-04-12 10:56:56]

修改 by -- [@NAME@]

-  ⇤ ← Revision 1 as of 2010-04-12 10:56:56 → 
  Size: 3174
  Editor: ZoomQuiet
  Comment:
+   ← Revision 2 as of 2010-04-12 11:17:49 → ⇥
  Size: 3071
  Editor: flyinflash
  Comment:
-Deletions are marked like this.
+Additions are marked like this.
 Line 5:
-Line 7:
+Line 8:
-Line 15:
+Line 13:
-sender-time Sent at 18:22 (GMT+08:00). Current time there: 6:54 PM. ✆
reply-to [email protected]
to [email protected]
date Mon, Apr 12, 2010 at 18:22
+sender-time     Sent at 18:22 (GMT+08:00). Current time there: 6:54 PM. ✆
reply-to        [email protected]
to      [email protected]
date    Mon, Apr 12, 2010 at 18:22
-Line 20:
+Line 18:
 subject [CPyUG] php2python第一弹:还是自己写个语法分析器吧
-Line 24:
+Line 22:
-语法分析器我第一个想到是antlr 因为GAE用的就是antlr 而且添加到他的第三方库里
,python之父也对antlr赞不绝口,更重要的是java有一个php2java的项目用的就是antlr.果然很方便就找到一个antlr解析php的语法文件.
+语法分析器我第一个想到是antlr 因为GAE用的就是antlr 而且添加到他的第三方库里 ,python之父也对antlr赞不绝口,更重要的是java有一个php2java的项目用的就是antlr.果然很方便就找到一个antlr解析php的语法文件.
-Line 28:
+Line 25:
-大汗,转用ply(Python Lex Yacc)一个纯python的Lex Yacc.
找了个例子,发现特好用,一目了然,但是语法细节还是要读文档.这就是通用语法分析器不爽的地方.
+大汗,转用[[http://www.dabeaz.com/ply/|PLY]](Python Lex Yacc)一个纯python的Lex Yacc. 找了个例子,发现特好用,一目了然,但是语法细节还是要读文档.这就是通用语法分析器不爽的地方.
-Line 39:
+Line 36:
-{{{
#!python
+{{{#!python
-Line 44:
+Line 39:
-   #词法分析
   token=[ r'\d+',#数值
           r'[\(\)\+\-\*\/]',#运算符
           ]
+   #词法分析
   token=[ r'\d+',#数值
           r'[\(\)\+\-\*\/]',#运算符
           ]
-Line 49:
+Line 44:
-   import re
   rule="((?:"+")|(?:".join(token)+"))"
   rule=re.compile(rule)
+   import re
   rule="((?:"+")|(?:".join(token)+"))"
   rule=re.compile(rule)
-Line 53:
+Line 48:
    #语法分析
   op=['(']#运算符栈
   value=[]#数值栈
   d={ '+':1,
       '-':1,
       '*':2,
       '/':2,
       '(':0,
   }#运算符优先级
   def _calc():
       l=value.pop()
       r=eval(value.pop()+op.pop()+l)
       value.append(str(r))
   def _right_par():
       if op[-1]=='(':
           op.pop()
           return
       else:
           _calc()
           _right_par()
   def _parser(m):
       i=m.group(1)
       if i in '+-*/':
           if d[op[-1]]>d[i] :_calc()
           op.append(i)
       elif i=='(':
           op.append(i)
       elif i==')':
           _right_par()
       else:
           value.append(i)
-Line 85:
+Line 80:
-   rule.sub(_parser,s)
   _right_par()
   return value[0]
+   rule.sub(_parser,s)
   _right_par()
   return value[0]
-Line 90:
+Line 85:
-   try:
       s = raw_input('calc > ')
   except EOFError:
       break
   print parser(s)}}}
+   try:
       s = raw_input('calc > ')
   except EOFError:
       break
   print parser(s)
}}}
-Line 97:
+Line 92:
-Line 102:
+Line 96:
+修改 by -- [@NAME@]

Diff for "MiscItems/2010-04-12"

php2python第一弹:词法分析

四则运算的解析器