分析不标准json数据
Leo Jay <[email protected]> 发件人当地时间 发送时间 17:14 (GMT+08:00)。发送地当前时间:下午5:28。 ✆ 回复 [email protected] 发送至 [email protected] 日期 2010年11月5日 下午5:14 主题 Re: [CPyUG] python分析json数据
PLY自制解析
2010/11/3 月色狼影 <[email protected]>: > python只能分析标准的json数据. {'foo" : "bar"} > 但是我采集数据的这个站点 所有的数据都不是标准化的 也就是说 所有的key都没有双引号, 而value都是单引号 > 我该如何处理 > 比如: > {type: 7, typeId: 495, name: 'Howling Fjord'} > 我该如何在每个key上加上引号 以及在value上把单引号换成双引号 >
没有现成的,那就自己写词法分析吧:
1 from ply import lex, yacc
2 import json
3
4 tokens = ('WORD', 'STRING')
5
6 literals = '{},:'
7 t_WORD = r'\w+'
8 t_ignore = ' \t\n\r'
9
10 BACKSLASH = dict(json.decoder.BACKSLASH)
11 BACKSLASH["'"] = u"'"
12
13 def t_STRING(t):
14 r"'(\\'|[^'])*?'"
15 t.value = json.decoder.py_scanstring('%s"' % t.value[1:-1], 0,
16 None, True, BACKSLASH)[0]
17 return t
18
19 def t_error(t):
20 print 'Illegal character "%s"' % t.value[0]
21
22 def p_dic(p):
23 '''dic : '{' key_value_pairs '}'
24 '''
25 p[0] = p[2]
26
27 def p_key_value_pairs(p):
28 '''key_value_pairs : key ':' value
29 | key ':' value ',' key_value_pairs
30 '''
31 if p[0] is None:
32 p[0] = {}
33 if len(p) == 6:
34 p[0].update(p[5])
35 p[0][p[1]] = p[3]
36
37 def p_key_and_value(p):
38 '''value : dic
39 | WORD
40 | STRING
41 key : WORD
42 '''
43 p[0] = p[1]
44
45 def p_error(p):
46 print 'Syntax error at "%s"' % p.value
47
48 lexer = lex.lex()
49 parser = yacc.yacc()
50 result = parser.parse(r"{roles:0,joined:'2010/06/25
51 06:55:27',posts:34,avatar:1,avatarmore:'inv_drink_32_disgustingrotgut',sig:'Wise
52 man said: just walk this way, to the dawn of the light. \nWind will
53 blow into your face, as the years pass you by.\nHear this voice, from
54 deep inside. It\'s the call of your heart.'}")
55 print json.dumps(result)
以上程序在python2.6下运行结果是:
{"roles": "0", "posts": "34", "joined": "2010/06/25 06:55:27", "avatarmore": "inv_drink_32_disgustingrotgut", "sig": "Wise man said: just walk this way, to the dawn of the light. \nWind will blow into your face, as the years pass you by.\nHear this voice, from deep inside. It's the call of your heart.", "avatar": "1"}
你要先安装PLY