module xmlhelper.html_parser_json
¶
Classes¶
class |
truncated documentation |
---|---|
Static Methods¶
staticmethod |
truncated documentation |
---|---|
Iterates on every field contains in the JSON structure. |
|
Methods¶
method |
truncated documentation |
---|---|
Cleans a dictionary of value. |
|
What to do with data. |
|
What to do for the end of a tag. |
|
What to do for a new tag. |
Documentation¶
parsing HTML to convert it into JSON
-
class
pyrsslocal.xmlhelper.html_parser_json.
HTMLtoJSONParser
(raise_exception=True)¶ Bases:
html.parser.HTMLParser
Parses HTML and output a JSON structure. Example:
file = ... with open(file,"r",encoding="utf8") as f : content = f.read() parser = HTMLtoJSONParser() parser.feed(content) js = parser.json
Or:
js = HTMLtoJSONParser.to_json(content)
To iterator on path:
all = [ (k,v) for k,v in HTMLtoJSONParser.iterate(js) ]
- Parameters
raise_exception – if True, raises an exception if the HTML is malformed, otherwise does what it can
-
__init__
(raise_exception=True)¶ - Parameters
raise_exception – if True, raises an exception if the HTML is malformed, otherwise does what it can
-
clean
(values)¶ Cleans a dictionary of value.
-
handle_data
(data)¶ What to do with data.
-
handle_endtag
(tag)¶ What to do for the end of a tag.
-
handle_starttag
(tag, attrs)¶ What to do for a new tag.
-
static
iterate
(json_structure, prefix='', keep_dictionaries=False, skip=['__parent__'])¶ Iterates on every field contains in the JSON structure.
- Parameters
json_structure – json structure
prefix – prefix to add
keep_dictionaries – if True, add yield k,v where v is a JSON dictionary
skip – do not enter the following tag
- Returns
iterator of (path, value)
-
pyrsslocal.xmlhelper.html_parser_json.
iterate_on_json
(json_structure, prefix='', keep_dictionaries=False, skip=['__parent__'])¶ Iterates on every field contains in the JSON structure.
- Parameters
json_structure – json structure
prefix – prefix to add
keep_dictionaries – if True, add yield k,v where v is a JSON dictionary
skip – do not enter the following tag
- Returns
iterator of (path, value)