issue with char coded with cp1252 or windows-1252
|
|
---|
'value': '<html> |
Code Block |
---|
| ### dpage_content : {'title': 'IT', 'type': 'page', 'body': {'storage': {'value': '<html>\n\t\n<head>\n\t<title>2- Add server on XenCenter</title>\n\t<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">\n <meta name="generator" content="H\ |
|
using notpad + HEX-Editor | Image Modified |
python + bs4 modeule to remove it |
Code Block |
---|
| #clean up the begining of the file wih special char
with open(cleaned_html, 'w') as cleaned_file2:
nonBreakSpace = u'\xef\xbb\xbf\x3c'
cleaned_file2.write(str(soup).replace(nonBreakSpace,r'<'))
cleaned_file2.close() |
|
|
|
0xe2809d or †( Right dual quote: " ) | ChassisInfoFetcher Using Vagrant |
|
|
0xe2809c or “ ( Left dual quote ) | LEFT DOUBLE QUOTATION MARK |