Regular expression and strip html tag
<img alt="" src="lib/header1.jpg" style="padding : 1px;"/> = dictionary of attribute = {'alt': '', 'style': 'padding : 1px;', 'src': 'lib/header1.jpg'}
...
Code Block |
---|
| for x in soup.find_all('img') :
print(x)
print(x['src'])
x['src'] = x.get('src').strip('lib/')
print(x['src'])
print(x.attrs) |
|
Code Block |
---|
| C:\Users\jkriker\Google Drive\FTUwebsite\O0O000OOO00O\migrate framed to unframed\cleaned
<img alt="" src="lib/header1.jpg" style="padding : 1px;"/>
lib/header1.jpg
header1.jpg
{'alt': '', 'style': 'padding : 1px;', 'src': 'header1.jpg'}
<img alt="" src="lib/NewItem460.png" style="padding : 1px;"/>
lib/NewItem460.png
NewItem460.png
{'alt': '', 'style': 'padding : 1px;', 'src': 'NewItem460.png'}
|
|
---|
Reguar Regular expressions for python |
|
---|
|
|
\/\/f.*uk\\ | Image Modified |
<ri.*\"\s\/> | Image Added |
<a.*\"\s/> | Image Added |
|
|
|
|
|
|
|
|
|
|
|
|