实时抓取网页数据(我正在尝试从这个网页中获取实时商品价值它有一个iframe)
优采云 发布时间: 2021-10-11 13:27实时抓取网页数据(我正在尝试从这个网页中获取实时商品价值它有一个iframe)
我正在尝试从此网页获取实时产品价值。它有一个 iframe 地址::8000/
这是我用 BeautifulSoup 尝试过的:
from bs4 import BeautifulSoup
#import time
import urllib
data = []
url=urllib.urlopen("http://213.136.84.136:8000/")
html=url.read()
url.close()
soup = BeautifulSoup(html,"html.parser")
span=soup.find('table', attrs={'class':'table2'})
table_body = span.find('tbody')
rows = table_body.find_all('tr')
for row in rows:
cols = row.find_all('td')
cols = [ele.text.strip() for ele in cols]
data.append([ele for ele in cols if ele]) # Get rid of empty values
print([data])
这是我得到的输出:
[[[], [u'INTERNATIONAL MARKET'], [u'SPOT Gold'], [u'SPOT Silver'], [u'CrudeOil'], [u'Copper'], [u'NaturalGas'], [u'Dow Jones'], [u'Bank Nifty'], [u'INDIAN MARKET'], [u'MCXGold'], [u'MCXSilver'], [u'MCXCrudeOil'], [u'MCXCopper'], [u'MCXLead'], [u'MCXNickel'], [u'MCXZinc'], [u'MCXNaturalGas'], [u'MCXAluminium'], [u'MCXMenthaOil'], [u'USDINR'], [], [u"Disclaimer: We can't assure any guarantee about the accuaracy of the data."]]]
它不返回任何引号。你知道吗?
以下是HTML代码:
SYMBOL
LTP
HIGH
LOW
INTERNATIONAL MARKET
SPOT Gold
SPOT Silver
CrudeOil
Copper
NaturalGas
Dow Jones
Bank Nifty
INDIAN MARKET
MCXGold
MCXSilver
MCXCrudeOil
MCXCopper
MCXLead
MCXNickel
MCXZinc
MCXNaturalGas
MCXAluminium
MCXMenthaOil
USDINR
Disclaimer: We can't assure any guarantee about the accuaracy of the data.
有没有人知道其他方式来获得这个实时报价?你知道吗?