安装 Selenium 库
// Python 2.x pip install Selenium // Python 3.x pip3 install Selenium
安装 PhantomJS
从 PhantomJS 下载地址 中选择相应版本下载并解压
Python 代码
from selenium import webdriver from bs4 import BeautifulSoup import time driver = webdriver.PhantomJS(executable_path='/path/to/download/phantomjs-2.1.1-macosx/bin/phantomjs') driver.get("http://pythonscraping.com/pages/javascript/ajaxDemo.html") time.sleep(3) # print(driver.find_element_by_id('content').text) pageSource = driver.page_source bsObj = BeautifulSoup(pageSource) print(bsObj.find(id="content").get_text()) driver.close()
根据控件检查页面是否已经完全加载
from selenium import webdriver from selenium.webdriver.common.by import By from selenium.webdriver.support.ui import WebDriverWait from selenium.webdriver.support import expected_conditions as EC driver = webdriver.PhantomJS(executable_path='/path/to/download/phantomjs-2.1.1-macosx/bin/phantomjs') driver.get("http://pythonscraping.com/pages/javascript/ajaxDemo.html") try: element = WebDriverWait(driver, 10).until(EC.presence_of_element_located((By.ID, "loadedButton"))) finally: print(driver.find_element_by_id("content").text) driver.close()
参考
《Python 网络数据采集》
欢迎来到这里!
我们正在构建一个小众社区,大家在这里相互信任,以平等 • 自由 • 奔放的价值观进行分享交流。最终,希望大家能够找到与自己志同道合的伙伴,共同成长。
注册 关于