博客
关于我
强烈建议你试试无所不能的chatGPT,快点击我
天气预报爬虫小程序
阅读量:6992 次
发布时间:2019-06-27

本文共 2310 字,大约阅读时间需要 7 分钟。

Python3.5 Mac ios系统 爬取天气预报的小程序: import requests, csv, random, time, socket from bs4 import BeautifulSoup import http.client def get_content(url, data = None):     header = {
'Accept': 'text / html, application / xhtml + xml, application / xml;q = 0.9, image / webp, * / *;q = 0.8', 'Accept - Encoding':'gzip, deflate, sdch', 'Accept - Language':'zh - CN, zh;q = 0.8', 'Connection':'keep - alive', 'User - Agent': 'Mozilla / 5.0(Macintosh;Intel Mac OS X 10 11_6) AppleWebKit / 537.36(KHTML, like Gecko) Chrome / 50.0.2661.102Safari / 537.36' } timeout = random.choice(range(80,180)) while True: try: rep = requests.get(url, headers = header, timeout = timeout) rep.encoding = 'utf-8' break except socket.timeout as e: print('3:', e) time.sleep(random.choice(range(8,15))) except socket.error as e: print('4:', e) time.sleep(random.choice(range(20,60))) except http.client.BadStatusLine as e: print('5:', e) time.sleep(random.choice(range(30,80))) except http.client.ImproperConnectionState as e: print('6:', e) time.sleep(random.choice(range(5,15))) return rep.text def get_data(html_text): finalFile = [] bs = BeautifulSoup(html_text, 'html.parser') body = bs.body data = body.find('div', id="15d") ul = data.find('ul') li = ul.find_all('li') for day in li: temp = [] inf = day.find_all('span') date = inf[0].string temp.append(date) weather = inf[1].string temp.append(weather) temperature= inf[2].text temp.append(temperature) wind = inf[3].string temp.append(wind) wind1 = inf[4].string temp.append(wind1) finalFile.append(temp) return finalFile def write_data(data, name): file_name = name with open(file_name, 'a', errors = 'ignore', newline = '') as f: f_csv = csv.writer(f) f_csv.writerows(data) if __name__ == '__main__': url = 'http://www.weather.com.cn/weather15d/101270101.shtml' html = get_content(url) result = get_data(html) write_data(result, 'content.csv')

转载于:https://www.cnblogs.com/fredkeke/p/5767216.html

你可能感兴趣的文章
java编程目录
查看>>
Java读取xml
查看>>
swagger
查看>>
QFT URL
查看>>
phpcms二级菜单
查看>>
(4)pyspark---dataframe清理
查看>>
JS、PHP验证URL
查看>>
通过iscsi协议使用ceph rbd
查看>>
javascript 和 CoffeeScript 里的类
查看>>
POJ2239 Selecting Courses(二分图)
查看>>
最优二叉搜索树
查看>>
nginx php-cgi php
查看>>
ZetCode PyQt4 tutorial custom widget
查看>>
追求代码质量: 驯服复杂的冗长代码
查看>>
移动端页面适配ipad?
查看>>
js闲记
查看>>
CSS布局部分知识总结
查看>>
Jquery DataTables相关示例
查看>>
HihoCoder第三周与POJ2406:KMP算法总结
查看>>
利用python+seleniumUI自动化登录获取cookie后再去测试接口,今天终于搞定了
查看>>