总体目标:抓取淘宝以“男机杯”为关键词的产品数据信息,以及销量前十的所有评价。
总体目标:抓取淘宝上关键词为“男机杯”的产品数据信息,以及销量前十的所有评价。
专用工具: Python Scrapy抓取数据,Excel世界云分析数据
网络爬虫
找一个简单的换页方式,数据信息就藏在这里。
https://s.taobao.com/search?data-key=sdata-value=44ajax=true_ksTS=1504329067199_977callback=jsonp978q=男用飞机杯imgfile=commend=allssid=s5-esearch_type=itemsourceId=tb.indexspm=a21bo.50862.201856-taobao-item.1ie=utf8initiative_id=tbindexz_20170902bcoffset=4p4ppushleft=,48https://s.taobao.com/search?data-key=sdata-value=88ajax=true_ksTS=1504329110124_1174callback=jsonp1175q=男用飞机杯imgfile=commend=allssid=s5-esearch_type=itemsourceId=tb.indexspm=a21bo.50862.201856-taobao-item.1ie=utf8initiative_id=tbindexz_20170902bcoffset=4p4ppushleft=,48s=44https://s.taobao.com/search?data-key=sdata-value=132ajax=true_ksTS=1504329292131_1421callback=jsonp1422q=男用飞机杯imgfile=commend=allssid=s5-esearch_type=itemsourceId=tb.indexspm=a21bo.50862.201856-taobao-item.1ie=utf8initiative_id=tbindexz_20170902bcoffset=4p4ppushleft=,48s=88 https://s.taobao.com/search? Data-key = s Data-value = 44 Ajax = true _ ksts = 1504329067199 _ 977 callback = JSON 978 q =男式飞机杯 img file = comment = all ssid = S5-e search _ type = item sourceId = TB . index SPM = a21bo . 50862 . 201856-Taobao-item . 1 ie = ut F8 initiative _ id = TB indexz _ 2000data-key = s data-value = 88 Ajax = true _ ksts = 1504329110124 _ 1174 callback = JSON 1175 q =男式飞机杯 img file = comment = all ssid = S5-e search _ type = sourceId = TB . index SPM = a21bo . 50862 . 201856-Taobao-item . 1 ie = ut F8 initiative _ id = TB indexz _ 2000data-key = s data-value = 132 Ajax = true _ ksts = 1504329292131 _ 1421 callback = JSON 1422 q =男式飞机杯 img file = comment = all ssid = S5-e search _ type = sourceId = TB . index SPM = a21bo . 50862 . 201856-Taobao-item . 1 ie = ut F8 initiative _ id = TB indexz _ 21
做好相对对策就好了。
数据信息说明
在执行了简单的数据处理方法后,你发现了什么?
10- 20元的产品那么多。哇,真不敢想象。