文章目录
- 前言
- 一、列表页分析
- 1.请求分析2.请求参数分析2.1 cookie参数2.2 请求参数3. 请求参数破解3.1 下断点调试3.2 加密参数解析3.3 解密参数mw-sign4. 获取数据5. 列表页数据获取源码
- 二、详情页分析
- 1.请求分析2.请求参数分析3. 参数mw-sign解密4. 获取数据5. 详情页数据获取源码
- 三、cookie的获取
- 总结
前言
目标网址:https://list.mogu.com/search/goods?q=%E5%B8%BD%E5%AD%90&ptp=31.nXjSr.0.0.Lu7TXQ23&platform=pc&page=1&ppath={}
注:本次内容不详细说明,只表达破解方法和加密位置,具体流程可自己分析,这样才会有进步嘛~
一、列表页分析
1.请求分析
2.请求参数分析
2.1 cookie参数
这两个cookie必须携带才能请求到数据,怎么获取后面会说
2.2 请求参数
- data里的参数是一些搜索的参数,没有加密,内容应该都看到懂吧
- mw-ckey、mw-appkey、mw-ttid、mw-uuid、mw-h5-os、callback 这几个是固定参数
- mw-t、_ 13位的时间戳
- mw-sign主要加密参数,加密类型是MD5,主要是加密数据的获取,比较多
3. 请求参数破解
3.1 下断点调试
主要加密参数是mw-sign,直接搜索好吧,
出来一个,找到加密位置,下断点调试,重新请求网页,这里注意,第一下断点的数据不是列表页,需要捕获第二次断点
3.2 加密参数解析
这里把this.buildQuery(e)数据抠出来看看,可知道是有多组数据用&符拼接的。
"100028&pc-search-wall&unknown&1605079173471&NMMain@mgj_pc_1.0&8348efe7-c649-4aa6-a2fe-693f612b1808&mwp.pagani.search&19&4223703e712b06a1a2abdf3a33a14b3a&fc7a01e7b579b7fb4324ace80c7a2c5a_1605001442502"
这里还有个z()方法是我们未知的,只要断点调试下就知道他是一个MD5加密了,这里就不扣了,知道方法就好,有兴趣的可以自己扣一下。
3.3 解密参数mw-sign
下面两图可以看出this.buildQuery(e)的组成部分,上面几个参数已经显示,下面的几个参数我们自己扣
参数解析:
1. t.version = "19"
2. t.api = "mwp.pagani.search"
3. z(t.getDataString()) = Md5(‘{"page":"1","pageSize":24,"sort":"pop","ratio":"3:4","cKey":"pc-search-wall","q":"%E5%B8%BD%E5%AD%90","ptp":"31.nXjSr.0.0.Lu7TXQ23","platform":"pc","ppath":"{}"}’ )
4. O.instance().mState.getToken() = _mwp_h5_token(cookie里的值) = "fc7a01e7b579b7fb4324ace80c7a2c5a_1605001442502"
最终组成参数:
"100028&pc-search-wall&unknown&1605079604437&NMMain@mgj_pc_1.0&8348efe7-c649-4aa6-a2fe-693f612b1808&mwp.pagani.search&19&4223703e712b06a1a2abdf3a33a14b3a&fc7a01e7b579b7fb4324ace80c7a2c5a_1605001442502"
4. 获取数据
5. 列表页数据获取源码
import re
import requests,hashlib
import time
headers = {
'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/86.0.4240.183 Safari/537.36',
'referer': 'https://shop.mogu.com/',
'accept-language': 'zh-CN,zh;q=0.9',
}
# cookies参数
_mwp_h5_token_enc = '0d2a82fc181f5243c9256e934a203ca6'
_mwp_h5_token = 'fc7a01e7b579b7fb4324ace80c7a2c5a_1605001442502'
cookies = {'_mwp_h5_token_enc':_mwp_h5_token_enc,'_mwp_h5_token':_mwp_h5_token}
# 页数参数
page = '1' # 页数
pageSize = '24' # 商品个数,但是跟返回的数据量不一致
q = '帽子' # 检索词
page_info = '{"page":"'+page+'","pageSize":'+pageSize+',"sort":"pop","ratio":"3:4","cKey":"pc-search-wall","q":"'+q+'","ptp":"31.nXjSr.0.0.Lu7TXQ23","platform":"pc","ppath":"{}"}'
page_info_md5 = hashlib.md5(page_info.encode('utf8')).hexdigest()
uuid = '8348efe7-c649-4aa6-a2fe-693f612b1808' # uuid,目前不变
timestamp = str(int(time.time()*1000)) # 时间戳
position = 'mwp.pagani.search' # 列表页参数
unknow_flag = '19' # 列表页数据id
# 用 & 合成的数据,进行md5加密得到sign
data = "100028&pc-search-wall&unknown&"+timestamp+"&NMMain@mgj_pc_1.0&"+uuid+"&"+position+"&"+unknow_flag+"&"+page_info_md5+"&"+_mwp_h5_token
sign = hashlib.md5(data.encode('utf8')).hexdigest() # 总数据的md5
# print(data,sign)
params = (
('data', page_info),
('mw-ckey', 'pc-search-wall'),
('mw-appkey', '100028'),
('mw-ttid', 'NMMain@mgj_pc_1.0'),
('mw-t', timestamp),
('mw-uuid', uuid),
('mw-h5-os', 'unknown'),
('mw-sign', sign),
('callback', 'mwpCb2'),
('_', timestamp),
)
response = requests.get('https://api.mogu.com/h5/mwp.pagani.search/19/', headers=headers, params=params,cookies=cookies)
# print(response.text)
# 获取商品列表数据
list_data = re.compile('"docs":(.*?),"offset"').findall(response.text)[0]
print(list_data)
二、详情页分析
1.请求分析
随便点个商品进去 https://shop.mogu.com/detail/1mut1ms(地址后缀被我去了),方法跟列表页的差不多
2.请求参数分析
主要注意下 iid和mw-sign这俩个参数,cookie跟其他参数方式跟列表页差不多,iid是从列表页获取的商品id,列表页的商品id字段是tradeItemId,mw-sign就是所有参数加一个cookie组成的MD5加密。
3. 参数mw-sign解密
看下详情页mw-sign参数未加密前的组成
"100028&unknown&1605082036373&NMMain@mgj_pc_1.0&8348efe7-c649-4aa6-a2fe-693f612b1808&mwp.darwin.multiget&3&b0113484d22f2dbbfb073a607f269942&fc7a01e7b579b7fb4324ace80c7a2c5a_1605001442502"
主要变得是以下几个参数:
1. t.version = "1"
2. t.api= "http.detail.api"
3. z(t.getDataString()) = Md5(‘{"iid":"1mut1ms","activityId":"","fastbuyId":"","template":"1-1-detail_normal-1.0.0"}’)
4. O.instance().mState.getToken() = _mwp_h5_token(cookie里的值) = "fc7a01e7b579b7fb4324ace80c7a2c5a_1605001442502"
4. 获取数据
5. 详情页数据获取源码
import requests,hashlib
import time
headers = {
'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/86.0.4240.183 Safari/537.36',
'referer': 'https://shop.mogu.com/',
'accept-language': 'zh-CN,zh;q=0.9',
}
# cookies参数
_mwp_h5_token_enc = '823197577e55ea2bc9513a1d5b6b9112'
_mwp_h5_token = '434663ca8b36913268e90d3b67acd4c0_1605057353352'
cookies = {'_mwp_h5_token_enc':_mwp_h5_token_enc,'_mwp_h5_token':_mwp_h5_token}
iid = '1lz1ch6' # 商品id
iid_data = '{"iid":"'+iid+'","activityId":"","fastbuyId":"","template":"1-1-detail_normal-1.0.0"}'
iid_data_md5 = hashlib.md5(iid_data.encode('utf8')).hexdigest() # 根据商品ID的md5
position = 'http.detail.api' # 位置参数
uuid = '8348efe7-c649-4aa6-a2fe-693f612b1808' # 目前不变
timestamp = str(int(time.time()*1000))
unknow_flag = '1'
data = "100028&unknown&"+timestamp+"&NMMain@mgj_pc_1.0&"+uuid+"&"+position+"&"+unknow_flag+"&"+iid_data_md5+"&"+_mwp_h5_token
sign = hashlib.md5(data.encode('utf8')).hexdigest() # 总数据的md5
# print(iid,data,sign)
params = (
('data', '{"iid":"'+iid+'","activityId":"","fastbuyId":"","template":"1-1-detail_normal-1.0.0"}'),
('mw-appkey', '100028'),
('mw-ttid', 'NMMain@mgj_pc_1.0'),
('mw-t', timestamp),
('mw-uuid', uuid),
('mw-h5-os', 'unknown'),
('mw-sign', sign),
('callback', 'mwpCb2'),
('_', timestamp),
)
response = requests.get('https://api.mogu.com/h5/http.detail.api/1/', headers=headers, params=params,cookies=cookies)
print(response.text)
三、cookie的获取
import requests,re
def get_cookies():
headers = {
'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/86.0.4240.183 Safari/537.36',
'referer': 'https://shop.mogu.com/',
'accept-language': 'zh-CN,zh;q=0.9',
}
res = requests.get('https://api.mogu.com/h5/mwp.darwin.multiget/3/?data=%7B%22pids%22%3A%22132244%2C138852%2C138851%22%7D&mw-appkey=100028&mw-ttid=NMMain%40mgj_pc_1.0&mw-t=1605002244481&mw-uuid=8348efe7-c649-4aa6-a2fe-693f612b1808&mw-h5-os=unknown&mw-sign=9616c3228f4d36a57ba96b13527427b1&callback=mwpCb2&_=1605002509140', headers=headers)
txt = res.text
cookies = dict()
cookies['_mwp_h5_token_enc'] = re.compile('"encToken":"(.*?)",').findall(txt)[0]
cookies['_mwp_h5_token'] = re.compile('"token":"(.*?)",').findall(txt)[0]
return cookies
print(get_cookies())
总结
博主写作不易,大家都是打工人,就别为难自己人,来个三连,点赞加关注也行