


php curl simulates single sign-on PHP implements JS encryption function
php curl模拟单点登录
这里说的模拟登录,也就是抓站的基础部分, 模拟登录,获取登录后的状态,然后模拟人的访问流程,获取访问结果,分析保存之……
首先,抓站利器,httpwatch,也可以用firefox下的firebug或者chrome下自带的开发者工具,其实我原来一直都是用firefox或者chrome的,但是听说httpwatch很好,就拿来试用一下,结果用的还真是很舒服,不错。其次就是Snoopy.class.php,我用PHP,这个PHP类真是好用到极致了,用来抓站so easy。但是这个类用来抓一般的基于http的站是没问题的,如果要是https的就麻烦了,这个类在抓https的站时,采用的不是php本身的curl了,使用的是Linux/Unix下的原生CURL,所以这个类在windows下用着很麻烦,需要自己装一个windows下能用的curl,并且配置好环境。在sae下直接就是不行的了,如果是用Linux/Unix的话,就很方便了~~指定好curl的路径就可以了。我也有一个https的模拟登录签到的脚本,使用的是php的curl,没有使用Snoopy。
由于新浪采用了单点登录,所以一般的登录方法是不行的了,不是直接构造用户名密码POST到一个login action页面完事。先使用httpwatch抓取一下登录的流程吧。
整个登录的流程是这样的:
1)、输入用户名密码,点击登录(在用户名输入完onblur的时候会有一个自动检测邮箱合法性的过程,我们模拟时候可以忽略)。
2)、访问一个页面,获取到几个特殊的值,包括servertime,pcid,nonce,这几个值用来干什么的呢,我们再分析一下JS就可以发现了,是用来加密用户名和密码的。
3)、将加密后的用户名、密码以及一些其它信息,提交到SSO的login去申请ticket。(ticket就是SSO登录中用到的票据啦)
4)、认证成功,访问几个其它站点种下Cookie,(相当于把票给人家看门的看看,告诉他我有访问你内部的权限了)。
5)、返回爱问首页。
主要需要做的就是如何实现用户名和密码的加密,因为客户端是用js实现的,但是我们是写在脚本里的,无法调用js,所以,只能通过PHP来模拟一下js加密的实现过程了。
新浪的JS采用的应该是Dean Edwards的packer算法,其实不用管什么算来,拿来之后在Google一下js解密,放进去就是还原后的代码了。
在还原后,大概看了看,就明白了一个流程,用户名和密码都分别进行了加密,最重要的部分,就是加密的这块了。用户名用base64加密,密码用hex_sha1加密后加密再加盐再加密。我们现在需要做的,就是用PHP来实现这两个加密方法。(其实通过分析后发现,用户名采用base64加密,并未加盐,所以每次加密后的数据都是一样的,我们也没必要去实现base64的加密方法了。)
//用户名加密 d["su"] = sinaSSOEncoder.base64.encode(bi(a)); //密码加密 b = sinaSSOEncoder.hex_sha1("" + sinaSSOEncoder.hex_sha1(sinaSSOEncoder.hex_sha1(b)) + k.servertime + k.nonce) var sinaSSOEncoder = sinaSSOEncoder || {}; (function() { var n = 0; var o = 8; this.hex_sha1 = function(s) { return A(p(z(s), s.length * o)) }; var p = function(x, f) { x[f >> 5] |= 0x80 << (24 - f % 32); x[((f + 64 >> 9) << 4) + 15] = f; var w = Array(80); var a = 1732584193; var b = -271733879; var c = -1732584194; var d = 271733878; var e = -1009589776; for (var i = 0; i < x.length; i += 16) { var g = a; var h = b; var k = c; var l = d; var m = e; for (var j = 0; j < 80; j++) { if (j < 16) w[j] = x[i + j]; else w[j] = v(w[j - 3] ^ w[j - 8] ^ w[j - 14] ^ w[j - 16], 1); var t = u(u(v(a, 5), q(j, b, c, d)), u(u(e, w[j]), r(j))); e = d; d = c; c = v(b, 30); b = a; a = t } a = u(a, g); b = u(b, h); c = u(c, k); d = u(d, l); e = u(e, m) } return Array(a, b, c, d, e) }; var q = function(t, b, c, d) { if (t < 20) return (b & c) | ((~b) & d); if (t < 40) return b ^ c ^ d; if (t < 60) return (b & c) | (b & d) | (c & d); return b ^ c ^ d }; var r = function(t) { return (t < 20) ? 1518500249: (t < 40) ? 1859775393:(t < 60) ? -1894007588 : -899497514 } var u = function(x, y) { var a = (x & 0xFFFF) + (y & 0xFFFF); var b = (x >> 16) + (y >> 16) + (a >> 16); return (b << 16) | (a & 0xFFFF) }; var v = function(a, b) { return (a << b) | (a >>> (32 - b)) }; var z = function(a) { var b = Array(); var c = (1 << o) - 1; for (var i = 0; i < a.length * o; i += o) b[i >> 5] |= (a.charCodeAt(i / o) & c) << (24 - i % 32); return b }; var A = function(a) { var b = n ? "0123456789ABCDEF": "0123456789abcdef"; var c = ""; for (var i = 0; i < a.length * 4; i++) { c += b.charAt((a[i >> 2] >> ((3 - i % 4) * 8 + 4)) & 0xF) + b.charAt((a[i >> 2] >> ((3 - i % 4) * 8)) & 0xF) } return c }; this.base64 = { encode: function(a) { a = "" + a; if (a == "") return ""; var b = ''; var c, chr2, chr3 = ''; var d, enc2, enc3, enc4 = ''; var i = 0; do { c = a.charCodeAt(i++); chr2 = a.charCodeAt(i++); chr3 = a.charCodeAt(i++); d = c >> 2; enc2 = ((c & 3) << 4) | (chr2 >> 4); enc3 = ((chr2 & 15) << 2) | (chr3 >> 6); enc4 = chr3 & 63; if (isNaN(chr2)) { enc3 = enc4 = 64 } else if (isNaN(chr3)) { enc4 = 64 } b = b + this._keys.charAt(d) + this._keys.charAt(enc2) + this._keys.charAt(enc3) + this._keys.charAt(enc4); c = chr2 = chr3 = ''; d = enc2 = enc3 = enc4 = '' } while (i < a.length); return b }, _keys: 'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/=' } }).call(sinaSSOEncoder);
后通过努力,把这个JS的对象封装成了一个PHP的类,具体的代码我就不贴出来了,好歹我也是浪人,自己就不危害自己啦,大家自己研究下,很简单的。
其中最难的亮点就是js中的 >>> 无符号右移 和 charCodeAt(i),PHP中没有这两个对应的实现,需要自己来写。
我把这两个难点贴出来供大家参考吧,其实这两个算法我也是参考的别人的,中间发现有一个算法是错的,浪费了我好长时间。。。。。
/** * 无符号32位右移 ;模拟实现JS的>>>,无符号右移。实现原理,化为二进制,先右移,后补零。 * @param mixed $x 要进行操作的数字,如果是字符串,必须是十进制形式 * @param string $bits 右移位数 * @return mixed 结果,如果超出整型范围将返回浮点数 */ function shr32($x, $bits){ // 位移量超出范围的两种情况 if($bits <= 0){ return $x; } if($bits >= 32){ return 0; } //转换成代表二进制数字的字符串 $bin = decbin($x); $l = strlen($bin); //字符串长度超出则截取底32位,长度不够,则填充高位为0到32位 if($l > 32){ $bin = substr($bin, $l - 32, 32); }elseif($l < 32){ $bin = str_pad($bin, 32, '0', STR_PAD_LEFT); } //取出要移动的位数,并在左边填充0 return bindec(str_pad(substr($bin, 0, 32 - $bits), 32, '0', STR_PAD_LEFT)); } //模拟实现JS的chaCodeAt() function getUnicodeFromOneUTF8($word) { //获取其字符的内部数组表示,所以本文件应用utf-8编码! if (is_array( $word)) $arr = $word; else $arr = str_split($word); //此时,$arr应类似array(228, 189, 160) //定义一个空字符串存储 $bin_str = ''; //转成数字再转成二进制字符串,最后联合起来。 foreach ($arr as $value) $bin_str .= decbin(ord($value)); //此时,$bin_str应类似111001001011110110100000,如果是汉字"你" //正则截取 $bin_str = preg_replace('/^.{4}(.{4}).{2}(.{6}).{2}(.{6})$/','$1$2$3', $bin_str); //此时, $bin_str应类似0100111101100000,如果是汉字"你" return bindec($bin_str); //返回类似20320,汉字"你" //return dechex(bindec($bin_str)); //如想返回十六进制4f60,用这句 }
是这两个算法了,还有一个无符号左移的,这里没用到,也贴下来,记录。
例子:
<?php /** * 无符号32位左移 * @param mixed $x 要进行操作的数字,如果是字符串,必须是十进制形式 * @param string $bits 左移位数 * @return mixed 结果,如果超出整型范围将返回浮点数 */ function shl32 ($x, $bits){ // 位移量超出范围的两种情况 if($bits <= 0){ return $x; } if($bits >= 32){ return 0; } //转换成代表二进制数字的字符串 $bin = decbin($x); $l = strlen($bin); //字符串长度超出则截取底32位,长度不够,则填充高位为0到32位 if($l > 32){ $bin = substr($bin, $l - 32, 32); }elseif($l < 32){ $bin = str_pad($bin, 32, '0', STR_PAD_LEFT); } //取出要移动的位数,并在右边填充0 return bindec(str_pad(substr($bin, $bits), 32, '0', STR_PAD_RIGHT)); }
有一个需要注意的是PHP和JS中的三元判断符的使用,PHP中是从右向左,JS中是从左向右。所以这个使用if…else…替代比较好。
还有一个就是JS的数组和PHP的数组问题,自己想办法解决吧。
解决了这个,剩下就没难题了,跟普通的抓站一样了。
构造好需要的POST数据,提交。
然后,获取返回的ticket,然后再模拟访问其它几个页面,把ticket给人家看门的看一眼,下次带Cookie直接访问里面就好了。
以上就是php curl模拟单点登录 PHP实现JS加密函数 的内容,更多相关内容请关注PHP中文网(www.php.cn)!

Hot AI Tools

Undresser.AI Undress
AI-powered app for creating realistic nude photos

AI Clothes Remover
Online AI tool for removing clothes from photos.

Undress AI Tool
Undress images for free

Clothoff.io
AI clothes remover

Video Face Swap
Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Article

Hot Tools

Notepad++7.3.1
Easy-to-use and free code editor

SublimeText3 Chinese version
Chinese version, very easy to use

Zend Studio 13.0.1
Powerful PHP integrated development environment

Dreamweaver CS6
Visual web development tools

SublimeText3 Mac version
God-level code editing software (SublimeText3)

Hot Topics











PHP and Python each have their own advantages, and the choice should be based on project requirements. 1.PHP is suitable for web development, with simple syntax and high execution efficiency. 2. Python is suitable for data science and machine learning, with concise syntax and rich libraries.

PHP is a scripting language widely used on the server side, especially suitable for web development. 1.PHP can embed HTML, process HTTP requests and responses, and supports a variety of databases. 2.PHP is used to generate dynamic web content, process form data, access databases, etc., with strong community support and open source resources. 3. PHP is an interpreted language, and the execution process includes lexical analysis, grammatical analysis, compilation and execution. 4.PHP can be combined with MySQL for advanced applications such as user registration systems. 5. When debugging PHP, you can use functions such as error_reporting() and var_dump(). 6. Optimize PHP code to use caching mechanisms, optimize database queries and use built-in functions. 7

PHP and Python each have their own advantages, and choose according to project requirements. 1.PHP is suitable for web development, especially for rapid development and maintenance of websites. 2. Python is suitable for data science, machine learning and artificial intelligence, with concise syntax and suitable for beginners.

PHP is widely used in e-commerce, content management systems and API development. 1) E-commerce: used for shopping cart function and payment processing. 2) Content management system: used for dynamic content generation and user management. 3) API development: used for RESTful API development and API security. Through performance optimization and best practices, the efficiency and maintainability of PHP applications are improved.

PHP is still dynamic and still occupies an important position in the field of modern programming. 1) PHP's simplicity and powerful community support make it widely used in web development; 2) Its flexibility and stability make it outstanding in handling web forms, database operations and file processing; 3) PHP is constantly evolving and optimizing, suitable for beginners and experienced developers.

PHP is mainly procedural programming, but also supports object-oriented programming (OOP); Python supports a variety of paradigms, including OOP, functional and procedural programming. PHP is suitable for web development, and Python is suitable for a variety of applications such as data analysis and machine learning.

PHP is suitable for web development, especially in rapid development and processing dynamic content, but is not good at data science and enterprise-level applications. Compared with Python, PHP has more advantages in web development, but is not as good as Python in the field of data science; compared with Java, PHP performs worse in enterprise-level applications, but is more flexible in web development; compared with JavaScript, PHP is more concise in back-end development, but is not as good as JavaScript in front-end development.

PHP and Python have their own advantages and disadvantages, and the choice depends on project needs and personal preferences. 1.PHP is suitable for rapid development and maintenance of large-scale web applications. 2. Python dominates the field of data science and machine learning.
