侧边栏壁纸
博主头像
皮皮Blog博主等级

分享互联网上的宝藏网站

  • 累计撰写 17 篇文章
  • 累计创建 58 个标签
  • 累计收到 1 条评论

目 录CONTENT

文章目录

Java webmagic 爬取https网站报错:Received fatal alert: handshake_failure 问题的解决方案

皮皮Blog
2022-09-01 / 0 评论 / 0 点赞 / 1,656 阅读 / 4,141 字 / 正在检测是否收录...

在使用webmagic爬取这个网站的时候:https://www.similarweb.com/ 报错:Received fatal alert: handshake_failure。来说说我的解决过程,网上很多解决思路都是重复的,而且解决不了我遇到的问题。

我的环境:idea2021.3 jdk1.8.0_45 httpclient4.5.2 webmagic0.7.3

出现问题后,网上搜索解决方案:大部分说是 ssl的版本不一致

解决方案1:

找到这篇文章 https://www.cnblogs.com/vcmq/p/9484418.html

private SSLConnectionSocketFactory buildSSLConnectionSocketFactory() {
	try {
		return new SSLConnectionSocketFactory(createIgnoreVerifySSL(), new String[]{"SSLv3", "TLSv1", "TLSv1.1", "TLSv1.2"},
                null,
                new DefaultHostnameVerifier()); // 优先绕过安全证书
	} catch (KeyManagementException e) {
        logger.error("ssl connection fail", e);
    } catch (NoSuchAlgorithmException e) {
        logger.error("ssl connection fail", e);
    }
	return SSLConnectionSocketFactory.getSocketFactory();
}

按照这个方案就行代码修改,修改完成后,问题没有解决。网上大部分都是这种解决方案,搞了一下午没行。

然后又找到这么一篇文章,为我解决问题打开新的思路。 https://www.jianshu.com/p/f07db27e1507

解决方案2:

按照上面网站的说法,是因为webmagic引用的httpcliet版本不支持一部分ssl加密算法导致的握手失败,我就尝试更换httpclient不同的版本,文章中4.5.6版本是正常的,我尝试了4.5.6/4.5.2/4.5.8版本统统不行,还是没有解决问题。

没得办法,只能按照文章中说的思路一点点打开debug日志进行排查。

我先尝试换了一个客户端去连接网站,发现hutool提供的httputils可以正常爬取到网站的数据,这样我确定了不是jdk版本的问题,问题还是在httpclient的加密组件的问题,就朝这个方向解决。

首先把httpclient和hutool访问时的ssl握手日志打印出来,发现两者使用的CipherSuites不一样。

研究发现httpclient默认的CipherSuites更少一些,那就把hutool支持的CipherSuites也加入到httpclient中。

修改后的代码:

private SSLConnectionSocketFactory buildSSLConnectionSocketFactory() {
        try {
            String[] s = {"TLS_ECDHE_ECDSA_WITH_3DES_EDE_CBC_SHA","TLS_ECDHE_RSA_WITH_3DES_EDE_CBC_SHA","SSL_DHE_RSA_WITH_3DES_EDE_CBC_SHA","SSL_DHE_DSS_WITH_3DES_EDE_CBC_SHA", "TLS_ECDH_ECDSA_WITH_3DES_EDE_CBC_SHA","TLS_ECDH_RSA_WITH_3DES_EDE_CBC_SHA","SSL_RSA_WITH_3DES_EDE_CBC_SHA","TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384","TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256","TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384","TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256","TLS_DHE_RSA_WITH_AES_256_GCM_SHA384","TLS_DHE_DSS_WITH_AES_256_GCM_SHA384","TLS_DHE_RSA_WITH_AES_128_GCM_SHA256","TLS_DHE_DSS_WITH_AES_128_GCM_SHA256","TLS_ECDHE_ECDSA_WITH_AES_256_CBC_SHA384","TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384","TLS_ECDHE_ECDSA_WITH_AES_128_CBC_SHA256","TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256","TLS_DHE_RSA_WITH_AES_256_CBC_SHA256","TLS_DHE_DSS_WITH_AES_256_CBC_SHA256","TLS_DHE_RSA_WITH_AES_128_CBC_SHA256","TLS_DHE_DSS_WITH_AES_128_CBC_SHA256","TLS_ECDH_ECDSA_WITH_AES_256_GCM_SHA384","TLS_ECDH_RSA_WITH_AES_256_GCM_SHA384","TLS_ECDH_ECDSA_WITH_AES_128_GCM_SHA256","TLS_ECDH_RSA_WITH_AES_128_GCM_SHA256","TLS_ECDH_ECDSA_WITH_AES_256_CBC_SHA384","TLS_ECDH_RSA_WITH_AES_256_CBC_SHA384","TLS_ECDH_ECDSA_WITH_AES_128_CBC_SHA256","TLS_ECDH_RSA_WITH_AES_128_CBC_SHA256","TLS_ECDHE_ECDSA_WITH_AES_256_CBC_SHA","TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA","TLS_ECDHE_ECDSA_WITH_AES_128_CBC_SHA","TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA","TLS_DHE_RSA_WITH_AES_256_CBC_SHA","TLS_DHE_DSS_WITH_AES_256_CBC_SHA","TLS_DHE_RSA_WITH_AES_128_CBC_SHA","TLS_DHE_DSS_WITH_AES_128_CBC_SHA","TLS_ECDH_ECDSA_WITH_AES_256_CBC_SHA","TLS_ECDH_RSA_WITH_AES_256_CBC_SHA","TLS_ECDH_ECDSA_WITH_AES_128_CBC_SHA","TLS_ECDH_RSA_WITH_AES_128_CBC_SHA","TLS_RSA_WITH_AES_256_GCM_SHA384","TLS_RSA_WITH_AES_128_GCM_SHA256","TLS_RSA_WITH_AES_256_CBC_SHA256","TLS_RSA_WITH_AES_128_CBC_SHA256","TLS_RSA_WITH_AES_256_CBC_SHA","TLS_RSA_WITH_AES_128_CBC_SHA","TLS_EMPTY_RENEGOTIATION_INFO_SCSV"};
            // 优先绕过安全证书
            return new SSLConnectionSocketFactory(createIgnoreVerifySSL(), new String[]{"SSLv3", "TLSv1", "TLSv1.1", "TLSv1.2"},
                    s,
                    new DefaultHostnameVerifier());
        } catch (KeyManagementException e) {
            logger.error("ssl connection fail", e);
        } catch (NoSuchAlgorithmException e) {
            logger.error("ssl connection fail", e);
        }
        return SSLConnectionSocketFactory.getSocketFactory();
    }

运行成功,问题解决。

0

评论区