最近在做一些无聊的文书工作,基本就是复制粘贴,太累人了,本打算把整个网站写爬虫趴下了,记起来这个东东,懒得趴了。
丢上 vps 去了,带上套,不久爬完了。
特此向大家推荐这个好东西,从写爬虫之中解放出来。httrack 跨平台,开源。官方介绍如下:
HTTrack is a free (GPL, libre/free software) and easy-to-use offline browser utility.
It allows you to download a World Wide Web site from the Internet to a local directory, building recursively all directories, getting HTML, images, and other files from the server to your computer. HTTrack arranges the original site's relative link-structure. Simply open a page of the "mirrored" website in your browser, and you can browse the site from link to link, as if you were viewing it online. HTTrack can also update an existing mirrored site, and resume interrupted downloads. HTTrack is fully configurable, and has an integrated help system.
WinHTTrack is the Windows 2000/XP/Vista/Seven release of HTTrack, and WebHTTrack the Linux/Unix/BSD release. See the download page.
一个中型网站几分钟就可以趴下了,效率是 wget 的上百倍。注意别玩的太狠了。
欢迎来到这里!
我们正在构建一个小众社区,大家在这里相互信任,以平等 • 自由 • 奔放的价值观进行分享交流。最终,希望大家能够找到与自己志同道合的伙伴,共同成长。
注册 关于