Wild Spider

#1 / 1 rate

Wild Spider

38 users

2019-03-08

Xuan Wu

Extension Information

5 star
0%
4 star
0%
3 star
0%
2 star
0%
1 star
100%

Supported Languages

Permissions

Description

web pages are crawled by being loaded into browser using multiple tabs parallelly

WATCH OUT: more tabs you use, more computer resources (CPU, memory) will be used, and each page costs a bit disk to save the content (in IndexedDb, accessible from extensions -> Inspect views: background page).

The "spider" works in this way:
1) The current url is used as the starting point, and it's loaded again in a new tab.
2) After this page is loaded, fetch all the links on the page.
3) Get all the links on the page, including relative urls.
4) Open the extracted link parallelly in all the tabs used (by default 3, set in eventPage).
5) repeat 2-4

All source code at: https://github.com/nobodxbodon/ChromeCrawlerWildSpider