Hello,

1 min readOct 22, 2018

Hello,

As i see it, you simply need to add another HTTP Sampler before the ‘While’ loop so that the authentication is done. This would also require you to add a ‘Cookie manager’ config element to manage your logged session.

As for the pdf links part i’m not sure that i understand, but i see 2 issues here:

1. You cannot crawl **only** the pdf links since you cannot extract other links from that pdf
2. This means you still need to crawl **all** links but while doing this, create a separate arraylist where you store only with the .pdf links (extract them the same way you extract normal links with a regex like href=”(.+?).pdf” )

Good luck !

Written by Dragos Campean

No responses yet