Dragos Campean
1 min readOct 22, 2018

--

Hello,

As i see it, you simply need to add another HTTP Sampler before the ‘While’ loop so that the authentication is done. This would also require you to add a ‘Cookie manager’ config element to manage your logged session.

As for the pdf links part i’m not sure that i understand, but i see 2 issues here:

1. You cannot crawl **only** the pdf links since you cannot extract other links from that pdf
2. This means you still need to crawl **all** links but while doing this, create a separate arraylist where you store only with the .pdf links (extract them the same way you extract normal links with a regex like href=”(.+?).pdf” )

Good luck !

--

--

Dragos Campean
Dragos Campean

Written by Dragos Campean

Software tester passionate about coffee and reading ☕📚

No responses yet