http://www.semanlink.net/tag/pdf_extract ; Scraping AND Howto
Common descendants