GitHub project ; NLP sample code AND Identification of similar documents
Common descendants