This process has an apparent drawback of only focusing on frequency. But, generic words are likely to be very frequent in almost any doc but are certainly not agent with the area and subject from the document. We want a way to filter out generic terms. If you're thinking that https://news.askabout.online