You mean... a Semantic Web? Always valid, anything messy rejected?Human artifacts at scale need constant sorting & cleaning yet never reach fully tidy/valid state, and while we try, we can't ignore the legacy or new/raw/uncurated. Search engines do not. Real web needs scraping.

Thomas O'Brien
Source
Actions
Flag