How approaches to dealing with fake news could impact search
It seems to me that the way we handle filtering fake news has the potential to create a new factor used to help rank quality content on the web: content trust.
Consider that Facebook is partnering with sites like Snopes.com to effectively have them label what sites are considered to be hoaxes. Following this logic, you could imagine a potential classification for content as being parody; sites such as The Onion and The Oatmeal come to mind. Of course this is not a simple problem to solve: some websites may indeed be found to be propaganda farms, and others may trade in parody stories for entertainment. That said, how do you handle news stories that trade in some quality stories but also amplify some that are bogus? How do you handle sites that are expressing individual opinions and that are not trying to mislead people. This is where the work gets hard and why it is going to need to be done by humans in the early days. Presumably, once they get good at it, Facebook is going to try and quantify these rankings into a content trust score.
I know, you are probably saying: “We already know that site reputation is a ranking signal. It’s on all of the popular lists of search ranking factors.” That said, social signals have remained one of the correlated but not causal factors for winning sites. Page Rank has continued to be the driving factor of success.
It is also clear to me that Facebook is going to take some time to get a working system in play. A recent interview about the fake news debacle had their spokesperson saying that they would consider shares as indicators of content quality. This flies in the face of studies of social media activity showing that people share what they do not read.
Back to our roots: Directories
Most of our search engines started as directories, if we can remember back that far. We asked sites to classify themselves. Entertainment, humor, news, opinion. Of course, these classifications started to become too complex to manage and couldn’t always be trusted. Google’s Page Rank stood out because it let action speak louder than words. Once people actually linked to the sites that mattered to them, that spoke louder than any classification.
Years later, we have watched Google become more sophisticated about how it ranks sites and crafts queries to meet the context of its users. Yet as good as search has become, it’s still quite easy to inject fake content that looks real into the mix. Even Google News itself, has become easy to game for site owners that want to get their blogs into the Google News feed. Presumably that will now be addressed, given the recent climate around news hoaxes.
I do think that part of the way to the right classification scheme for sites will include some sort of site classification. This could be self administered or graded by Google, Facebook and the other search overlords. Ultimately, I feel that these classifications may be useful for power searchers and advanced users, but they’re not a strong enough mechanism.
The Simplest Answer is Probably The Right One
Search for the general population has to be simple, fast and predictable. When all things are equal, search engines need to assume that people are seeking true information first and foremost. Truth over satire. For this reason, I think a content trust score is something we could see in the not too distant future.
Also published on Medium.