Forget Me Not
The recent âright to be forgottenâ ([factsheet]1) has had a lot of attention, I wanted to post a blog a while ago thinking about it but havenât had enough time to read around the various issues. This is pulling together some questions I have about how the ruling is being implemented, but it is very sketchy at the moment. My major concerns are: 1. relevance is defined solely by complainants not information seekers, meaning a search query specifying a name and some particular thing (e.g. a crime) will â despite its relevance to the query â potentially not return target documents, 2. it is unworkable in the long-run given disambiguation issues (name overlap, content-mirroring/URL changes, etc.) 3. search engines donât exist in a vacuum and their relationship e.g. to social media content, etc. has not been considered properly Having said that, I find some of the complaints raised dubious â it is quite obviously the case that if I ask âSo, tell me about Jamesâ that any good informant will tell me recent, relevant, and largely factually accurate (rather than rumour mill) information. Given googleâs role in processing the ways we access data, they are a relevant informant and where things seem to be going wrong we might need some recourse (which is not to say we should censor). # RTBF [This article]2 seems to explain how google are dealing with the process (i.e. their submission form, etc.), collecting minimal data, confirming the requester is (or has the ID of) the target, asking for URL target pages, and reasons why they are irrelevant, outdated or otherwise inappropriate. I like the book analogy â the RTBF doesnât remove pages, it removes index terms. It isnât tantamount to pulping books, but it is to burning index cards. My concerns with RTBF are largely pragmatic rather than ethical (,etc.): * It seems to me entirely appropriate that someone who searches for âMr x, bankruptcy repossessionâ should in fact find that information (i.e., that combination of keyterms shouldnât be âdeindexedâ from the target document). This would be true of various cases including victims and perpetrators of crime, corrupt politicians, etc. At the moment the implementation is (I think) removing these index terms too. This seems wrong to me â the relevance of the information in these cases is strong insofar as the information seeker has defined their need in those terms. * It seems less appropriate that someone who searches for âMr xâ should see old or irrelevant material as a top result, where other results exist about âMr xâ, this is particularly true of victims of crime (and the accused but not guilty) where links to news articles about them might be ranked higher compared to other information about them, even if they are not the core content (the perpetrator is) or/and it is outdated. In these cases RTBF may be the only means to âdowngradeâ that information. * Someone searching for âMadrid repossession firmsâ should still be able to find the target document; the index âMr xâ is peripheral. Such searches should be unaffected by the ruling. Overall, I donât buy the claim that âit cannot be right to remove verifiable true factsâ â there are cases where we might remove true information (e.g. for safety reasons), or limit indexing. Search engines arenât just some mirror on the world, their algorithms target âimportanceâ, âauthorityâ, âdiversityâ, etc. â these techniques prioritise certain information, particularly from news sources. Just telling individuals to get better at personal SEO isnât enough. Google does process information when it âdecidesâ what links to present, and in what order, to which queries. Thatâs not just a trivial matter, it says something about how we prioritise information and its authoritativeness. # Unresolved issues What I am hesitant of here is just supporting the âfree speechâ line entirely, something about proportionality, notability, and biographies of living persons (including reference to peripheral characters) is important nuance â this is much in line with the Wikipedia stance. Some unresolved issues: 1. It isnât clear what deindexing is happening â whether target documents are being removed from SERPs entirely, or for what keyterms (and how requesters are being informed of that). 2. How symmetric/asymmetric requests should be dealt with (e.g. interesting cases around potential for victim and perpetrator of crime requesting RTBF on the same/different pages). It isnât outside the realm of possibilities that two individuals â with the same name â might both appear in a target page, one wishing to be deindexed, the other not. This is more likely of shared surname (so query specificity really matters). We can also imagine cases where individuals request information relating to a namesake (which would of course be returned on searching for the name) be removed, and while google would presumably instigate some investigation to avoid such issues itâs not clear theyâd always be successful. 3. What the implications are for integration of services in search engines â many search engines integrate some social-media features, for example alongside searching the âopen webâ also searching posts in google+, facebook, etc. made by my friends. The removal of many links under RTBF parallels the âsuper injunctionâ madness, although matters are more complicated by material that is about me (my friend posts a photo of me I donât want available â I canât force google+ to take it down but could I stop them indexing that material?). 4. What the implications are for other services including (but not limited to) Wikimedia projects â 1. e.g. on Wikipedia we can link to target documents where the title of the article might match a query leading to that document, will that mean the Wikipedia article is removed? This would undoubtedly lead to censorship of content on Wikipedia, or/and of access to Wikipedia articles ([recent interesting case]3 of a Greek Wikipedian accused of defamation). 2. Could the projects be forced (by ECJ) to remove certain articles or content despite BLP/notability guidance? Weâve already seen attempts to bring legal action against editors in particular countries (unsuccessfully), does RTBF give another avenue a) for that and b) for legal complaint against the WMF even if it is based in US 3. Could edit histories/revisions be censored to conceal âoutdatedâ revs to an article? 5. The weighting google gives to recency and in/out links isnât clear, and it isnât clear whether a rejigging in ranking would address some concerns â in the case in point, a lesson in self-SEO and restrictions around what newspapers can publish, or/and what historic information credit-agencies can use (if that was an issue) could both address the concern. Neither of those âsolutionsâ would lead to the kind of rtbf deindexing being done. The ways search engines rank information are important for the processing and presentation of it. 6. There are serious pragmatic issues around: 1. capacity of smaller search engines (and potentially other organisations) to comply 2. leaving things up to each individual company to comply are also of concern (as per [Lords discussion]4) â note it isnât clear at the moment if googleâs decisions have all been appropriate, e.g. [these guardian pieces]5 (apparently since reinstated), and I think thereâs a very legitimate concern regarding leaving decisions to individual search engines
Footnotes
-
http://ec.europa.eu/justice/data-protection/files/factsheets/factsheet_data_protection_en.pdf â©
-
http://searchengineland.com/google-right-to-be-forgotten-form-192837 â©
-
http://blog.wikimedia.org/2014/09/23/greek-wikipedia-user-wins-key-hearing-in-defamation-case/ â©
-
http://www.theguardian.com/technology/2014/jul/30/lords-right-to-be-forgotten-ruling-unworkable â©
-
http://www.theguardian.com/commentisfree/2014/jul/02/eu-right-to-be-forgotten-guardian-google â©