Did you know that City of Beverly Hills blocks all search engines from indexing the city’s public website? We noticed recently that a search of the Beverly Hills website brings up no results. Try it yourself. In Google enter: [your search term] site:beverlyhills.org. (Leave off those square brackets but keep the ‘site’ command that restricts your search only to the city’s website.)
We noticed the problem when looking for our city’s updated Sustainability Plan. The posted draft is from 2009, and it’s all that’s available. No luck on an adopted one. So, digging a bit further, we noticed that the city uses the ‘robots.txt’ file to block search engines from crawling any part of the city site.
Robots.txt is a text document that simply reads:
In plain English, ‘disallow’ prohibits all search engines from accessing anything in the public folder (“/”) of the city’s web domain (which means all files publicly available via the site); it tells search engines to GO AWAY. But it turns away everyone at the city’s online door. Google says of the robots.txt file, “You need a robots.txt file only if your site includes content that you don’t want search engines to index.”
Cities today want to make themselves more available to residents not less so, don’t they? Why use robots.txt? Many recognize that public documents provide value and make such documents more accessible the public not less so.
Not Beverly Hills. The exclusionary practice is right in line with the city’s overall approach to transparency – which is to say non-transparency – and that’s why we give Beverly Hills a #FAIL. Consider:
- Our city is one of only two cities in Los Angeles County that puts a full block on search engines – two out of 88;
- Our city routinely links to official agenda materials that are not machine-readable and thus not searchable because they’re images of the paper documents;
- Our city persists in generating PDFs that are not at all legible on the Mac platform (see below right) without special Adobe-brand browser plugins (Safari and Firefox use native or non-Adobe plugins respectively ‘out of the box.’
Most worrisome is the production of the document image PDFs. In this format, city documents cannot be indexed by search engines; nor can you copy or cut-and-paste text from them. And many city budget documents today are presented in this fashion – such as the 75-page check register, City Council minutes, and purchase orders paid. All are linked from the latest City Council agenda. All are legible only to the human eye. Planning Commission minutes and some of the staff reports are not machine-readable either.
Why not? When City Clerk office officials from other cities in the region – West Hollywood, Santa Monica, Glendale and Burbank for example – were asked about the practice, they expressed puzzlement. After all, why print paper copies only to scan them? Such documents are more trouble to generate and it wastes paper.Those cities don’t follow that practice.
For the transparency minded, that process produces a document less open to scrutiny. Despite complaints stretching back to early 2010 in which accessibility was explicitly noted as a problem, the frequency of the practice not only continues, it seems to be on the increase. We’re making ever-more inscrutable documents not fewer of them. Public Works agendas and minutes, for example, are images of text documents, though until January they were actual text documents.
Consider the practice of generating images of text documents and the search engine exclusion a double-locked door: not only are these public documents not open for search engine indexing because they’re images; their ‘meta data’ (or information about the file itself, such as title and URL) is not accessible to search engines at all. While one-third of LA County cities use a robots.txt file to selectively block server directories, only Azusa and Beverly Hills bolt the front door that way.
Why the Fuss?
Search engines are the most popular way to find information of all kinds, but when Google and other search engines are prevented from accessing or indexing documents, they can’t deliver them to you today or in the future. If the Internet never forgets, our city doesn’t even allow it to begin to remember. Want that report or study that you know is public but that the city is not making available today? Forget it. Once off the Beverly Hills web server, it’s gone.
What does this say for City Hall access, so to speak?
Now, if you want a commemorative plaque for your philanthropy, why we’ll oblige you. Want a stand-up photo-op with the Mayor in Chambers for your hard work or event promotion? That’s business as usual in City Hall. But if you want to be able to access public documents via Google you are out of luck. That’s a transparency #FAIL, folks.
[Update (8/15): City of Beverly Hills has changed their robots.txt file to allow search engine crawling, and agreed to revisit PDF publishing practices for some documents that are not currently machine-readable. ]