February 2, 2006 - DIT released Report on Google.cn's Self Censorship
Dynamic Internet Technology Inc. (firstname.lastname@example.org), February 6, 2006
Various search have been performed and compared on google.com and google.cn between 1/25-2/3. We derived that google.cn is using two blacklists to remove some results. In addition to that, both www.google.com when accessed from China and www.google.cn include more web pages from China in their search results. Combined with Google's best search ranking technologies, google.cn become model partner of Chinese government's propaganda.
1. Impact on user experience
On issues that are highly sensitive to the CCP regime, a search within "all sites" will be silently limited to websites inside China. So the results will closely follow the party line.
As an example, search results of the Chinese word "Falun Gong" (法轮功) from google.com and google.cn have dramatic difference. Half of the search results from google.com are oversea websites and are supportive of Falun Gong. All search results from google.cn are slandering.
Results from google.cn:
Results from google.com:
On less sensitive issue, propaganda in China applies subtle alterations compared to oversea websites. In such cases, google.cn only removes results from blacklisted websites, which is enough to present the propaganda.
One example is the search of "Iraq+America" (in Chinese characters). News reports left on google.cn emphasize how little support President Bush got from US, Iraq and the world, and how hypocritical US army is in the prisoner torture case.
Besides removing results, Google also adds more results from websites in China to www.google.cn and www.google.com when accessed from China. Those websites in China usually follow government propaganda closely. Because of the ranking algorithm, such larger pool of well inter-connected web pages will "dilute" other web pages and lower the ranking of oversea web pages sooner or later.
For example, we found "Iraq+America" reported 1,950,000 results found on www.google.com when accessed from US, 4,960,000 results from www.google.com when accessed from China (through proxy in China) and 4,850,000 results from google.cn.
Google.cn also implemented various redirection to silently limit user search to web pages in China even user originally choose the default "all pages". This happens when some "highly sensitive" keywords are searched.
Image and news search work the same way as web search and have similar impact on user experience.
2. Impact on free flow of information
According to recent "Statistical Reports on the Internet Development in China" by CNNIC, search engine is one of the three "most used Internet service/function" and search engine is the most important way to learn a new website.
With the explosion of online information, search engine become another form of media. The case of google.cn is the latest example of how new technologies can be used to consolidate the communism regime.
By removing contents from blacklisted websites on google.cn, most users who are not technically savvy will have a harder time to be aware of the existence of such website through search engine, much less to seek ways to access them.
3. Blacklist discovered
When a keyword is blacklisted, google.cn silently limits search results to websites in mainland China only WITHOUT ANY WARNING. Search results related to those keywords from oversea websites usually show dramatic difference from news from mainland China. Those keywords include "sensitive issues" like Falun Gong, Nine Commentaries, June 4th etc.
Blacklisted keywords found can be grouped into the following:
1)法轮功 (Falun Gong)，法轮大法(Falun Dafa – alternative name for Falun Gong)， 讲真相 (Clarify the truth – Falun Gong's effort to expose the persecution in China.), 自焚( self-immolation – one of the largest effort Chinese Regime made to defame Falun Gong), 李老师,李洪志,李大师(teacher Li, Li Hongzhi, Master Li – various ways to call the founder of Falun Gong)
2)九评 ( Nine commentaries – 9 articles published by theepochtimes.com that comment on the Chinese Communist Part)， 退党，(Rennounce Chinese Communist Party membership)
3)美国之音，自由亚洲电台，博讯，明慧, 人民报,大纪元，新唐人 (names of oversea news websites including VOA and RFA)
4) tiananmen，天安门, 六四, 王丹, 魏京生,刘晓波 ，(June 4th related keywords)
5) 江泽民 (Jiang Zemin – former president of China), 江戏子, 江主席,僵贼民,江贼民, 江猪媳(alternative name for Jiang Zemin, some are highly derogatory)，江绵恒 (son of Jiang Zemin) ,江泽慧 (cousin of Jiang Zemin)
6)太石村，东洲，高智晟，(Taishi villiage, Dongzhou (shooting), Gao ZhiSheng - recent "sensitive issues")
7) 达赖(Dalai)，西藏独立(Tibet independence) ，陈水扁(Chen Shuibian – president of ROC/Taiwan )
Blacklisted websites will never make it to the search results on google.cn. Those sites tend to report more on human rights issues in China in Chinese language.
Websites found to be on blacklist:
rfa.org, minghui.ca，minghui.cc，epochtimes.com，dajiyuan.com，kanzhongguo.com ，voa.gov, secretchina.com，renminbao.com , peacehall.com, bbc.co.uk，libertytimes.com.tw， hrichina.org, hrw.org, falundafa.org, chinese.faluninfo.net
When there are search results being excluded, users won't see any warning until they reach the bottom of the page. The warning message reads in Chinese: "According to local law and policy, some search results are not displayed". There is no warning to indicate that google.cn included extra number of websites from China compared to google.com.
More data and blacklisted keywords/websites will be posted.
Algorithm that can be used to find out if a keyword or website is blacklisted
To check if a website is blacklisted, one can search "China" in Chinese character (or anything like "test" ). If site is on black list, there will be zero results returned found or displayed, and a warning message will show up. For example, try http://www.google.cn/search?hl=zh-CN&q=%E4%B8%AD%E5%9B%BD+site%3A.epocht...
To identify blacklisted keyword, one can go to www.google.cn and search the keyword within "all websites" . If google.cn reset the default limitation to "pages in China", this keyword is blacklisted.