Japan's largest rail company has announced that it will be partnering with Japanese multinational engineering and electronics conglomerate Hitachi to gather data from its e-ticketing system, strip it of identifying information such as names and addresses, and then sell it in bulk to third party companies.
East Japan Railway (JR East) will be using travel history information from its e-ticketing system, called Suica, according to technology news website Ars Technica. JR East counts about 42 million Suica users. The company plans to “sell the information in the form of monthly reports to retailers, eating and drinking establishments, and real estate agencies that operate near the train stations,” according to Japanese business news site Nikkei.
A June 28, 2013 Nikkei post reports that Hitachi “will profile commuter activity at each train station by parameters like gender, age, and times of use, analyzing such things as the customer-drawing power of each station and the potential for business in the area.”
Takashi Yamaguchi, a JR East spokesman, told Computerworld, “There is no way to determine the identity of specific individuals from the data, so we feel there is no privacy issue.”
Selling E-ticket data stripped of private information is legally okay, but social media users voiced their concerns and skepticism:
@hikita suicaビッグデータ。たしかに個人情報保護法に直接的に触れる個人情報は入っていないが乗降履歴は継続的に収集すれば個人を特定しうるということで実質的な個人情報として扱うことになるのではなかったか？現時点では問題ないとはいえかなり疑義 http://t.co/3AUfmqDMHn
Big Data on Suica: it does not include “personal information” explicitly defined in the Private Information Protection Act. However, by repetitive collection of commute history, it may lead to identifying a person which could be practically personal information, or can it be? I have serious concern even if it is legal at this time.
一番の問題は、事前確認なしのデータ活用であるにも関わらずユーザーに対して説明しないことだが、もう一つ個別データを見ることができるのかどうかが気になるかな。再識別化できちゃうだろうし。→Ｓｕｉｃａ情報分析し販売 戸惑いも NHKニュース http://t.co/T8RcBLB8BK
— Yasuaki Madarame (@madarame) July 15, 2013
The biggest problem here is that they started using the data without explanations and without prior acknowledgment by users. I wonder whether independent data can be seen or not, which could enable re-identification.
On social bookmarking site hatena, yoko-hirom commented in comparison with unique identities in real life.
yoko-hirom: 皆，顔も車のナンバーも隠さず交通機関や道路を利用している。しかし，個別に追跡・記録されたらどうか。Web だと閲覧履歴のトラッキング。それが販売される。不快感も嫌悪感も抱かない方がどうかしている。
Everybody uses public transportation and roads, without hiding one's face or license plates. What if it is tracked and recorded individually? When it comes to the Web, the subject is browsing history. And it is on sale. It's insane if you don't feel it discomfort or disgusting.
The use of private information needs to follow Private Information Protection Act, but there is no legal restrictions when the data is anonymized and does not include any identifiers. The Japanese government is planning to develop a legal framework for the use of personal data. The Cabinet approved “世界最先端ＩＴ国家創造宣言”, or temporally translated as “a Declaration of Creating the World's Most Advanced IT Nation” [ja] on June 14 2013 and it touches on the use of personal data to balance the use of personal information and privacy.
A recent keynote speech by the Deputy Director-General of the Ministry of Internal Affairs and Communications at a ICT forum in Okinawa [ja] talked about the need to utilize privacy-enhancing technology and develop rules for the use of personal data. His slide [ja] wrote about how personal data should be used in the future, especially when the data is anonymized but leaves rooms for potential re-identification. The slide showed that personal data should only be used under a condition that it is properly anonymized, it is ensured that the data will not be re-identified, and third parties are banned by contract from re-identification.
While more users voiced concerns on the use of personal data, some didn't seem to mind.
KeiPipe was another Twitter user who felt the use of data did not matter to him:
便利になるならそれでいい。 僕個人に対する特別な興味があるわけじゃないだろうし – Reading: Ｓｕｉｃａ情報分析し販売 戸惑いも NHKニュース http://t.co/QpMFTdpdNO
— けぃ not 真人間 (@KeiPipe) July 16, 2013
I don't mind as long as things become more convenient. I bet they have no special interests towards me personally.
Atsushi Fukuda, a creative director of digital media content, shrugged, arguing that all kinds of information are out there anyway:
まぁ、あらゆる情報は分析されちゃうわな。／Ｓｕｉｃａ情報分析し販売 戸惑いも NHKニュース http://t.co/EeRaEOL1MI
— Atsushi Fukuda (@fukudadesuga) July 15, 2013
Oh well, I guess all kinds of information are destined to be analyzed.
For Twitter user maipy1Q70, it is not so much about privacy as much as it is about the profits of using big data not being distributed to the masses who generate the data:
— H. Imai (@maipy1Q70) July 14, 2013
This is becoming a nasty example. The problem is, aside from privacy, that there is no clear evidence that Hitachi can exploit it exclusively and as they wish to. Also there is no re-distribution of profit to the masses or the users who are the originator of these data sets.
Another hatena user emphasized the lack of compensation for customers who are generating the data:
I will forgive them if they give us a discount, like 20-30 percent off of the fare to customers contributing data. I don't care much about my emitted information, but it makes me upset that I won't be compensated [for contributing data sets].