- Global Voices - https://globalvoices.org -

A “Robot” for Analyzing the Persian Blogosphere

Categories: Middle East & North Africa, Iran, Ideas, Technology

Arash Kamangir [1], is a very active Canada-based Iranian blogger who has contributed to several internet projects such as Balatarin [2], an Iranian version of Digg [3]. He just launched an innovative project [4] for analyzing the Persian blogosphere that provides valuable new information about Iranian blogs.

Q: Please tell us briefly about yourself, your blogs and your recent project. What is it all about? When did you start it, and what you mean when you say it is a robot?

I write under the name Arash Kamangir. That's because Kamangir means “Archer” in Persian, and Arash is a mystical Iranian archer who once saved Iran's integrity (the whole story is mentioned in Wikipedia [4]). I have an English blog at kamangir.net [1] and a Persian blog at persian.kamangir.net [5]. There is also a photography blog at photo.kamangir.net [6].

I have given my recent project the Persian title KiBeKi [4], which almost means “What's up?” in Persian. It is a project I had been thinking about for a long time, but finally started writing the code for at the beginning of December. When I started the work I did it to satisfy my personal curiosity, but then as time passed it grew into a bigger project than I thought. In short, I am developing a piece of code which starts from a blog and works towards finding the connection pattern in the Persian blogosphere. I call it a robot because it's the name people have given to these pieces of code which crawl the web for information (Wikipedia [7]).

Q: What are the results of your research so far?

I still have a lot of difficulties, some of which are caused by the huge number of Persian blogs the code succeeds in finding. The last figures indicate that KiBeKi has spotted 130,000 Internet sources, 16,000 of which are Persian blogs and have been thoroughly analyzed. The robot has also determined that 75,000 more of the sources are indeed Persian blogs but they await further processing. This is the result of a few days of free run, which I had to interrupt because my data structures were blowing up, thus decreasing the performance. I have since been heavily working on the technical side of the code.

As a very early result, I looked at the email services Persian bloggers tend to use. From the over 13,000 email addresses that KiBeKi has discovered in Persian blogs, more than 75% are a Yahoo! email address and 12% are in Gmail.

Statistics on email adresses

Also, looking at Persian blogging services, such as Blogfa [8] and Persianblog [9], the statistics of the 16,000 blogs found so far indicates that Blogfa dominates the scene with about three fourths of the blogosphere. These are indeed early results and the code needs to be running for weeks and months to give us a more realistic picture.

Statistics on blog service providers

Q: Did the findings/results of your robot surprise you?

Yes it did. First of all, I always had an idea of how populated the Persian blogosphere is, but still I am surprised by the number of people who are out there writing their experiences in their blogs. Also, the patterns were very interesting.

Q: What are the objectives of your research and how do you want to develop them?

First, I aim at finding the population of the Persian blogosphere and the connection pattern in there. This will only concern Persian blogs that exist, independent of their nature and frequency of update. Then, in the second phase, I'll work on determining the volume of activity of blogs through reading timestamps. This will give me a better understanding of the Persian blogosphere and will help reject dead blogs. Nevertheless, more than anything else, this is a preliminary step in helping other researchers.

Q: What is the added value of your project for other researchers?

I know that there are works done on patterns in the Persian blogosphere. People have worked on clusters and on the statistics of activity in the Persian blogosphere. But, KiBeKi, to my knowledge, is the first all-inclusive investigative tool which attempts at gathering information from the blogosphere to this extent. Therefore, I am looking for people who would be able to use the data I gather in their ongoing research. This is given that we work out a good way of protecting the privacy of bloggers of course.

Q: Do you know about similar projects?

On smaller scales and based on manual browsing yes, but not as an automated procedure. I have not done extensive research though.

Q: How do you evaluate Iranian blogs’ impact in Iranian society?

Big and even bigger. Not that every single Iranian reads blogs. But I am amazed how conveniently we cross the red lines put in place by the state to discuss issues of interest. There is a very long way to go, but this phenomenon is fantastic in that it offers a method to think together and freely discuss issues of mutual interest.

Q: Any ideas to share with the Global Voices audience?

One of my hobbies these days is to go inside the database of KiBeKi and look at random blogs. I am amazed how many fantastic Persian blogs are out there, most of which do not have the deserved readership. The Persian blogosphere has been initially shaped around a few famous blogs. This is gradually turning into a more scattered pattern of readership. I believe that anyone who wants to know about the Persian blogosphere has to refrain from restricting themselves to the “first-generation” blogs and must try to go deeper than that.