Cross-Lingual Linking in the Blogosphere 

This article is about linguistic barriers and cross-lingual interaction. The study finds some problems between different languages in blogosphere. 

I. Definition.
2 concepts are addressed in this article.
1.    Linking Patterns:
2.    Modularity:

Modularity is a measure of “the goodness of fit” of a given partitioning of a network and can be used as a measure of language group insularity and an operationalization of the separation between language groups.

II. The study:
This research analyzes linguistic barriers and cross-lingual interaction through link analysis of more than 100,000 blogs discussing the 2010 Haitian earthquake in English, Spanish, and Japanese. In addition, cross-lingual hyperlinks are qualitatively coded. This study finds English-language blogs are significantly less likely to link cross-lingually than Spanish or Japanese blogs. However, bloggers' awareness of foreign language content increases over time. Personal blogs contain most cross-lingual links, and these links point to (primarily English-language) media. Finally, most cross-lingual links in the dataset signal a citation or reference relationship while a smaller number of cross-lingual links signal a translation. Although most bloggers link to other blogs in the same language, the dataset reveals a surprising level of human translation in the blogosphere.

For that purpose, four charts lead their study.

A. Language Distribution

B. Dataset Hyperlinks

The links between language groups, given in the table, show each language group is highly insular in its linking pattern as predicted by the literature and the pilot study. The diagonal of the table, which represents links within the same language group, contains 94% of the hyperlinks in the dataset, demonstrating the relevance of homophily to language.

C. Relationship of Cross-Lingual Linking Blogs

Modularity measures how the network deviates from a network of the same number of nodes and community divisions but with random edges and has been justified previously as a measure of polarization. In the present context, the lowest possible modularity score (0.0) represents no separation between language groups (the language groups are linked together as much as in a random network), and the highest score (1.0) represents the most separation between language groups (i.e. no cross-lingual links). For this dataset, the modularity score for the entire network is 0.51, which indicates that language is a strong dividing force (Newman & Girvan, 2004). English is the most insular of the three language groups and accounts for 42% of the modularity score.8 Spanish accounts for 37% of the score, and Japanese accounts for 21%.

D. Origins and Destinations of Cross-Lingual Hyperlinks

Most blogs (54%) containing cross-lingual links were classified as personal. Personal blogs contain the largest number of cross-lingual links for both Spanish (29% of all cross-lingual links) and Japanese (23% of all cross-lingual links); however, for English group blogs contain the largest number of cross-lingual links. English-language groups author 6% of all cross-lingual hyperlinks in the dataset—two-thirds of these links are created by Global Voices, a bridgeblogging community. Overall, Global Voices alone creates 15% of all cross-lingual links in the dataset. It accounts for half of the quotations and one-third of the translations in the dataset.

English-language media are the most central node in the network of cross-lingual links (figure available in the supplemental materials online). The largest single destination of cross-lingual links is to a collection of photos published by the Denver Post.9 After a link to this page was first posted on one Japanese blog, a wave of additional Japanese bloggers also linked to the same page. In fact, a large number of blogs (13.8%) share a photo or video directly on the page, while even more link to another blog specifically mentioning the multimedia content of the page.

Additional coding revealed several other aspects. Only 11.4% of cross-lingual links are between pages owned by the same organization. In addition, nonnews blogs sharing a cross-lingual link often also share a common topic or theme (e.g. technology, automotive, music). Finally, a large number of blogs making cross-lingual links (15.1%) appear to be very similar to another cross-lingually linking blog in the set. These blogs have nearly the same text, images, and/or links as another blog in the dataset.
Human translation occurs in the blogosphere in a decentralized patchwork of mainly individuals and small groups. Communities that encourage translation are a particularly effective means to locate translations, avoid duplication, and provide support and encouragement. Bloggers seem to read one another and on occasion link to blogs referencing foreign content or the foreign content itself demonstrating an increasing awareness of foreign content that undulates and changes over time perhaps with the amount of content available. Bloggers writing in English link much less to foreign content than bloggers writing either Spanish or Japanese. Although there is a substantial amount of content in English, the percentage of all Internet content in English is steadily declining, and human summarization and translation provide one way to communicate information between languages. This is particularly important for languages where machine translation performs poorly.

Class discussion: 
Do you read cross- lingual information? Share your thoughts when reading cross-lingual articles or news? 

  • Yes, I do. When I trying to find information more briefly I will read Chinese information, when I want some information with more detail I would find English website.
  • If I want to know about the incident happened in America, I will find information in my first language, to know the abstract about the news, then I read English website to know more detail.