Archive for the 'News' Category

Facebook’s Newest Machine Translation Tool Falls Flat

Just in case you need more proof that most machine translation tools don’t quite cut the mustard, the latest attempt by social media giant Facebook to incorporate machine translation (MT) into its platform fails miserably with most languages.

In an effort to help pages connect better with their fan base—often found scattered across the globe—Facebook recently introduced machine translation, powered by Bing. With just one click, users can get an automatic translation of status updates and comments. Facebook plans to roll out this feature to all profiles (not just pages) in the near future.

There’s just one problem: most of the translations are unintelligible. Posts on social media sites like Twitter and Facebook are rife with slang, and Bing’s machine translation tool simply isn’t up to task. An analysis of Bing’s performance by the site Lexiophiles shows that Bing only gets it right about 50% of the time, leaving users confused and, at worst, possibly misled by faulty translations. Interestingly, out of the ten languages tested, posts translated from Spanish to English fared worst of all, with less than 10% of the rendered text considered intelligible.

For greater accuracy, Facebook will also be implementing a feature that allows bilingual users to offer an alternative translation. If other users endorse the accuracy of the crowdsourced translation, it will take the place of Bing’s original translation the next time the “Translate” option is clicked. Page administrators will be able to manage crowdsourced translations through a “manage translations” link below the posts on pages they control.

One of the great arguments in favor of MT has always been that it can at least offer users a gist of the conversation when no other means of translation is available. It seems that, at least for now, Bing’s tool doesn’t even offer that to Facebook’s users.

Spelling Errors Cost Companies Millions in Lost Sales

Poor grammar, errant punctuation, and typographical errors all spell disaster for online sales. When it comes to Internet sales and marketing, shoddy websites filled with spelling gaffes rarely get a second chance to impress. In fact, most visitors make up their minds about the quality of a website in just six seconds. According to British Internet entrepreneur Charles Duncombe, just one spelling error on a company’s website can lead to a 50% decrease in online sales. He estimates that Internet retailers lose millions every week due to spelling slip-ups.

Consumers look to attention to detail in spelling and grammar as important indicators of a website’s credibility. Spelling mistakes and poor grammar sound alarm bells for potential customers concerned about spam or phishing. While there seems to be a more tolerant attitude toward spelling and grammar mistakes on social media sites such as Facebook, websites marketing products and services can’t afford to take a lax approach to spelling blunders.

For more information on this story, visit BBC News.

Bilingualism Delays Onset of Alzheimer’s Disease

Ellen Bialystok, a cognitive neuroscientist and research professor of psychology, has made the study of bilingualism her life’s work. After nearly 40 years of research, she has discovered that regularly speaking two languages offers a number of benefits, including a delay in the onset of Alzheimer’s disease.

Dr. Bialystok found that there’s a significant difference between monolinguals and bilinguals in terms of how they process language. Bilingualism sharpens the mind’s ability to maintain multiple pieces of information in play and switch between them, effectively improving one’s multitasking skills.

A study published by Dr. Bialystok in 2004 showed that normally aging monolingual individuals experience a more pronounced loss of cognitive functioning than their normally aging bilingual counterparts. In later studies, the records of 400 Alzheimer’s patients were examined. The findings revealed that while bilingualism didn’t prevent Alzheimer’s disease, those who spoke two languages manifested symptoms five to six years later than the monolinguals.

For more information, read “The Bilingual Advantage” on NYTimes.com.

Dirae: The Latest Tool to Search for Terms in Spanish

Spanish speakers and students of the Spanish language now have one more handy tool at their disposal. The Real Academia Española (RAE) – the official institution responsible for policing the Spanish language – recently released the online tool known as Dirae, based on the RAE’s Diccionario de la lengua española (Spanish language dictionary). Unlike traditional dictionaries, Dirae functions as a reverse dictionary, enabling users to find words based on a set of general concepts.

Using carefully chosen search terms, Dirae also functions as an associative thesaurus, etymological search tool, and synonym finder. For example, by entering the search terms “‘del quechua’ maíz,” the tool will return Spanish words etymologically based in the Quechua language that are related to corn. Read more about this new tool and view examples of its use here [in Spanish].

Related Posts:
New Spanish Spelling Reforms from the RAE
New Inclusive Grammar Guidelines from the Real Academia Española

Language an Obstacle for Internet Users in European Union

A recent study conducted by Eurobarometer, the European Commission’s survey research program, found that more than 50% of Internet users in the European Union (EU) sometimes access the web using a language other than their mother tongue. In addition, the study revealed that 90% of EU Internet users show a preference for websites featured in their native language.

Nonetheless, 44% of survey respondents sensed that they were “missing something interesting online” since a number of websites display information in a language they don’t comprehend.

Neelie Kroes, the European Commissioner for Digital Agenda, wrote, “If we are serious about making every European digital, we need to make sure that they can understand the web content they want. We are developing new technologies that can help people that cannot understand a foreign language.”

At the present time, the European Commission is funding 67 million euros’ worth of research projects to enhance translation techniques for online content, including the site iTranslate4, which generates machine translations of several European languages.

For more information on this topic, read this article on the news site Deutsche Welle.

Google Strikes Deal to Translate European Patents

Last week Google announced an agreement with the European Patent Office (EPO) to translate approximately 50 million patents using the search giant’s Internet-based translation tool, Google Translate. Google and authorities at the EPO will collaborate to translate patents into 32 different languages.

Patent researchers, scientists and others will be able to conduct searches for patents in German, French and English, the patent authority’s three official languages. The EPO site’s users may then obtain an instant translation of the patent documentation into languages such as Russian, Japanese or Spanish. It’s important to note that these translations are being made available purely for research and information purposes; they are in no way meant to substitute for official patent translations done by professional translators, as mandated by law.

The EPO will grant Google access to all previously translated patents, which amount to some 1.5 million documents in addition to 50,000 new patents per year.

Officials at the patent office expect the project to be finalized by 2014.

For more information, visit EPO.org.

Also read:
The machine translation debate

Google Translate and the Struggle for Accurate Machine Translations

“Refudiate” Chosen as 2010 Word of the Year

The New Oxford American Dictionary mulled over pages’ worth of new candidates for the 2010 Word of the Year. Although the technology sector contributed a considerable number of terms to 2009’s field of contenders, this year seemed more heavily influenced by politics, the economy, and current events with words like “Tea Party,” “bankster,” “double-dip” and “top kill.” Technology did manage to chip in with words like “webisode,” “crowdsourcing” and “retweet.”

So, which new word garnered the top spot? “Refudiate” – a word coined by controversial U.S. politician Sarah Palin – was bestowed the title of 2010 Word of the Year by the lexicographers at Oxford. The word, a verb “used loosely to mean ‘reject,’” resulted from a blending of the words “refute” and “repudiate.”

For a complete list of the words considered for the 2010 Word of the Year along with their definitions, have a look at this article from the Oxford University Press blog.

Endangered Languages Open Database Launched Online

An open database of endangered languages has been launched by researchers in the hope of creating a free, online portal that will give people access to the world’s disappearing spoken traditions.

The website has been developed by researchers at the World Oral Literature Project, based at the University of Cambridge, and is now available at its website, http://www.oralliterature.org/.

It includes records for 3,524 world languages, from those deemed “vulnerable”, to those that, like Latin, remain well understood but are effectively moribund or extinct.

Researchers hope that the pilot database will enable them to “crowd-source” information from all over the world about both the languages themselves and the stories, songs, myths, folklore and other traditions that they convey.

Users can search by the number of speakers, level of endangerment, region or country. In the United Kingdom, the site lists 21 disappearing languages, ranging from the relatively well known, like Scots and Welsh, to obscure forms such as Old Kentish Sign Language.

Where possible, the research team has also included links to online resources and recordings so that users can find out more. Their hope is that by making an early version of the database open to all, more people will come forward with information and references to recordings that they have missed.

Dr Mark Turin, Director of the World Oral Literature Project, said: “We want this database to be a dynamic and open resource, taking advantage of online technology to create a collaborative record that people will want to contribute to.”

At present, the world has more than 6,500 living languages, of which up to half will cease to exist as spoken vernaculars by the end of the century. In most cases, their disappearance is a by-product of globalisation, or rapid social and economic change. The World Oral Literature Project aims to document and make accessible these spoken traditions before they are lost without record.

Three existing datasets are raising awareness about the number of languages under threat: the online Ethnologue, the UNESCO Atlas of the World’s Languages in Danger and innovative work by conservation biologist Professor William Sutherland in the Department of Zoology at the University of Cambridge. Each, however, evaluates the risk and the problem differently, with varying results.

“While some severely endangered languages have been well documented, others, which may appear to be less at risk, have few, if any, records,” Dr Turin said. “Here in Cambridge we are interested not only in language endangerment levels but also in what might be called a ‘documentation index’. To this end, we are locating references to and recordings of oral literatures in collections around the world.”

“At the moment if you’re a researcher, a member of an endangered speech community or just an interested member of the public, there is no way to pull all these useful but disparate resources together in one place. We wanted to create a resource that does just that, and also build something that can be developed and expanded further to encourage other people to submit additional information. At present, the database allows us to pose comparative research questions about which languages are closest to extinction and where the records are.”

Of the 3,524 languages listed, about 150 are in an extremely critical condition. In many of these cases, the number of known living speakers has fallen to single figures, or even just one.

Examples include the Southern Pomo language, spoken by Native Americans in parts of California; Gamilaraay, the language of the Kamilaroi of New South Wales; and the language of the Sami communities based in northwestern Russia.

The entries specific to the United Kingdom include Manx, Cornish and Old Kentish Sign Language – a precursor to the generic British version which Samuel Pepys, among others, referred to in his famous diary.

Another disappearing oral tradition in the United Kingdom is Polari, a form of slang once used by the likes of actors and circus or fairground communities, and which was then adopted by gay subcultures as a type of code language.

Elsewhere in Europe, the endangered languages list includes a version of Low Saxon spoken in the north-eastern Netherlands; Mocheno (a Germanic language used in north Italy); and Istriot, which is spoken on the Croat coast and has about 1,000 speakers left.

The database also covers extinct languages about which enough is known through existing records to render them visible. In some cases this may be because the speech form died out very recently, as is the case with Laghu, which was spoken on Santa Isabel in the Solomon Islands and disappeared in 1984. In other scenarios, the language ceased to be spoken long ago but is still well known or used in a specialised setting, as with both Latin and Ancient Greek.

The database is being launched to coincide with a workshop at the Centre for Research in the Arts, Social Sciences and Humanities (CRASSH) at the University of Cambridge on December 10 and 11, which will bring together researchers to discuss some of the key issues surrounding the dissemination of oral literature through traditional and online media.

More information about both the database and the World Oral Literature project can be found at: http://oralliterature.org/ The pilot database was made possible by a Small Research Grant from the British Academy with additional funding from the Chadwyck-Healey Charitable Trust.

Source: University of Cambridge

Also read: Will 90% of the languages cease to exist? and Number of Living Languages

Some Linguistic Features May Be Genetically Linked

Linguistic researchers came to a consensus some time ago that humans’ capacity for language is genetically hardwired. It would seem, however, that with over 7,000 languages currently spoken worldwide, that this hardwiring of language is rather flexible. It’s no wonder, then, that scientists recently began to explore whether some of this tremendous linguistic diversity can be attributed to genetics.

Linguists at the University of Edinburgh investigated whether there exists a genetic difference among speakers of non-tonal and tonal languages (i.e. languages that use pitch in addition to consonants and vowels to impart meaning). The researchers discovered a correlation between genes and tonality, and based on their findings, they further hypothesized that linguistic tone is influenced by a pair of genes that influence brain growth and development. Further scientific study has shown that certain linguistic features such as tone and the presence of front rounded vowels may indeed be “genetically anchored,” and as such, these features are much more likely to persist over time.

For more information on this topic, visit The Economist.

World Cup Attracts Multilingual Audiences

The FIFA World Cup – soccer’s premier sporting event – is coming to African soil for the first time in history. South Africa will play host to 32 national soccer teams and 350,000 foreign visitors during the month-long event, requiring that both linguistic and cultural barriers be bridged for the global cast of players, organizers and fans who will be in attendance.

Demand for website localization, translation, editing and voice-over projects has increased exponentially in advance of the international sporting event. Written translations for the World Cup are mandatory for each of the 11 official languages of South Africa, in addition to the languages of the participating teams. In an effort to reach as many fans as possible, the World Cup website has been translated into Arabic, English, French, German, Spanish, and Portuguese.

Organizers predict that 450 to 500 million viewers from around the globe will tune in to this year’s World Cup.

For more information, see this article at Global Watchtower.

Do you like it?


Transpanish on Facebook

Suscribe to our blog!

Enter your email address:

Delivered by FeedBurner