“I'm turning Japanese, I think I'm turning Japanese, I really think so.”

JeffK [1] mentioned that over the past few days, when he views The Boston Diaries [2] his browser asks if he wants to download and install Japanese language support. I found the notion odd, but like some stores I've heard [3], computers can be affect by wierd things so it was remotely possible that for whatever reason his browser felt the need to install Japanese language support whenever my page was loaded.

So we head over to his computer and as he's bringing up my blog, it suddently hits me why his computer is asking to install Japanese language support: my entry on the 14 [4]^th! (don't worry, it's fixed for now).

When writing English, I was taught that you italicize foreign words. Easy enough to do in HTML (HyperText Markup Language), just slap some <I> tags around the word and be done with it. But semantically that doesn't really mean anything, what with the semantic web [5] being a current hot topic and all. While it's apparent to most readers that garçon is French and über is German, what about slumpmässig? Could be German for all you know (it's not—it's Swedish). By using the features inherent in HTML we can add semantics to foreign words beyond just italicizing them.

And that's what I do, in fact. For a foreign word like slumpmässig I'll encode it up like:

>
```
<I LANG="se" TITLE="chance; luck, hazard">slumpm&auml;ssig</I>
```

Certain browsers, like MSIE (Microsoft Internet Explorer) and Mozilla [6], will display a tooltip with the text in the TITLE attribute, where I stick the translation of the word (if you happen to be using MSIE or Mozilla, try holding the mouse over a foreign word), and an intelligently programmed HTML vocalizer (used perhaps, by the blind to speak pages) can use the language tag to help recognize which language the word is written in and use that to guide the pronounciation.

Semantically much better than just <I>slumpm&auml;ssig<I>.

So, when I wrote that entry on the 14^th [7] I did what I've been doing now for some time and slapped some semantics around the Japanese terms.

The <I LANG="ja" TITLE="fan art">dojinshi</I> market .... <I LANG="ja" TITLE="comic book">Manga</I> publishers ...

They are Japanese terms after all.

Since I seem to already have the Japanese language support installed I didn't notice anything odd when I loaded the page to proof read the entry. But it seems that other browsers that don't have the Japanese language support saw the language attribute for “Japanese,” realized they weren't installed, so decided to ask the user if it was okay to install Japanese language support. But I'm using an Anglicized spelling for a Japanese word so there's no real need to download Japanese language support for what I used, so how do I get around that?

That, I don't know. I'm fudging it right now by using LANG="x-ja" which is allowed (any language code starting with “x” is for private use; that shouldn't trigger any download message from browsers—it's intended for words like Nazgûl which don't have an officially designated language), which I suppose, is better than nothing.

Update on Saturday, September 23^rd, 2023

I think it's more semantically correctly to use the <I> tag than the <SPAN> tag to mark foreign words, so I'm going back and making that change.

[1] http://www.livejournal.com/users/j3ff

[2] https://boston.conman.org/

[3] /boston/2002/12/26.1

[4] /boston/2003/01/14.2

[5] http://www.google.com/search?hl=en&lr=&ie=UTF-8&oe=UTF-8&q=semantic+web&btnG=Google+Search

[6] http://www.mozilla.org/

[7] /boston/2003/01/14.2

Gemini Mention this post

Contact the author