💾 Archived View for dioskouroi.xyz › thread › 29441857 captured on 2021-12-04 at 18:04:22. Gemini links have been rewritten to link to archived content

-=-=-=-=-=-=-

The ruby HTML Element

Author: alin23

Score: 109

Comments: 33

Date: 2021-12-04 16:28:23

Web Link

________________________________________________________________________________

dalke wrote at 2021-12-04 17:54:18:

Interesting!

I noticed "It can also be used for annotating other kinds of text" and wanted to experiment with being able to number specific letters in a string.

More specifically, SMILES is a linear molecular structure notation. "O" is water, "COO" is ethyl alcohol, "c1ccccc1" is a benzene ring, and much more. (See

https://en.wikipedia.org/wiki/Simplified_molecular-input_lin...

I want to annotate atom positions in the SMILES string. I currently do this with ("pip install smiview") text over the string, as this example using phenol.

1 23456 7
  c1ccccc1O

I wanted to try it with ruby so I used:

<ruby>
  c<rt>1</rt>1c<rt>2</rt>c<rt>3</rt>c<rt>4</rt>c<rt>5</rt>c<rt>6</rt>1
  </ruby>

The "2" is located over the center of "1c" instead of over the second "c" like I wanted.

How do I get it to center only over the "c"?

I tried changing the CSS too, using this catch-all:

ruby {
      font-size: 2em;
      ruby-align: center;
      text-align: center;
  }
  * {ruby-align: center;}

No luck. I also tried wrapping things in a span, like:

c<rt>1</rt>1<span>c<rt>2</rt></span>c<rt>3</rt>

but got the 3 as a new ruby line, centered over the "1cc", itself with a ruby "2" between the second and third "c".

I tried other combinations of <span>, to no avail.

vore wrote at 2021-12-04 18:39:27:

Multiple ruby spans, perhaps?

  <ruby>
   c<rt>1</rt>
  </ruby>1<ruby>
   c<rt>2</rt>
  </ruby><ruby>
   c<rt>3</rt>c<rt>4</rt>c<rt>5</rt>c<rt>6</rt>1
  </ruby>

dalke wrote at 2021-12-04 19:54:50:

Yes, that was the solution. Thanks!

gvx wrote at 2021-12-04 19:19:21:

As far as I can tell, the only content inside of a ruby tag should be annotated text or its annotation, the 1 that should not be annotated should not be inside the <ruby> tag.

I get a good result with this:

  <ruby>c<rt>1</rt></ruby>1<ruby>c<rt>2</rt>c<rt>3</rt>c<rt>4</rt>c<rt>5</rt>c<rt>6</rt></ruby>1

alin23 wrote at 2021-12-04 19:26:43:

Indeed, this looks quite good:

https://cln.sh/5MWsIR

dalke wrote at 2021-12-04 19:31:35:

Thanks to you for the working demo, and to gvx for figuring it out!

UPDATE: Here's an example for theobromine -

https://jsfiddle.net/j84z1kyb/

It _looks_ great!

But it's not as useful as I hoped it would be. Copy&paste captures the numeric annotations. I probably should have expected that, but didn't.

And highlighting is wonky. In Safari and Firefox, I seem to get a character and the ruby annotation for the character next to it, more often than I do the one overhead.

derefr wrote at 2021-12-04 20:25:23:

> Copy&paste captures the numeric annotations. I probably should have expected that, but didn't.

That's very likely a browser bug. Think about the accessibility implications (for e.g. screen readers) if ruby text was _supposed_ to be modelled in the DOM as being interpolated into the text it's annotating.

I'm not sure what the standard says about how it should be treated, but my guess is that each annotation should be thought of as _alternative_ for the text it annotates — ala the image "alt" and "srcset" attributes, or ala videos with multiple audio tracks in the same language, where one of those is Described Video or director's commentary or whatever.

In other words, the "correct" behavior would likely be that your browser knows the user's language prefs, and then chooses to select (or copy, or speak, etc.) _either_ the text _or_ its annotation, depending on which one the user is more likely to be able to read/understand.

yorwba wrote at 2021-12-04 22:48:44:

> Think about the accessibility implications (for e.g. screen readers) if ruby text was _supposed_ to be modelled in the DOM as being interpolated into the text it's annotating.

The <rp> tag exists explicitly for the purpose of interpolating the annotations into the text. So e.g. <ruby>漢字<rp>（</rp><rt>かんじ</rt><rp>）</rp></ruby> will look like 漢字（かんじ） to a client that doesn't support ruby text and a screenreader could read it the same way it would read any other text with parenthetical annotations in it.

The standard doesn't actually say what screenreaders are supposed to do, so I guess they could also try something fancy. But they don't have to.

alin23 wrote at 2021-12-04 20:17:58:

Whoa that's incredible!

Yes, selection does also highlight the annotations but the good part is that copying ignores them.

I just selected the whole formula, pressed Cmd-C, Cmd-V and got this:

        Cn1cnc2c1c(=O)[nH]c(=O)n2C

dalke wrote at 2021-12-04 20:24:36:

Interesting. My Cmd-C, Cmd-V in Safari gives:

          C1n21c3n4c52c61c7(=O8)[nH]9c10(=O11)n122C13

but in Firefox gives:

          Cn1cnc2c1c(=O)[nH]c(=O)n2C

ahmedfromtunis wrote at 2021-12-04 20:31:16:

Not with chrome for Android though. Here's what the clipboard captured instead: C1n21c3n4c52c61c7(=O8)[nH]9c10(=O11)n122C13

kingcharles wrote at 2021-12-04 21:22:42:

My guess is that only the paste is broken, not the copy.

If the box you were pasting into supported annotations then it would paste perfectly. Pasting into a plain text text-area field leaves the browser with a hard choice to make on how to interpret the data in the clipboard when transliterating it into plain text.

dalke wrote at 2021-12-04 22:04:24:

I pasted into a iTerm2 terminal window, cat > /dev/null.

I just now tried pasting into a Jupyter notebook, and into the HTML entry box of the JSFiddle I linked to.

Again, Safari copy&paste to those elements in Safari includes the annotations.

Firefox does not.

Pasting to the terminal and pasting to a Jupyter notebook are my two primary expected paste destinations.

OJFord wrote at 2021-12-04 18:26:51:

I know nothing about this, but use a fixed width font perhaps (if you weren't)? Sounds like it could be because the '1' is narrower.

robin_reala wrote at 2021-12-04 19:00:29:

Weirdly I’ve only used this in anger outside of Japanese text, to replicate a semantic layout in the original printing of _Tristram Shandy_ for Standard Ebooks (see book 3, chapter 11, the Latin version:

https://standardebooks.org/ebooks/laurence-sterne/the-life-a...

kingcharles wrote at 2021-12-04 21:18:55:

What does the "ruby" gloss mean in the linked text? Why do some words have it? It's been many a decade since I took Latin...

danschuller wrote at 2021-12-04 18:47:48:

This is a cool.

It's something I've toyed with putting into my toy font renderers but it always seemed like it had a lot of edge cases. Length of the ruby text overflowing the width of the parent, in some to most cases a little overflow is ok but it's certainly not guaranteed. Scaling down the ruby text isn't the ideal solution because it quickly becomes unreadable. The other option is to scale the spacing in the parent text, which seems to be done for <ruby>境界面<rt>インターフェース</ruby> in the specification

https://html.spec.whatwg.org/multipage/text-level-semantics....

but then that's going to impact the line wrapping and so on. Kudos to the implementators!

kazinator wrote at 2021-12-04 20:05:16:

I used that for the furigana over the made-up words in a Jabberwocky translation.

http://www.kylheku.com/~kaz/gayabōkin.html

wodenokoto wrote at 2021-12-04 19:42:14:

The `<rp>` tag showed in the examples, isn't explained on the page but is a fallback - something that should be rendered if the ruby tag is not understood.

Sadly, the rp page doesn't show any examples of what fallback behavior might look like.

https://developer.mozilla.org/en-US/docs/Web/HTML/Element/rp

shawnz wrote at 2021-12-04 19:49:51:

Simply remove all the tags to see how the fallback behaviour would look, for example data:text/html;charset=utf-8,%E6%BC%A2%20(kan)%20%E5%AD%97%20(ji)

antonkar wrote at 2021-12-04 20:50:46:

I used it in my old and free iOS web browser to put translation (Spanish, French…) or Pinyin on top of English or Chinese words

https://apps.apple.com/app/id932996489

nyuszika7h wrote at 2021-12-04 19:25:30:

When I read the headline I thought it's some weird new syntax for embedding Ruby code snippets in HTML.

This is definitely not going to be confusing... /s

7c7599bfe5df wrote at 2021-12-04 20:11:30:

It's been around for a decade in browsers[1], and the terminology itself precedes that of the ruby language.

You haven't been confused for the last 11 years, so this doesn't seem to have been a problem.

[1] The W3C spec is even older.

notreallyserio wrote at 2021-12-04 21:08:52:

> You haven't been confused for the last 11 years

You underestimate me! Or over.

kingcharles wrote at 2021-12-04 21:15:35:

Now I'm even more confused!

thrashh wrote at 2021-12-04 20:36:51:

I think I used it in 2005 so it’s definitely olddd.

alin23 wrote at 2021-12-04 19:28:32:

My initial title was:

  The <ruby> HTML element

But the ruby tag got stripped by HN and I ended up with

  The  HTML element

Izkata wrote at 2021-12-04 20:02:09:

Hum... Does it accept

    &lt;ruby&gt;

?

alin23 wrote at 2021-12-04 20:15:08:

Not really, it renders:

      The andlt;Ruby> HTML element

https://cln.sh/LteC0S

makach wrote at 2021-12-04 19:31:27:

I thought the same. Now that I read the documentation I think I will be fine and confusion less likely.

jagger27 wrote at 2021-12-04 19:35:33:

Is that support table right? IE5 supports it but it took until Firefox 38? Those came out in 1999 and 2015.

deaddodo wrote at 2021-12-04 19:52:42:

Yes. The Ruby tag was introduced by Microsoft in IE5, and then rolled into HTML5 during the standardization process.

kingcharles wrote at 2021-12-04 21:14:04:

I assume this was added as a way to implement language features such as Furigana, which is a minor, but useful, feature of written Japanese:

https://en.wikipedia.org/wiki/Furigana

Would this make sense for putting romanized text above non-roman languages?

Currently the standard is just to write the native text and then the romanized. See, e.g.:

https://en.wikipedia.org/wiki/Weekly_Sh%C5%8Dnen_Jump

I'm thinking Hepburn above the kana:

https://en.wikipedia.org/wiki/Hepburn_romanization

Anyone with more language knowledge want to cuss me out this idea?