2010-10-04 Stripping CSS Style Information using XSLT

I run a blog aggregation using Planet Venus, the Old School RPG Planet. My result page has a sidebar floating off to the right. Some of the articles included come with style information including “clear: right”, “clear: left” and “clear: both” – which will then interact with my sidebar. I can transform the articles using XSLT.

Planet Venus

Old School RPG Planet

Here’s an example:

  <!-- Feedburner detritus -->
  <xsl:template match="xhtml:div[@class='feedflare']"/>

So now I would like to *remove* style attributes that contain the string “clear:” as a first step. Here are some examples:

  <div class="separator" style="clear: both; text-align: center;">…</div>
  <a href="…" style="clear: right; float: right; margin-bottom: 1em; margin-left: 1em;">…</a>
  <a href="…" style="clear: left; float: left; margin-bottom: 1em; margin-right: 1em;">…</a>

I managed to solve the first problem using the following rule:

  <!-- CSS clear -->
  <xsl:template match="xhtml:div[@class='separator']">
    <xsl:apply-templates/>
  </xsl:template>

But what about the second and third? I’m a bit confused regarding the writing of XSLT templates. It seems to work. It’ll drop all attributes except for the href attribute. Maybe that’s overkill, but the few examples I saw used nothing else.

  <xsl:template match="xhtml:a[contains(@style, 'clear:')]">
    <a>
      <xsl:attribute name="href">
	<xsl:value-of select="@href"/>
      </xsl:attribute>
      <xsl:apply-templates/>
    </a>
  </xsl:template>

Any ideas on how to improve this? I’d love to copy the style attribute and just replace “clear ?: ?[a-z]+;?” with the empty string or something like that.

​#Web ​#XSLT ​#Planet

Comments

(Please contact me if you want to remove your comment.)

I’ll try to come up with a working example perhaps this evening, but for this sort of thing, you usually want to start with the XSL Identity transform, and then special case what you want. In particular, you probably want to remove clear: from all inline style attributes everywhere, no? In which case, build a template that matches on “@style”. You’ll then want to emit a cleansed attribute with xsl:attribute. You can either do it the easy way with an extension function (written in, e.g., java, if you’re using xalan-j) or do it the ’hard’ way with a recursive loop using substring and index.

xsl:attribute

What xslt processor are you using?

– Brian 2010-10-05 17:46 UTC

---

I’ll have to check. Planet Venus is written in Python, so I’m assuming the XSLT C library with Python wrapper. ¹ And you’re right, eventually I want to remove all clear styles, wherever they may be. I need to check whether that allows me to call custom code somewhere... Otherwise I think I’m stuck with using loops, substrings, etc.

¹

Thanks for the IRC conversation pointing me at the example on the XSLT page ²:

²

13:55 **bpalmer`**

13:55 **bpalmer`**

13:56 **bpalmer`**

13:57 **bpalmer`**

13:57 **bpalmer`**

14:01 **bpalmer`**

14:02 **bpalmer`**

And I learn a new word: prolix. 😄

prolix

– Alex Schroeder 2010-10-06 07:51 UTC

Alex Schroeder

---

Hm… I think that the XSLT implementation falls back on *xsltproc* which in turn does not support 2.0 features. I looked at XSLT 2 and Delimited Lists, wrote a little local example, and got the error: *xsl:version: only 1.0 features are supported* 👎 – and when I change the version to 1.0, I get *xmlXPathCompOpEval: function tokenize not found* 👎 🙁 The article mentioning XSLT 2 features was written in 2003. Something’s not right.

XSLT 2 and Delimited Lists

Using libxml 20706, libxslt 10126 and libexslt 815
xsltproc was compiled against libxml 20706, libxslt 10126 and libexslt 815
libxslt 10126 was compiled against libxml 20706
libexslt 815 was compiled against libxml 20706

– Alex Schroeder 2010-10-06 23:07 UTC

Alex Schroeder

---

I think I’m happy enough with things as they are and will look into this again if and only if I see more borkage on my planet.

– Alex Schroeder 2010-10-09 23:07 UTC

Alex Schroeder