Displaying the First N Words of a Long Rich Text Column with XSL

When you want to display blog posts and announcements with DVWPs in your SharePoint Site Collection, you usually don’t want to display the full posts, but just enough to indicate what the item is about and to let the user know if they should click to see more.  An example might be showing the last 3 blog posts on your Home Page.  There isn’t any easy out of the box way to do this.

For the following examples, let’s say that the @Body column contains the text: “The <em>quick</em> <span style=”color: #a52a2a;”>brown</span> fox jumped over the lazy dog.”, which actually looks like this: “The quick brown fox jumped over the lazy dog.”

One option is to use the ddwrt:Limit function.  This allows you to specify a number of characters to show, along with some text to postpend if the original text is longer than the limit you set.  So, for instance, ddwrt:Limit(string(@Body), 25, ‘…’) would show the first 25 characters, followed by the ‘…’ string if there are more than 25 characters in the @Body column.  However, since the @Body column usually contains some HTML markup, you usually don’t get what you really want (the tags are all counted as part of the number of characters).  With our example @Body text above, you’ll get “The <em>quick</em> <span …”, which isn’t even valid HTML since the <span> tag isn’t closed.  Depending on the browser you are using, you’ll probably see something like “The quick“.

So, the first thing you might want to do is to strip out all of the HTML.  The StripHTML XSL template below will do this for you.

<xsl:template name="StripHTML">
  <xsl:param name="HTMLText"/>
  <xsl:choose>
   <xsl:when test="contains($HTMLText, '&gt;')">
    <xsl:call-template name="StripHTML">
      <xsl:with-param name="HTMLText" select="concat(substring-before($HTMLText, '&lt;'), substring-after($HTMLText, '&gt;'))"/>
    </xsl:call-template>
   </xsl:when>
   <xsl:otherwise>
    <xsl:value-of select="$HTMLText"/>
   </xsl:otherwise>
  </xsl:choose>
 </xsl:template>

Once you have the HTML stripped out, the ddwrt:Limit function will do what you want, but the text will probably be cut off mid-word.  Looking at our example @Body text again, the StripXSL template will return “The quick brown fox jumped over the lazy dog.”, which with the ddwrt:Limit function above will look like “The quick brown fox jumpe…”

So, an even better solution is to first strip out the HTML and then return a specific word count.  The FirstNWords XSL template below takes care of this for you.

<xsl:template name="FirstNWords">
  <xsl:param name="TextData"/>
  <xsl:param name="WordCount"/>
  <xsl:param name="MoreText"/>
  <xsl:choose>
    <xsl:when test="$WordCount &gt; 1 and
        (string-length(substring-before($TextData, ' ')) &gt; 0 or
        string-length(substring-before($TextData, '  ')) &gt; 0)">
      <xsl:value-of select="concat(substring-before($TextData, ' '), ' ')" disable-output-escaping="yes"/>
      <xsl:call-template name="FirstNWords">
        <xsl:with-param name="TextData" select="substring-after($TextData, ' ')"/>
        <xsl:with-param name="WordCount" select="$WordCount - 1"/>
        <xsl:with-param name="MoreText" select="$MoreText"/>
      </xsl:call-template>
    </xsl:when>
    <xsl:when test="(string-length(substring-before($TextData, ' ')) &gt; 0 or
        string-length(substring-before($TextData, '  ')) &gt; 0)">
      <xsl:value-of select="concat(substring-before($TextData, ' '), $MoreText)" disable-output-escaping="yes"/>
    </xsl:when>
    <xsl:otherwise>
      <xsl:value-of select="$TextData" disable-output-escaping="yes"/>
    </xsl:otherwise>
  </xsl:choose>
</xsl:template>

With our example, StripHTML returns “The quick brown fox jumped over the lazy dog.” and then a call to FirstNWords with a WordCount of 5 will give you “The quick brown fox jumped…”  Much nicer!

Note that this won’t do a perfect job if there is a lot of odd spacing or punctuation, but most of the time, it’s a much cleaner solution.

NOTE (2009-02-05): I was working with some data today that had lots of double spaces and some escaped characters, so I tweaked my FirstNWords template to work a little better by adding the test for double spaces (though it isn’t foolproof with different types of white space).

UPDATE (2009-02-27): Here’s an example of how I’ve used these templates in the past to display blog posts.  First, I create a variable called BodyText that contains the contents of the @Body column with the HTML stripped out by using the StripHTML template.  Then I output a row with a link to the post and a second row with the first 25 words of the post, followed by ‘…’, using the FirstNWords template.

<xsl:template name="USG_Blog.rowview">
  <xsl:variable name="BodyText">
    <xsl:call-template name="StripHTML">
      <xsl:with-param name="HTMLText" select="@Body"/>
    </xsl:call-template>
  </xsl:variable>
  <tr>
    <td>
      <a href="{$WebURL}Lists/Posts/Post.aspx?ID={@ID}&amp;Source={$URL}" >
        <xsl:value-of select="@Title"/>
      </a>
    </td>
  </tr>
  <tr>
    <td>
      <xsl:call-template name="FirstNWords">
        <xsl:with-param name="TextData" select="$BodyText"/>
        <xsl:with-param name="WordCount" select="25"/>
        <xsl:with-param name="MoreText" select="'...'"/>
      </xsl:call-template>
    </td>
  </tr>
</xsl:template>

As a side note, I always store these “utility” functions in a separate file for reuse and use the xsl:import tag to pull them into the DVWP I’m working on.  The import should go before the xsl:output tag, as below.

<xsl:import href="/Style Library/XSL Style Sheets/Utilities.xsl"/>
<xsl:output method="html" indent="no"/>

Similar Posts

97 Comments

  1. Marc – Firstly, thanks for your efforts and for making them public.

    I’ve been trying to get this working for a while now. While I did manage to strip the HTML out and limit the post to a certain number of characters, I’ve never managed to get the nwords part working.

    If I take your two templates and add them in above the Blog Posts DFWP, and then replace the code that selects values with your own real world example I get an error saying that I do not have a valid xslt stylesheet and that a semi-colon was expected. Any thoughts on where I’m going wrong?

    Thanks again,
    James

    1. James:

      You’re welcome! The two templates need to be inside the stylesheet for the DVWP. I usually will put custom templates like this at the bottom of the stylesheet, after dvt_1.rowview (or the equivalent). It sounds like you may just not have the tepmlates in a place where SharePoint Designer can “find” them.

      M.

      1. Thanks Marc – I had them just above the dvt_1.rowview part, tried moving them to after that but I’m getting the same error.

        Sharepoint Designer has them underlined and when I Ctrl Click them it jumps to the two templates – So it would appear that it can see them.

        I’ll have another look – If I return to the original source for the dvlt_1.rowview part it all starts working again so at least it’s narrowed down to just that section.

        Thanks again,
        James

            1. It turned out that it was my mistake, not yours. Some of the escaping was missing in my code. I’ve updated the code above, and it out to work now.

              M.

  2. Hi Mark,

    I am trying to use the strip html only with my DVWP. I have made a news section that requires the editor to use the features of Rich text editing. Plain text it works fine, but the rich text editor puts in div and br tags. I tried using your strip code in my .aspx page, but it doesn’t seem to be working. I would be very greatful for any suggestions you may have on this.

    Thank you

    1. Justin:

      It’s hard to say what the problem might be without seeing your code. Why don’t you post it over at the jQuery Library for SharePoint Web Services Solutions forum at EndUserSharePoint’s Stump the Panel and I’ll take a look? (It’s easier to exchange code over there, and there’s also a larger community who will benefit.) I will get an alert when you post there, as I’m the moderator.

      M.

  3. Hello…

    I am trying to do this same thing with an Announcement webpart. Where do I insert this code at? Does it replace the snippet?

    I am fairly new to SharePoint and SPD – but I have been stuck on this issue for several days now – and I need to be able to truncate the body of the announcements before we can go live with our site.

    I would greatly appreciate any help you can give!

    – Rex

    1. Rex:

      You’ll need to add a Data View Web Part (DVWP) to your page using SharePoint Designer. Once you’ve got that in place, you can add the XSL templates I show here into the stylesheet and adapt them to your needs. If you haven’t worked with DVWPs before, there are quite a few steps to all of this. Why don’t you let me know where you are in the process so that I can minimize the details I need to give you.

      M.

  4. Hi Mark,
    Thanks for all the good info on DVWPs. I needed to display the first paragraph of the most recent announcement on the home page of a site. I saw that announcements are stored like this:

    Fist Paragraph
    Second Paragraphh

    So to get the first paragraph from an announcements list, I just needed to :

  5. Hi Marc, I’ve just tried to implement this and I’m
    struggling. The StripHTML template works no problem (i.e. If I
    include this in my code and then display $BodyText in a row, it
    shows the text with no HTML). However, when I then add in the
    FirstNWords template, it errors with “Failed setting processor
    stylesheet: 0x80004005 : Expected token ‘EOF’ found
    ‘.’..–>.<–" Any ideas? Cheers!
    Dave.

    1. Dave:

      From Twitter, it sounds like you solved this. Let me know if it was anything with my code that caused the issue and I’ll fix it!

      M.

  6. Mark, before i spend some time trying to do this in SP 2010, i wonder if you happen to know what if if any changes i will need to make to get to work.
    Thanks
    Dean

  7. Mark, thanks for letting me know that it should work in 2010, but i’m confused about how to implement . The XSLTListViewWebPart that i am seeing on the default.aspx page of a Blog site, does not seem to have an XSL:Template node. do you have any examples of how this is done in 2010, i have your DVWP book, but there are some changes in the new XSLT ListView that have me confused. Any tips you could share would be greatly appreciated. If you write another book on this new web part, i’ll buy it also.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.