Displaying the First N Words of a Long Rich Text Column with XSL

When you want to display blog posts and announcements with DVWPs in your SharePoint Site Collection, you usually don’t want to display the full posts, but just enough to indicate what the item is about and to let the user know if they should click to see more.  An example might be showing the last 3 blog posts on your Home Page.  There isn’t any easy out of the box way to do this.

For the following examples, let’s say that the @Body column contains the text: “The <em>quick</em> <span style=”color: #a52a2a;”>brown</span> fox jumped over the lazy dog.”, which actually looks like this: “The quick brown fox jumped over the lazy dog.”

One option is to use the ddwrt:Limit function.  This allows you to specify a number of characters to show, along with some text to postpend if the original text is longer than the limit you set.  So, for instance, ddwrt:Limit(string(@Body), 25, ‘…’) would show the first 25 characters, followed by the ‘…’ string if there are more than 25 characters in the @Body column.  However, since the @Body column usually contains some HTML markup, you usually don’t get what you really want (the tags are all counted as part of the number of characters).  With our example @Body text above, you’ll get “The <em>quick</em> <span …”, which isn’t even valid HTML since the <span> tag isn’t closed.  Depending on the browser you are using, you’ll probably see something like “The quick“.

So, the first thing you might want to do is to strip out all of the HTML.  The StripHTML XSL template below will do this for you.

<xsl:template name="StripHTML">
  <xsl:param name="HTMLText"/>
  <xsl:choose>
   <xsl:when test="contains($HTMLText, '&gt;')">
    <xsl:call-template name="StripHTML">
      <xsl:with-param name="HTMLText" select="concat(substring-before($HTMLText, '&lt;'), substring-after($HTMLText, '&gt;'))"/>
    </xsl:call-template>
   </xsl:when>
   <xsl:otherwise>
    <xsl:value-of select="$HTMLText"/>
   </xsl:otherwise>
  </xsl:choose>
 </xsl:template>

Once you have the HTML stripped out, the ddwrt:Limit function will do what you want, but the text will probably be cut off mid-word.  Looking at our example @Body text again, the StripXSL template will return “The quick brown fox jumped over the lazy dog.”, which with the ddwrt:Limit function above will look like “The quick brown fox jumpe…”

So, an even better solution is to first strip out the HTML and then return a specific word count.  The FirstNWords XSL template below takes care of this for you.

<xsl:template name="FirstNWords">
  <xsl:param name="TextData"/>
  <xsl:param name="WordCount"/>
  <xsl:param name="MoreText"/>
  <xsl:choose>
    <xsl:when test="$WordCount &gt; 1 and
        (string-length(substring-before($TextData, ' ')) &gt; 0 or
        string-length(substring-before($TextData, '  ')) &gt; 0)">
      <xsl:value-of select="concat(substring-before($TextData, ' '), ' ')" disable-output-escaping="yes"/>
      <xsl:call-template name="FirstNWords">
        <xsl:with-param name="TextData" select="substring-after($TextData, ' ')"/>
        <xsl:with-param name="WordCount" select="$WordCount - 1"/>
        <xsl:with-param name="MoreText" select="$MoreText"/>
      </xsl:call-template>
    </xsl:when>
    <xsl:when test="(string-length(substring-before($TextData, ' ')) &gt; 0 or
        string-length(substring-before($TextData, '  ')) &gt; 0)">
      <xsl:value-of select="concat(substring-before($TextData, ' '), $MoreText)" disable-output-escaping="yes"/>
    </xsl:when>
    <xsl:otherwise>
      <xsl:value-of select="$TextData" disable-output-escaping="yes"/>
    </xsl:otherwise>
  </xsl:choose>
</xsl:template>

With our example, StripHTML returns “The quick brown fox jumped over the lazy dog.” and then a call to FirstNWords with a WordCount of 5 will give you “The quick brown fox jumped…”  Much nicer!

Note that this won’t do a perfect job if there is a lot of odd spacing or punctuation, but most of the time, it’s a much cleaner solution.

NOTE (2009-02-05): I was working with some data today that had lots of double spaces and some escaped characters, so I tweaked my FirstNWords template to work a little better by adding the test for double spaces (though it isn’t foolproof with different types of white space).

UPDATE (2009-02-27): Here’s an example of how I’ve used these templates in the past to display blog posts.  First, I create a variable called BodyText that contains the contents of the @Body column with the HTML stripped out by using the StripHTML template.  Then I output a row with a link to the post and a second row with the first 25 words of the post, followed by ‘…’, using the FirstNWords template.

<xsl:template name="USG_Blog.rowview">
  <xsl:variable name="BodyText">
    <xsl:call-template name="StripHTML">
      <xsl:with-param name="HTMLText" select="@Body"/>
    </xsl:call-template>
  </xsl:variable>
  <tr>
    <td>
      <a href="{$WebURL}Lists/Posts/Post.aspx?ID={@ID}&amp;Source={$URL}" >
        <xsl:value-of select="@Title"/>
      </a>
    </td>
  </tr>
  <tr>
    <td>
      <xsl:call-template name="FirstNWords">
        <xsl:with-param name="TextData" select="$BodyText"/>
        <xsl:with-param name="WordCount" select="25"/>
        <xsl:with-param name="MoreText" select="'...'"/>
      </xsl:call-template>
    </td>
  </tr>
</xsl:template>

As a side note, I always store these “utility” functions in a separate file for reuse and use the xsl:import tag to pull them into the DVWP I’m working on.  The import should go before the xsl:output tag, as below.

<xsl:import href="/Style Library/XSL Style Sheets/Utilities.xsl"/>
<xsl:output method="html" indent="no"/>

Similar Posts

97 Comments

  1. Hi Marc
    I have an issue whereby I’m using crosslist and pulling KPI columns with HTML to display the indicator to the rollup site. The problem I have is the column is showing text (string;# to be exact) in front of the tag. Basically I just want to show indicator, do I need to strip out the html to display the indicator as per your suggestion above? I was hoping to use a simple substring before command to remove the string;# and simply display that within the table.

    Your thoughts?

    Thanks
    Wayne

    1. Wayne:

      Substringing should solve it for you. something like:

      <xsl:value-of select="substring-after(@ColumnName, 'string;#')"/>
      

      M.

  2. Hi Marc, This is really great article and helped me a lot.
    Is there any way to show first N words in rich html format i.e. first N words without loosing original formatting of text?

    Thanks for the article,
    Amol

    1. Amol:

      That’s a lot trickier, as you’d need the template to figure out what a “word” was even wrapped in HTML tags. I’ve never tried to make it work, as it doesn’t seem worth it.

      M.

  3. Hi Marc,

    Thanks for the XSL code! After a few struggles I was able to get it working properly.

    One question. Is there a way to get the “…” part be a link to show the rest of the body text should someone click on it?

    Thanks,
    Robert

      1. Hi Marc,

        Thanks for the info. That sounds easy enough.

        One more question though. How do I dynamically put in the link for the latest item in the announcement list? The announcement list just shows the body text of the latest entered item. I’m just wondering how I would put that in as a link.

        Thanks again for you help,
        Robert

        1. Robert:

          Take a look at line 9 in the third code sample in the post. That’s where the link to the post is emitted.

          M.

          1. Hi Marc,

            One strange thing is something as simple as this isn’t showing up:

            <a href="http://intranet-dev" rel="nofollow"></a>
            

            If I change to something like “Click Me”, then the link shows up.

            I’m not sure why the value of the Title item isn’t showing up with the link.

            Cheers,
            Robert

            1. Robert:

              Unfortunately, WordPress strips out a lot of code. I’ve rescued what I could see above.

              If you want to show the Title as the link, it would be something like this:

              <a href="http://intranet-dev" rel="nofollow"><xsl:value-of select="@Title"/></a>
              

              M.

  4. Marc

    I have being finding your posts really useful, particularly this one which I have been able to implement on my blog site. I was interested to know whether you ever did come across a way to adapt the template to allow for images to be displayed as ideally this is what I would like to do.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.