Predictive Indexing Comes to Office 365 Lists and Libraries

4 minute read

The 5000-item view threshold in SharePoint lists and libraries has been a bane for many people for a long time. I’ve been very vocal about it for the community (as well as for my own selfish reasons – I have client work to do!). I have a plethora of posts tagged with 5000 item limit, most of which are aimed to help us work around it, though I admit there is a fair amount of just plain complaining, too.

SharePoint Online is finally coming into the modern age, with Microsoft firmly attacking the 5000 item limit in several new ways. In the Microsoft Ignite session Harness collective knowledge with SharePoint and OneDrive content services (ECM), Chris McNulty (@cmcnulty2000) and Ian Story (@storyid) talked through how predictive indexing will improve our overall SharePoint experience.

We’ve all seen the rhetoric here: for years lists and libraries have been able to contain up to 30 million items, yet in too many cases we see errors when we have 5000 or more items in a view. In the “modern” experience, that may mean a #sadicecreamcone.

Or a #sadflattire.

By using predictive indexing – basically adding indices as the platform sees you will need them in addition to any you add manually – the real limits of list views become much larger.

Predictive indexing rolled out in advance of Ignite, so you may already have noticed some improvements when you work with lists with over 5000 items – as long as you are in the “modern” experience. As you add grouping to views or slice and dice list content using the column headers, SharePoint may be adding indices behind the scenes for you. You can see this happening if you look at the list settings and go to Indexed columns. There, you may see that some indices have been added by the platform, not by you.

One of the side benefits of predictive indexing – which to many will feel like a main benefit – is the the Filters Pane is now “smarter”, too. As those indices are added, the platform realizes that those columns may be things you’d like to see in the Filters Pane.

So are we all set now? Well, predictive indexing isn’t a total panacea, but should reduce the number of throttling events in many real-world situations. We still have to consider the 5000 item limit in several places, unfortunately. And taking proactive steps with our own indices is never a bad idea.

First, predictive indexing doesn’t change the threshold for REST calls. If we want to get more than 5000 items from a list or library via REST, we still need to page through them or come up with come other fancy caching scheme. I’d love to be able to tell a better story than “the 5000 item limit is still in effect for REST calls”. It’s still the case that – as has been the case for a long time now – if we use indexed columns as our first filters, we can get past the first 5000 items. The first filter still needs to bring the count under 5000 items for us to avoid an error, and we can’t retrieve more than 5000 items.

If you create a list or library and immediately dump more than 20,000 items into it, you’ll run into some of the same issues we used to have at 5000 items. We can create indices on lists with up to 20,000 items, but after that we run into the same issue we used to have with 5000 items: we can’t add a new index without getting below the 20,000 item number.

This also means Microsoft Flows or PowerApps may run into the 5000 item limit, depending on what you are doing (making REST calls yourself, for sure), as they use REST endpoints in the Microsoft Graph to retrieve data. The average power user building a Powerapp will see batches of 100 (but may never realize it). Someone taking things further (like reading from another list via a connector) might see batches of 500. In other words, if you are working with large lists, you need to understand the concept of paging on some level.

So what’s the upshot here? Well, primarily that Microsoft is working hard to make the constraints on large lists go away, or at least set the limitations to much higher levels. From an end user standpoint, this is Very Good News. From a developer standpoint, I think we should view it as a work in progress. Overall, though, it shows that Microsoft knows this is a weakness in the product and they are working hard to fix it.

I’m VERY happy to see this progress, and based on conversations with people like Chris and others in the Product Group I expect to see continuous improvement. Don’t worry – I’ll keep pushing them on it mercilessly, though!

Advertisements

8 Comments

  1. Thanks for the update Mark!
    You answered my big question –indices are in addition to any you add manually. Do you know if they are increasing the maximum # of indices that can be on a list?
    Russell

    Reply
      • i’m up to 19 on one library. was set up that way by a vendor. doubt i need them all. i thin i’ll remove the ones i don’t think are needed and see if sharepoint replaces them with others.

        i’m over 5000 but under 20000

        Reply
      • I’ve had a few cases where I hit the 20-indexes-limit, for customers with many different views (for different user groups) on several thousand items (you can argue if all of those were really necessary of course).

        A related issue I ran into: we rolled out our lists and indexes using PowerShell. However SharePoint started proactively adding its own clever indexes halfway through that process. So what happened is that I suddenly get an error when I arrived at scripted index no. 18 because on the list I unexpectedly already reached 20.

        Also, I found that you cannot manipulate those auto-created indexes afterwards, they seem locked, which is understandable, but annoying nonetheless.
        So ideally you need to leave enough room for any future predictive indexes on your list, but a ‘manual prediction’ on how many indexes you need to leave open is difficult. That also kind of defeats the purpose of the predictive indexing, because it’s not something I want to have to consider upfront.

        But as you rightly said, it’s clearly work in progress. Any scenario where we can now escape the 5000-limit is another one off my list.

        -Pascal

        Reply
  2. This is in addition to my comment via twitter….

    5000 view limit has been a big pain. Predictive indexing eases it but, as you mentioned, there is still lots to be done. It is work-in-progress.
    I have encountered the following issues so far:

    Only works on modern view so if you have customization that require the classic view, you will still run into the 5000 view limit error.
    The Quick Edit doesn’t work Throws the 5000 limit error.
    Export to Excel doesn’t work. Only 100 items exported.

    My scenario:
    List with around 80 columns and 8705 records.

    Thanks for pushing hard on this issue. A client wanted to move away from SP just because of this issue. For now I am using datatables (with paging) to facilitate exporting the whole list to Excel.

    Reply
  3. Hey Mark, Awesome presentation on Collab365! Thanks. I have one question. You mentioned that you copy large lists into a document library as JSON objects to get around the 5000 limit. How on earth do you do that?

    Reply
    • @russell:

      I have some code that basically takes the list content and saves it into a text file as JSON. When I load the page the next time, I check for that cached file, hydrate the JSON if it’s there, figure out the deltas, update the file, and save it again. All that is much faster than reading from the list directly.

      M.

      Reply

Have a thought or opinion?