Getting Around SharePoint’s Threshold Limits for Large Reference Lists

In SharePoint 2007, we could build lists that contained as many items as we wanted and access them in any way we wanted. We may have done stupid things that were bad for server performance, but if we knew what we were doing there were few limits. We had the mythical “2000 item” rule of thumb (which had little basis in reality in many, many cases) but otherwise it was up to us. (See: Plan for software boundaries (Office SharePoint Server))

In SharePoint 2010, Microsoft introduced threshold limits to protect us from ourselves. This seemed counter intuitive to me, since with the upgrade to SharePoint 2010, one also had to go to 64 bit architecture and put a heck of a lot more iron behind the SharePoint farm. While we could potentially store 30,000,000 items in a list (30 million!), we had a list view threshold of 5000 items. (See: SharePoint Server 2010 capacity management: Software boundaries and limits)

SharePoint 2013 maintains similar limits to 2010, with 5000 items the limit for a list view. (See: Software boundaries and limits for SharePoint 2013)

List View ThrottlingSomehow, as technology has moved forward – significantly – we’ve got more limits.

5000 items is way too many to ever show in a list view, but it may not be if you want to do some client side processing. It doesn’t matter if you’re trying to build a truly bloated, Bad Idea list view or request data using REST, you’re stuck with that 5000 item limit.

But what if you know what you are doing and the 5000 item limit doesn’t work for you? I’ve written in the past about why we shouldn’t change some of the threshold limits (e.g.,  The SharePoint 2010 “List View Lookup Threshold” and Why We Don’t Change It).

If we’re working on Office365, we simply can’t change the limits – Microsoft gets to decide what they are. Sometimes those limits change without notice, too, so it’s truly out of our hands.

Well, what does all this add up to? Let’s take a specific example.

I needed to provide auto complete functionality in an InfoPath form. It was the classic scenario: we had a list of [in this case] 19,000+ cities and towns and we needed a good UX for entering selections into the form. We run into this sort of thing with data like cities and towns all the time. There are too many choices for a dropdown to make any sense, and SharePoint doesn’t offer up any better alternatives. jQueryUI’s autocomplete function is a perfect solution for this sort of thing.

The data was in a spreadsheet, and it’s easy to upload data from a spreadsheet to a SharePoint list. Bingo, bango, done, right? Well, no. It’s impossible to upload data from a spreadsheet *and* index the column. We need to index the column so that we can make type of requests requests that we need for autocomplete – basically, give me all of the items that begin with or contain this string – or else we run into the 5000 item threshold error on our requests.

No problem! We’ll just upload the data and *then* index the column. Not so fast, mister. When you try to index the column you run into – you guessed it – the 5000 item limit. No indexee, no laundry.

So seemingly we’re stuck. But wait – we can just create the list in the UI, add the index, and paste all 19,000 items into the Brave New World of SharePoint 2013’s “data sheet view” aka “Quick Edit”. Sadly, when I tried to paste in more than a few hundred items, the browser hung on me. This was on Office365, and I tried several different browsers. SharePoint Online seems to prefer nibbles to big bites of data. Unless I spent days just doing the pasting, this was never going to happen.

In this case, I decided to simply store the data in a text file containing JSON rather than trying to force it into a list. Thanks to @bpmccullough for the suggestion on Twitter.

The data I originally had received for the sities and towns was in a CSV file. I found a nice little site that would convert the CSV format to JSON (bookmark this one; you’ll need it in this JSON-oriented world).

Once I had the JSON file uploaded to a Document Library, the code below is what I used to set up the autocomplete. We wanted the autocomplete to allow multiple choices, so there’s a little monkeying around to enable that. It wasn’t tremendously peppy (19,000 cities, after all), but it worked. Remember, kids: this is an example. Don’t expect to drop it into your page and have it work.

var citiesAndStates = [];

// Get the City and State data from the file with JSON data
$.getJSON("/Shared%20Documents/CityState.txt", function(data) {
  var results = $(data);
  // Push the data into an array which is appropriate for jQueryUI autocomplete
  citiesAndStates = ($.map(results, function(item) {
    return {
      label: item.State + " - " + item.City, // Show "State - City" [e.g., MA - Boston] for selection
      value: item.City + ", " + item.State // Show "City, State" [e.g., Boston, MA] upon selection
    }
  }));
});

// When the field is available and gets focus, set up the autocomplete behavior
//ctl00_ctl42_g_6069846d_1b7f_4890_b767_2bdc15d2b133_FormControl0_V1_I1_S15_I4_T5
$(document).on("focus", "input[id$='FormControl0_V1_I1_S15_I4_T5']", function() {
  if ($(this).data("autocompleteSet") === undefined) {
    $(this).data("autocompleteSet", true);

    $(this)
    // Don't navigate away from the field on tab when selecting an item
    .bind("keydown", function(event) {
      if (event.keyCode === $.ui.keyCode.TAB && $(this).data("ui-autocomplete").menu.active) {
        event.preventDefault();
      }
    }).autocomplete({
      source: function(request, response) {
        // Delegate back to autocomplete, but extract the last term
        response($.ui.autocomplete.filter(citiesAndStates, extractLast(request.term)));
      },
      minLength: 3,
      focus: function() {
        // Prevent value inserted on focus
        return false;
      },
      select: function(event, ui) {
        var terms = split(this.value);
        // Remove the current input
        terms.pop();
        // Add the selected item
        terms.push(ui.item.value);
        // Add placeholder to get the semicolon-and-space at the end
        terms.push("");
        this.value = terms.join("; ");
        return false;
      }
    });
  }
});

function split(val) {
  return val.split(/;\s*/);
}

function extractLast(term) {
  return split(term).pop();
}

One note here: This approach basically violates my “put all data into a list so that end users can maintain it” rule. However, the approach I came up with was the best one for the situation. It’s still possible for a reasonably savvy (meaning “non-developer”) user to manage the data by downloading the file and using a JSON-aware editor.

The main thing was that we were able to set up something that worked, regardless what the Office365 limits may change to.

38 Comments

  1. While importing large amounts of data to a list, my experience is that Access i almost the only thing that doesn’t hang totally…

    Also you should be able to import it to a termset using a csv-file and then use the OOTB autocomplete functionality. Haven’t tried with a list that big, but the limitation is 30.000 items in a term set

    Reply
    • Fredrik:

      Managed Metadata is great, unless you need to store additional data per term. For instance, if I needed the population of each city and town, a term set wouldn’t cut it. In most cases, I prefer lookup lists to Managed Metadata for just this reason.

      Additionally, since we were using an InfoPath form Managed Metadata was out.
      M.

      Reply
  2. I just found another limit, while you can create a view to display over 100 items, you cannot select more than 100 items at a time. This makes cleaning up larges lists TEDIOUS.

    Reply
  3. Nice work-around Marc… I too changed the way I think about implementing client-side solutions on SharePoint after learning (experiencing) these limitations… I even found that when using Webservices against such lists, even if the column was indexed, results were not returned…
    Any client side solution that has the risk of reaching such limits has to think about how it can get around them – which might be for that solution to have a design that can operation on multiple lists… once one starts to reach the threshold, have that solution “switch” over to a new List and possible bridge the two client side when needed. Another approach is to “archive” items by moving them to a new list (there are drawbacks: history is lost)… These approaches add much more complexity to the application.

    Reply
    • Are you aware of any workarounds for using SPServices on SharePoint 2013 to access a list which has exceeded the 5000 item threshold? The query should return far fewer items total, but in the past once we break that limit the services all shut down. I’m faced with either asking IT to increase the threshold (may not happen) or breaking the list into logical sub-sets and re-writing my interfaces and update functions to be cognizant of which list they should be looking at for their data.

      Thanks.

      Reply
            • Yes, it was SP 2010 on premises… In my case, due to the amount of re-factoring I would have to do, we ended up with implementing the change MS suggested – remove the threshold from the list (I had to get IT involved)…
              I was just curious because some queries did work (ex. if I used ID) – but maybe that is considered a “bug” from MS’s view. :)

              Reply
                • Sample B on that post is the “clever” one… look closely a the CAML filter…
                  What I found was that the queries would all fail on filters… but… were successful on filter when ID was used.. so I used this knowledge and created something like:

                  <And>
                  <Neq>
                  <FieldRef Name=’ID’/>
                  <Value Type=’Counter’>0</Value>
                  </Neq>
                  <Or>
                  … Real filters here filters you want here …
                  </Or>
                  </And>

                  The fact that I was using an condition on the Outside with the ID filter (which always evaluates to TRUE, thus including all rows in the table), allowed me to run my “real” filter in the second condition of the aggregate. It would be great to find if this 1) still works and 2) works on O365.
                  Again: this may be considered a bug from MS’s point of view… :)

                  Reply
  4. Other people probably used the function for years, but I just came about the javascript localStorage function. Would probably be great to improve performance caching large lists like this locally

    Reply
    • Fredrick, localStorage is actually fairly recent, especially in the Microsoft world. That’s definitely a tool to consider in such cases, but it wouldn’t address the initial load threshold.

      Reply
  5. Marc, I agree with your last paragraph, and also adhere to the OneList rule. Also, that’s an insane amount of data, even with the clever .txt workaround.

    Two techniques I have used in the past to address such scenarios:
    – standard cascading selection: make the user select a country or state first, then only retrieve the filtered cities.
    – let the user type the fist character (or maybe two), then only retrieve the matching cities. That’s what search boxes usually do (e.g. Google).

    Reply
    • Christophe:

      I tried to be smart about only fetching the data I needed, but I still ran into the threshold limits. Unless you have the column you are filtering on indexed, you hit the threshold. Thus my thrashing around on this.

      M.

      Reply
      • I must be missing something… Are you saying that even after applying a filter to the 19,000 items you are still returning more than 5,000?
        My point here is that it would be useless to return 5,001 items even if technically this was possible. Whatever method you rely on, an autocomplete will only make sense from a UX standpoint if it displays a maximum of 10-20 items.

        Reply
          • Paul,
            To follow up on our conversation, I just did a test to confirm that GetListItems works fine on views (viewName parameter).
            In this scenario, you could create 26 views (cities starting with A, B, etc.), and after the user types the first letter only retrieve the relevant view. 26 views might sound like a lot, but I find it a better practice than pulling the whole list on the client side. If needed, you could make it work with just 5-6 views.
            For the record, another technique I often use is page scraping. Not an option regular developers will consider, as it takes some outside of the box thinking to realize that page scraping can be more efficient than SOAP or REST for SharePoint lists ;-)

            Reply
  6. Would it make sense to chuck the data into a SQL table and then use the BCS to pull it back and potentially for edits. We use something similar (without edits) to pull Clients and Projects from our Practice Management system (~40K Clients and >100K Projects) and then use jQuery against the BCS for autocomplete.

    Reply
  7. Marc,
    For onboarding lots of data to SharePoint Online … MS Access works well for thousands of rows – I’ve used it to populate lists I have already created and indexed. I was looking for lots of rows to populate and test the list view threshold and found US census data by state delivered in Excel. I open my target list with MS Access, ensuring my fields are mapped and the ID column is not in the Access table, then copy the Excel rows directly to Access. Depending on your hardware 5000 rows only takes a couple of minutes.

    Something to try – *pieced together from research – and I still need to test it further*– Queries to large lists (with CAML)… SP does not execute the query all at once – it steps through. So if the first filter does not return less than 5000 items it will error. consistently throws errors … and a single argument doesn’t work. Two ‘s where the first returns less than 5000 and the second gets you closer to a goal of 20 – 30 items. Limit return items by itself will not cure the problem. Oh, and if the filtered column is not indexed … no joy!

    Reply
  8. I have been wracking my brain (and the internet) trying to figure something like this out.

    Our scenario is we need to reference a list of 26,000+ customers all throughout our O365 environment. Managed metadata seems like it will be too manually intensive to keep up to date. SharePoint list and BCS throttling will make it unusable.

    I am currently investigating using the JQuery Data Tables library (https://datatables.net/release-datatables/examples/api/form.html) and using ajax to import a text file that will contain a unique ID, customer name, and some other details. I created a custom Edit form for the document library, added the custom code, and made the Customer input fields uneditable. Users can select the customer from the Data Table and then using JQuery will add the details into the appropriate SharePoint single text fields.

    The Data Table greatly helps with the “quick sorting” ability of the massive list. Performance wise, loading all of that data is only taking a couple of seconds (a reasonable timeframe given the customer requirements, and seemingly no better alternative).

    The only downside of course is there is no “synchronization” of the data, so if a customer name changes, how to make those changes automatically….

    Reply

Have a thought or opinion?