Yes, Virginia, You Can Get More than 5000 SharePoint Items with REST

If you haven’t been paying any attention, you might not know that I loathe the 5000 item limit in SharePoint. I may have mentioned it here and here and here and a bunch of other places, but I’m not sure. But given it’s there, we often need to work around it.

No 5000 item limit

I’ve written this little function before, but alas today I needed it again but couldn’t find any previous incarnations. That’s what this blog is for, though: to remind me of what I’ve already done!

In this case, I’m working with AngularJS 1.x. We have several lists that are nearing the 5000 item level, and I don’t want my code to start failing when we get there. We can get as many items as we want if we make repeated calls to the same REST endpoint, watching for the __next link in the results. That __next link tells us that we haven’t gotten all of the items yet, and provides us with the link to get the next batch, based on how we’ve set top in the request.

Here’s an example. Suppose I want to get all the items from a list which contains all of the ZIP codes in the USA. I just checked, and that’s 27782 items. That’s definitely enough to make SharePoint angry at us, what with that 5000 item limit and all. Let’s not get into an argument about whether I need them all or not. Let’s just agree that I do. It’s an example, after all.

Well, if we set up our requests right, and use my handy-dandy recursive function below, we can get all of the items. First, let’s look at the setup. It should look pretty similar to anything you’ve done in an AngularJS service. I set up the request, and the success and error handlers just like I always do. Note I’m asking for the top 5000 items, using "&$top=5000" in my REST call.

If there are fewer than 5000 items, then we don’t have a problem; the base request would return them all. Line 32 is what would do that “normal” call. Instead, I call my recursive function below, passing in the request only, even though the function can take two more arguments: results and deferred.

The recursive function simply keeps calling itself whenever it sees that the __next attribute of the response is present, signifying there is more data to fetch. It concatenates the data into a single array as it goes. In my example, there would be 6 calls to the REST endpoint because there are 27782 / 5000 = 5.5564 “chunks” of items.

Image from https://s-media-cache-ak0.pinimg.com/564x/16/bb/03/16bb034cb3b6b0bdc66d81f47a95a59f.jpg

Image from https://www.pinterest.com/pin/323062973239383839/

NOW, before a bunch of people get all angry at me about this, the Stan Lee rule applies here. If you have tens of thousands of items and you decide to load them all, don’t blame me if it takes a while. All those request can take a lot of time. This also isn’t just a get of of jail free card. I’m posilutely certain that if we misuse this all over the place, the data police at Microsoft will shut us down.

In actual fact, the multiple calls will be separated by short periods of time to us, which are almost eternities to today’s high-powered servers. In some cases, you might even find that batches of fewer than 5000 items may be *faster* for you.

In any case, don’t just do this willy-nilly. Also understand that my approach here isn’t as great at handling errors as the base case. Feel free to improve on it and post your improvements in the comments!

Retrieving the Content Type in a REST Call in SharePoint 2013 On Premises

Content Type

I’m working in a SharePoint 2013 on premises installation, and I needed to get the ContentType back from a REST call. Content Type is often the most important piece of metadata in a list or library, with the detailed metadata for the Content Type coming in as secondary. It certainly matters which Content Type you have if you’re building custom UIs like I am.

On SharePoint Online, this works:

/_api/web/lists/getbytitle('MyList')/items?
$select=ContentType/Name&$expand=ContentType

But on premises, it doesn’t. Nothing comes back for the ContentType at all.

I found this great little tip from eko buried in a StackExchange comment – with no up votes. Up votes matter, people, so make sure you use them.

Simply add ContentTypeId to the $select and you’re in business:

/_api/web/lists/getbytitle('MyList')/items?
$select=ContentTypeId,ContentType/Name&$expand=ContentType

We shouldn’t need to do this sort of workaround, but it’s great when we can!

Making Your REST Calls Simpler by Changing the Metadata Setting

When you talk to SharePoint using REST, you have some options about what the data you get or send looks like. Early on, when REST first arrived on the scene for SharePoint, we HAD to specify odata=verbose. This meant that we had to be very specific about what our data was going to look like, especially when it came to updates.

If you don’t use odata=verbose, then you’re telling the server to accept something different. You’re basically telling the server what “dialect” of odata you are choosing to speak for the current transaction.

If you look at this post, you’ll see the options:

JSON Light support in REST SharePoint API released

Just to recap, the options are:

accept: application/json; odata=verbose

This is probably what you’re used to, and requires the heaviest payload. Using this option, you’ll get the maximum amount of information about the data coming your way, and you’ll also have to send more as well.

accept: application/json; odata=minimalmetadata

This is the middle ground. You’ll get some metdata, and you’ll also need to send some, but it’s less.

accept: application/json; odata=nometadata

This option means we won’t get any metadata and we also don’t have to send any.

accept: application/json

If you don’t specify the odata setting at all, it will default to odata=minimalmetadata.

You can use these settings both on the inbound and outbound part of your REST calls.

Content-Type: "accept: application/json; odata=minimalmetadata"

means that you are sending data that contains minimal metadata.

Likewise,

Accept: "accept: application/json; odata=verbose"

means that you want to receive data with full metadata information.

In many cases, you’ll want to use odata=verbose as you’re debugging and switch to odata=nometadata once you move into production. The metadata tells you a lot about the data coming and going and can be helpful as you build up your calls. If you don’t make that switch, there will be more data going down the wire, though in many cases that doesn’t matter too much. Since I learned to code back when nibbles were expensive storage, I tend to want to reduce data size as much as I can, though.

Unfortunately, I don’t think many SharePoint developers realize what all this can do for you. I hear people say all the time that REST calls are too “chatty”. In many cases, that’s simply not true; you developers are making things too chatty!

This allows you to switch from something like:

var data = {
  "__metadata": {
    "type": "SP.Data.FC_x0020_RunsListItem"
  },
  "Title": run.RunID,
  "RunDate": run.RunDate,
  "OperatorId": run.Operator.Id,
  "DataIDs": dataIds.join(","),
  "FCData": angular.toJson(run.samples)
};

var request = {
  method: 'POST',
  url: url,
  data: JSON.stringify(data),
  headers: {
    "Accept": "application/json; odata=verbose",
    "content-type": "application/json;odata=verbose",
    "X-RequestDigest": document.getElementById("__REQUESTDIGEST").value,
    "X-HTTP-Method": method,
    "IF-MATCH": "*"
  }
};

to:

var data = {
  "Title": run.RunID,
  "RunDate": run.RunDate,
  "OperatorId": run.Operator.Id,
  "DataIDs": dataIds.join(","),
  "FCData": angular.toJson(run.samples)
};

var request = {
  method: 'POST',
  url: url,
  data: JSON.stringify(data),
  headers: {
    "Accept": "application/json",
    "content-type": "application/json;odata=nometadata",
    "X-RequestDigest": document.getElementById("__REQUESTDIGEST").value,
    "X-HTTP-Method": method,
    "IF-MATCH": "*"
  }
};

This is a real example from some of my code in a client project. It may not look like a huge savings, but if you are passing a lot of data either way, it can make a difference.


UPDATE on 3 May: Tip from Mikael Svenson (@mikaelsvenson) – If you’re running on premises SharePoint 2013, you may need to enable the multiple metadata formats for JSON. See:

Getting the Reply Count for a SharePoint Discussion Using the REST API

The other day I was creating a custom UI for a SharePoint 2013 Discussion list. To me, the out-of-the-box UI in SharePoint 2013 is definitely a step forward from previous versions. It’s still pretty rudimentary, though, and my client wanted something “more like other forums”.

SharePoint’s Discussion lists are sort of like Document Sets, in that the original post is a Discussion Content Type which inherits from Folder and the replies are the Message Content Type, which inherits from Item. So there aren’t any Documents involved, but Discussions are once again glorified Folders.

Since I’m using KnockoutJS on this project, I can make the UI look like pretty much anything. One thing we wanted to display was a reply count per thread. This seemed easy enough, but it could get expensive to retrieve all of the replies just to count them for the UI. Unfortunately, there was no obvious column (like “Replies” maybe?) to give me the answer.

Trolling around SharePoint StackExchange, I ran across a post by my friend Rothrock entitled How to get ItemChildCount of DocumentSet (folder) using REST Api asking a pretty similar question, but in his case it was about Document Sets. There ought to be a field like ItemChildCount or something, but he couldn’t find it, either. We’re used to a field like that using SPServices and SOAP, but there didn’t seem to be anything analogous in REST.

Luckily, someone who goes by ECM4D answered Rothrock with this example. (There were typos, but this was the idea.)

/_api/web/lists/getbytitle('your_list')/items?
  $select=ID,Title,Folder/ItemCount
  &$expand=Folder/ItemCount
  &$filter=FSObjType eq 1

By expanding the Folder, we can get at an ItemCount easily, just as we’d like to expect. This is yet another example where the documentation for the REST services simple doesn’t go deep enough to help us. Because much of the documentation is example-based – and this isn’t in any of the examples – we’re out of luck.

I ended up with something pretty similar, and it works great. Note that the ItemCount includes *all* replies, which means replies to the original post as well as replies to replies. If you wanted just replies to the original post, you’d need some other method.

MyProject.Promises.Discussions = $.ajax({
  url: _spPageContextInfo.webAbsoluteUrl +
    "/_api/web/lists/getbytitle('Discussion')/items?" +
    "$select=ID,Title,FileRef,IsFeatured,Created,Author/Title,Folder/ItemCount" +
    "&$expand=Author,Folder",
  method: "GET",
  headers: {
    "Accept": "application/json; odata=verbose"
  }
});

This call – using jQuery’s $.ajax function – gets me the basic info about the Discussion items I need:

  • ID and FileRef let me provide links for the user to use to go deeper into the threads
  • Title is the Subject of the original post
  • IsFeatured tells me if it’s a featured post so that I can highlight it in some way
  • Created and Author tell me who started the thread and when
  • Folder/ItemCount is that mysterious count of replies I was looking for

All pretty easy, really, once I found that thread on SharePoint StackExchange. One of the best ways to learn the ins and out of the REST endpoints is to troll those public forums. Be forewarned, though: you’re just as likely to find something that doesn’t work as something that does.

Get all SharePoint Document Library Files and Folders at a ServerRelativeUrl in One REST Call

Recently, I was building a directory tree view of a Document Library for a client. Yes, you can say this shouldn’t be necessary. We can just tag the documents with metadata and we won’t need folders at all. Unfortunately, that’s not always the way people want to work.

To build this, I started by calling the _api/Web/Lists/getbytitle('Document Library Name')/items endpoint. I figured I’d just get all the documents in the library and sort out the display from the array. I got this working pretty well and in my test environment with a few hundred documents, it worked great.

Then – boom. There was a little requirement I didn’t know about: quite a few of the Document Libraries where we wanted to use this had more than 5000 documents. I stupidly hadn’t thought of that possibility. In my experience, it’s pretty unusual to see Document Libraries with that many documents, though it definitely happens.

Why does 5000 documents matter? Well, as of SharePoint 2010, we went from being able to request as many items as we wanted to an item limit of 5000. I’ll admit it’s not usually a great idea to request that many items from the client, but sometimes we need to.

I had two choices:

  • Add some paging logic to the call. This would mean that if there were more than 5000 items, I’d simply make Math.ceil(documentCount / 5000) calls to get them. In smaller Document Libraries, it would still be one call; as the number of documents went up, it would take more calls.
  • Be smart about it. Just request the objects (files and folders) in the root of the Document Library, and then on-demand, only request what I needed as the user expanded each folder.

The former would have been a little easier, but with larger libraries those calls could get pretty slow. The latter idea was really the “enterprise” way to do it. The problem was that the _api/Web/Lists/getbytitle('Document Library Name')/items endpoint didn’t really give me the right info to do it well.

So I turned to a different endpoint: _api/Web/GetFolderByServerRelativeUrl('folderRelativeUrl'). This is a newer endpoint and is designed for doing stuff like this. We can pass in the relative URL – maybe something like “/sites/SiteA/SubSiteA/LibraryName/TopFolder/SubFolderA/SubFolderB” – and get back just the files and folders for that relative path.

It takes two calls for this, though:

  • _api/Web/GetFolderByServerRelativeUrl('folderRelativeUrl')/Folders
  • _api/Web/GetFolderByServerRelativeUrl('folderRelativeUrl')/Files

That would work, but it seemed a bit inefficient. Wouldn’t it be better to get the files and folders at the same time?

Off I went to Bingle. Luckily, I found a post on SharePoint StackExchange pretty quickly from jkr asking the same thing: Get all Files and Folders in one call. Vadim Gremyachev replied with the trick.

_api/Web/GetFolderByServerRelativeUrl('folderRelativeUrl')?$expand=Folders,Files

With this one call, we can get the info about the file and the folders together in one complex object.

Figure 1: Complex object returned from _api/Web/GetFileByServerRelativeUrl()/$expand=Files,Folders

Figure 1: Complex object returned from _api/Web/GetFileByServerRelativeUrl()/$expand=Files,Folders

As I said, this endpoint is perfect for building something like a directory tree.

There’s not a lot of good documentation for this endpoint (surprise!). You can find some examples of calls on the MSDN page Files and folders REST API reference, but no examples of the results. If you download the SharePoint 2013 REST Syntax (wall posters) you get some more clues.

The Files result provides results like those shown in Figure 2. As far as I can tell, there’s no way to control what fields you get back, as using $select has no effect.

Figure 2: Complex object returned from _api/Web/GetFileByServerRelativeUrl()/Files

Figure 2: Complex object returned from _api/Web/GetFileByServerRelativeUrl()/Files

The Folders result provides results like those shown in Figure 3. As far as I can tell, there’s no way to control what fields you get back here either, as using $select has no effect.

Figure 3: Complex object returned from _api/Web/GetFileByServerRelativeUrl()/Folders

Figure 3: Complex object returned from _api/Web/GetFileByServerRelativeUrl()/Folders

Note that in the Folders results, there are also Files and Folders objects, so the idea of recursion is there, though the objects are deferred. Because each folder has a ServerRelativeUrl value, you can dig as deep as you need to.

If you know you only need to go a few layers deep, you can also do things like:

_api/Web/GetFolderByServerRelativeUrl('folderRelativeUrl')?

$expand=Folders,Folders/Folders,Folders/Folders/Folders

or

_api/Web/GetFolderByServerRelativeUrl('folderRelativeUrl')?

$expand=Files,Folders/Files,Folders/Folders/Files

Both of these calls will get you three folders deep, which may be enough for some things you might want to do. I could also see using a call like these latter ones to get a bit ahead of your user to reduce the “chatter” on the line. That would make your array processing on the client side a little more complex, but could be worth it.

With some spiffy recursion in your framework of choice, you can build some very nice user interfaces with data like this. But that’s for another post…