Microsoft Flow: Updating File Metadata When You Have More Than 5000 Items

3 minute read

It’s been way too long since I did a blog post. Let’s remedy that, shall we?

Paul Culmsee (@paulculmsee) has been doing some great work with Microsoft Flow and PowerApps. His daughter Ashlee even won a contest put on by the PowerApps team to make a fidget spinner app! When Paul digs into a technology, stand back, because he beats it into submission like few others. If there is a Clever Workaround – and we need those way too often – he’ll find it.

<UPDATE>

Paul’s latest post on this topic shows that we now can get the ID of the newly created item from the Create File step, which we can use with the newer SharePoint – Update File Properties Flow action. Thank goodness my tricks here are no longer necessary!  The (new current) best way to update SharePoint document metadata using Flow

</UPDATE>

Paul did a post back in August I’ve referred to multiple times as I’ve tried to use Microsoft Flow. My approach with Flow here is that I try to use it for anything new I need to automate in Office 365. If I hit a roadblock, then I revert to SharePoint Designer workflows. The good news is that I’m finding I revert less and less, though it still happens maybe a third of the time at this point. (I’m guessing, but that’s about what it feels like. I would love it to be 0%!) Unfortunately, that means rework down the road if we want to be fully “modern”.

Paul’s article The (currently) best way to update SharePoint document metadata using Flow shows us the unfortunately byzantine way we need to do things in order to update metadata on a file we’ve just created. The fact that we don’t get a useful “handle” back to the file we just created is sorta crazy, and the Microsoft Flow team knows this. In the meantime, Paul’s approach works. Well, it works unless you have more than 5000 items in a library.

I had exactly that situation. At one of my clients, we have an automated process which dumps files into a library I’ve inventively called __Temp. When a new file shows up, I run a Flow to copy the file into a place where we want to store it and apply metadata to the copied file. My problem arose when I tried Paul’s second step. Because my library has way over 5000 items, that step failed.

So, until we are able to get the ID or some other useful handle for the newly created file, I needed to come up with a cleaver workaround on top of Paul’s clever workaround.

The trick was to do the following:

  • Add an index to the Modified column in the destination list.
  • Set up a TimeStamp variable in my Flow to capture the startOfDay() when the Flow is running
  • In the GetItems step, use a filter based on the Modified column, and *then* the FileLeafRef to find the file I just created

Here it is in pictures…

Make sure you have an index on the Modified column in the target list. In my case, I have indices on several other columns as well.

In the flow, set up a variable to hold startOfDay(utcnow()) . You set this up in the Expression Builder. I initially tried to set the variable just before I created the new file, setting it to utcnow() . In theory, this should have worked, but I found I had about a 50% failure rate. This doesn’t make a lot of sense, but to be safe, I switched to startOfDay() , which has proved reliable.

Just like Paul’s example, I then create the new file.

In the GetItems step, I use the more complex filter:  Modified gt datetime'TimeStamp​​' and FileLeafRef eq 'Name' Because Modified is an indexed column, and it is the first filter in my expression, the result set is reduced to fewer than 5000 items and the query works.

Now everything works reliably. I can’t wait to reduce the complexity of this Flow when we get a handle to the newly created file in the Create File step!

Aren’t you proud of me? I wrote this whole post without whining about the 5000 item limit.

Advertisements

5 Comments

  1. Flow’s SharePoint actions continues to get improved. There is a new Update File (Metadata) action that lets us update “list fields” on a File.

    But there’s some confusion over what is the Identifier field of these new Update File Actions – they actually use the URL, not the List Item ID.

    So we now have two imperfect ways to update list fields after adding or copying files:

    Add/Copy, Query List Item ID, Update List Item (via list item ID)

    or

    Add/Copy, Update File Metadata (via assumed File URL)

    Neither are perfect. Also, when using these actions such as copying to Drop Library, Document Set or Asset Library (anything that has its own event receivers that would create subfolders and moves files) and such – the copy action can actually return ‘fail’ even though the file is uploaded anyway. Sometimes need to wrap those inside a ‘catch errors – ignore’ scope block.

    Still new, somewhat buggy, but things are moving and changing a lot, so there’s still a lot of fun.

    Reply
  2. Hello Marc, I’m currently testing the 5000 limit with the SharePoint Flow connector. It looks like when I add a document in a Huge doc lib (the source doc lib), the Flow connector doesn’t see my new document. In the workflow history I can see the Flows runs are skipped.The trigger doesn’t really work anymore, so it is not what you had ?

    Reply
    • @Serge:

      My situation here was the reverse: files are dropped into a temporary Document Library with very few – if any – documents in it and I copy them where they need to go with Flow and set some metadata.

      It sounds like you’re seeing a failure on the Flow trigger when you have 5000+ items in a Document Library? Is it the “When an item is created” trigger?

      M.

      Reply

Have a thought or opinion?