Updated 05 October 2021
We're aware of some transient problems that a subset of customers are facing using Cloud Drive Mapper at the moment whereby the content of their drives temporarily goes invisible or where they experience transient errors when trying to save or create files in their mapped drives.
Here is a summary of the issue (in blue italics):
We are aware of some transient issues with drives mapped to Office 365 that can temporarily result in the contents of the drive appearing invisible. This can be resolved by the user right-clicking in the File Explorer window and hitting 'Refresh' or hitting F5 on their keyboard. The underlying reason for this behaviour is caused by the drive's webdav requests receiving a 503 'service unavailable' response from SharePoint Online, which is something that Office 365 responds with when it is under serious load. In the very short-term, users can resolve this themselves by refreshing the file explorer. But we know this is far from ideal for anything more than a very brief stop-gap. Another thing you can do is to update to the latest version of Cloud Drive Mapper (2.8.x) because we have significantly reduced the amount of traffic it sends to Office 365. This is not only likely to help mitigate the problem a bit in your own tenancy, but it is also a nice neighbourly thing to do because it will help reduce load across the whole SPO farm. In the meantime, while we can't control whether SharePoint Online occasionally provides 503 requests, we are looking into options to be able to better handle the way the drives respond to 503s to make it less disruptive to the users.
The root problem isn't something Microsoft can "solve" precisely, because the problem is that there are passing periods of time where too many people are using SharePoint at the same time. The only thing MS could do would be to build more server farms, which is possible but is certainly not a quick fix. However, from past experience, we strongly believe the prevalence of the problem is linked to the returning of universities, which has historically had a pretty big impact on cloud services. As such, we hope that these issues will resolve themselves - or at least become significantly less common - within a few days time.
This problem isn't just caused by people going to SharePoint drives or websites to access files, it's any activity at all in SPO - including permission updates, group membership updates, user provisioning - any kind of automated process, which can be especially high at this time of year as all the students are being provisioned into new sites and Teams etc. Collectively they cause massive spikes to the SPO server CPU load at this time of year - which is where the 503s come from.
What we're doing about it
Firstly, we recommend upgrading to 2.8 (if you haven't already done so) but there are some considerations around this detailed here - https://www.iamcloudstatus.com/incidents/szmx92f542f0. This may help reduce the problem because it significantly reduces the load to SharePoint. So while it will be good for your organization (albeit not a complete fix necessarily), it is also a good thing to do overall, because the more organizations that reduce their activity with SharePoint, the less likely anyone is to receive 503s. With this problem, every little helps.
However, we're also working on some solutions to stop the 503s from having such a significant impact on the drives in the first place. This is our top priority right now.
Here is the crux of the problem. With native webdav, it's retry logic only allows 1 retry attempt. This cannot be modified.
So if the calls to O365 looked like this: 503, 200, 503, 200, 200, 200, 200, 503, 200 etc, you would be fine.
But if the calls to O365 went like this: 503, 200, 503, 200, 200, 503, 503 (two 503s in a row), this results in a WebDAV failure.
We're working on a solution, which will involve the introduction of a virtual file system, to act as middleware between the drive and the webDAV connection. This is hybridising technology we've already created as part of our V3 release, so we already have most of the technology we need to make this happen.
What the middleware virtual file system allows us to do is intelligently handle retries on 503s. This means that we can prevent the drives from ever failing. The VFS also allows us to add some extra temporary caching in place to further reduce the amount of traffic going to Office 365 in the first place.
This will make a massive improvement on the stability and performance of the drives. We still have a bit more testing to do before we take the plunge on the exact architecture of the solution, but we're hopeful of having a better method to handle this proin the coming weeks.
In the meantime, we do believe the problem will lessen significantly as we move further into October, and as a worst case, users are able to resolve the problem themselves by hitting right-click Refresh inside File Explorer or F5. But we appreciate this is not ideal. We'll keep you updated on progress of our solution, but it will make CDM significantly stronger in general and another step closer to our V3 client which doesn't use webdav at all.
If you've got any questions, please contact email@example.com