Share and Enjoy !

Find this Podcast “Best Practices for Updating SharePoint Large Farm Environments” on the ThreeWill Soundcloud, Stitcher, and iTunes.


Danny:Hello and welcome to the ThreeWill Podcast. This is your host, Danny Ryan and today, I have Matthew Chestnut here with me. How you doing, Matthew?


Matthew:I’m doing well, Danny. I guess it’s time for our quarterly podcast.


Danny:It’s our quarterly pow wow here. Where we spend a little bit of time together. It’s good. I get to catch up on what your recent adventures have been. It sounds like from catching up with you, it sounds like you’ve been doing a lot of administrative type of work.


Matthew:Yeah, we’ve got some Sustainment customers. This goes back to our previous conversation we had a few podcasts ago about our Sustainment practice here at ThreeWill. I’m rolling off a particular development project that I was on and I’m about to start another one. In the meantime, there’s always something to be done for our sustainment customers. In this example, we’ve got a couple of customers who are both running SharePoint 2013 completely unrelated but just coincidental that they’re both running SharePoint 2013. In-house installations, and among some other things, they need some programming, some other work done. They wanted to get their farms up to date to a more recent patch level with SharePoint.


In the SharePoint world, they come out with patches monthly, pretty much so monthly. These are notice cumulative upgrades or cumulative updates. There’s a process that you use to apply these updates. Typically on a running SharePoint farm, you can always take the principle that if it isn’t broke, don’t fix it. Because anytime you introduce something new, there’s a potential for a problem. If you are an administrative type person to read the release notes and keep an eye on the comings and goings of SharePoint, you’d know specifically what these various patch levels, CU’s, cumulative updates, et cetera, addressed.


In this example, one of the farms, there’s a test in staging farm and a development farm all leading up to production. Both of these customers have implemented best practices in regards to SharePoint because they’ve got these variety of farms. In other words, they just don’t have a development machine and production. They go through the entire life cycle and they promote the application to the various stages so it can be tested.


It’s easy for these machines to get out of sync or to an older level of SharePoint. We started with one customer. They had three levels. They had staging development, Jimmy, development staging test, QA and production. Everything goes great on staging and QA. They had a couple of littler quirks, but what it turns out, one of these you have to watch out for when you’re doing a SharePoint upgrade is, staging and QA took maybe a couple of hours. I was thinking it would only take about thirty minutes.


In my world and my development machine, we do our own administration. With our machines, it’s easy to upgrade because our content databases are relatively small and we don’t have a whole lot of them. The upgrade works pretty quickly. I get used to that. Even though this is on very good hardware, these production systems, this one particular production instance had something like fifty content databases, when in QA and staging, it only had about three.


What took thirty minutes, forty-five minutes to complete in those smaller environments on a larger environment, it took close to three hours to do just this one task. Of course, when you schedule these things after-hours, those after-hours can get quite long as you’re waiting for this process to finish. One thing I do when I’m watching or monitoring this screen because all you see is a screen that it says processing, please wait. The challenge is, SharePoint gets to this hundred percent level and then sits there for a long, long time.


A couple of things to look at as an administrator is always keep an eye on your ULS logs. Your logging. In the log directory that’s specified in your SharePoint configuration, there’s a folder that you specify where to put the log files. In that folder, there is the log file for general SharePoint usage. It’s the .log. Also there’s the PC diagnostic type log files and then the upgrade log files.


I use a little utility that just sits on the screen and monitors those files. Any time some content is written to the log file, it just displays on the screen. I’m monitoring the activity as the SharePoint process continues. That gives me a little bit of peace of mind that something is happening when it’s just sitting there spinning and seems like it’s doing nothing.


I noticed there was a certain pattern happening that as it went through the various content databases, it would take anywhere from a minute to two minutes to do each one. Then it got stuck on this one, and ended up taking about forty-five minutes to do this one. When you take all these little two minutes, three minutes, five minutes, ten minutes, forty-five minutes, it starts adding up.


Just as a recommendation, when you’re doing your upgrades, be certain of course to try it on staging first and QA first before you dive into production. Always keep in mind that the log files are your friend. Sometimes the stuff in there is cryptic, but just the fact that something is being written to those log files, gives you peace of mind that something is happening.


Also, don’t be concerned if you see some really weird stuff that looks like error messages because you got to remember, these are developers that are writing these log files and it’s typically for other developers. Or perhaps, even for their use. Some of the messages may look scary, but it’s okay. In other words, don’t get concerned if you see these weird messages because it’s probably progressing. If it fails, it will tell you that it fails and it will give you some error messages there that will help guide you to what the root cause of the problem is.


Danny:Now while you’re doing these upgrades, like on production, is the site down for that period of time? What happens when people try to access the SharePoint site?


Matthew:Now, there’s a commonsense question from a guy who just wants to use SharePoint, doesn’t want to be bothered with all this upgrade stuff. In essence, you’re pretty much down for a certain window. In the upgrade process with SharePoint, there’s basically like three major steps. You install the binaries which is the actual upgrade package itself. This includes the cumulative updates. It’s just the standard exe that you run on the server. It extracts and writes the binaries in the appropriate locations et cetera.


Typically, when that gets done, you have to reboot. Obviously, when you reboot, that’s going to take the form down. Then you have to do what’s known as a PS config step or run the configuration wizard. In this example of one customer where they had two servers perform. They had an application server and a web front end to help spread the load, that meant that this binary update had to be done in two different locations. The reboot had to be done twice, and this PS configuration step that the configuration of the upgrade had to be done twice.


Some of these you can do concurrently like the binaries you can do concurrently. The reboots you can do concurrently, but for the actual configuration step, I think it’s recommended you do it on the application server first. The server that has most of the stuff on it. That’s the one that would take a long time because it’s updating the content databases. It’s updating the schema. It’s updating the registry and all those kind of stuff as it goes through it. Then the other web front end, usually it’s pretty quick. Pretty quick, I mean it could be five to ten minutes for those to get done where it may have taken seven hours for the other one to get completed.


There is some techniques for high availability. In other words, if you’ve got a true production farm in this one example, it was production but it was an intranet site, so they could deal with outage overnight. That’s why they didn’t mind it being out from 8:00 PM to 3:00 in the morning. In a true production environment where you really, really want to minimize your down time, there is this concept of a high availability type upgrade. Basically what you’re doing is, you’re cloning the farm. You’re decommissioning the old farm so that you can update it while the users are using this clone farm. Then once the upgrade is done on the original farm that you really care about, then you switch it back.


That goes beyond my pay grade. I’m just a developer guy and I know administration because I use common sense and I read the manuals and I follow the steps. There’s definitely guidance from SharePoint in regards to this high availability concept. For customers that depend on SharePoint that they can’t have an outage of a few hours, that is a route that you may want to consider.


Danny:You’re upgrading obviously to SharePoint 2016?


Matthew:Actually, this was a 2013 place upgrade. It was an upgrade in place.


Danny:It was upgrading to …


Matthew:Upgrading in place. In other words, they were running SharePoint 2013 already but SharePoint comes out with these cumulative updates periodically, and they wanted something more recent.


Danny:You’re just patching it with … Okay.


Matthew:It gives people peace of mind if they’re running something more current and it makes sense.


Danny:Got you. That both of those, it was the same case, you’re not updating to SharePoint 2016?


Matthew:That’s correct. Even though you’re still within the same product family, sometimes you get some new features. SharePoint will add some features and capabilities during their life cycle but in this example, it was just getting new binaries. Then one other thing I wanted to point out with this other customer that I was doing their upgrade on. This was a “simple” upgrade, but this one was, I’m still working in the staging or test environment and it just happened to be a single farm and it was non-production so I could have full control.


What I mean by full control is sometimes I have to work with system administrators to schedule a reboot. In other words, I just can’t reboot the machine. I got to tell them it’s going to reboot or they need to reboot it, because they’re monitoring these systems for uptime and stuff. Here I am, I’ve got my own environment, everything’s going good, and the installation fails with the very helpful message of, “Your installation has failed. Period.”


After a lot of head scratching, et cetera, and looking at log files, because I practice what I preach. I review the log files and it gave me some clues. Like I mentioned, these clues sometimes aren’t very helpful because they’re developer type messages. It turns out as I was watching the upgrade occur and in progress, I was also watching the disk space. This particular server had five or six disks. Sure enough, the C drive, which of course is where your operating system is configured, and which always gets filled with operating system upgrades, it was getting below one gigabyte of disk space.


Now, you think one gigabyte is a huge amount. In these days, it’s not. It turns out I had to clean out some temporary files. I had to move the temporary directory to another drive basically to get about three to four gigs of free space. Once I did that, then the actual binary upgrade portion worked as expected.


Unfortunately, the message I got, installation failed, period, wasn’t very useful. It is a bit frustrating. We, as SharePoint experts, still get frustrated with this product. Why couldn’t they come up with just a little bit of hint that says, “Hey, take a look at this disk space. It looks pretty low to me.”


Danny:Yeah. With some installations, some software, it’ll check it before you …




Danny:Check to make sure there’s enough space before it actually does the installation.


Matthew:That was a little frustrating and we collaborate amongst ourselves here at ThreeWill. We have our own intranet, so of course I can put that up on the bulletin board if you would, so that every other person could see it. They commensurate with me, my other coworkers who have done similar things and run across other, maybe not the exact same issue, but similar issues. It is interesting. It is good that we do collaborate like that, because kernels of knowledge like that do stick in the back of your mind. You’ll just remember that, “Oh, if I’m doing an upgrade, let me make sure I have enough free space on my C drive so I don’t run into a problem later.”


Danny:Yeah, absolutely. Now are you starting off on a new project now? Or what are you up to recently?


Matthew:Yeah, we’ve got another set of sustainment projects that are coming. In the sustainment world, we have these what I consider one-off tickets. They might be a simple task that take five minutes, ten minutes, an hour, a couple hours, but also we have miniature projects. We’ve got a miniature project coming up where a customer is transitioning from SharePoint 2010 to SharePoint 2013, and we have some applications that we have done for them and it’s time for them to migrate to this new platform, to 2013.


Now, in the SharePoint world, that in and of itself is pretty straightforward. We do that quite often internally just to test to make certain, and we can tell our customer, “Hey, we just took your application that was on SharePoint 2010. We installed it locally and ran it. Everything looks good.” The snafu here is that the IT group that governs this SharePoint installation has decided that certain techniques and technologies that we used from SharePoint are no longer supported from them for this time forward in the future. In other words, we have to re-architect a few of our things that we did.


That in and of itself doesn’t seem like a big thing, but unfortunately, these things were really melded into the application. Things like event receivers and sandbox processes. All things that were best practices, three, four years ago, and now have come, there’s always a better way to do it. They want to go with the new better way. Which is fine, but that’s the project. We’re going to take the applications that are running fine today. We’re going to move them to the new environment and we’re going to change the architecture enough. In an essence, it future-proofs them. It takes away some of the SharePoint dependencies on some of the technology that SharePoint provided, but it does require some additional work.


Danny:That probably gets them a step closer to either moving into the cloud or to SharePoint 2016 as well.


Matthew:Absolutely. Once again, spoken by a true businessman, you’re looking at the big picture. They don’t realize that sometimes and when we say, “Well, granted we have to do this work but the benefit of doing this work is if you want to go to the cloud in the future, you’re that much closer to doing so.” Back in the old days of SharePoint old days, just a few years ago, farm solutions were the way to go. You’d write code. It would run on the farm. It would do everything you wanted it to do.


If you wanted to take that code and run it in SharePoint online in the cloud, it’s not going to happen. We start decoupling the application code from SharePoint. We start using some of the more modern technologies and frameworks to help make this code more portable so it can run in-house or in the cloud. That’ll be more cloud-ready than it is now. Now whether or not the customer chooses to go to the cloud or not, it’s certainly up to them but the application will be ready.


Danny:Great. Thank you so much for these updates and for spending time with me quarterly. Good luck on the upcoming projects that you’ve got going on and thank you so much for all your hard work. Thank you for staying up to 3:00 AM in the morning here when you did. Thank you for all your hard work Matthew, you are awesome.


Matthew:You’re welcome Danny.


Danny:Take care everybody. Thank you for listening and have a wonderful day. Bye-bye.



Share and Enjoy !

Related Content: