Caroline Sosebee is a Software Engineer at ThreeWill. She comes to us with 20+ years of software development experience and a broad scope of general IT support skills.
Recently, I had to dig in and figure out how SharePoint 2013 Usage Analytics Reporting works since one of our clients was actively using this feature and started having issues with it due to a server that decided to whack out. This blog details the things I did in order to get it working again. It, in no way, makes me an expert on it, though. I’m only hoping this will be of some help to someone else who suddenly finds themselves struggling to figure out what went wrong with features they know nothing about. J
Fun Stuff, Part 1 – Usage Reports Showing All Zeros
Our client uses the built in Usage Analytics reporting found in SharePoint 2013 in order to determine how popular certain reports are with the users. Overall, this has worked well for them.
But recently they had a major problem with their web front end and ended up having to rebuild the box. Once the new front end was up and had been running for a few days, we checked the usage reports and found that they were showing zeros across the board. Apparently it hadn’t accumulated anything since the new box was turned on.
To troubleshoot, I began with the basics of verifying that all the settings and services needed to handle collecting the usage data (a part of Web Analytics) were on and had the appearance of working. I then checked the directory used for the .usage log files and found that they were being created and consumed. At this point, I could have checked the WSS_Usage database to be 100% sure that the data from the .usage files were being successfully stored there, but decided that that was way above my pay grade and just assumed the data was getting there ok. J
Since all seemed ok at this point, I turned to trusty Mr. Google and found this very helpful blog that explained the problem and solution very clearly. It turns out that the receivers that are usually registered to the usage definitions had been ‘disconnected.’ These usage receivers are what send the data from the WSS_Usage database along to the analytics engine of the Search Service Application (SSA) process which then turns the data into consumable bits for us. I found this link to be another great source of information on how this part of the process works.
After following the instructions to reattach the receivers and waiting the requisite 24-36 hours, there was now data again. Yay! To confirm this, I navigated to Site Settings / Popularity and Search Reports and then opened the ‘Usage’ report which showed data beginning to collect again.
Fun Stuff, Part 2 – Usage Reports No Longer Collecting Data
And then I went to sleep. When I woke up the next day, we had more front end problems which caused the server to basically stop responding to any and all requests. There were a couple of different problems that caused this, but they are an aside to this post and better left un-delved. Needless to say, a little scrambling got the server responding again, but for some reason the usage reports had stopped collecting data. Of course.
So I went down the same path as before, trying to figure out what was wrong. This time, the receivers were still attached but when I went to the logging directory, I saw that the .usage files were no longer being consumed for some of the selected events. The files were being created but were simply piling up in the log directory even though the Import timer job indicated it was running successfully.
You can determine which events are actively having usage data logged by navigating to Central Administration / Monitoring and clicking on Usage and Health Data Collection. Within the Event Selection section is a list of ‘Events to log.’ For each of the items checked, there will be a corresponding log directory created holding both the .usage files and an ImportProgress.ini file.
What I finally found was that some of the ImportProgress.ini files used by the Import Timer job appeared to be corrupted. And here’s where known fact is replaced with what worked for me, as I could find no confirmation that what I’m about to state is in fact true. But my usage data is collecting again!
Usage log files accumulate in their corresponding directories until the timer job runs (default is every 5 minutes) at which point they are processed and deleted. Each .usage file name is composed of the server name, the date, the UTC time then the date again and a time with a sequential ‘version’ number appended to the end. Each file seems to have a 1 megabyte size limit before a new .usage file is created. If multiple files are created in a single minute, they are differentiated by the trailing sequential number. (You can see in the illustration above that there were several files created during the 1723 minute.)
Inside the ImportProgress.ini file there are only two lines. The first line shows the last file that was processed by the timer job (for that event) and the second line seems to indicate the success/failure of the import. A value of -1 indicates success.
Correct .ini File
But after the server freeze, when I looked at the ImportProgress.ini files, several of them did not have a -1, but instead had some random number.
Incorrect .ini File
As I didn’t think this looked correct and the logs weren’t processing anyway, I decided that I would delete all the .usage files piled up in the directory (I actually saved them elsewhere, just in case) and updated the .ini file to have a file reference on line 1 that showed a date and time that was before the current time and a -1 on line 2. I then recycled the timer service by triggering the Timer Service Recycle job (the correct way to do this) to ensure that the new changes were picked up.
After all this, I started watching the log directories again and found that the files were back to being consumed correctly. So the next day, I checked my data and was ecstatic to see real numbers in place of zeros for the day before, telling me that all was well again.
And I’m happy to report that we haven’t had usage problems since. (I’ll continue to cross my fingers on this one, though!)