IRC log for #ldstech on 20120826

00:21.06*** join/#ldstech TimRiker (~timr@bzflag/projectlead/TimRiker)
00:21.06*** mode/#ldstech [+o TimRiker] by ChanServ
00:21.32TimRikerscgallafent, any update? should we try the windows solution? ie: rebooting the server?
00:22.21TimRikerhmm. still getting the out-of-service page sometimes, and load above 30.
00:22.26TimRikergoing to try a reboot.
00:42.44scgallafentTimRiker, just got back.
00:43.05TimRikerrebooted, no effect. shutting down wiki again. what have you learned?
00:43.15scgallafentLooks like the reboot didn't solve it.
00:43.25scgallafentIt looks like the wiki is the problem.
00:43.42scgallafentI found that about half the wiki execute time was in the parser, but wasn't able to pin down exactly where.
00:43.56scgallafentI'm going to tweak the wiki out of service redirect ... hold on.
00:44.13TimRikerit's back in place now...
00:44.44scgallafentHow long has the wiki been OOS?
00:45.31TimRikerjust moved it...
00:45.49TimRikersee comment above. 3 minutes?
00:46.53scgallafentOK. Load was still pretty high.
00:47.09scgallafentI'm setting the wiki so you and I can hit it but anyone else gets sent to out of service.
00:48.59scgallafentOK. If you're signed in, you should be able to hit https://tech.lds.org/wiki/
00:49.21scgallafentLoad is back down below 3.
00:50.50scgallafentI figure we're down to one of a handful of issues:
00:51.00scgallafent* Change in the VM environment slowing things down
00:51.17scgallafent* Some kind of server issue (disk space / database corruption / etc)
00:51.19scgallafent* Content
00:51.40scgallafentThere were some edits earlier today that I'm going to roll back.
00:55.19scgallafent:( No better
00:58.05scgallafentruns back down the wiki rabbit hole
01:01.26TimRikercpitt's text counting patches?
01:04.29scgallafentI tried disabling that earlier and didn't see any effect. That should only affect page updates, not reads.
01:04.47scgallafentJust added my timing code back to the wiki index page.
01:05.43scgallafent34.69s to process the wiki index page. Woohoo!
01:06.10scgallafentDisabling extensions again.
01:11.44scgallafentTimRiker, vastool/vasd were at 90%+ a second ago. Any reason they would be working that hard?
01:12.53TimRikerI think that's normal for them, as in I've seen them do it before.
01:17.45TimRikerswitching computers.... brb
01:41.14*** join/#ldstech TimRiker (~TimRiker@bzflag/projectlead/TimRiker)
01:41.14*** mode/#ldstech [+o TimRiker] by ChanServ
02:48.08*** join/#ldstech scgallafent- (scgallafen@conference/ldstech/x-vhjaxdryhlpgrqma)
02:48.08*** mode/#ldstech [+o scgallafent-] by ChanServ
03:08.36*** join/#ldstech TimRiker (~timr@bzflag/projectlead/TimRiker)
03:08.37*** mode/#ldstech [+o TimRiker] by ChanServ
04:59.59*** join/#ldstech scgallafent (~scgallafe@c-67-160-61-229.hsd1.wa.comcast.net)
04:59.59*** mode/#ldstech [+o scgallafent] by ChanServ
05:04.07*** join/#ldstech TimRiker (~timr@bzflag/projectlead/TimRiker)
05:04.08*** mode/#ldstech [+o TimRiker] by ChanServ
05:05.10TimRikerscgallafent, got a test from smootar saying the wiki is down. :/
05:05.32scgallafentYes. I've been trading texts with him. Wiki is up now, but slow. Side effect is that the forum is down.
05:05.33TimRikertried copying the db, renaming the old one, putting in the new one, same result.
05:05.51TimRikerwiki is back down again.
05:05.53scgallafentI spent quite a bit of time profiling the code. It's just generally slow.
05:05.59scgallafentWhat does load look like?
05:06.03TimRikerI moved your change to Localsettings and re-enabled it.
05:06.35scgallafentGoing to try reconnecting to VPN. I may disappear for a minute.
05:06.53scgallafentAm I still here?
05:06.57TimRikerload was just over 30 and nothing was getting done.
05:07.04TimRikeryep you're still here. :)
05:07.12scgallafentI think we take the wiki down and leave it until someone can look at the VM.
05:07.14TimRikersmootar hopes to drop in later.
05:07.36TimRikerI got a voice mail from a tech. he says the vm looks fine.
05:07.54scgallafentCan't reach the server. Messing with VPN settings again.
05:07.55TimRikerheavy load, but normal.
05:09.31*** join/#ldstech scgallafent- (scgallafen@conference/ldstech/x-ekscqtvyfenxiqor)
05:09.32*** mode/#ldstech [+o scgallafent-] by ChanServ
05:09.52scgallafent-Now I think I've got everything connected.
05:09.58scgallafent-bemoans VPN again
05:10.46scgallafentI did some testing earlier on the wiki code. At one point I had a section that was pretty "clean" (basically branches and variable updates) that took 0.5s to complete.
05:10.49scgallafentSomething isn't right.
05:11.26scgallafentWiki is showing out of service because the code in LocalSettings isn't commented.
05:11.33scgallafentDo we want it disabled right now?
05:15.43TimRiker<PROTECTED>
05:16.02TimRikerscgallafent, yes. the rest of the site dies if we enable it.
05:16.13TimRikerI'd rather have the wiki down than the whole site.
05:16.14scgallafentOK. Wasn't sure if you had intentionally disabled it.
05:16.19scgallafentAgreed.
05:16.41scgallafentForums are still showing server too busy.
05:16.54scgallafentI'm not sure what method vBulletin uses to determine "too busy."
05:16.56scgallafentlooks
05:17.02TimRikerload at 6
05:17.20TimRikerdon't know what time period vbulletin looks at.
05:18.00scgallafentdlhace should be in at some point and we can have a party on the server
05:18.40TimRikeryea
05:19.40scgallafentI left the word count extension disabled on the wiki. Everything else is enabled.
05:19.53scgallafentI tried disabling different extensions with no visible improvement.
05:21.00TimRikernods
05:21.32scgallafentSeeing anything in your database file?
05:22.02scgallafentThe only thing with the database that raises my eyebrows is that there were some changes earlier this morning with Chinese work.
05:22.18scgallafentThe time frame is pretty close to when you started getting errors.
05:22.41TimRikernope. as the chinese edit was around the time things started, I'm trying a conversion from latin1 to utf8
05:22.58scgallafentThat's not a task for the faint at heart.
05:23.14scgallafentThere haven't been many edits since then. Maybe we should try a backup from just before the edits.
05:23.32TimRikerhmm. that's a good idea..
05:23.46scgallafentGive me a minute and I'll grab the content off the page.
05:24.00TimRikerI just made wikidb_orig as a copy of the bad one.
05:24.43scgallafentI've got the one Chinese page on screen. Go ahead and roll back the db.
05:25.42TimRikerrolling back to the 0446 backup
05:26.08scgallafentLooks like the first edit was at 04:59, so just about perfect.
05:26.50scgallafentwonders if we really need the Community Project Handbook page in Chinese
05:36.08TimRikerwell, if it didn't break things, then sure. :)
05:36.35scgallafentThe jury is still out on that.
05:37.04scgallafentWaiting for mysql restore is two steps up from agonizing.
05:42.07TimRikerhehe
05:42.28TimRikerfires up wizard101 to pass the time
05:43.55scgallafentLooks like it may be finished
05:44.25scgallafentpokes TimRiker to check
05:46.18*** join/#ldstech TimRiker (~timr@bzflag/projectlead/TimRiker)
05:46.18*** mode/#ldstech [+o TimRiker] by ChanServ
05:46.34TimRikerhmm. xchat crashed
05:46.51scgallafentLooks like the restore may be finished
05:46.56TimRiker"restore" finished. how are things?
05:47.04scgallafentChecking....
05:48.00TimRikerstill a lot more cpu usage than I'd expect. :(
05:48.20scgallafentPage load is still ugly. 44.8 seconds.
05:48.26TimRikerugh
05:48.42scgallafentI'm going to start bleeding where I'm scratching my head.
05:49.07TimRikerheh
05:49.07scgallafentdlhace asked if caching was running. I see memcached in the process list. Should something else be running?
05:50.06TimRikerrestored the apache client limit.
05:50.20TimRikeryes, memcached does appear to be running ok.
05:50.45TimRikerlistening ok, entered in localsettings, etc.
05:51.28TimRikerwe were hanging with timeouts when it was down. that's back when the server was first setup.
05:51.41TimRikerI did try restarting memcached too. no effect.
05:52.22scgallafentI'm baffled.
05:52.30scgallafentWould something have been upgraded automatically?
05:52.44TimRikernope. it would show up in the logs.
05:53.05TimRikerI manually applied a few updates earlier today to see if they helped. no change. look for yum in the logs.
05:54.02scgallafentbrb
05:55.23scgallafentback
05:55.48scgallafentCode doesn't appear to have changed. Database has been rolled back.
05:55.51scgallafentWhat are we missing?
05:56.06scgallafentIs there an external service we depend on that is running slow?
06:01.37TimRikercould be, but why just the wiki? we use ldap, but not every call, should be in session and just used on login.
06:02.35scgallafentI don't think we even hit LDAP on login. We added a WAM plugin with the upgrade.
06:03.27TimRikerhmm. yep.
06:03.39TimRikermy head hurts
06:05.10scgallafentForum is happy with wiki disabled. I really don't want to go back down the MediaWiki rabbit hole.
06:06.30TimRikeryeah. I'm still thinking, and may try a utf8 conversion, but I don't really have high hopes.
06:06.54scgallafentI'm still stuck trying to come up with an idea on what could have changed.
06:07.02TimRikerme too
06:07.28scgallafentThere was a core network change, but that was overnight Thursday/Friday.
06:12.36TimRikerright
06:15.20scgallafentCPU usage is still higher than I would expect. Four httpd processes above 10% seems high.
06:17.43TimRikernods
06:18.05scgallafentJust tested /ldshelp. Page loads there are marginal (1.5 to 4 seconds).
06:24.08scgallafentDid your UTF-8 rewrite finish on the .sql file?
06:26.21TimRikernot yet
06:52.25scgallafentNext thought:
06:52.48scgallafentThe wiki is probably getting the worst of the slowdown because it's arguably the most complex code we've got.
06:53.14scgallafentWith the wiki disabled, the httpd processes are still overloaded.
06:53.31scgallafentThey're not getting that much traffic right now.
06:53.36scgallafentWhy so busy?
08:53.51*** join/#ldstech mailman0 (~mailman0@ip72-201-159-66.ph.ph.cox.net)
09:02.07*** join/#ldstech Spennig (~Spennig@pool-108-11-215-19.hrbgpa.fios.verizon.net)
16:25.38*** join/#ldstech scgallafent (scgallafen@conference/ldstech/x-hogbjncwropaviyz)
16:25.39*** mode/#ldstech [+o scgallafent] by ChanServ
20:21.52*** join/#ldstech Spennig (~Spennig@pool-108-11-215-19.hrbgpa.fios.verizon.net)
20:22.21*** part/#ldstech Spennig (~Spennig@pool-108-11-215-19.hrbgpa.fios.verizon.net)

Generated by irclog2html.pl Modified by Tim Riker to work with infobot.