Archive for March, 2007

Major server problems.

[17:38 GMT] It seems crazed daemons have generated several gigabytes of log files. These have been deleted, so the ‘disk full’ errors are gone. Still working on the other issues though.

[23:23 GMT] The `servers` table has been repaired, and the online status system is operational again.

[23:17 GMT] Hmm. Seems the MySQL server got upset – “Table './avonline/servers' is marked as crashed and should be repaired.” That’d be why the online status system is “down or overloaded.”

[22:48 GMT] It seems that the cron daemon decided not to start at boot. This has screwed up all the stats. Gah!

This server is having serious issues. Random processes have taken to segfaulting, it thinks it’s been up for a negative amount of time, and it’s apparently executing, on average, 0.08 microqueries per second. Or roughly two per year.

Daemons refuse to start for no good reason too.

This can be traced back to someone deciding it would be a good idea to screw around with /dev. I have updated the kernel and uninstalled udev. The former made the machine bootable (into something other than into a single user system), and the latter took the load average down significantly. It’s still unreasonably high though.

I have no idea what’s wrong with it, and frankly I’m tired of messing with it. Anyone have any ideas?

TSLE Auctions

I’m working on them. Really. They’re fairly easy to implement. But making the UI is incredibly dull. Make form, validate and sanitise input, repeat. Not to mention a browsing system.

Since I don’t predict much popularity, auctions won’t have categories or tags to start with. When it becomes a pain to look through the auctions, then I’ll add those.

What they will have:

  • Images and descriptions
  • Minimum bid
  • Reserve
  • Minimum increment
  • Set end date
  • Option to set a “buy now” price (possibly)
  • Automated item delivery.
  • Cancellable, but uneditable once set up.

Soo… once I get round to UI work. I hate UIs. They’re the least fun part of making stuff.

Also on the TSLE todo list: make TSLE news posts by me get posted over here under the TSLE category.

Main Grid plus Teen Grid, or just The Grid?

Should the TG and MG really be separate?
If so, why?

Firstly, my opinion, so you know what my bias is (and will presumably remain once I finish writing the post): I like our little grid; it lets me do things completely impossible on the main grid simply due to its sheer size (statistics, anyone?). It’s also nice to be able to meet teens. However, it would also be nice to be able to visit some of these places on the MG, e.g. those covered on New World Notes.

Continue reading ‘Main Grid plus Teen Grid, or just The Grid?’

Time

Time is strange.

Looking back over the past term, it seems to have gone incredibly fast – yet last week seems like months ago. Hours are long but days are short. Months are short but weeks are long. Small time spans are stretched, while extensive ones are compressed.

Why is time so strange?

Rolling update.

This is what an update rolling across the Teen Grid looks like. It cuts out for a few minutes (seconds in the animation) where my sim goes down, causing everything to appear online:

Rolling Update Animation

Name Dropped

We’d also like to that Katharine Berry who has created an effective system for tracking performance stats for the ~100 TSL sims. Displaying the comparative measurements in a series of web-based charts has already resulted in our ability to identify one item’s brief autorez that was regularly and noticeably affecting one region’s performance. I know that Katharine’s tools will be of great help going forward.

Yay. That’s from the  February edition of the Second Opinion (published in late March for some reason). I doubt they’ll ever actually use the system again, but eh.

Stats site 404s

[17:20 GMT] This is now resolved.

I am aware that stats.katharineberry.co.uk is giving 404 errors when you try and view data for certain sims; mostly private islands. This is happening because the links to those pages didn’t previously exist. As part of the database load reduction yesterday, I have removed the checking whether these links should be generated in the first place; this has resulted in a roughly 10,000% speed improvement on generating that page.

No data has been lost where those 404s were, it’s just sending you to a page that never existed, but wasn’t linked to before.

I have an idea for fixing this without having to hammer the database, and (assuming it works) will implement this tonight.

I’m also aware that nobody actually cares enough for it to be worth me making these posts in the first place.

[UPDATED] Server issues resolved

[21:30 GMT] The issues are resolved, I think. Woo!

[21:15 GMT] Most parts of the stats gathering are up again; still working on the individual sim pages (sim graphs are fine) and the average stats graph. I think it might be dumping data too… >.<

[20:59 GMT] HUD Thingies have been re-enabled. Work on the database to avoid excessive load is ongoing.

[20:31 GMT] The problem is not currently occuring, but the system causing it to happen has been disabled; that is, the one that regenerates the pages on the stats site. I’m working on it.

[19:20 GMT] The problem has been located and is being investigated.

[19:03 GMT] The problem is happening again. Still looking into it.

[18:54 GMT] After several hours of maintenance, the server is back up. “Maintenance” meaning updating things and reconfiguring things.

Performance data during the downtime has been lost, but SL data has been logged successfully. This shows that the performance logger is bugged; this will be looked into. (It’s supposed to keep the data whenever it’s recieved, regardless of whether it can be processed yet.)

I’ll keep an eye on it and see if it breaks. Fingers crossed it’s been restored now! >.>

Server Problems

This server is having… issues. MySQL and FTP related issues.

Due to the incessent crashing, I have written a script that runs every ten minutes and checks the MySQL server. If it crashed then it’s restarted. As I type this I’m watching an updated MySQL compile in the other window, and will be updating various other services too.

How to check my server’s health in two steps:

  1. Visit http://blog.katharineberry.co.uk. If you get a WordPress error about databases, the MySQL server crashed.
  2. Visit http://katharineberry.co.uk. Check the “load average” figures for “this server”. If the first number is around 10-11, something’s broken (it shouldn’t really be above 2). If the first number is significantly lower than the other two the server is just recovering. If it’s significantly higher than the other’s it’s just about to break.

[07:26 GMT] “ctype-eucjpms.c:8472: internal compiler error: Segmentation fault”

So much for that then.

Statistics still moving along

There are 4,312,447 rows of sim stats on record.

My statistics continue to move along, and (mostly) don’t crash my server. Almost 5,000,000. Woo.  Also, we’ve passed the highest ever concurrency (and still rising), at over 39,000 online now. We do this roughly weekly.

So… does anyone have any idea as to what I could actually do with these numbers?