Lloydyboy's Development Blog IV

Developer Blogs & Updates

Lloydyboy's Development Blog IV

Postby Lloydyboy on June 3rd, 2010, 10:57 am

Normally I would be lazy and wait a month or so before telling everyone that I have been lazy...but this time it's different. I have actually done something awesome :)

I was investigating the Caf crash when I got seriously annoyed with the zone startup times (something which has been bugging me for a while). When you get an itch, you have to scratch it. Taking 5 mins to load for a freshly installed database is excessive. With some noticeable pauses during loading, I was convinced there were some bottlenecks.

It was time to put on my detective outfit.

I knew the bottleneck wasn't the database as it barely registers on the CPU compared to what the zone is doing...that made things easier as I'm no SQL guru.

First step was trying to find exactly what is loaded, and when. This is actually harder than you might think as the load process isn't a list. In order to improve the performance we load multiple things at the same time where they don't interfere like badges at the same time as schematics at the same time as command lists etc. and we only load sequentially things which rely on others.

Because we don't know when a particular load starts or stops...i needed to put in lots of debugging messages so i could trace the output and time how long we were loading various things.

I noticed that the first big pause was when we loaded up skills. We have 1087 skills in the database - it shouldn't take all that long, should it? After some investigation I discovered that for every skill we loaded we were sending 7 small sql queries to get other data from 7 related tables such as what the skill prerequisites are, what schematics are granted on skill training etc.

To put that another way, we were using 7000+ sql statements to load up a relatively small amount of data. The sql itself was fast as the queries were simple...but when the server itself has to then transform this data through bindings into objects for use in the code...a fairly costly process. I swapped out the 7000 sql statements for 7 sql statements. the result - no more 40sec delay. The skills get loaded in less than 1sec.

Ok, off to a good start...but there's an even bigger black hole when we get further along. After some more research, I narrowed it down to schematics. It looked like we were making the same mistake as with skill - requesting lots of little snippets of data. Only this time it was worse. much worse. We weren't requesting data on a 2 level loop, we were requesting data on a 3 level loop...and there's far more schematics than there are skills (about 6x as much). After a few evenings of refactoring the schematic loading, I got it finished last night...the result. Literally minutes reduced to a few seconds.

Finally, the longest two loads are resource map generation and height map creation/caching. We can't (for now)* get rid of these altogether, but I noticed that it takes around 20 secs to load everything except heightmap (40secs) and resource maps (25 secs). The only problem was we were telling the server to start loading heightmaps virtually last.

After swapping it round to load heightmaps as soon as possible rather than at the end, we can both load the map and the rest of the stuff concurrently.

As I was doing all this, I also took the opportunity to preallocate memory in containers where possible. This means we allocate memory in 1 big block rather than thousands of smaller bocks.

End result? Tatooine on my dev laptop has gone from >5 mins load time to <60 secs, and Tutorial now takes around 10secs to load.

But not only do we load much much faster, I am reliably informed that zoneserver now uses about 100mb less memory!

Moral of the story? Just because something has been working reliably without being touched for 2-3 years doesn't mean it can't be improved.

Now...to fix that crash bug.

* Powerking and others are working on reverse engineering the height map fractal formula from the client which will remove the need to load a heightmap in the first place. This should reduce load time of tatooine to around 30-40secs.
Euro-Chimaera Pre & Post-CU
Lloyd Pickering - Jedi (Post 9 Village)
Zoxara BE, 12pt Chef, 14pt Artisan, Manager of EMP Mall, Commerce City
Lloydyboy
SWGANH Developer
 
Posts: 122
Joined: September 7th, 2008, 4:33 am
SWG Official Server: Chimaera

Re: Lloydyboy's Development Blog IV

Postby Kronos on June 3rd, 2010, 2:13 pm

Such a great job Lloyd! This really helps out with development, I can't tell you how nice it is to compile/build and have the server and zones running a minute later...
Kronos
SWGANH Developer
 
Posts: 69
Joined: May 21st, 2010, 11:53 pm
Location: North Idaho
SWG Official Server: Wanderhome


Return to Development

Who is online

Users browsing this forum: No registered users and 1 guest

cron