Coder Diary #12 -- Automated Crash Testing, Backup Saves

John Tiller's Campaign Series exemplifies tactical war-gaming at its finest by bringing you the entire collection of TalonSoft's award-winning campaign series. Containing TalonSoft's West Front, East Front, and Rising Sun platoon-level combat series, as well as all of the official add-ons and expansion packs, the Matrix Edition allows players to dictate the events of World War II from the tumultuous beginning to its climatic conclusion. We are working together with original programmer John Tiller to bring you this updated edition.

Moderators: Jason Petho, Peter Fisla, asiaticus, dogovich

Post Reply
User avatar
berto
Posts: 21461
Joined: Wed Mar 13, 2002 1:15 am
Location: metro Chicago, Illinois, USA
Contact:

Coder Diary #12 -- Automated Crash Testing, Backup Saves

Post by berto »


Coder Diary #12 -- Automated Crash Testing, Backup Saves


Coding the Campaign Series -- it's not all fun and games.

Something not so fun: game crashes.

First, a review. Since last February, we have:

[*]A new coder (me), one fairly new to C++ and quite new to Windows programming. (I'm an old hand at C programming on Unix/Linux.)
[*]A code hand-off and new code installation.
[*]A new compiler version and development environment (Visual C++ 6.0, Visual Studio 2008).
[*]Many and significant changes to the UI.
[*]Many new features.
[*]Data updates and fixes.

And last but not least:

[*]The codebase merge.

Any one of those items has the potential to introduce bugs. Adding them all together, it's a wonder that the game runs at all, or runs so well.

A game runs not so well if it is prone to crashing. Are the latest EXEs prone to crashing? Let's find out...

You can only be so careful, you can only eyeball the code so much. You can also employ special QA tools (lint, bounds checkers, profilers, debuggers, disassemblers, etc.) to vet the code. But there is no substitute for actual game play.

The dev team's ranks have swelled in recent months, with both new and returning dev team members, joining ranks with the old-timers.

There's lots of playtesting now. But there are only so many playtesters, so many hours in the day, so many days in the week.

Which is where automated testing comes into play.

I have coded a new Test Trial Play mode. Much like Automated, Empirical A/I Testing, with Test Trial Play mode, and its A/I vs. A/I play, I have the means to test all scenarios, all games, entirely hands-off and unattended, one after another, on multiple computers, round the clock.

If game crashing bugs lurk in the code and/or data, exercising the code and data is a very good way to find them.

In Test Trial Mode, at the Cygwin command line, I can test a single scenario via, for example (in East Front):

Robert@roberto /cygdrive/c/Games/Matrix Games/John Tiller's Campaign Series/East Front
$ ./ef.exe -W -T Gotha.scn

Better still, I might test the 20 smallest East Front scenarios with:

Robert@roberto /cygdrive/c/Games/Matrix Games/John Tiller's Campaign Series/East Front
$ for scn in `ls -1S *.scn | tail -n 20`; do ./ef.exe -W -T $scn; done

With just a single command, I could launch a massive test of all 205 East Front scenarios via:

Robert@roberto /cygdrive/c/Games/Matrix Games/John Tiller's Campaign Series/East Front
$ for scn in *.scn; do ./ef.exe -W -T $scn; done

If game crashing bugs lurk in the East Front EXEs and data, by means of such testing, I will surely come to know about them!

The proof is not just in the playing; the proof is in the testing too.

After a bulk test run, how do I identify the scenarios that crashed and prematurely aborted? I have coded a new Perl script, btlchk.pl (not shown), to check the save files to determine if the last saved turn matches the max scenario turns. If the last saved turn is less than the max, I know that the scenario crashed. (Auto Save is activated.) If btlchk.pl reports problematic scenarios, I know where next to direct my bug hunt.

Another way to determine crashes: As each test trial ends, at the point of Victory display, the game engine EXE outputs the scenario name, as in:

Robert@roberto /cygdrive/c/Games/Matrix Games/John Tiller's Campaign Series/East Front
$ for scn in `ls -1S *.scn | tail -n 20`; do ./ef.exe -W -T $scn; done
Borisov.scn
Pogorelo.scn
Kharkov43.scn
Prokhorovka.scn
01odessa.scn
Tractor factory.scn
Rappards.scn
Szekesfehervar.scn
Venskyula.scn
Moscow.scn
...

I can compare the `ls -1S *.scn | tail -n 20` list against the test trials output list. If any scenario is missing from the latter, I know that scenario crashed.

If I run hundreds of scenario tests, and if too many of them crash, that strongly suggests that there are problems in the EXEs. (Since the legacy scenarios are unchanged, and through the years there have been no reports of widespread failure.) But if those same hundreds of scenario tests produce just a few game crashes, it tends to suggest bad scenario data (else in the referenced OOBs).

Of course the happiest result is no crashes. Is that what we have? Let's see...

For the past couple of weeks, on two test systems (Windows 7, Windows XP), I have been running in parallel a series of test trial games of nearly all West Front scenarios, from the smallest to the largest (in terms of SCN file size) -- ~175 in total.

Why parallel series? It's because I want to compare the two series results to see if a scenario crashes in one series but not the other, or if the A/I freezes in one series but not the other. (A/I freezes? More about that in a bit.) Especially if a scenario crashes or freezes in both test series, I know there is a real problem, worth investigating.

~10 days after launch, on the faster Windows 7 test system, the test trials had reached scenario 156 out of 173. (I would have been further along in the test sequence, were it not for three A/I freezes. Overall, I had lost about a day of run time while the system waited, idle, for me to abort the frozen test and relaunch the tests at the next scenario in sequence.) At the 10 day mark, I decided to abort the test trials on that system, as I wanted to proceed with a Windows 8.1 update. (That system is dual-boot Windows 7/Windows 8.)

I am happy to report that (apart from three A/I freezes) none of the 156 test trial games prematurely aborted, indicating a crash. The game engine appears to be quite robust!

...

15 days after launch, on the slower Windows XP test system, the test trials have gone as far as scenario 158 out of 173. (The second system is slower mainly because the graphics are so much slower.) I am letting this series of test trials run all the way to the bitter end.

Since I am testing the scenarios from smallest to largest, each individual trial is taking longer and longer, as the test passes from one scenario to the (larger) next. The pace is slowing down. Now at testing the very large Complexity 9 scenarios, progress is glacial on this slower XP system.

On this second system, I am again happy to report: So far, no crashes!

Of course, automated, hands-off test trial game plays don't:

[*]Exercise the UI. There may still be UI-related bugs, and game crashes.
[*]Exercise PBEM, HotSeat, or other modes of play.
[*]Reveal logic errors, graphics glitches, etc.
[*]Deal with unexpectedly weird player actions and other serendipity.

And although I have tested over 150 different WF scenarios, there are still many hundreds more scenarios in the other games to test. (I selected WF, because it has the most varied scenarios overall.)

But at least it's good to know that the engine and routines for movement, combat, etc. appear to be basically sound.

Automated Test Trials -- from now on, a standard tool in my QA toolkit. I will run them from time to time to test new code, and especially before any public updates and new releases.

Now, what about those three A/I freezes?

For Brolo.scn (Sicily, August 1943). I observed this test scenario freezing. The Allied landing craft approach the beach, pile up in a heap at the shoreline, offload a few units, then ... no further activity, nothing. The A/I stops cold, goes dead, cannot be unfrozen or revived.

This freezing: Is it an artifact of the A/I self-play? Is it due to something I've done?

To help answer these questions, I tried a conventional human vs. A/I game of Brolo.scn using an archived West Front installation dating back to 20130201, before I signed onto this project, and using the CS 1.04 EXEs and data.

With me playing the Axis and the A/I playing the Allies: the A/I controlled Allies run through most of their turn one, then ... the A/I freezes. The turn does not auto advance. I can't force a Next Turn. Deactivating then reactivating the A/I accomplishes nothing. Nothing I can do. It's game over. On turn one.

At least for Allied A/I play, Brolo.scn is a broken scenario. Can I debug this?

One way I might try to debug this is real-time, interactively, with the VS debugger. Of course, I myself can't play too many test games in interactive debug mode, stopping at preselected break points, and single stepping through portions of the code, inspecting dozens of variables each step of the way. Impossibly time consuming. Talk about wild goose chases!

I could write into the code all manner of logging to record game progress points and to dump key data values. Not only would that slow the game down terribly, it would generate huge amounts of log data. Talk about finding the needle in the haystack!

Debugging these A/I freezes would require days and perhaps weeks of study. But I've got better and more important things to do in the near term.

What should we do? Ignore this? Yank Brolo.scn from the official installation? Leave it in, but in the Scenario Information add the comment: "Best played human-to-human, or human Allies vs. A/I Axis. Do not play human Axis vs. A/I Allies, as the Allied A/I is broken!" This is probably the solution we will adopt. In a second test game, I confirmed that the defending Axis A/I is fine, does not freeze. Then of course there is the PBEM alternative. No need to toss an otherwise perfectly good scenario because one mode of play freezes. We'll just give fair warning.

Are there other scenario with a similarly broken A/I? Yes, so far there have been two others, also amphibious assault scenarios. Jason has confirmed that amphibious assault scenarios with a freeze-prone attacker A/I is a known problem of the Campaign Series.

This emphasizes the need, I think, for a new Backup Saves option. If the game crashes, or if the A/I ever goes dormant and cannot be revived, in all cases we want to give the player the option to revert to an earlier save. Given how the game random number generation works, there is a good chance that from the earlier save game start point, a replayed game turn will not arrive at the very same dormant A/I or game crash end point. That could still happen. I must try my best to make it impossible, and do my best to debug it. (Remember: The problem might lie in the game data, not the game code.) But throwing the player a Backup Saves lifeline may be the best thing we can practically do.

Backup Saves? Done!

For this, I've decided to KISS it. Each save file has up to three versions:

[*]foobar.btl
[*]foobar OLDER.btl (previously foobar.btl)
[*]foobar OLDEST.btl (previously foobar OLDER.btl)

Here it is in action:

Image

And another screenshot, from one of the crash test trial systems:

Image

More justification for Backup Saves:

[*]Occasionally, players report corrupted saves. I myself have encountered them with JT games. Backup Saves provide fallbacks.
[*]It also caters to players who want to go back and play earlier turns, for whatever reason -- to undo a stupid move, to try a different strategy, to try their luck.

...

I've been very busy. But so too the Dev Team. Other team members are too shy or too tongue-tied or too busy to report their activities, but I can report: Much work is being done on

[*]OOB development and fixes
[*]new scenario creation
[*]new maps
[*]graphics fixes (including Warhorse's splendid new 2D unit symbols)
[*]new sounds, and sound fixes
[*]etc.

Men at Work.

It's tons of effort putting together, also testing, games of this complexity and scope. We understand that the player community is impatient for the next Campaign Series update(s), also for Modern Wars. If it's any consolation:

[*]We're impatient too!
[*]We're working as hard and fast on this as we can!

"Good things come to those who wait."

Until the next time...
Attachments
BackupSaves3.jpg
BackupSaves3.jpg (198.39 KiB) Viewed 192 times
Campaign Series Legion https://cslegion.com/
Campaign Series Lead Coder https://www.matrixgames.com/forums/view ... hp?f=10167
Panzer Campaigns, Panzer Battles Lead Coder https://wargameds.com
tide1530
Posts: 103
Joined: Thu Apr 14, 2011 10:32 pm

RE: Coder Diary #12 -- Automated Crash Testing, Backup Saves

Post by tide1530 »

WOW!! [;)]
User avatar
wings7
Posts: 4586
Joined: Mon Aug 11, 2003 4:59 am
Location: Phoenix, Arizona

RE: Coder Diary #12 -- Automated Crash Testing, Backup Saves

Post by wings7 »

[:D] [&o]
Please come and join and befriend me at the great Steam portal! There are quite a few Matrix/Slitherine players on Steam! My member page: http://steamcommunity.com/profiles/76561197988402427
User avatar
junk2drive
Posts: 12856
Joined: Thu Jun 27, 2002 7:27 am
Location: Arizona West Coast

RE: Coder Diary #12 -- Automated Crash Testing, Backup Saves

Post by junk2drive »

bump 12
Conflict of Heroes "Most games are like checkers or chess and some have dice and cards involved too. This game plays like checkers but you think like chess and the dice and cards can change everything in real time."
User avatar
berto
Posts: 21461
Joined: Wed Mar 13, 2002 1:15 am
Location: metro Chicago, Illinois, USA
Contact:

RE: Coder Diary #12 -- Automated Crash Testing, Backup Saves

Post by berto »


11+ days after launch:

[*]My current WF crash test trials are on the 167th game. No (white) crashes!
[*]My current EF crash test trials are on the 183rd game. 182 successes, one possible crash.

That is, the numbers represent X test trial games of X different scenarios. Possibly one EF scenario test crashed. I will have to investigate.

Automated crash tests, on two different test systems, will continue round the clock throughout this pre-release month. The automated test trials are now a standard part of our operation. (Automated, Empirical A/I Testing will resume in future.)

All in all, we actually have a very stable game here. To be sure, the game is not perfect, particularly not the A/I. (Improving the A/I is on the to-do list.) But white crashes? Virtually non-existent! We are trying very hard to give you the best, most robust game possible. If occasionally, rarely, there are slip-ups, it won't be for our lack of trying.
Campaign Series Legion https://cslegion.com/
Campaign Series Lead Coder https://www.matrixgames.com/forums/view ... hp?f=10167
Panzer Campaigns, Panzer Battles Lead Coder https://wargameds.com
scottintacoma
Posts: 192
Joined: Fri Jan 25, 2008 1:15 am

RE: Coder Diary #12 -- Automated Crash Testing, Backup Saves

Post by scottintacoma »

Thanks for the update.

And to all the team, thanks for the hard work!!!
Post Reply

Return to “John Tiller's Campaign Series”