Core/SoarKernel/src/kernel.h contains a few relevant preprocessor symbols:
NO_TIMING_STUFF: Defining this symbol removes most timer code from the kernel.
The stats command output will be much shorter as well, as it will not include
timing statistics.
DETAILED_TIMING_STATS: Only valid when NO_TIMING_STUFF
is not defined. Defining this turns on more timers for more detailed stats for
things like chunking and GDS. Compiling out timer code can result in much
faster runs. Timers are compiled-in by default, but not detailed timing stats.
There is a timers command to enable/disable timers at run-time. This should
result in a performance improvement but not as much as compiling out the timers
completely by defining NO_TIMING_STUFF in kernel.h.
The timers are currently implemented using the STLSoft library. The STLSoft code
is wrapped in another layer in Core/shared/misc.h which define
soar_wallclock_timer and soar_process_timer. Here are the comments from that
file:
// Instantiate soar_wallclock_timer or soar_process_timer objects to measure// "wall" (real) time or process (cpu) time. Call reset if you are unsure of// the timer's state and want to make sure it is stopped. Call start and then// stop to deliniate a timed period, and then call get_usec to get the value// of that period's length, in microseconds. It is OK to call get_usec after// calling start again. Resolution varies by platform, and may be in the// millisecond range on some.//// Use soar_timer_accumulator to rack up multiple timer periods. Instead of// calling get_usec on the timer, simply pass the timer to a// soar_timer_accumulator instance with it's update call. Use reset to clear// the accumulated time.// Platform-specific inclusions and typedefs//// The STLSoft timers used in the kernel have platform-specific namespaces// even though they share very similar interfaces. The typedefs here// simplify the classes below by removing those namespaces.//// We are using two different types of timers from STLSoft,// performance_counter and processtimes_counter. The performance timer is// a high-performance wall-clock timer. The processtimes_counter is a cpu-// time timer. Keep in mind that as of 11/2010 the resolution of process-time// counters on windows is 16 milliseconds.
soar_process_timertimer;soar_timer_accumulatorstat;...timer.reset();// initialize/resetstat.reset();// initialize/reset...timer.start();// do stuff that takes timetimer.stop();stat.update(timer);// read the timer, add elapsed to stat...stat.get_sec();// total time accumulated with update();
By default, thesoar_process_timer uses an STLSoft interface called
processtimes_counter. This is fine on most systems, but has a somewhat
unacceptable 16 millisecond resolution on Windows platforms. The other kind of
timer available is performance_counter which is used by the soar_wallclock_timer. This counter has higher resolution but the timer itself takes
more time to execute (on most systems). To increase resolution on Windows
system, uncomment the symbol USE_PERFORMANCE_FOR_BOTH in
Core/shared/misc.h so thatsoar_process_timer uses the performance_counter.
Note that this higher-resolution timer is measuring things in different way than
the process times_counter and the results should probably not be compared.
/*For Soar 7, the timing code has been completely revamped. When the compileflag NO_TIMING_STUFF is not set, statistics will be now be collected on thetotal cpu time, total kernel time, time spent in the individual phases of adecision cycle, time spent executing the input and output functions, and timespent executing callbacks (or monitors). When the DETAILED_TIMING_STATS flagis set, additional statistics will be collected for ownership, match, andchunking computations according to the phase in which they occur. (Noticethat DETAILED_TIMING_STATS can only be collected when NO_TIMING_STUFF is nottrue.)The total_cpu_time is turned on when one of the run_<x> functions isinitiated. This timer is not turned off while the do_one_top_level_phase()function is executing. The total_kernel_time timer is turned on just afterthe total_cpu_time timer and turned off just before the other is turned off.This guarantees that the total kernel time -- including the time it takes toturn on and off the kernel timer -- is a part of the total cpu time. Thetotal_kernel_time is also turned off whenever a callback is initiated or whenthe input and output functions are executing.The decision_cycle_phase_timers measure the kernel time for each phase of thedecision cycle (ie, INPUT_PHASE, PREFERENCE_PHASE, WM_PHASE, OUTPUT_PHASE,and DECISION_PHASE). Each is turned on at the beginning of its correspondingphase in do_one_top_level_phase and turned off at the end of that phase.These timers are also turned off for callbacks and during the execution ofthe input and output functions.The monitors_cpu_time timers are also indexed by the current phase. Whenevera callback is initiated, both the total_kernel_time anddecision_cycle_phase_timer for the current phase are turned off and themonitors_cpu_time turned on. After the callback has terminated, the kerneltimers are turned back on. Notice that the same relationship holds here asit did between the total_cpu_time and total_kernel_time timers. Thetotal_kernel_time is always turned off last and turned on first, incomparison to the decision_cycle_phase_timer. This means that turning thedecision_cycle_phase_timers on and off is included as part of the kernel timeand helps ensure that the total_kernel_time is always greater than the sum ofthe decision_cycle_timers.The input_function_cpu_time and output_function_cpu_time timers measure thetime it takes to execute the input and output functions respectively. Boththe total_kernel_time and decision_cycle_phase_timers are turned off whenthese timers are turned on (with the same ordering as discussed previously).The input function is a little tricky. Because add-wme can be called by theinput routine, which then calls do_buffered_wm_and_ownership_changes, wecan't just turn off the kernel timers for input and expect to get numbers forboth match_time (see next para) and kernel time. The solution implemented inthe 28.07.96 changes is to not turn off the kernel timers until the actualINPUT_PHASE_CALLBACK is initiated. This takes care of all but directadditions and removals of WMEs. Since these are done through the add-wme andremove-wme commands, the input_timer is turned off there was well, and thekernel timers turned back on (for the buffered wm changes). However, this isa hack and may introduce problems when add-wme and remove-wme are used at thecommand line or someplace in the decision cycle other than input (probablyrare but possible).The DETAILED_TIMING_STATS flag enables collection of statistics on match,ownership and chunking calculations performed in each part of the decisioncycle. An 'other' value is reported which is simply the difference betweenthe sum of the detailed timers and the kernel timer for some phase. The othervalue should always be greater than or equal to zero.The "stats" command (in soarCommandUtils) has been updated to report thesenew timing values. The output is provided in a spreadsheet-style format todisplay the information in a succinct form. There are also some derivedtotals in that report. The derived totals in the right column are simply thesum of the all the other columns in a particular row; for example, thederived total for the first row, kernel time, is just the sum of all thedecision_cycle_phase_timers. The derived totals in the bottom row are thesum of all the basic timers in that row (i.e., no DETAILED statistics areincluded in the sum). For example, the derived total under input is equal tothe sum of decision_cycle_phase_timer and the monitors_time for theINPUT_PHASE, and the input_function_cpu_time and represents the total timespent in the input phase for the current run. The number in the lowerright-hand corner is the sum of the derived totals above it in that rightcolumn (and should always be equal to the numbers to the left of it in thatrow).Also reported with the stats command are the values of total_cpu_time andtotal_kernel_time. If the ordering discussed above is strictly enforced,total_kernel_time should always be slightly greater than the derived totalkernel time and total_cpu_time greater than the derived total CPU time. REW */