working title:

Development monitoring through automated and continuous tests

worked on by: Daniela Benze

Outline

Currently the Saros project is also monitored through two test suites; however the suites are not performing as desired. While tests have been written, there has been little to no resources to maintain or improve performance of these tests. This thesis focuses on improving consistency and performance. This includes determining causes of unhelpful test results as false positives, speeding up execution and possibly improving error messages and warnings.

To accomplish this the following tasks are planned:
  • an analysis of false positive tests results, resulting in an understanding of the most frequent causes and which changes will result the most noticeable improvements.
  • rewriting tests to be independent of the GUI, calling interfaces directly and logging platform warnings and error messages instead of simply recording GUI states.
  • improving performance of single tests to increase the number of possible tests in a timely fashion.

Thesis Requirements

  • Identifying causes of false positives in contrast to real faults. Formulating solutions to differentiate significant faults within tests execution, so that test results only highlight issues that necessitate review.
  • Evaluating and improving performance, without compromising correctness and reliability of the test suites.
  • Possibly formulating guidelines to facilitate further test development.

Milestones and Planning

Milestone no.Sorted ascending Past days CW Goals target accomplished wrench
1 41 implementation of automated test repetition to rule out failures due to glitches  
2   log expansion to include essential information in main test logs  
3   start-up and tear-down optimisation of individual tests  

Weekly Status

Week 3 (CW 38)

Activities

  • basic statistics on builds.
  • setting up environment for local tests

Results

  • basic structure for tests results
  • platform to run additional tests

Next Steps

  • catalogue information within tests, comparison between false positives, "stable" builds and generated, true positives

Week 4 (CW 39)

Activities

  • some test log generation, mostly analysis of collected results/logs

Results

  • very little, sadly. as a result this analysis is put on hold

Next Steps

  • implement automated retrying of failed tests. this should include logic to recognize failed tests, then re-queue (up to twice) those and merge all results into a jenkins-readable log file.

Week 5 (CW 40)

Activities

  • (sick)(again)
  • determining formal requirements of jenkins log files, stf tests

Results

  • sketching script requirements to expand test execution.

Next Steps

  • adding a second block to test queue, generated during/after the execution of the first, original & full test set, consisting of all failed tests.
  • should include merging logs for each repeated test.
  • should be a .sh script. Only calling the script should be found in the Jenkins GUI
  • possibly expanding logs messages

Week 6 (CW 43)

Activities

  • copying Jenkins job STF_Test, named STF_Test_debug to test modifications, will be triggered manually
  • adding a second block of tests to the test execution. This block should contain a list of failed tests from the first block.
  • test results/logs from repeated tests should include output of both runs.

Results

  • Some incongruities have caught my eye: even the fresh copy of the jenkins job "Test_STF" does not behave like it's original. Most importantly the workspace is not reliably cleared at the beginning of a new run, resulting in all subsequent runs failing immediately. For now a forceful clear is included to allow testing of script changes. (see above) But eventually the cause should be found!

Next Steps

  • make the debug job copy runnable to develop shell script & changed job configuration regarding test queue.

Week 7 (CW 44)

Activities

  • Jenkins job mod - see above - continue and finish this week

Results

  • At least one test fails due to an unhandled window pop-up. This is clear only from the user-specific log files. This information should appear in the main log! Logging should be expanded to include the error-lines from user logs in the main log. This would expand the main log by max 4 lines for every test error; so it should be negligible.
  • The failure above stems from a removed, otherwise redundant, component. A substitute svn should be found and the window should no longer appear.

Next Steps

  • include error lines from JUnit user logs in main log.
  • find a alternative Saros svn.

Week 8 (CW 45)

Activities

  • theoretical research and setting up the written thesis.

Results

  • found test evaluation to use. Getting up to speed on taxonomy and methodology.
  • sketch of thesis structure. what should/could be included

Next Steps

  • back to implementation. resume log expansion.

Week 9 (CW 46)

Activities

  • log expansion and clean-up. (slow progress)
  • reevaluated my overview of the project. (this caused some mistakes) made detailed sketch of the structure from my perspective.

Next Steps

  • finish log expansion.

Week 10 (CW 47)

Activities

  • log expansion. should be finished this week.
  • add test repetition to the Saros-Eclipse-STF-Nightly job.
  • possibly finding cause of job copy problems. see results week 6 (nope. no time)

Results

Next Steps

  • next task! speed-up test start-up & tear-down.
  • possibly: clean-up test start-up & tear-down, if it is an immediate improvement.
  • possibly: finding cause of job copy problems. see results week 6 or (better) find a reliable way to standardise job config.

Week 11 (CW 48)

Activities

Results

Next Steps

*

Week 12 (CW 49)

Activities

Results

Next Steps

*