V+Ü Empirical Evaluation in Informatics (Empirische Bewertung in der Informatik) SS 2011

This is the homepage of the lecture (Vorlesung) "Empirische Bewertung in der Informatik" (Empirical Evaluation in Informatics) and its corresponding tutorial (Übung).

Description

As an engineering discipline, Informatics is constantly developing new artifacts such as methods, languages/notations or concrete software systems. In most cases, the functional efficiency and effectiveness of these solutions for the intended purpose is not obvious -- especially not in comparison to other already existing solutions for the same or similar purpose.

For this reason, methods for evaluating the efficacy of these solutions must be a routine part of Informatics -- a fact which unfortunately only slowly has become recognized. Evaluation is needed by those who create new solutions (that is in research and development), but also by the users, as these need to evaluate the expected efficacy specifically for their situation. These evaluations need to be empirical (that is based on observation), because the problems are nearly always too complicated for an analytical (that is a purely thought-based) approach.

This lecture presents the most important empirical evaluation methods and explains where these have been used (using examples) and should be used, how to use them and what to consider when doing so.


Administration

Lecturers

Requirements/target group, classification, credit points etc.

see entry in the KVV course syllabus

Registration

Mailing-list: All participants need to be member of the mailing-list SE_V_EMPIR. (Please enter both your given name and family name.) Via this list important information and announcements will be sent. Please sign in individually.

KVV (course syllabus): All participants need to have registered in the KVV.

Dates

  • The lecture is held Mondays from 10:15 to 11:45 in room 049, Takustr. 9,
  • The tutorial takes place Mondays from 12:15 to 13:45 in room 049, Takustr. 9
  • Written exam: Monday, 2011-07-18, 11:59, Hörsaal, Takustr. 9 (first Monday after the end of term)
  • Post-exam review (Klausureinsicht): Wednesday, 2011-10-19, 15:59 until at least 16:30, room 051, Takustr. 9

Examination modalities

Necessary criteria for obtaining the credit points:
  • Completion of at least (n-1) practice sheets
  • active participation in the tutorial
  • passing of the written examination


Content

Note concerning links

Some of the linked documents can only be accessed from the FU net (externally you receive a 403/Forbidden: "You don't have permission to access ...").

Attention: The practice sheets are now to be found in a separate section Practice Sheets.

Lecture topics

The lecture divides into three sections:
  • Introduction (3 weeks): Introduces the basic ideas of empiricism and discusses quality characteristics for empiricial studies (lectures 1 to 3).
  • Methods (7 weeks): Presents basic aspects of and approaches to various empirical methods and illustrates them with concrete examples from the scientific literature.
  • Data analysis (2 weeks): Empirical studies always generate raw data first which may partly be of qualitative and partly of quantitative nature. The research results only arise from the data's analysis and interpretation. The topic of the analysis of quantitative data is so comprehensive that you may dedicate an entire degree to it (statistics).
    This section gives the first introduction to the analysis of quantitative data. (The completely different analysis of qualitative data is beyond the scope of this lecture.)

The individual lectures:
  1. (11.4.2011) Introduction - The role of empiricism:
    • Term "empirical evaluation"; theory, construction, empiricism; status of empiricism in Informatics
    • Hypothetical examples of use
    • quality criteria: reliability, relevance
    • Note: scale types
  2. (18.4.2011) The scientific method:
    • Science and methods for gaining insights; classification of Informatics
    • The scientific method; variables, hypotheses, control; internal and external validity; validity, reliability, relevance
  3. (2.5.2011) How to lie with statistics:
    • When looking at somebody else's conclusions from data: What is actually meant? What specifically? How can they know it? What is not said?
    • Does the measurement distort the meaning? Is the sample biased?, etc.
    • Material: book on the topic; Study on alternative ink; article with arguments against hypothesis testing: "The earth is round (p < 0.05)".

  4. (9.5.2011) Empirical approach:
    • steps: formulate aim and question; select method and design study; create study situation; collect data; evaluate findings; interpret results.
    • example: N-version programming (article, reply to the criticisms against it)
  5. (16.5.2011) Survey:
    • example: relevance of different topics in Informatics education (article)
    • method: selection of aims; selection of group to be interviewed; design and validation of the questionnaire; execution of the survey; evaluation; interpretation
  6. (23.5.2011) Controlled experiment:
    • example 1: flow charts versus pseudo-code (article, criticized prior work)
    • method: control and constancy; problems with reaching constancy; techniques for reaching constancy
    • example 2: use of design pattern documentation (article)
  7. (30.5.2011) Quasi experiment:
    • example 1: comparison of 7 programming languages (article, detailed technical report)
    • method: like controlled experiment, but with incomplete control (mostly: no randomization)
    • example 2: influence of work place conditions on productivity (article)
  8. (6.6.2011) Benchmarking:
    • example 1: SPEC CPU2000 (article)
    • Benchmark = measurement + task + comparison; problems (costs, task selection, overfitting); quality characteristics (accessibility, effort, clarity, portability, scalability, relevance) (article)
    • example 2: TREC (article)

  9. (20.6.2011) Data analysis - basic terminology:
  10. (27.6.2011) Data analysis - techniques:
    • Samples and populations; average value; variability; comparison of samples: significance test, confidence interval; bootstrap; relations between variables: plots, linear models, correlations, local models (loess)
    • Article: "A tour through the visualization zoo"

  11. (4.7.2011) Case study:
    • example 1: Familiarization with a software team (article)
    • method: characteristics of case studies; what is the 'case'?; use of many data types; triangulation; validity dimensions
    • example 2: An unconventional methods for für requirements inspections (article)
  12. (11.7.2011) Other methods:
    • The method landscape; simulation; software archeology (studies on the basis of existing data); literature study;
    • example simulation: scaling of P2P file sharing (article)
    • example software archeology: code decline (article)
    • example literature study: a model of the effectiveness of reviews (article)

  13. (oops, term is already over!) Summary and advice:
    • Role of empiricism; quality criteria; generic method; advantages and disadvantages of the methods; practical advice (for data analysis; for conclusion-drawing; for final presentation); outlook

Aims of the tutorials

  • Tutorial 1 to 3 (concerning R)
    • To get to know the possibilities of a free, comprehensive and modern statistics software and gain basic skills with it.
    • To get to know a new way of thinking for programming ("programming with data") and practice it.
    • Realize how enlightening a data analysis may be in some cases and how useless in others.
  • Tutorial 4 to 9 (project: empirical study)
    • To have gone through the design process of an empirical study oneself and to realize how many aspects must be considered.
    • To experience how many good ideas you may have and how many others possibly are still missing.
    • To realize how important it is to work accurately (because a correction of mistakes is often impossible and usually causes a huge amount of extra work).
    • To have had the gee-whiz experience of analysing data which nobody else in this world has seen so far.

Practice sheets

(These links will be added continuously as the course proceeds.)

Groups

Group number Team members Research question Target group
1 Dohse, Eickelberg, G*** v** M******, Schlott Wie organisieren Sie Ihr E-Mail-Aufkommen? Die Umfrage soll so aufgebaut sein, dass Personen, die regelmäßig E-Mails verwenden und sich mit dem Nötigsten des Computers auskennen, als Teilnehmer in Frage kommen. Durch diese geringen Anforderungen soll eine möglichst hohe Zahl an Teilnehmern erreicht werden.
2 Dames, Dittwald, Busse, Prüm, Otte Nimmt mit fortschreitender Lehre die Nutzung von Versions Control Systemen (VCS) zu? Studenten der Informatik (HU, FH Trier)
3 Kühl, Eckstein, Kresse, Bischoff Welche Faktoren beeinflussen die erwartete Studiendauer? Bachelor und Master Studenten Informatik an verschiedenen Unis; Mailinglisten der Informatik Institute
4 Klick, Mitlmeier, Öchsner Schutz persönlicher Daten von Informatikstudenten Teilnehmer der Vorlesung Rechnersicherheit, Teilnehmer des Softwareprojektes Übersetzerbau, Mailingliste Informatik in Karlsruhe und Würzburg
5 Lashchyk, Olivera Viscarra, Siripanya, Kölbel Welche Aspekte des Studiums sorgen für die größte Zufreidenheit der Informatikstudenten/Innen der FU? Die Informatik Studenten/Innen der FU
6 Dohnert, Bitterling, Kostulski, Lengsfeld Welche Wirkung hat die Beteiligung an einem Open-Source- Projekt auf das Informatikstudium? Informatik Studenten
9 Koeckeritz, Poehle, Schroeter, Niemeier Wie ausgeprägt ist das Sicherheitsbewusstsein bei verschiedenen Benutzergruppen bezüglich der Facebook-Appnutzung? Alle Facebook Nutzer.

Changes over the years

  • 2010: Lecture and tutorial both held in English.
  • 2004: Lecture first held.
  • 2005: Lecture: only minor changes. Tutorial: broader choice of topics for the surveys.

Literature


(Comments)

Should you have comments or suggestions concerning this page, you may add them here (possibly with date and name):

"wer zuerst kommt, mahlt zuerst" ist natürlich ein geniales verfahren für die zuteilung der mailinglisten! der erste trägt sich mit all@fu-berlin.de (ich weiß, die gibts nicht) ein und da alle anderen mailinglisten nur teilmengen dieser sind, können die anderen sich an den daumen spielen, oder wie? -- DennisHartrampf - 14 Jun 2009

Nein. Wenn mehrere Gruppen berechtigtes Interesse an denselben Verteilern haben, lösen wir diesen Konflikt mündlich in der Übungsstunde (wie geschehen). Niemand wird mit den Daumen spielen müssen! -- MartinGruhn - 18 Jun 2009

Es heißt "malt" nicht "mahlt", du bist ja keine Mühle

-- Main.nvm - 17 Jul 2011

"Mahlt" ist zumindest laut Fontane, Goethe und Kafka korrekt. Aber vllt. haben die Herren sich auch geirrt wink Zuerst recherchieren, dann kommentieren.

-- MaximilianSchmidt - 25 Jul 2011
 
Topic revision: r37 - 21 Mar 2012, JuliaSchenk
 
  • Printable version of this topic (p) Printable version of this topic (p)