A program fails.
Under which circumstances does the failure occur?
Our \textsc{Alhazen}\xspace approach starts with a run that exhibits a particular behavior and automatically determines input features associated with the behavior in question:
(1) We use a \emph{grammar} to parse the input into individual elements.
(2) We use a decision tree learner to \emph{observe} and \emph{learn} which input elements are associated with the behavior in question.
(3) We use the grammar to \emph{generate additional inputs} to further strengthen or refute hypotheses as learned associations.
(4) By repeating steps 2~and~3, we obtain
a \emph{theory} that explains and predicts the given behavior.
In our evaluation using inputs for \texttt{find}, \texttt{grep}, \texttt{NetHack}, and a JavaScript transpiler, the theories produced by \textsc{Alhazen} \emph{predict} and \emph{produce} failures with high accuracy and allow developers to \emph{focus} on a small set of input features:
``\texttt{grep} fails whenever the \texttt{–fixed-strings} option is used in conjunction with an empty search string.''