I am an Associate Professor of Computer Science at the University of Göttingen, where I am leading the Computer Security Group. Prior to taking this position, I have been working at Technische Universität Berlin and Fraunhofer Institute FIRST. I am a recipient of the CAST/GI Dissertation Award for Computer Security and a Google Faculty Research Award.
My research interests revolve around computer security and machine learning. This includes the detection of computer attacks, the analysis of malicious software, and the discovery of vulnerabilities. I am also interested in efficient algorithms for analyzing structured data, such as sequences, trees and graphs
Harry: A Tool for Measuring String Similarity.
Journal of Machine Learning Research (JMLR), to appear October 2015.
Comparing strings and assessing their similarity is a basic operation in many application domains of machine learning, such as in information retrieval, natural language processing and bioinformatics. The practitioner can choose from a large variety of available similarity measures for this task, each emphasizing different aspects of the string data. In this article, we present Harry, a small tool specifically designed for measuring the similarity of strings. Harry implements over 20 similarity measures, including common string distances and string kernels, such as the Levenshtein distance and the Subsequence kernel. The tool has been designed with efficiency in mind and allows for multi-threaded as well as distributed computing, enabling the analysis of large data sets of strings. Harry supports common data formats and thus can interface with analysis environments, such as Matlab, Pylab and Weka.
VCCFinder: Finding Potential Vulnerabilities in Open-Source Projects to Assist Code Audits.
Despite the security community's best effort, the number of serious vulnerabilities discovered in software is increasing rapidly. In theory, security audits should find and remove the vulnerabilities before the code ever gets deployed. However, due to the enormous amount of code being produced, as well as a the lack of manpower and expertise, not all code is sufficiently audited. Thus, many vulnerabilities slip into production systems. A best-practice approach is to use a code metric analysis tool, such as Flawfinder, to flag potentially dangerous code so that it can receive special attention. However, because these tools have a very high false-positive rate, the manual effort needed to find vulnerabilities remains overwhelming. In this paper, we present a new method of finding potentially dangerous code in code repositories with a significantly lower false-positive rate than comparable systems. We combine code-metric analysis with metadata gathered from code repositories to help code review teams prioritize their work. The paper makes three contributions. First, we conducted the first large-scale mapping of CVEs to GitHub commits in order to create a vulnerable commit database. Second, based on this database, we trained a SVM classifier to flag suspicious commits. Compared to Flawfinder, our approach reduces the amount of false alarms by over 99% at the same level of recall. Finally, we present a thorough quantitative and qualitative analysis of our approach and discuss lessons learned from the results. We will share the database as a benchmark for future research and will also provide our analysis tool as a web service.
Automatic Inference of Search Patterns for Taint-Style Vulnerabilities.
Taint-style vulnerabilities are a persistent problem in software development, as the recently discovered “Heartbleed” vulnerability strikingly illustrates. In this class of vulnerabilities, attacker-controlled data is passed unsanitized from an input source to a sensitive sink. While simple instances of this vulnerability class can be detected automatically, more subtle defects involving data flow across several functions or project-specific APIs are mainly discovered by manual auditing. Different techniques have been proposed to accelerate this process by searching for typical patterns of vulnerable code. However, all of these approaches require a security expert to manually model and specify appropriate patterns in practice. In this paper, we propose a method for automatically inferring search patterns for taint-style vulnerabilities in C code. Given a security-sensitive sink, such as a memory function, our method automatically identifies corresponding source-sink systems and constructs patterns that model the data flow and sanitization in these systems. The inferred patterns are expressed as traversals in a code property graph and enable efficiently searching for unsanitized data flows—across several functions as well as with project-specific APIs. We demonstrate the efficacy of this approach in different experiments with 5 open-source projects. The inferred search patterns reduce the amount of code to inspect for finding known vulnerabilities by 94.4% and also enable us to uncover 8 previously unknown vulnerabilities.
Torben: A Practical Side-Channel Attack for Deanonymizing Tor Communication.
The Tor network has established itself as de-facto standard for anonymous communication on the Internet, providing an increased level of privacy to over a million users worldwide. As a result, interest in the security of Tor is steadily growing, attracting researchers from academia as well as industry and even nation-state actors. While various attacks based on traffic analysis have been proposed, low accuracy and high false-positive rates in real-world settings still prohibit their application on a large scale. Instead, the few known cases of deanonymization have been reported to rely on vulnerabilities in browser implementations and cannot be considered weaknesses in Tor itself. In this paper, we present Torben, a novel deanonymization attack against Tor. Our approach is considerably more reliable than existing traffic analysis attacks, simultaneously far less intrusive than browser exploits. The attack is based on an unfortunate interplay of technologies: (a) web pages can be easily manipulated to load content from untrusted origins and (b) despite encryption low-latency anonymization networks cannot effectively hide the size of request-response pairs. We demonstrate that an attacker can abuse this interplay to design a side channel in the communication of Tor, allowing short web page markers to be transmitted and exposing the web page a user visits over Tor. In an empirical evaluation with 60,000 web pages, our attack enables detecting these markers with an accuracy of over 91% and no false positives.
See all publications.
Editorial board of the Journal of Machine Learning Research (JMLR)
Guest editor of the special issue "Threat Detection, Analysis and Defense" in JISA
Steering committee of the GI SIG Intrusion Detection and Response (SIDAR)
Steering committee of the Conference on Detection of Intrusions and Malware (DIMVA)
Associate Member of the EU Network of Excellence SYSSEC
Forum InformatikerInnen für Frieden und gesellschaftliche Verantwortung e.V. (FIfF)
Conference and Workshop Organization
Program chair of the 10th Conference on Detection of Intrusions and Malware (DIMVA 2013)
General chair of the 6th European Conference on Computer Network Defense (EC2ND 2010)
Local organization of GI Graduate Workshop on Reactive Security (SPRING 2006)
Recent PC Memberships
2016: EUROS&P; SEC;
2015: CCS; ACSAC; RAID; DIMVA; WWW; ESSOS; SEC; EUC; AISEC; EUROSEC; MLOSS;
2014: CCS; RAID; DIMVA; EUC; ARES; AISEC; EUROSEC; ECRIME; LSP;
2013: DIMVA; ARES; PST; AISEC; MLOSS;
2012: DIMVA; ARES; SSS; AISEC;
I am a member of "Verband der krawattenlosen Wissensträger" (VDKW)
See all community services.