We have listed subjects of interests for projects and theses below. Please, contact the corresponding person to gain more detailed information about a specific subject.
Differences and similarities between modern web automation frameworks
Multiple frameworks exist that allow the automation of conventional browsers. The most famous frameworks are Puppeteer for Chrome, Playwright (Chrome/FF/Webkit), and Selenium+WebDriver. The goal of this work is to elaborate and contrast these frameworks' mechanics. Based on this knowledge, the fingerprint surface shall be determined.
Attacking automation frameworks
Automation frameworks enable search engines, business intelligence and large-scale research. However, there have been rare attempts to explore the threat model for these automation frameworks and their application. This project dives into this unexplored area.
Effectiveness of stealth modifications
Recent research has shown how to detect web bots by their fingerprints. As a response, first implementations have emerged that hide identifiable properties of a bot. This work shall shine a light on the effectiveness of these extensions concerning detectability and website compatibility.
Effects of bot detection on measurements
Recent emerging work has shown that there are significant effects on web measurements when using so-called headless browsers. However, the general question of how a specific bot setup influences measurements remains unclear. This project aims to investigate the effect of different setups for bots on web measurements.
How to compare scraping runs?
Web bots are commonly used to study the web. However, the reliability of some bots has been questioned recently. Comparing measurements with bots is one way to determine the reliability of these tools. The project aims to determine parameters that need to be considered and to find suitable settings for these parameters.