HERSS Summer School 2024 on Innovative Methods in Higher Education Research and Science Studies from September 9th to 13th, 2024 at DZHW in Hannover, Germany
The research area of higher education and science studies is at a crossroads. The all-encompassing expansion of the web, coupled with new methodological toolkits, opens completely new research avenues that go far beyond traditional (web) survey techniques. Researchers continuously strive for new, innovative research methods to, for example, recruit participants (e.g., through social media and apps), use mixed-method approaches (e.g., through narrative voice answers in smartphone surveys), investigate digital behavior (e.g., through web tracking), and explore activities on social platforms (e.g., through digital data donation). This methodological expansion is accompanied by a massive increase in new, powerful analysis techniques that fall under the umbrella of social data science, including text mining and machine learning. These analysis techniques are key to coping with the inherent heterogeneity and flood of data.
Here you find the call for participation, which is outlined in more detail below. The call is open until 17th of May 2024 (extended until 31st of May 2024).
Further Information
Topics of the Summer School
This year's HERSS Summer School aims to bring forward recent advances in innovative methodology in the field of higher education and science studies. It includes keynotes and hands-on workshops teaching and discussing cutting-edge methods and data analysis techniques, including research on advances related to the whole data life cycle. Along this cycle, we structure the summer school into three overarching tracks: (1) data collection, (2) data sharing and enrichment and (3) data analysis. In the data collection track, modern app-based or multi-modal survey techniques are discussed. For the data sharing and enrichment track, participants will get insights into novel data sources and how to collect them. The data analysis track focuses on using modern machine-learning and natural language processing (NLP) frameworks in R or Python.
Audience and Content
HERSS Summer School invites contributions from undergraduates, doctoral students, and postdocs in the field of higher education research and science studies from all over the world. It resembles a fruitful exchange format for both qualitative and quantitative researchers, as well as those that want to get a deeper insights into most recent developments. Although most of the workshops (three per track) are planned to also onboard beginners, (basic) knowledge on methodology and data analysis is a prerequisite. In order to fuel fruitful discussions and collaboration opportunities, we require participants to submit a motivation letter for participation in the HERSS Summer School. Furthermore, slots for the presentation of research projects are scheduled in which participants can present their advances. During these sessions, our peers will give valuable feedback. Finally, HERSS Summer School includes several keynotes from international experts speaking about cutting-edge research topics, such as digital data donation.
The summer school is structured into 9 workshops, 4 keynotes, 2 poster sessions, and a plenary debate.
To date, the following speakers followed our invitation:
Data Collection Track:
- Nonresponse and Data Integration: Camilla Salvatore – Utrecht University (https://www.uu.nl/staff/CSalvatore)
AbstractGiven the declining response rates and increasing costs associated with traditional probability-based sample surveys, researchers and survey organisations are increasingly investigating the use of alternative data sources, such as nonprobability sample surveys and digital trace data.
While initially considered as potential replacements, it's now clear that the most promising role for these alternative data sources is supplementing probability-based sample surveys. Indeed, the use of auxiliary data is a considerable opportunity as it often allows for timeliness, data with detailed time intervals, and geographical granularity, among others.
Therefore, the following research question arises: how can we integrate traditional surveys and nontraditional data? This is where this workshop comes in.
In this workshop, we consider different types of data (e.g., nonprobability samples and digital trace data). We will review existing data integration techniques, and participants will engage in discussions covering the following topics:
- Opportunities and challenges of survey data integration
- Strategies for evaluating and managing data quality and errors
- Comparing different data integration techniques
The session will include discussion of case studies from existing literature, supplemented by practical exercises (in R) to demonstrate the application of data integration techniques. Participants should have basic pre-existing knowledge about survey methodology and statistics.
At the end of the workshop, participants will be able to:
- Identify challenges related to data quality and errors
- Design research projects centred on survey data integration
- Select appropriate methodologies for survey data integration
- Acknowledge the limitation of such methodologies
Participants will have the opportunity to share and discuss challenges encountered in their own research together with the instructor and fellow participants.
- Multi-modal Data Collection: Jan Karem Höhne – DZHW Hannover, Leibniz University Hannover (https://jkhoehne.eu)
AbstractInexpensive and time-efficient web surveys have increasingly replaced survey interviews, especially conducted in person. Even well-known social surveys, such as the European Social Survey (ESS), follow this trend. However, web surveys suffer from low response rates and frequently struggle to assure that the data are of high quality. New advances in communication technology and artificial intelligence (AI) make it possible to introduce new approaches to web survey data collection. Building on these advances, I introduce participants to web surveys in which questions are read through life-like virtual interviewers and in which respondents answer through selecting options from rating scales or providing open narrative answers. To this end, I incorporate features of in-person interviews in self-administered web surveys. This has the great potential to improve data quality through the creation of rapport and engagement, while providing respondents all benefits of the self-administered web survey mode. In this course, I initially give a thorough and empirical-driven overview of potential effects of virtual interviewers on respondents’ answer behavior by relying on existing survey literature and my own research. In addition, participants obtain comprehensive insights into the creation of life-like virtual interviewers for surveying respondents and how to implement them in contemporary web surveys. Importantly, the course includes applied data collection exercises in which participants
- work with a video generation platform,
- learn to implement videos of virtual interviewers in survey software,
- manage large-scale web surveys with virtual interviewers,
- and get novel insights into data handling.
As basis for the data-driven showcases, I will use the platform HeyGen (www.heygen.com) and the survey software solution Unipark (www.unipark.com). Previous knowledge on web surveys or programming skills are not mandatory. Participants are encouraged to prepare questions concerning current and future studies and to bring a laptop PC for the exercises.
Abstract Lightning TalkTuesday, 10th, 1:30 - 2:45pm
Lightning Talk: Innovating mixed methods: Merits and limits of open voice answers from smartphone surveys
The increase of smartphone usage in web surveys, coupled with developments in communication technology, provide novel opportunities for measuring respondents’ attitudes and opinions. Specifically, smartphones allow the collection of voice instead of text answers by using the built-in microphone. This facilitates answering open questions potentially resulting in richer information and higher data quality. However, existing research also points to some methodological shortcomings. For example, voice answers add an additional layer to data processing because they require transcription and result in relatively high item non-response. In this talk, I provide empirical evidence on the merits and limits of voice answers collected in smartphone surveys. To this end, I start with new insights into the performance of Automatic Speech Recognition (ASR) systems when it comes to the transcription of voice answers. I then look at item non-response identifying characteristics of “non-responders.” Finally, I evaluate the linguistic and content characteristics of voice answers in comparison to text answers.
Keywords: Automatic Speech Recognition (ASR), Data quality, Missing data, Smartphone sensors, Text-as-Data methods - App-based Surveying: Joshua Claassen – DZHW Hannover, Leibniz University Hannover (https://www.dzhw.eu/gmbh/mitarbeiter?m_id=987)
AbstractAcademia is increasingly shifting from the analogue to the digital world with digital platforms, such as Google Scholar, Harvard Dataverse, and Open Science Framework, shaping the behavior of both students and researchers. However, established data collection methods in higher education research and science studies, such as contemporary web surveys, run frequently short to accurately measure digital behavior because they are prone to recall error (i.e., erroneous recalling and reporting of past behavior) and social desirability bias (i.e., misreporting of behavior to comply with social norms). New advances in the collection of digital trace (or metered) data make it possible to directly measure digital behavior in the form of browser logs (e.g., visited URLs and search terms) and apps (e.g., duration and frequency of use). Building on these advances, I introduce participants to web surveys augmented with metered data. In this course, I initially give a thorough overview of the manifold new measurement opportunities provided by metered data. In addition, participants obtain comprehensive insights into the collection, analysis, and error sources of metered data as well as its application to higher education research and science studies. Importantly, the course includes applied data collection and analyses exercises in which participants
- learn how to plan and conceptualize a metered data collection,
- apply metered data to their personal research interests,
- work with metered datasets,
- and get novel insights into the proper handling of metered data.
Previous knowledge on metered data or programming skills are not mandatory. Participants are encouraged to prepare questions concerning possible applications to current and future studies and to bring a laptop PC for the exercises.
Data Sharing and Enrichment Track:
- Social Media Sampling: Zaza Sophie Louise Zindel - University Bielefeld (https://ekvv.uni-bielefeld.de/pers_publ/publ/PersonDetail.jsp?personId=84399219)
AbstractThe rise of social media engagement among the general population is opening up new and expanded opportunities for survey researchers. A growing number of studies have leveraged these platforms to recruit survey respondents, particularly taking advantage of their extensive targeting capabilities. This session will provide an introduction to the Meta Ads Manager, the predominant tool currently used for participant recruitment. It will discuss various targeting options and compare the estimated reach across different social media platforms. In addition, the session will provide insights into research findings from several studies conducted between 2019 and 2023. Attendees will leave with a clear understanding of how to leverage social media platforms for their own survey-based research projects.
- Data collection using digital traces: Bella Struminskaya - Utrecht University (https://www.uu.nl/staff/BStruminskaya)
AbstractSmartphone sensors (e.g., GPS, camera, accelerometer) and apps allow researchers to collect rich behavioral data, potentially with less measurement error and lower respondent burden than self-reports through surveys. Passive mobile data collection (e.g., location tracking, call logs, browsing history), donation of digital traces, and respondents performing additional tasks on smartphones (e.g., taking pictures, scanning receipts) can augment or replace self-reports. However, there are multiple challenges to collecting these data: participant selectivity, (non)willingness to provide sensor data or perform additional tasks, ethical issues, privacy concerns, usefulness of these data, and practical issues of in-browser measurement and app development. This course will address these challenges by reviewing state-of-the-art practices of smartphone sensor data collection, ranging from small-scale studies of hard-to-reach populations to large-scale studies to produce official statistics, and discuss design best-practices for sensor measurement. Recommendations provided will include:
- What research questions can be answered using smartphone sensors and apps?
- What are participants' concerns and how to address them?
- How to ask for consent for sensor measurements and ensure participation?
This course will discuss methods of assessing data quality and touch upon the analysis of passively collected data. The course will not provide analytic methods for 'found' data nor demonstrate how to program smartphone sensor apps.
Who should attend: The course is intended for survey practitioners, researchers, or students who want a practical introduction to smartphone sensor-based research. No prior knowledge of smartphone sensors is required, but a basic understanding of survey practice and survey errors is helpful.
By the end of the course participants will:
- know what smartphone sensors are available and what they can measure to facilitate and enhance surveys
- be able to identify potential applications of smartphone sensor measurement for their own data collection
- be able to anticipate practical issues when implementing smartphone sensor data collection.
Abstract KeynoteThursday 12th, 1:30 - 2:45pm
Keynote: Augmenting Surveys with Smartphone Sensor, App, and Digital Trace Data: Design and Data Integration
Traditional surveys are not well-equipped to measure certain concepts of interest such as expenditures, time use or travel behavior due to the high burden placed on participants. Facts or behaviors that are difficult to measure through self-report can be measured using new technologies: smartphone apps, sensors, and wearables. For example, accelerometers in smartphones and fitness bracelets can objectively measure physical activity, screen time apps can measure (social) media use. Another possibility is to augment surveys with administrative data or data from digital platforms such as Google, Youtube, Instagram that participants can provide to researchers through data donation, or consent to data linkage. However, to ensure representation, participants have to be willing and able to use their devices to perform such tasks. If participants differ from nonparticipants in key outcomes, research results can be biased. In this talk, I will present the results of several randomized experiments on the mechanisms of willingness and consent to collect data using smartphone sensors, apps, and wearables in general population surveys, and the extent of nonparticipation bias assessed by linkage of survey data to sensor and administrative data. I will further focus on how these mechanisms translate to data donation of digital trace data, what opportunities and challenges such novel data collection methods hold for the social sciences and official statistics, and on how to integrate data from different sources (e.g., self-report and digital trace data; traditional diary studies and app-based measurement) in a way that ensures high data quality and leverages the advantages of the novel data collection, is not a trivial question. In this talk, I approach the question of data integration from a perspective of (survey) design and use examples of integrating probability and nonprobability online panels, survey and digital trace data, as well as smart surveys in the context of national statistical institutes. - Web Scraping: David Broneske (https://www.dzhw.eu/gmbh/mitarbeiter?m_id=889), Saijal Shahania (https://www.dzhw.eu/gmbh/mitarbeiter?m_id=955) - DZHW Hannover
AbstractWeb scraping, a useful tool for data enrichment, offers the advantage of automated data collection from a wide range of online sources. In the digital age, where information is abundant, traditional data collection methods often struggle to keep up. Web scraping steps in, empowering researchers to extract data from websites systematically. This not only overcomes the limitations of traditional methods, but also provides valuable insights into social trends, public opinion, and behavior. Moreover, the real-time nature of data and the need to observe evolving trends pose additional hurdles. Web scraping addresses these challenges by automating data collection, enabling researchers to efficiently and accurately gather vast amounts of information.
In this 3.5-hour workshop, we will cover the following topics:
- Introduction to Web Scraping:
- Overview of web scraping and its relevance in social science research.
- Ethical considerations and legal implications of web scraping.
- Static vs. Dynamic Websites:
- Techniques for scraping static websites.
- Handling dynamic websites using Selenium.
- Python Tools for Web Scraping:
- Installing and using key libraries such as BeautifulSoup, Requests, and Selenium.
- Parsing different types of data (HTML, XML, etc)
- Use Case: Scraping News Articles:
- Collecting data from news websites to analyze trends and public opinion.
- Extracting article titles, publication dates, and content.
- Building a Web Scraping Script:
- A step-by-step guide to developing a web scraping script.
- Automating data extraction from online news sources.
- Data Storage and Management:
- Saving scraped data in various formats (e.g., CSV, JSON).
- Organizing and managing large datasets for analysis.
- Data Cleaning and Preprocessing:
- Techniques for cleaning and preprocessing scraped data.
- Handling missing data and duplicates.
- Practical Exercises:
- Hands-on exercises to reinforce learning.
- Applying web scraping techniques to participants' research interests.
- Best practices for maintaining and updating web scraping scripts.
No prior knowledge of web scraping is required. Participants must bring their laptops to engage in practical exercises and prepare questions about potential applications in their current and future research projects.
- Introduction to Web Scraping:
Data Analysis Track:
- Natural Language Processing in R: Thomas Hills - University of Warwick (https://warwick.ac.uk/fac/sci/psych/people/thills/)
AbstractInvestigating how people understand conspiracy, risks, immigration, politics, mental health, wellbeing, and many other problems often requires that we look through the lens of language. Methods for collecting and analyzing this data are changing dialy. Moreover, many top researchers and businesses use these methods routinely to analyze billions of words of text to produce insights into human behavior. My presentation will outline the four fundamental text analysis methods, illustrating their application and the insights they offer. The workshop session will include a practical, hands-on demonstration using R, with all necessary data and code accessible via Github. Participants will gain the skills to immediately apply these methods to their own data. Additionally, I will discuss crucial aspects of data collection and quality, providing guidance on identifying high-quality data and determining the optimal use of these analytical techniques.
- Machine Learning for the Social Sciences: Marco Steenbergen - University of Zurich (https://www.ipz.uzh.ch/en/people/employees/msteen.html)
AbstractMachine learning and the related field of AI are rapidly changing the nature of research and education around the globe. This 3.5-hour workshop introduces machine learning with the goal of describing its logic and workflow, as well as potential applications. A key aspect is to provide hands-on experience with the implementation of machine learning methods. To make this manageable, we focus on one example: random forests. The detailed program is as follows:
Time Topic 9.00-9.15 Defining machine learning and contrasting it with conventional quantitative methods. 9.15-9.45 Training and test sets; machine learning errors and their prevention. 9.45-10.30 Theory of classification and regression trees and random forests. 10.30-10.45 Break and assistance with setting up R. 10.45-11.45 Hands-on exercise fitting, evaluating, and reporting a random forest. 11.45-12.30 Beyond basic predictive modeling —deep neural networks and AI. - Measurement Error and Correction: Melanie Revilla - Barcelona Institute of International Studies (https://www.upf.edu/web/webdataopp/melanie-revilla)
AbstractThe expansion of the Internet and the development of a range of new active and passive measurement tools, particularly on mobile devices, present exciting opportunities for researchers. Compared to conventional surveys, using these new measurement opportunities (e.g., visual data) could reduce respondent burden, improve data quality, and extend measurement into new domains, allowing to answer questions that could not be answered so far and to improve the decisions of key actors, such as governments. Despite these promising prospects, there remains a noticeable dearth of research that has effectively harnessed such possibilities, with even fewer studies evaluating the associated data quality. Efforts in this direction are essential for unlocking the full potential of these novel measurement approaches. This presentation will go through the main potential benefits offered by these data, while also presenting the challenges and risks researchers encounter when using them. Moreover, it will showcase examples of ongoing research aimed at deepening our understanding of these new data types to help researchers leveraging these resources effectively, maximizing their utility and minimizing potential pitfalls.
Abstract KeynoteWednesday 11th, 4:00 - 5:15pm
Keynote: Going beyond conventional web surveys: Opportunities and challenges of using new types of data within the frame of web surveys
The expansion of the Internet and the development of a range of new active and passive measurement tools, particularly on mobile devices, present exciting opportunities for researchers. Compared to conventional surveys, using these new measurement opportunities (e.g., visual data) could reduce respondent burden, improve data quality, and extend measurement into new domains, allowing to answer questions that could not be answered so far and to improve the decisions of key actors, such as governments. Despite these promising prospects, there remains a noticeable dearth of research that has effectively harnessed such possibilities, with even fewer studies evaluating the associated data quality. Efforts in this direction are essential for unlocking the full potential of these novel measurement approaches. This presentation will go through the main potential benefits offered by these data, while also presenting the challenges and risks researchers encounter when using them. Moreover, it will showcase examples of ongoing research aimed at deepening our understanding of these new data types to help researchers leveraging these resources effectively, maximizing their utility and minimizing potential pitfalls.
Panel Discussion:
- Social Media — Blessing or Curse for Higher Education Research and Science Studies
AbstractSpeakers:
- Julia Lenz - DZHW (https://www.dzhw.eu/en/gmbh/mitarbeiter?m_id=999)
- Christoph Hönnige - Leibniz University Hannover (https://www.ipw.uni-hannover.de/de/institut/personenverzeichnis/christoph-hoennige)
Social media is omnipresent in everyday life. We read and create social media posts, comment on those by others, and consume a massive amount of information. According to Statista, people spend more than two hours a day on social media platforms. As such an influential communication technology, social media platforms are an enormous source of personal information and opinions. Hence, it is an invaluable source for the social sciences and many adjacent research fields, including higher education research and science studies. However, there is a diverse set of challenges that researchers face when working with social media data. For example, the terms and conditions related to data access highly rely on the respective platform, and they can change on a daily basis introducing a high level of dependency. Another problem is the lack of transparency on data sampling strategies evoked by many social media platforms, introducing representation concerns (e.g., for what populations or groups can we draw robust conclusions?). This is accompanied by issues related to measurement quality. The operationalization of measurement concepts and the validity of social media data are still up for discussion. In this panel, we therefore invite well-renowned researchers to share their experience with social media data. Specifically, they showcase pitfalls and best practices of how and when to use social media data for qualitative and quantitative research purposes. We also provide an open floor for discussion and exchange between our panel members and audience inferring the merits and limits of social media data.
Project Presentations:
- Stefan Gunzelmann (University of Education Karlsruhe (PH Karlsruhe)): Can Gendered Language in Job Ads Predict Hiring Discrimination? Preliminary Insights into an Automated Text Analysis
- Yasemin Tutar (RTE University): International Student Mobility, Cultural Adaptation, Multicultural Competence, and Career Development
- Kajal (University of Delhi): Exploring Socially Responsible Science Education in Indian Rural Context
- Keanen McKinley (William & Mary): Increasing Survey Response Rates: Framing International Students as Potentially Vulnerable and Hard-to-Reach
- Hanyu Qin (The University of HongKong): The Formation and Persistence of Hong Kong's International Research Collaboration in Higher Education: Based on the Panel Data of International Co-authored Papers from 1966-2022
- Tatiana Akuneeva (HSE University): Untangling the red tape at Russian universities: administrative profiles
- Deepak (University of Delhi): Higher Education and Socio Economic Development: Valmiki Community in India
- Rashmi Pal (University of Delhi): Media Pedagogy for Gender Sensitization in Teacher Education Programmes in Indian Context
- Anastasia Byvaltseva-Stankevich (HSE University): Relationship Between Faculty Age Structure and University Publishing Activity
- Amina Irfan (University of Lodz): Exploring Green Marketing Strategies for Sustainable Food Products in Poland: An Analysis of Consumer Behavior and Influencing Factors
- Daniel Remai (Ludovika University of Public Service): The potential of data visualisation in public higher education
- Victor Rudakov (Freie Universität Berlin): Student employment in Europe and subsequent labour market outcomes
- Alberto Márquez Carrasca (Institute of Public Goods and Policies (IPP-CSIC)): The Effects of Decentralization on Higher Education Policy Instruments: Evidence from Spain
- Siqi Sun (The University of Manchester): Using Online Data to Understand Teacher Feedback on Student written Assignments and Student Reactions
- Bettina MJ Kern (TU Wien): Improving Dropout Prediction for Informatics Bachelor Students
- Mei Lai (The University of Hong Kong): Students' attitudes towards post- graduation life and career:Insights from Chinese undergraduates in transnational higher education
- Nirupama Malavalli Prasad (BML Munjal University, Gurugram, India): Assessing the Influence of Decentralization on Policy Instruments for Quality Assurance in Indian Higher Education
- Ewa Zegler-Poleska (University of Warsaw / Indiana University Blooomington): Exploring preprint retractions: A case study of arXiv
- Mazlum Karataş (GESIS Leibniz Institute for the Social Sciences): Reactive vs. Non-Reactive Methods: Comparing Web Survey and Web Scraping in the INSPIRE-Project to Assess Gender Equality Plans in Research Organisations
- Asen V. Dimitrov (Institute of Philosophy and Sociology of the Bulgarian Academy of Sciences): Anxiety in undergraduate students at Bulgarian universities: An exploration through Axel Honneth’s recognition theory
- Xiuli Huang (University of Göttingen): Comparative Insights into Teacher Competencies and Well-being in Higher Education from Germany, Finland, the US, and China
Application Procedure
To participate, send a motivation letter (max. 500 words) including the title of your research topic until 17th of May 2024 to . Acceptance will be sent out by 7th of June 2024. The participation fee for the whole week of the summer school, including course materials and catering is 300 Euro per person, on a self-funding basis.
We offer a small number of stipends to participants with limited funding opportunities. To apply for a stipend, please submit a letter (max. 1 page) explaining your financial situation. The organizers are strongly committed to creating equal opportunities. In case of equal qualification, preference will be given to underrepresented groups.
We are looking forward to your applications!
Organization and Venue
Location: Conti Hochhaus
Königsworther Platz 1
30167 Hannover
Organizers: Prof. Dr. Jan Karem Höhne and Dr. David Broneske
Leibniz University Hannover and German Center for Higher Education Research and Science Studies (DZHW)
Accomodation
We can recommend the following hotels nearby the location. Hotels are to be booked on a self-funding basis.
Design Hotel Wiegand
Email: hannover@hotel-wiegand.de
Lange Laube 20
30159 Hannover
https://wiegand-hotel.de/#/booking/search
City Hotel Hannover
Tel.: +49 (0)511 36070
Limburgstraße 3
30159 Hannover
http://www.cityhotelhannover.de/
Hotel City Panorama
Tel.: +49 (0)511 897 060 15
Münzstr. 5
30159 Hannover
https://www.hannover-city-panorama.de/