To content
Department of Statistics

Open Theses

On this page you find open theses to be supervised by the Chair of Economic and Social Statistics.

The modelling of real-valued (univariate or multivariate) time series data with autoregressive (AR) models is a classical approach and has been used for decades. However, the use of autoregressive models is still favoured in practice, because AR models are flexible, they allow for a nice interpretation and the AR coefficients are easy and explicitly estimable. In contrast, when dealing with serially dependent integer-valued data (i.e. count data), an autoregressive modelling is not straightforward as classical AR models do not respect the integer-valued range. To remedy this, for (non-negative) integer-valued data, i.e. count data, so called integer-valued AR (INAR) models are often used, which are still of autoregressive nature, but respect the integer-valued range by incorporating a so-called thinning operation. These models are still easy to estimate and nicely interpretable, but the parameter range is more restrictive in comparison to the AR model. Whereas the literature that investigates both models separately is huge, the literature lacks a unified approach to model jointly real-valued and integer-valued time series in an autoregressive way.

In this thesis, the goal is to investigate how real-valued and integer-valued time series can be jointly modelled using an autoregressive way such that

  1. The corresponding ranges, i.e. the real line and the integers, respectively, are respected by the modelling approach.
  2. The famous Yule-Walker equations hold, which leads to simple and explicit estimators for the autoregressive coefficients.
  3. The range of the AR model parameters is not restricted and ideally coincides with that of a classical vector autoregressive model.

Literature:

  • Brockwell, P.J. and Davis, R.A. (1991). Time Series: Theory and Methods. Springer, New York.
  • Jentsch, C. und C. H. Weiß (2019). Bootstrapping INAR models. Bernoulli 25(3), 2359-2408.
  • Kim, H.Y. and Park, Y. (2008). A non-stationary integer-valued autoregressive model. Statistical Papers 49(3):485-502.
  • Weiß, C.H. (2018). An Introduction to Discrete-Valued Time Series, 1st edn. Wiley

If you are interested in writing this thesis, please contact Prof. Dr. Carsten Jentsch.

In this thesis, the goal is to transfer the idea of randomization based inference for testing the equality of spectral densities of (multivariate) time series in Jentsch and Pauly (2015, Bernoulli) to random fields, that is, processes indexed not only in the space of integers Z, but, more generally, in Z^d. These processes allow to model spatial data in the two-dimensional plane Z^2 or in the three-dimensional space Z^3, or spatial data Z^2 over time Z. In such scenarios, the covariance function becomes more complex and relies on more parameters, which are difficult to estimate in practice. To tackle this, simplifying assumptions on the covariance function such as symmetry or separability are often imposed. In this thesis, a frequency-domain test statistic based on non-parametric spectral density estimators is proposed. Following Jentsch and Pauly (2015, Bernoulli), the null distribution of that test is estimated by a randomization approach, which has the main advantage that it does not require any tuning parameters (in addition to the bandwidth parameter) in comparison e.g. to commonly applied bootstrap methods.

 

The asymptotic distribution of the test statistic under the corresponding null has to be derived and it has to be checked whether the randomization approach leads to the correct distribution as well. The finite sample performance has to be investigated using Monte Carlo simulations.

If you are interested in writing this thesis, please contact Prof. Dr. Carsten Jentsch.

Now- and forecasting are important components in answering econometric questions. At DoCMA (Dortmund Center for Data-based Media Analysis), two indices, UPI (uncertainty perception indicator) and IPI (inflation perception indicator), were developed to measure the presence of uncertainty and inflation reporting in German newspapers, partitioned by different topics.

The aim of this thesis is to assess to what extent the (sub-) indices improve the predictive power of established econometric models.

Literature (as well as references in the following papers):

  • Rieger, J., Hornig, N., Schmidt, T. and Müller, H. (2023). Early Warning Systems? Building Time Consistent Perception Indicators for Economic Uncertainty and Inflation Using Efficient Dynamic Modeling. Accepted for MUFin'23. Link. GitHub.
  • Müller, H., Rieger, J., Schmidt, T. and Hornig, N. (2022). An Increasing Sense of Urgency: The Inflation Perception Indicator (IPI) to 30 June 2022 - a Research Note. DoCMA Working Paper #12. DOI. GitHub.
    Previous editions: Pressure is high (04/30/2022), A German Inflation Narrative (02/28/2022).
  • Müller, H., Rieger, J. and Hornig, N. (2022). Vladimir vs. the Virus - a Tale of two Shocks. An Update on our Uncertainty Perception Indicator (UPI) to April 2022 - a Research Note. DoCMA Working Paper #11. DOI. GitHub.
    Previous editions"Riders on the Storm" (Q1 2021), "We’re rolling" (Q4 2020), "For the times they are a-changin'" (Q3 2020).
  • Shrub, Y., Rieger, J., Müller, H. and Jentsch, C. (2022). Text data rule - don't they? A study on the (additional) information of Handelsblatt data for nowcasting German GDP in comparison to established economic indicators. Ruhr Economic Papers #964. Link.

Additional information on the indicators:

If you are interested in writing this thesis, please contact Jonas Rieger.

Latent Dirichlet allocation (LDA) is still a widely used topic model in application-oriented research areas for the exploration of textual data. In our work on RollingLDA, we describe a further development of the model in terms of discrete updates. For this, three parameters have to be chosen: the initialization period, the size of the update interval, and the size of the corresponding memory for each update interval.

The goal of this thesis is to investigate the effect of the three parameters on the resulting model and to give suggestions for parameter choices in specific settings, as well as to propose adjustments for model estimation, if useful.

Literature:

  • Rieger, J., Jentsch, C. and Rahnenführer, J. (2021). RollingLDA: An Update Algorithm of Latent Dirichlet Allocation to Construct Consistent Time Series from Textual Data. Findings of the Association for Computational Linguistics: EMNLP 2021, 2337-2347. DOI. GitHub.

Example applications of RollingLDA:

  • Rieger, J., Hornig, N., Schmidt, T. and Müller, H. (2023). Early Warning Systems? Building Time Consistent Perception Indicators for Economic Uncertainty and Inflation Using Efficient Dynamic Modeling. Accepted for MUFin'23. Link. GitHub.
  • Bittermann, A. and Rieger, J. (2022). Finding scientific topics in continuously growing text corpora. Proceedings of the 3rd Workshop on Scholarly Document Processing. Link. GitHub. PsychTopics App.
  • Lange, K.-R., Rieger, J., Benner, N. and Jentsch, C. (2022). Zeitenwenden: Detecting changes in the German political discourse. Proceedings of the 2nd Workshop on Computational Linguistics for Political Text Analysis. pdf. GitHub.
  • Rieger, J., Lange, K.-R., Flossdorf, J. und Jentsch, C. (2022). Dynamic change detection in topics based on rolling LDAs. Proceedings of the Text2Story'22 Workshop. CEUR-WS 3117, 5-13. pdf. GitHub.

If you are interested in writing this thesis, please contact Jonas Rieger.

To top of page