Sonntagsfrage

The “Sonntagsfrage” is a german survey where people answer the crucial question:

If you had to vote the german government on this sunday, whom would you choose?

https://www.infratest-dimap.de/umfragen-analysen/bundesweit/sonntagsfrage/
(translated freely)

It is performed by the infratest bimap institute and generally around 1000 – 1500 people take part in the questions. The results are often used as a barometer for the political situation within the country and are regularly presented in one of germanies main news programs, the “Tagesschau”.

In this Post the historical results of the Sonntagsfrage are used as input for a machine learning algorithm to produce a forecast for upcoming Sunday. Hence creating an AI which predicts the answers to the next Sonntagsfrage and thus gives an indication of the political climate in germany right now.

The results for next Sunday can be seen in the following Dashboard. Predictions can be viewed per used model or the average over all models can be displayed.

Check out my Medium Posts, where I explain how I used Google and Azure Cloud services to set this whole thing up.

Solution architecture

For the realisation of this project as well as automatisation and data consumption a combination of services from the Azure Cloud, Google Sheets and Tableau Public were used. The architecture can be seen in the image below.

Historical data is pulled from
https://www.wahlrecht.de/umfragen/dimap.htm
via a webcraller in python. For automation purposes the Microsoft Cloud service Azure was chosen: crawling and cleaning are each realised through Azure Functions, while orchestration is performed by Azure Durable Functions. As the single-point-of-truth storage the Azure SQL database is used.

For more information on this topic, check out my Medium Articles about Azure Functions and Durable Functions. The corresponding code can be found in my GitHub account.

Model training and predicting is performed with Python through the the use of the Azure Machine Learning service. Pipelines and Compute Clusters create an automated and reproducible Data Science Product.

Data consumption is performed via dashboards from the Tableau Public service using Google Sheets as a technical backbone.

Model evaluation

The models I use for machine learning are as follows:

  • DecisionTreeRegressor (sklearn)
  • SGDRegressor (sklearn)
  • GradientBoostingRegressor (sklearn)
  • XGBoost Regressor

As for input parameters right now only temporal features are used. Cyclical passing over the years is encoded into radial coordinates and additionally features like “number of days since the last survey” are being used.

In order to compare the different models I defined the following common performance metrics. Each one is computed over the most recent 12 weeks

  • MAE
  • MSE
  • RMSE
  • r2

The following Dashboard shows these metrics computed per party and model in a heat map.

As can be seen, all models perform similarly poor. Reasons for this are probably the lack of seasonal time series modelling (like ARIMA, prophet, etc.) or the lack of useful features.

Data consumption

Gathering data and fueling some algorithm are only two thirds of the way done. In order to have impact the results of all this magic in the backend have to be consumed by people. Hence the last step of this project: easy to read and easy to access visualisation. This is the part of such data science projects which is actually visible to other people and by which the quality of the project is often judged.

Visualisation of the historical answers to the Sonntagsfrage as well as of the prediction for the next survey are done with the service of tableau public. As data source for the dashboard Google Sheets ist used. This way new calculations are automatically uploaded into a google document and the dashboard refreshes the displayed data each time the document is updated. Check out my Article on Medium for more information on that topic.

The Dashboard with the current predictions can be found at the top of the article, a dashboard with historical values at the bottom.

Conclusion

The Sonntagsfrage gets now predicted weekly and this prediction visualized permanently without the need of any further input: we are now provided with a glimpse into the possible future political climate of germany. Look forward to my posts on Medium, where i go into detail about the essential steps. Also follow me on twitter and take a look into the well documented git repository of this Sonntagsfrage project.

Now Enjoy the dashboards!

1 thought on “Sonntagsfrage

Leave a Reply

Your email address will not be published. Required fields are marked *