Loading...

H.M. the Queen's New Year's Address Analyzer

Home

You are hereby invited to a look into the new year eve speeches of the royal majesty

This is a dashboard. The dashboard is a tool to analyze topics in speeches.

Topics can be hard to document correctly. By using this tool, you can find relevant topics and the words within on the topics tab. From there you can then choose to explore the sentiment, countries, or statistics to get a better understanding of the topic. You could also do your own thing and look at pretty graphs and statistics. We won't judge.

This is an interactive dashboard. You can select tabs in the sidebar on the left to navigate to different sections. When sidebar is expanded (see toggle button in top left corner), you can apply filters and settings for your liking.

GUD BEVARE DANMARK.

Statistics included

Statistics included displays the aggregated statistics for various covered topics.

Total values can give a idea about what is about to be covered.

How to use

Interpretation: Values are summed for different categories.

Year filter: Setting the year filter will filter for the words used in those years.

Featured words: Setting featured words will filter for the words used in that selection.

Topics

Topic model parameters

Options to set prefference to update featured words from topic model.

You can more easily inspect topic in other tabs by quicly obtaining their values.

Notice: "Update on topic selection" will clear previous selections.

Number of terms to display sets the number of terms used in the topic model, by size of topic.

Limiting the number of terms makes the topic model easier to interpret. Adding more, gives more information.

How to use

Setting featured words update prefferece: Click a a setting that fits your needs the most.

Choose terms to display: Use the slider to select a number by dragging the dot to any number.

Topic model

Topic model is a tool that maps topics. It does so by analyzing word relations by analyzing all speeches and mapping the relations as word matrixes. Then the model displays the topics to the left, and the words within on the right. Frequency of use is depicted by size for topics, and bar length for words. Topics are named 1:n, where 1 is the largest topic.

The topics model can be used to get an understanding of themes present in the speeches throughout the years.

How to use

Topics (left): The sizes in the visualization in the body represent frequency and can be interpreted as so. To inspect a topic, it can be selected in the selector in the header, a topic circle can be hovered. Clicking a topic will keep it highlighted.

Topic content (right body): Hovering a word will show what topics they are included in. Clicking a word will add an underline, but nothing else.

Relevance metric slider (right header): The slider adjusts relevance for words to be part of a topic. To see it work, first highlight a topic (by clicking or using the top left filter).

Sentence options

A slider to set the maxium amount of sample sentences.

To adjust how many sentece samples you want to see.

How to use

Drag the circle on the slider to any number.

Samples of sentences belonging to chosen topic.

To see examples of sentences in chosen topic.

How to use

Click on a a topic in the topic map above.

This boxplot shows the average sentiment of sentences belonging to a specific topic, compared to the rest of the topics.

To compare a specific topics sentiment to the rest of the topics.

How to use

Click on a a topic in the topic map above to make it appear.

Hover over a box in the boxplot to see descriptive statistics.

Sentiment

Sentiment of words

Sentiment of words displays words by sentiment.

Sentiment of words can help you identify what words influenced the sentiment.

How to use

Interpretation: the most positive words have a polarity of 3. The most negative words have a polarity of -3.

Tooltip: hovering a word will display the polarity of the word, and the sentiment category.

Series filter: Click on a series name to disable it. Click again or set a new filter that affects the visualization to enable it again.

Year filter: Setting the year filter will filter for the words used in those years.

Featured words: Setting featured words that have sentient will make them be included before other words (otherwise words are included by frequency). If at least one word with sentiment is present, the series will be names after both if they are positive/negative and if they are included or not.

Number of words: Number of words filters words included, from most to least frequent.

Notice: Words with a polarity of 0 is filtered.

Frequency used (n uses in total)

Frequency used displays words by frequency.

Frequency used can help you identify what words influenced the sentiment.

How to use

Interpretation: Larger bars have higher frequency.

Series filter: Click on a series name to disable it. Click again or set a new filter that affects the visualization to enable it again.

Year filter: Setting the year filter will filter for the words used in those years.

Featured words: Setting featured words that have sentient will make them be included before other words (otherwise words are included by frequency). If at least one word with sentiment is present, the series will be names after both if they are positive/negative and if they are included or not.

Number of words: Number of words filters words included, from most to least frequent.

Notice: Words with a polarity of 0 is filtered.

Number of words

Number of words is a slider to set the number of featured words.

Limiting the amount words featured will reduce clutter and make Sentiment of words and Frequency used easier to interpret.

How to use

Set a number by using the slider.

Countries

A world map, with countries mentioned in speeches.

To know where countries are, and how often they are mentioned.

How to use

Interpretation: Highlighted countries are mentioned. Darker colors represent more mentions.

Tooltip: hovering or clicking brings up a tooltip, showing total mentions and amount of mentions in each relevant year.

Year filter: Setting the year filter will filter for the countries mentioned in those years.

Countries mentioned

A bar chart showing the amount of times a country is mentioned for each year.

A boxplot showing the average sentiment of sentences in which a country is mentioned

To compare the mentions of a specific country, to mentions of countries in general

How to use

Tooltip bar chart: hovering over a bar shows the amount a specific country is mentioned in that year, compared to mentions of countries in general

Tooltip boxplot: hovering over a box shows distribution statistics of the average sentiment of sentences, in which a specific country is mentioned, compared to the distribution of average sentiment of sentences, in which any country is mentioned.

Samples of mentions.

To see examples of mentions.

How to use

Click on a country in the map above.

Sentence options

A slider to set the maxium amount of sample sentences.

To adjust how many sentece samples you want to see.

How to use

Drag the cirkle on the slider to any number.

Word statistics

Speech statistics sliders

Top frequent numbers is a slider to filter for frequency of words in word themed visualizations.

Many words can be mentioned. Most of these visualizations can become to cluttered to interpret.

How to use

Use the slider by dragging the circle to a any number.

Tip: Don't choose a too high number at first. Start low. Models can become to cluttered to interpret and it will take a long time to load visualizations.

Data source and handling

Data preparetaion

Stopwords are filtered. This is done to avoid the most common words (like "the") to dominate the statistics. This is done with stopwords defined by Bertel Torp, stopwords defined by snowball and custom stopwords found from interacting with data.

Words have been stemmed, to get better data for topics. This does remove information about word forms. this is done with snowballC for R

Words have been lemmatized (replacing words with identical meaning with a headword), to improve topic analysis, by using udpipe for R.

Topics

Topics have been derived with a Structural Topic Model (STM) that is characterized by estimating topic distributions using a covariate matrix as well as a document-term matrix.

The covariates used for estimating topic distributions in the Queen's new year's addresses is year of speech and average sentiment on sentence level.

Sentiment analysis

For danish: Sentiment analysis is done by getting sentiment values from "Det Danske Sprog- og Litteraturselskab (DSL, Society for Danish Language and Literature) and Center for Sprogteknologi, Københavns Universitet (CST, Centre for Language Technology, University of Copenhagen)", and adding the the values to the words we collected on word level, stem level and lemmatized values.

For danish: Sentiment analysis is done by getting sentiment values from "Det Danske Sprog- og Litteraturselskab (DSL, Society for Danish Language and Literature) and Center for Sprogteknologi, Københavns Universitet (CST, Centre for Language Technology, University of Copenhagen)", and adding the the values to the words we collected on word level, stem level and lemmatized values.

For english: Sentiment analysis is done by getting sentiment values from the AFINN-111 sentiment dataset and adding to the words we collected on word-, stem- and lemmatized-level.

Speeches covered

This are the speeches that are covered in the dashboard, with your current filter.

You can use this to see the source from what you filtered.

How to use

You can use language and year filters. To see all text in a language selected language, select all possible years.

Notice: English version of speech are translations. Not all Danish speeches are translated. Some years are therefore not available.

This is the info box from Wikipedia about the queen.

You can use this to familiarize yourself with the subject, or just want a quick read-up.

How to use

You can read the table, or visit the source, above the table.

How to operate dashboard

How to interpret visualizations and information

If you wish to have some guidance in your data journey, we have prepared assisting labels for each visualization.

These labels will be located directly beneath the point of interest.


Click the icon in the top right cornor to hide/expand section.

What marking: A blue dot with bold dark-blue text indicates what kind of visualization/information is depicted.

Why marking: A Yellow circle with dark-yellow bold text indicates why and when to use you could use the visualization/information you can observe.

How to use

How marking: A light-brown triangle with brown bold text indicates how you can interpret information in the visualization.

Grey italic/slanted text indicates other kinds of help text. This could be descriptions on how operate inputs, information about limitations and bugs.

Black text is other kinds of information. It would usually be practical information.

Filters

This dashboard utilizes filtering systems. The main filters are in the sidebar when expanded. Some visualizations have custom filters. They are documented when needed.

The filters can be used to filter the data you see. You can do this to inspect a certain subset of data. E.g., to look data relevant to you.

To expand the sidebar, press the expand button in the top left corner.

Language filter

The language filter set's what language/translation of speeches you inspect, and what language to display the Wikipedia info box.

For this dashboard, Danish would be optimal. It is the original language of the speeches. If you don't understand Danish, you can still use a limited set of speeches in English.

Push the text or dot next to the language you want to use.

Year filters

Year filter refers to the group of filters affecting year. These are: the year input, the year slider and the year selector.

Filtering years can help you limit the data relevant to the topic you are researching.

Year input: Push the text or dot next to the year input method you need or prefer

Year range: select the start year and end year by drag 'n drop the round dots.

Year selector - remove year(s): click on a year and press Backspace-key or Delete-key to remove a year. Hold down Ctrl to select multiple. Alternatively, select input-field (e.g. by clicking or using tab button), and navigate the cursor with the arrow keys, and use Backspace-key or Delete-key to remove years.

Year selector - Add year: select input-field (e.g. by clicking or using tab button) and type year. Press enter to add or select suggestion by clicking on it.

Notice: Some visualizations will use the whole dataset as a reference regardless of your set filter.

Featured words

Featured words is a selection of words from words featured in the speeches, that is used to filter or feature words in different kinds of ways.

When you are inspecting a topic, some words might seem important. So, you can filter for these words using the filter.

Select word: Type word and press enter when done or click suggestion pop up.

Remove word(s): click on a word and press Backspace-key or Delete-key to remove word Hold down Ctrl to select multiple. Alternatively, select input-field (e.g. by clicking or using tab button), and navigate the cursor with the arrow keys, and use Backspace-key or Delete-key to remove words.

Remove words by filter: Click button labled: "Clear words"

Regret removal by clear words: Click button labled: "Regret clear". It only works for last clearning.