H.M. the Queen's New Year's Address Analyzer

Home

You are hereby invited to a look into the new year eve speeches of the royal majesty

This is a dashboard. The dashboard is a tool to analyze topics in speeches.

Topics can be hard to document correctly. By using this tool, you can find relevant topics and the words within on the topics tab. From there you can then choose to explore the sentiment, countries, or statistics to get a better understanding of the topic. You could also do your own thing and look at pretty graphs and statistics. We won't judge.

This is an interactive dashboard. You can select tabs in the sidebar on the left to navigate to different sections. When sidebar is expanded (see toggle button in top left corner), you can apply filters and settings for your liking.

GUD BEVARE DANMARK.

Statistics included

● Statistics included displays the aggregated statistics for various covered topics.

■ Total values can give a idea about what is about to be covered.

How to use

▲ Interpretation: Values are summed for different categories.

▲ Year filter: Setting the year filter will filter for the words used in those years.

▲ Featured words: Setting featured words will filter for the words used in that selection.

Topics

Topic model parameters

● Options to set prefference to update featured words from topic model.

■ You can more easily inspect topic in other tabs by quicly obtaining their values.

Update featured words with topics content

Update on topic selection

Update on word selection

Do not update

Notice: "Update on topic selection" will clear previous selections.

● Number of terms to display sets the number of terms used in the topic model, by size of topic.

■ Limiting the number of terms makes the topic model easier to interpret. Adding more, gives more information.

Number of terms to display

How to use

▲ Setting featured words update prefferece: Click a a setting that fits your needs the most.

▲ Choose terms to display: Use the slider to select a number by dragging the dot to any number.

Topic model

● Topic model is a tool that maps topics. It does so by analyzing word relations by analyzing all speeches and mapping the relations as word matrixes. Then the model displays the topics to the left, and the words within on the right. Frequency of use is depicted by size for topics, and bar length for words. Topics are named 1:n, where 1 is the largest topic.

■ The topics model can be used to get an understanding of themes present in the speeches throughout the years.

How to use

▲ Topics (left): The sizes in the visualization in the body represent frequency and can be interpreted as so. To inspect a topic, it can be selected in the selector in the header, a topic circle can be hovered. Clicking a topic will keep it highlighted.

▲ Topic content (right body): Hovering a word will show what topics they are included in. Clicking a word will add an underline, but nothing else.

▲ Relevance metric slider (right header): The slider adjusts relevance for words to be part of a topic. To see it work, first highlight a topic (by clicking or using the top left filter).

Sentence options

● A slider to set the maxium amount of sample sentences.

■ To adjust how many sentece samples you want to see.

Senteces allowed

How to use

▲ Drag the circle on the slider to any number.

● Samples of sentences belonging to chosen topic.

■ To see examples of sentences in chosen topic.

How to use

▲ Click on a a topic in the topic map above.

● This boxplot shows the average sentiment of sentences belonging to a specific topic, compared to the rest of the topics.

■ To compare a specific topics sentiment to the rest of the topics.

How to use

▲ Click on a a topic in the topic map above to make it appear.

▲ Hover over a box in the boxplot to see descriptive statistics.

Sentiment

● Sentiment by year shows the positive-, negative- and summed sentiment by year. This model does so in ranges.

■ Displaying sentiment in connected ranges, makes it easy to interpret the relation of sentiment between years.

How to use

▲ Interpretation: Positive sentiment, Summed sentiment and negative sentiment is displayed. The range between positive and summed sentiment is highlighted. The difference between postive- and summed sentiment is the same value as the is the negative sentiment. Looking at the size of the changing sizes of the range can be used to observe changes in sentiment by year.

▲ Tooltip: Hovering a year displays a tooltip that shows the sentiment values of the year.

▲ Year filter: Using the year filter, will filter years featured.

▲ Featured words: Using featured words will show the sentiment that subset had in each year.

● Sentiment relationship shows the relationship between positive and negative and size of speeches.

■ The sentiment of speeches could indicate influences in topics. Influences from a year needs to be researched independently.

How to use

▲ Interpretation: The center position of a year-circle indicates it's sentiment. X-axis indicates positive sentiment from low (left) to high (right). Y-axis indicates negative sentiment from high (bottom) to low (top).

▲ Tooltip: Hovering a year-circle shows its positive sentiment, negative sentiment, summed sentiment, sentiment label and words with polarity in year in a tooltip.

▲ Year filter: Using the year filter, will filter years featured.

▲ Featured words: Using the featured words, will group the years into years that include words in the filter, and years that do not. This will be displayed in the tooltip.

● Sentiment by year shows the positive-, negative- and summed sentiment by year. This model does so in columns.

■ Displaying sentiment in columns like this makes it easy to interpret sentiment sizes in relation between positive-, negative- and summed sentiment and difference between years.

How to use

▲ Interpretation: Negatives are displayed left and positives right. The Sum shows their aggregated value. Observe the size difference.

▲ Series filter: By clicking on a series name, it can be disabled until clicked again, or updated by a filter.

▲ Tooltip: Hovering the values of a year will show the sentiment of enabled series.

▲ Year filter: Using the year filter, will filter years featured.

▲ Featured words: Using featured words will show the sentiment that subset had in each year.

Sentiment of words

● Sentiment of words displays words by sentiment.

■ Sentiment of words can help you identify what words influenced the sentiment.

How to use

▲ Interpretation: the most positive words have a polarity of 3. The most negative words have a polarity of -3.

▲ Tooltip: hovering a word will display the polarity of the word, and the sentiment category.

▲ Series filter: Click on a series name to disable it. Click again or set a new filter that affects the visualization to enable it again.

▲ Year filter: Setting the year filter will filter for the words used in those years.

▲ Featured words: Setting featured words that have sentient will make them be included before other words (otherwise words are included by frequency). If at least one word with sentiment is present, the series will be names after both if they are positive/negative and if they are included or not.

▲ Number of words: Number of words filters words included, from most to least frequent.

Notice: Words with a polarity of 0 is filtered.

Frequency used (n uses in total)

● Frequency used displays words by frequency.

■ Frequency used can help you identify what words influenced the sentiment.

How to use

▲ Interpretation: Larger bars have higher frequency.

▲ Series filter: Click on a series name to disable it. Click again or set a new filter that affects the visualization to enable it again.

▲ Year filter: Setting the year filter will filter for the words used in those years.

▲ Number of words: Number of words filters words included, from most to least frequent.

Notice: Words with a polarity of 0 is filtered.

Number of words

● Number of words is a slider to set the number of featured words.

■ Limiting the amount words featured will reduce clutter and make Sentiment of words and Frequency used easier to interpret.

Number of words (by frequency)

How to use

▲ Set a number by using the slider.

Countries

● A world map, with countries mentioned in speeches.

■ To know where countries are, and how often they are mentioned.

How to use

▲ Interpretation: Highlighted countries are mentioned. Darker colors represent more mentions.

▲ Tooltip: hovering or clicking brings up a tooltip, showing total mentions and amount of mentions in each relevant year.

▲ Year filter: Setting the year filter will filter for the countries mentioned in those years.

Countries mentioned

● A bar chart showing the amount of times a country is mentioned for each year.

● A boxplot showing the average sentiment of sentences in which a country is mentioned

■ To compare the mentions of a specific country, to mentions of countries in general

How to use

▲ Tooltip bar chart: hovering over a bar shows the amount a specific country is mentioned in that year, compared to mentions of countries in general

▲ Tooltip boxplot: hovering over a box shows distribution statistics of the average sentiment of sentences, in which a specific country is mentioned, compared to the distribution of average sentiment of sentences, in which any country is mentioned.

● Samples of mentions.

■ To see examples of mentions.

How to use

▲ Click on a country in the map above.

Sentence options

● A slider to set the maxium amount of sample sentences.

■ To adjust how many sentece samples you want to see.

Senteces allowed

How to use

▲ Drag the cirkle on the slider to any number.

Word statistics

Speech statistics sliders

● Top frequent numbers is a slider to filter for frequency of words in word themed visualizations.

■ Many words can be mentioned. Most of these visualizations can become to cluttered to interpret.

Top frequent numbers

How to use

▲ Use the slider by dragging the circle to a any number.

Tip: Don't choose a too high number at first. Start low. Models can become to cluttered to interpret and it will take a long time to load visualizations.

● Aggregated word frequency is a pie-chart displaying aggregated frequency of words.

■ Frequency is a major factor for determining topics.

How to use

▲ Interpretation: Frequency can be read from the labels.

▲ Tooltip: Hovering a word will display a tooltip showing the frequency of the word, and frequency in percentage in relation to selection. Clicking a word will highlight it.

▲ Top frequent numbers: Top frequent numbers will filter for the most frequent numbers by the specified amount.

▲ Year filter: Setting the year filter will filter for the words used in those years.

▲ Featured words: Setting featured words will filter for the words set in selection.

Notice: Due to performance issues, stream graph is limited to a maximum frequency of 40. Any number set above, will result in 40 being featured.

● This shows the frequency of words by year with a stream.

■ The model is useful for determining frequency of words and the relation with other words, especially overviewing changing frequency through the years.

How to use

▲ Interpretation: Words are represented by streams, and the width represents the frequency in a year of a selection of words.

▲ Tooltip: Hovering above a year displays a tooltip for the year. It will display the frequency of each word in that year.

▲ Top frequent numbers: Top frequent numbers will filter for the most frequent numbers by the specified amount.

▲ Year filter: Setting the year filter will filter for the words used in those years.

▲ Featured words: Setting featured words will filter for the words set in selection.

Notice: Due to performance issues, stream graph is limited to a maximum frequency of 20. Any number set above, will result in 20 being featured.

● This shows the frequency of words by year with stacked columns.

■ The model is useful for determining frequency of words and the relation with other words of a selection of words.

How to use

▲ Interpretation: Each year has a bar. The height of a bar is the total frequency of word selection. Each bar-part represents a word, and the height, the frequency of the word.

▲ Tooltip: Hovering a year will display a tooltip. The tooltip will list the year, the total frequency, and the frequency of each year.

▲ Top frequent numbers: Top frequent numbers will filter for the most frequent numbers by the specified amount.

▲ Year filter: Setting the year filter will filter for the words used in those years.

▲ Featured words: Setting featured words will filter for the words set in selection.

Notice: Due to performance issues, stream graph is limited to a maximum frequency of 30. Any number set above, will result in 30 being featured.

● This model displays words as data point by year and frequency.

■ The model is useful for determining frequency of words in relation to other years in same year.

■ The model is useful for detecting large values, big influencers, and potential outliers.

How to use

▲ Interpretation: Each year, each word is represented by a datapoint in relation to its frequency.

▲ Tooltip: Hovering a datapoint, will display a tooltip for the point. The tool

▲ Top frequent numbers: Top frequent numbers will filter for the most frequent numbers by the specified amount.

▲ Year filter: Setting the year filter will filter for the words used in those years. It will display the word, the year, frequency of year, and frequency of word in all of the years, even outside of set filter.

▲ Featured words: Setting featured words will filter for the words set in selection.

Notice: Due to performance issues, scatterplot is limited to a maximum frequency of 20. Any number set above, will result in 20 being featured.

● Words in the wordcloud is randomly selected from available words.

■ The wordcloud can give an impression of what a selection of words contain. They are not very useful.

How to use

▲ Interpretation: There are words. Size represents frequency.

▲ Tooltip: Hovering the filling of a word displays a tooltip. The tooltip displays the frequency of the word.

▲ Top frequent numbers: Top frequent numbers will filter for the most frequent numbers by the specified amount.

▲ Year filter: Setting the year filter will filter for the words used in those years.

▲ Featured words: Setting featured words will filter for the words set in selection.

Data source and handling

Data preparetaion

Stopwords are filtered. This is done to avoid the most common words (like "the") to dominate the statistics. This is done with stopwords defined by Bertel Torp, stopwords defined by snowball and custom stopwords found from interacting with data.

Words have been stemmed, to get better data for topics. This does remove information about word forms. this is done with snowballC for R

Words have been lemmatized (replacing words with identical meaning with a headword), to improve topic analysis, by using udpipe for R.

Topics

Topics have been derived with a Structural Topic Model (STM) that is characterized by estimating topic distributions using a covariate matrix as well as a document-term matrix.

The covariates used for estimating topic distributions in the Queen's new year's addresses is year of speech and average sentiment on sentence level.

Sentiment analysis

For danish: Sentiment analysis is done by getting sentiment values from "Det Danske Sprog- og Litteraturselskab (DSL, Society for Danish Language and Literature) and Center for Sprogteknologi, Københavns Universitet (CST, Centre for Language Technology, University of Copenhagen)", and adding the the values to the words we collected on word level, stem level and lemmatized values.

For english: Sentiment analysis is done by getting sentiment values from the AFINN-111 sentiment dataset and adding to the words we collected on word-, stem- and lemmatized-level.

Speeches covered

● This are the speeches that are covered in the dashboard, with your current filter.

■ You can use this to see the source from what you filtered.

How to use

▲ You can use language and year filters. To see all text in a language selected language, select all possible years.

Notice: English version of speech are translations. Not all Danish speeches are translated. Some years are therefore not available.

● This is the info box from Wikipedia about the queen.

■ You can use this to familiarize yourself with the subject, or just want a quick read-up.

How to use

▲ You can read the table, or visit the source, above the table.

How to operate dashboard

How to interpret visualizations and information

If you wish to have some guidance in your data journey, we have prepared assisting labels for each visualization.

These labels will be located directly beneath the point of interest.

Click the icon in the top right cornor to hide/expand section.

● What marking: A blue dot with bold dark-blue text indicates what kind of visualization/information is depicted.

■ Why marking: A Yellow circle with dark-yellow bold text indicates why and when to use you could use the visualization/information you can observe.

How to use

▲ How marking: A light-brown triangle with brown bold text indicates how you can interpret information in the visualization.

Grey italic/slanted text indicates other kinds of help text. This could be descriptions on how operate inputs, information about limitations and bugs.

Black text is other kinds of information. It would usually be practical information.

Filters

● This dashboard utilizes filtering systems. The main filters are in the sidebar when expanded. Some visualizations have custom filters. They are documented when needed.

■ The filters can be used to filter the data you see. You can do this to inspect a certain subset of data. E.g., to look data relevant to you.

▲ To expand the sidebar, press the expand button in the top left corner.

Language filter

● The language filter set's what language/translation of speeches you inspect, and what language to display the Wikipedia info box.

■ For this dashboard, Danish would be optimal. It is the original language of the speeches. If you don't understand Danish, you can still use a limited set of speeches in English.

▲ Push the text or dot next to the language you want to use.

Year filters

● Year filter refers to the group of filters affecting year. These are: the year input, the year slider and the year selector.

■ Filtering years can help you limit the data relevant to the topic you are researching.

▲ Year input: Push the text or dot next to the year input method you need or prefer

▲ Year range: select the start year and end year by drag 'n drop the round dots.

▲ Year selector - remove year(s): click on a year and press Backspace-key or Delete-key to remove a year. Hold down Ctrl to select multiple. Alternatively, select input-field (e.g. by clicking or using tab button), and navigate the cursor with the arrow keys, and use Backspace-key or Delete-key to remove years.

▲ Year selector - Add year: select input-field (e.g. by clicking or using tab button) and type year. Press enter to add or select suggestion by clicking on it.

Notice: Some visualizations will use the whole dataset as a reference regardless of your set filter.

Featured words

● Featured words is a selection of words from words featured in the speeches, that is used to filter or feature words in different kinds of ways.

■ When you are inspecting a topic, some words might seem important. So, you can filter for these words using the filter.

▲ Select word: Type word and press enter when done or click suggestion pop up.

▲ Remove word(s): click on a word and press Backspace-key or Delete-key to remove word Hold down Ctrl to select multiple. Alternatively, select input-field (e.g. by clicking or using tab button), and navigate the cursor with the arrow keys, and use Backspace-key or Delete-key to remove words.

▲ Remove words by filter: Click button labled: "Clear words"

▲ Regret removal by clear words: Click button labled: "Regret clear". It only works for last clearning.