# Sensometrics tutorials

Organised by The Sensometrics Society

The following tutorials, organised by the Sensometrics Society will take place on 13 September 2022 10am- 2pm. The cost is €70 + VAT : please book via the conference registration system

Tutorial 2: An introduction to text analysis with R for sensory and consumer scientists

Tutorial 1: Multivariate Data Analysis in Sensory Evaluation (or, why PCA should be your desert island airplane)

**Instructor(s): John Castura, ***Compusense Inc., Canada,*** Ingunn Berget, ***Nofima AS, Norway*

Suppose you are going to live on a desert island. As is usually the case on desert islands, you also need to analyse consumer and sensory test data while you are there. What multivariate statistical method will you bring with you? I am sure I’m not alone in making principal component analysis (PCA) a top choice. Why? PCA moves the observed results to a different coordinate system while handling the multicollinear sensory attributes, while doing data compression and data reduction along the way. The results are readily visualized and reveal patterns that might otherwise be missed. And it is key to unlocking other multivariate methods also. If you don’t already agree that PCA belongs on your desert island, by the end of this tutorial, you might.

The tutorial begins with a crash course in PCA. We describe the PCA airplane and fuel it with a data matrix, X. In flight, we describe how we get score and loading matrices from X using singular value decomposition. Then we show how the PCA results can be visualized and interpreted. Then, mid-flight, the instruments go haywire and the engines sputter! Oh no! Everyone aboard the plane parachutes to safety, and the pilot, against all odds, lands the plane safely in an emergency water landing. Everyone is okay! But we don’t want to ever have to do that manoeuvre again.

We retrieve the black box. What went wrong with PCA Flight #1 is that the original variables were measured with uncertainty, then after PCA treated as if they were not. To improve the safety of future PCA flights, we describe a bootstrap procedure for investigating uncertainty. We present research, some of it very new, for using these uncertainty estimates to investigate how well products are discriminated.

PCA Flight #2 takes flight, again fuelled by data matrix X. We can see clouds of uncertainty around the PCA results with our new instruments. These visualizations are connected to numerical results that reveal product discrimination. Interpretation focuses on the connection between the visual and numerical approaches. It is not a coding tutorial, but at cruising altitude, there will be mid-flight statistical analyses in R showing how it’s done. When PCA Flight #2 touches down for a perfect landing, the time in Turku will be about 3 hours after the tutorial departure time. Tutorial attendees will deplane having a new understanding and tools they can bring to their work, or to their desert island.

__Duration__: 3 hours

__Audience__: Sensory scientists and statisticians interested in analysis of consumer test data

__Background__: Basic understanding of statistics is advantageous but not required

__Laptop:__ It is not a coding workshop so a laptop is not required. Attendees will receive a list of resources related to the topics discussed. Some R code related to examples shown in the tutorial will be provided after the tutorial to attendees.

__Contingency__: If it is necessary, the workshop will pivot to an online format. Tutorial registrants will receive notice of any changes in plans.

Tutorial 2: An introduction to text analysis with R for sensory and consumer scientists

Instructor(s): Jake Lahne, Leah Hamilton and Martha Calvert, *Virginia Tech, USA*

Analysis of unstructured text data—for example, free responses on surveys or comments on websites that allow customer reviews of products—is increasingly of interest to sensory and consumer scientists. However, most sensory and consumer scientists are trained primarily in the analysis of purely quantitative data, and so the prospect of dealing with this inherently qualitative data can be daunting. In addition, many of the most powerful and flexible tools for analysing these data are in programming languages—like R and Python—that can also be intimidating for new users. Fortunately, useful, basic text analysis approaches are more accessible than they might appear, and this tutorial will present one set of methods to open analysis of text data to a larger audience. All resources used in the tutorial are open-source and will remain available to attendees.

In this tutorial, we will introduce the audience to the R statistical programming environment and the RStudio Interactive Development Environment (IDE) with the aim of developing sufficient basic skills to conduct a text analysis on sensory-relevant text data. We will provide a learning dataset of text data for the analysis—a set of food-product free-comment reviews that are associated with overall liking scores. This will allow us to demonstrate connections between text analysis methods and basic sensory and consumer science approaches. We will also provide an R script that walks through all steps of importing, manipulating, and analysing the test dataset.

The tutorial will have 2 sections. In the first section of the tutorial, we will introduce R and RStudio, we will cover the basic commands of R, and we will cover key, user-friendly conventions of ”tidy” R programming for importing, manipulating, and plotting data using the “tidyverse” packages. In the second section we will use the “tidytext” package to conduct basic text analysis, including text tokenization, text modeling using TF-IDF, and basic lexicon-based sentiment analysis.

At the end of the tutorial, attendees will have learned the basics of working with data in R and have familiarity with basic methodologies for text analysis.

Duration: 3 hours

Audience: Sensory and consumer scientists who are interested in learning basic text analysis using freely available/open-source software (R/RStudio).

Background: Basic understanding of statistics is helpful but not required. Basic experience with R is similarly helpful but not required, although in order to move through all of the material we will not be able to cover the full range of how to use R.

We will send out an email to registered participants detailing some basicsetup requirements (R/RStudio software installation).

Laptop: This is a coding workshop, and so we ask all participants to bring a laptop. We will ask for minimal pre-work (installation of R/RStudio).

Contingency: If it is necessary, the workshop could pivot to an online format. Tutorial registrants will receive notice of any changes in plans.