Tag Archives: cafe hounds

Data Science: Exploring CoffeeReview.com Top Coffees (Cntd.)

In the last post we began exploring the relationship between the language describing coffee (“cupping notes”) and price/brand/roaster. Our objective is to provide coffee consumers with a general understanding of particular groupings of coffee they can choose based on flavor profiles and mouthfeel characteristics. An example of the type of properties coffee professionals use to describe their craft is illustrated in the below flavor wheel from Counter Culture:

CC_FlavorWheel

After evaluating the segments that our initial k-means clustering (with a k of 5) produced, I was unsatisfied with the results. My decision to haphazardly throw the price variable (unscaled) into the model was wrong-headed and drove the algorithm to essentially classify segment membership solely based upon that. In some cases such an exercise may be useful, but for our objective of discerning whether specific language could be used to segment particular specialty coffees, this segmentation wasn’t going to do it for us.

Also, this initial segmentation helped me narrow my “business objective”. Now I wanted to segment by flavor profile, something that might actually help inform a potential consumer’s purchasing decisions.

In order to develop the cupping note variables that would inform our segmentation, I explored the text data from Kenneth Davids’ site and selected the most common and/or most distinguishing words to test. The list of words is below.

wordlist

A quick look at these led me to believe that certain words might not yield significant information gain in the algorithm due to lack of variance. Mouthfeel, sweet and acidity were present in 96%, 80% and 90% of reviews respectively. Their power as differentiating variables would be constrained by their existence in nearly all observations (with the possible exception of acidity).

However, in my initial quick cluster using SPSS, I included the three variables mentioned above and I still liked the results enough to move forward.

Segment 1: 16.9% of reviews

Segment 1: 16.9% of reviews

This segment was the most expensive (average $42.31 USD per pound) and highest rated (94.6). The segment was the highest indexed on floral, honey, complex, silk, delicate, intense, and peach cupping notes. It also indexed highly on nib, lemon and acidity. The most common producer countries in this mix were geisha panama and Colombia, Ethiopian, Kenyan and El Salvadoran coffees.

List of Segment One Coffees 

Seg1_L1 Seg1_L2

Segment 2: 27.8% of reviews

Segment 2: 27.8% of reviews

This segment was the least expensive (average $26.72 USD per pound) and moderately rated (94.45) while coming from the most diverse sampling of producer countries. It indexed highest on rich, deep, resonant and pungent cupping notes. Whereas the other segments did not include any coffees from Bolivia, Brazil, Mexico or Papa New Guinea, this segment did.

List of Segment Two Coffees 

Seg2_L1Seg2_L2Seg2_L3

Segment 3: 13.9% of reviews

Segment 3: 13.9% of reviews

This segment was middle of the road in terms of cost and ratings (average $37.09 USD per pound and rated 94.52 on average). It indexed highest as juicy, tart, acidity, nib, bright, sweet, and was also well above average in complexity and floral notes. The range of producing countries varied quite a bit in this segment, with several bourbon varietals from Guatemala, Costa Rica, Hawaii – still other Geishas from Panama, Colombia and Guatemala – several Ethiopian Yirgacheffe coffees and a few honey processed coffees from El Salvador (Pacamara) and Hawaii (Maragogype ($75/lb)).

List of Segment Three Coffees 

Seg3_L1 Seg3_L2

Segment 4: 20.3% of reviews

Segment 4: 20.3% of reviews

This segment was the least expensive ($28.46 USD per pound) and lowest rated (94.33) – all things relative to a very highly rated group of coffees. It indexed highest for fruit, sweet, lemon and light while also coming in pretty strong in the tart department as well. This segment is composed of a mixture of coffees from Ethiopia, Kenya, Burundi, Indonesia and Honduras. A few peaberry coffees are included, the red caturra from Rusty’s Hawaiian, a few stray Geisha coffees, and a decently heavy sampling of Sumatra, Yirgacheffe, Sidamo, and various Kenyan single-origins. For the value, this is a very attractive and diverse segment of coffees. See our site visit to Rusty’s in Hawai’i in 2011.

List of Segment Four Coffees 

Seg4_L1 Seg4_L2

Cupping With Miguel At Lorie's Home

Cupping With Miguel At Lorie’s Home

Segment 5: 21.1% of reviews

Segment 5: 21.1% of reviews

Segment five is highly rated (94.58) and quite expensive ($37.73 USD per pound on average). This segment indexes the highest for tart, rich, acidity, syrup, pungent, and mouthfeel, while also scoring highly for honey and bright notes. Panama, Colombia, Hawaii and Ethiopia are the most heavily represented producer countries in this grouping. This segment is probably the most populated by Geishas followed by exotic Ethiopian and Kenyan coffees.

List of Segment Five Coffees 

Seg5_L1Seg5_L2

 

 

For more information on the roasters evaluated in this data from the coffeereview.com website, see the links and data below:

ML_1ML_2ML_3ML_4

And I’ll leave you with a bit of a refresher on the Cup of Excellence Scoring Categories for thinking about and communicating coffee quality/taste.

Cup of Excellence® Scoring Categories

DEFECTS

Phenolic, rio, riado automatic disqualification Ferment
Oniony, sweaty

CLEAN CUP
+ purity | free from measurable faults | clarity – dirty | earthy | moldy | off-fruity

SWEETNESS (prevalence of…)
+ ripeness | sweet
– green | undeveloped | closed | tart

ACIDITY
+ lively | refined | firm | soft | having spine | crisp | structure | racy – sharp | hard | thin | dull | acetic | sour | flabby | biting

MOUTHFEEL (texture, viscosity, sediment, weight, astringency)
+ buttery | creamy | round | smooth | cradling | rich | velvety | tightly knit – astringent | rough | watery | thin | light | gritty

FLAVOR (nose + taste)
+ character | intensity | distinctiveness | pleasure | simple-complex | depth

(possible notations: nutty, chocolate, berry, fruit, caramel, floral, beefy, spicy, honey, smokey…)

– insipid | potato | peas | grassy | woody | bitter-salty-sour | gamey | baggy

AFTERTASTE
+ sweet | cleanly disappearing | pleasantly lingering
– bitter | harsh | astringent | cloying | dirty | unpleasant | metallic

BALANCE
+ harmony | equilibrium | stable-consistent (from hot to cold) | structure | tuning | acidity-body – hollow | excessive | aggressive | inconsistent change in character

OVERALL (not a correction!)
+ complexity | dimension | uniformity | richness | (transformation from hot to cold…) – simplistic | boring | do not like!

Data Science: Exploring CoffeeReview.com Top Coffees

Over the past few years, I’ve transitioned my career from government-oriented management consulting to the field of advanced analytics and data science.

 

In general terms, this has required me to climb a significant learning curve in the related areas of computer programming languages and advanced statistical methods. While it has been challenging, the rewards of being able to more effectively and efficiently extract insights from various types of information/data is encouraging.

With the objective of exploring my love of specialty coffee, I chose to practice a few basic data science methods on a relatively well-known specialty coffee review website: coffeereview.com .

The goal was to apply web scraping, text analytics, segmentation, and some visualization techniques to coffee review data in order to explore correlations between price, producer country, roaster, and quality over time.

My colleague and I discussed the objective over Memorial Day weekend and set out on parallel paths to scrape review data from the website. He used a Python script to scrape the website, and I used an R script to do the same. In the end, his Python script achieved a more efficient scrape, producing a column separated variable (.csv) file that could be imported into a statistical computing software package like SPSS or R.

The website we targeted in this scrape was the 21 pages of: http://www.coffeereview.com/highest-rated-coffees/

 

From there, I cleaned up the file (using R packages such as “dplyr”, “stringr” and “sqldf” to get things to a point where we could calculate price per pound amounts and country of origin for most of the coffees reviewed. I was also able to pull down city/state location data for each of the roasters and their websites.

One of my first business questions involved the type of descriptive language used to review the website’s top-rated coffees. Where there any particular words that we could associate with the best rated coffee out there, according to coffeereview.com?

A relatively straightforward way to investigate that question is to use a Word Cloud to illustrate the words with the highest frequency of mention in individual review comments.

Most frequent words describing top rated coffees.

Most frequent words describing top rated coffees.

Clearly, if you want to appear to know the jargon for communicating your delight about a quality cup of java, you should say something like, “This coffee’s intense aroma of flowers, baker’s chocolate and fruit is only bested by its complex, rich flavor with tart tinges of acidity and a balanced, silky, syrupy, honey finish…”. Okay…so that sounds ridiculous…but you get the point.

Exploring the data

What is the range of ratings found on the top rated page?

The maximum rating any single coffee receives on this page (of highest rated coffees) is 97, while the minimum is 94. There isn’t a lot of variance. Most of the top rated coffees are rated 94, a third are 95, and the remaining15 percent are either 96 or 97. We will revisit this data later.

Distribution of Top Rated Coffees from CoffeeReview.com

Distribution of Top Rated Coffees from CoffeeReview.com

What years of ratings do we have the most robust data for in order to do more specific analysis on our variables?

We decided to drop all years prior to 2010 (which had 29 coffees reviewed that year).

year count
2014 70
2013 58
2012 40
2011 39
2015 24
2010 20
Which coffee roasters were the most frequently reviewed and top rated by coffeereview.com between 2010 and roughly six months into 2015?

JBC Coffee Roasters from Madison, Wisconsin was the favorite by far in terms of its 26 reviews on the website in the time span specified. Followed by Temple Coffee and Tea in Sacramento, CA (20) and PT’s Coffee Roasting Company in Topeka, Kansas (13). This was a surprise to me, as I have never sampled ANY coffee from these roasters and feel like I have been missing out. In order to show the table of roasters, i used the combination of R packages “RGraphics” and “gridExtra” to save some nice incremental (sets of 15) graphics.

roasters_1_15

roasters_16_30 roasters_31_45 roasters_46_60 roasters_61_73

A quick visualization of the top rated coffees by year, price per pound and origin country shows some semi-distinct segments within the data based on price alone. This led me to ponder if we could use a clustering algorithm (such as k-means using dummy variables for each country, price per pound, and rating) in order to more clearly segment particular coffees by segment. Instead of using R for this exploration, I exported the data into a .csv and imported it into SPSS to run the analysis there.

Price per pound by origin country and year ($US).

Price per pound by origin country and year ($US). United States = Hawai’i.

A five-way cluster solution seemed the most suitable for segmenting the data in a way that illustrated differences across price and producer country.

Price unreasonably drove the segmentation, as seen in this graphic.

Price unreasonably drove the segmentation, as seen in this graphic.

The segments broke out into groupings containing the following number of coffee reviews each:

Segment                       Count               $US/lb

1                                       174                   $21
2                                       8                      $121
3                                       35                    $44
4                                       1                       $243
5                                       20                    $84

Segment 1: No Geisha or Hawaiian Coffees, Espresso Blends
Segment 2: Panama and Colombian Geishas
Segment 3: Mix of Geishas, Ethiopian, and Hawaiian
Segment 4: Semeon Abay Ethiopia
Segment 5: Mid-priced Geisha, Hawaiian and Ethiopian

Interestingly, a few roasters exhibited a bit of dispersion across the segments due to the variety of awesome tasting coffees they had reviewed. Those roasters included:

PT’s Coffee Roasting Co.

5 (Seg 1)
3 (Seg 2)
3 (Seg 3)
2 were (Seg 5)

Barrington Coffee Roasting Co.

3 were (Seg 1)
4 were (Seg 3)
1 was (Seg 4)
3 were (Seg 5)

Bird Rock Coffee Roasters

6 were (Seg 1)
1 was (Seg 2)
3 were (Seg 3)
1 was (Seg 5)

Paradise Roasters

6 were (Seg 1)
1 was (Seg 2)
1 was (Seg 3)
2 were (Seg 5)

After exploring the data in this way, I wondered if 1) my approach to segmentation was appropriate 2) what the comments from these segments looked like comparatively. To answer the first question: no, but that will be the topic of my next blog post. To answer the second, let’s explore some word clouds below.

Word Cloud: Segment 1

Word Cloud: Segment 1

Word Cloud: Segment 2

Word Cloud: Segment 2

Word Cloud: Segment 3

Word Cloud: Segment 3

Word Cloud: Segment 4

Word Cloud: Segment 4

Word Cloud: Segment 5

Word Cloud: Segment 5

 

Perhaps clustering by cupping notes is a better way to segment groups…stay tuned.

Interview: Chuck Patton – Bird Rock Coffee Roasters

Name: Chuck Patton
Title: Owner, Bird Rock Coffee Roasters, La Jolla, CA
Birthplace: San Diego, CA
Hometown: Pacific Beach community. Went to elementary, junior high and high school within a few miles of present day Bird Rock Coffee Roasters retail location.

La Jolla Light festivities.

La Jolla Light festivities.

Background

Cafe Hound: Where does your passion for specialty coffee come from? When was that?
Chuck: I started drinking a lot of coffee in high school just for the buzz.  Several years ago, my wife got me a home roaster and I spent a lot of time experimenting with different beans from Sweet Maria’s until the hobby grew into a business.
CH: Tell me about your entry to coffee industry.
Chuck: I bought a one pound fluid air roaster for about US$3,500 and began a home delivery service. I also sold my coffee at the La Jolla farmers market.
CH: How many years of experience do you have in coffee industry?
Chuck: I started the business in 2002.
CH: Did you work for other coffee establishments before starting your coffee business?
Chuck: No. I was self taught.
CH: What was the first location of your business?
Chuck: I did not have a location at first.  I roasted out of the VFW on Turquoise because they had a health permit.  Then, I rented space in a restaurant that is now out of business on La Jolla Blvd.  I converted his wine bar into a coffee bar for morning business but it did not do well.  I chalked it up to a learning experience.  Then, I rented a coffee kiosk on Turquoise behind Albertson’s and operated out of there as the smallest licensed coffee wholesaler in California.  Then, I bought the business of a guy who was burnt out.  It included a list of wholesale accounts and a Probat L12 but, operated out of Miramar.  So, we did that for about a year until we moved here.
CH: Who were your initial clients?
Chuck: Most of our clients are from Bird Rock, La Jolla, and Pacific Beach.


Pony spotting at BRC

Bean & Drink Talk

CH: Where do you buy your beans from?
Chuck: Different brokers. If we are buying directly from farmers, we still need to work with an importer and exporter.
CH: Do you roast your own beans?
Chuck: We roast our own.
CH: Do you sell wholesale or online?
Chuck: Yes, we do wholesale and also sell online.
CH: How often do you order beans? How often do you roast?
Chuck: We roast 5 days a week and order coffee at least twice a month.
CH: How do you name your blends?
Chuck: We only have two blends. We focus on single-origin coffee.
CH: What are your top 3 favorite roasts of the recent past?
Chuck: First, Ethiopia Amaro Gayo city roast. Second, Panama La Esmeralda city roast. And third, Costa Rica Micro-lot full city roast.
CH: What is your favorite drink?
Chuck: Coffee.
CH: What drink is the most sold at Bird Rock?
Chuck: Lattes.
CH: Are there any interesting stories behind your drink names?
Chuck: Trophy wife and sugar daddy are self-explanatory considering the area our café is located.

P1000231

Looking Ahead

CH: Do you have any plans for expansion?
Chuck: Secret.  No comment.
CH: So…what’s next?  Beyond the business, what else you would like to do through your work?Chuck: We are currently working on a water filtration project for some of the farmers we are working with in Huila. I will return to Colombia next month to install the second generation of prototypes in a few of the farmers’ homes. I believe we have a responsibility to the farmers we buy coffee from that goes beyond simply purchasing “Fair Trade” coffee so we will focus on projects like this in the future.
CH: What else do you want to tell our reader?
Chuck: We are continuing to seek out and purchase high quality coffee directly from farmers so we are increasing our travel time as we begin to develop relationships with farming groups.
CH: Thank you for sharing your interesting story with us.

Business Information

Bird Rock Coffee Roaster 5627 La Jolla Blvd., La Jolla, CA 92037 Tel. 858 551 1707 www.birdrockcoffeeroasters.com

Special Thanks

We would like to thank Chuck and the good folks at Bird Rock Coffee for roasting the beans used in the first release of Kris/Maher Blend. Maher also wants to thank all of the employees for keeping him caffeinated and happy over his last year of residence in Pacific Beach, especially Hector, Jocylynn, and Tony. Maher knows he’s forgetting the two dudes that used to make sure he got his morning espresso as he rushed to school – unfortunately, the key word was “rush”.

Photo credits: cafehound.com and http://www.lajollalight.com/life/258652-taste-of-bird-rock

 

Bird Rock Coffee Roasters

http://www.birdrockcoffeeroasters.com

5627 La Jolla Blvd.

La Jolla, CA 92037

858 551 1707

HRS: Mon-Fri 6am-6pm; Sat-Sun 6:30am-6pm

Background

Name: Chuck Patton
Title: Owner
Birthplace: San Diego, CA
Hometown: Pacific Beach community

· Went to elementary, junior high and high school within a few miles of present day Bird Rock Coffee Roasters retail location.

Cafehound.com: Where does your passion for specialty coffee come from? / When was that?

Chuck: Started drinking a lot of coffee in high school just for the buzz.  Several years ago, my wife got me a home roaster and I spent a lot of time experimenting with different beans from Sweet Maria’s until the hobby grew into a business.

Cafehound.com: Tell me about your entry to coffee industry.

Chuck: I bought a one pound fluid air roaster (~US$3,500) and began a home delivery service; selling my coffee at the La Jolla farmers market.

CH: How many years of experience do you have in coffee industry?

Chuck: Started the business in 2002.

CH: Did you work for other coffee establishments before starting your coffee business?

Chuck: No. Self taught.

CH: What was the first location of your business?

Chuck: Did not have a location at first.  I roasted out of the VFW on Turquoise because they had a health permit.  Then, I rented space in a restaurant that is now out of business on La Jolla Blvd.  I converted his wine bar into a coffee bar for morning business but it did not do well.  I chalked it up to a learning experience.  Then I rented a coffee kiosk on Turquoise behind Albertson’s and operated out of there as the smallest licensed coffee wholesaler in California.  Then, I bought the business of a guy who was burnt out.  It included a list of wholesale accounts and a Probat L12 but, operated out of Miramar.  So, we did that for about a year until we moved here.

CH: Who were your initial clients / client profile?

Chuck: Most of our clients are from Bird Rock, La Jolla, and PB.

Bean talk

CH: Where do you buy your beans from?

Chuck: Different brokers. If we are buying directly from farmers we still need to work with an importer and exporter.

CH: Do you roast your own or purchase from a wholesaler?

Chuck: We roast our own.

CH: Do you sell wholesale? Online?

Chuck: Yes/Yes

CH: How often do you order beans? How often do you roast?

Chuck: We roast 5 days a week and order coffee at least twice a month.

CH: How do you name your blends?

Chuck: We only have two blends.    We focus on single-origin coffee.

CH: What are your top 3 favorite roasts (country, degree of roast, specific origin/farm if possible) of the recent past?

Chuck:

Country

Most specific Origin

Roast (degree of roast)

Ethiopia

Amaro Gayo

City Roast

Panama

La Esmeralda

City Roast

Costa Rica

Micro-Lot

Full City Roast

Drink talk

Favorite Drink: Coffee

Most sold at Bird Rock: Lattes

CH: Are there any interesting stories behind your drink names?

Chuck: Trophy wife and sugar daddy are self-explanatory considering the area [our café is located in]…

Looking Ahead

CH: Any plans for expansion?

Chuck: Secret.  No comment.

CH: So…what’s next?  Beyond the business, what else you would like to do through your work? (Training initiatives, Farm visits, Educational programs, Environmental programs, etc…)

Chuck: We are currently working on a water filtration project for some of the farmers we are working with in Huila.  I will return to Colombia next month to install the second generation of prototypes in a few of the farmers’ homes.   I believe we have a responsibility to the farmers we buy coffee from that goes beyond simply purchasing “Fair Trade” coffee so we will focus on projects like this in the future.

CH: What else do you want to tell our reader?

Chuck: We are continuing to seek out and purchase high quality coffee directly from farmers so we are increasing our travel time as we begin to develop relationships with farming groups.

End of interview.

Special thanks to Chuck and the good folks at Bird Rock Coffee for roasting the beans used in the 1st release of Kris/Maher Blend. Maher also wants to thank some the employees for keeping him caffeinated and happy over his last year of residence in Pacific Beach: Hector, Jocylynn and Tony. Maher knows he’s forgetting the two dudes that used to make sure he got his morning espresso as he rushed to school – unfortunately, the key word was “rush”.

Photo credits: Cafehound.com and http://www.lajollalight.com/life/258652-taste-of-bird-rock