Category Archives: Beans

Find Your Coffee

At cafehound.com, we endeavor to locate the best coffee in the world. Over the last eight years we’ve happily watched as globally, the options available to the public have exponentially increased and the public’s general awareness of specialty coffee has deepened. Although we still believe that tracking down the best coffee in the world is central to our mission, we recently decided to dip our toes into the area of recommending specific coffee(s) to coffee lovers based on a mixture of qualitative and empirical analysis.

espresso_2017

In two posts (1 and 2) from 2015, we took verbal reviews of specialty coffees from the site coffeereview.com,  and we employed various clustering algorithms to discover groupings of coffee (based on words used to describe them and other factors). This served as our initial foray into using Data Science on expert coffee reviews to improve our understanding of specialty coffee.

Over the past month, we’ve set out to improve upon that original work in order to empower java lovers to discover the perfect brew. Our years of cupping coffee and talking with experts have shown that – after a certain point – what constitutes a “good cup of coffee” is subjective and specific to the palette of the beholder.

With that in mind, cafehound.com chose to use a large, multiyear list of coffee reviews from Kenneth David’s coffeereview.com site to explore the relationship between the descriptions used to rate coffee aroma, flavor, aftertaste, body, acidity and finish. We hypothesized that there are distinct groupings of coffee based on their roast profile, body, and flavors that are relevant to informing consumer preferences in the overall marketplace. To clarify, a market segmentation based on a representative sample of surveyed consumer preferences may be more useful to marketing professionals, but that is outside of the scope of this post. Instead, we’re using the structure inferred from math and reviews of specific coffees to estimate categories of the potential “coffee experience.” These categories may provide coffee consumers with guideposts for exploring new specialty coffees.

Our results led to six broad categories of coffee that we’ve ordered from lightest to darkest roast (based on average Agtron ratings). Agtron ratings are a numerical representation of the consistency of the roast color (lower numbers indicate a darker roast <45, higher numbers indicate a lighter roast 50+). More than the roast determines the flavor profile and overall body of the coffee, which is why some of these segments may appear similar.

Initially, we bring this content to you via occasionally updated web pages. Depending on demand, we may scale our service to provide daily or weekly recommendation updates.

For now, follow the link below to Find Your Coffee.

cafehoundlogos01

For code share:

Shiny Segmentation and Prediction

Data Science: Exploring CoffeeReview.com Top Coffees

Over the past few years, I’ve transitioned my career from government-oriented management consulting to the field of advanced analytics and data science.

 

In general terms, this has required me to climb a significant learning curve in the related areas of computer programming languages and advanced statistical methods. While it has been challenging, the rewards of being able to more effectively and efficiently extract insights from various types of information/data is encouraging.

With the objective of exploring my love of specialty coffee, I chose to practice a few basic data science methods on a relatively well-known specialty coffee review website: coffeereview.com .

The goal was to apply web scraping, text analytics, segmentation, and some visualization techniques to coffee review data in order to explore correlations between price, producer country, roaster, and quality over time.

My colleague and I discussed the objective over Memorial Day weekend and set out on parallel paths to scrape review data from the website. He used a Python script to scrape the website, and I used an R script to do the same. In the end, his Python script achieved a more efficient scrape, producing a column separated variable (.csv) file that could be imported into a statistical computing software package like SPSS or R.

The website we targeted in this scrape was the 21 pages of: http://www.coffeereview.com/highest-rated-coffees/

 

From there, I cleaned up the file (using R packages such as “dplyr”, “stringr” and “sqldf” to get things to a point where we could calculate price per pound amounts and country of origin for most of the coffees reviewed. I was also able to pull down city/state location data for each of the roasters and their websites.

One of my first business questions involved the type of descriptive language used to review the website’s top-rated coffees. Where there any particular words that we could associate with the best rated coffee out there, according to coffeereview.com?

A relatively straightforward way to investigate that question is to use a Word Cloud to illustrate the words with the highest frequency of mention in individual review comments.

Most frequent words describing top rated coffees.

Most frequent words describing top rated coffees.

Clearly, if you want to appear to know the jargon for communicating your delight about a quality cup of java, you should say something like, “This coffee’s intense aroma of flowers, baker’s chocolate and fruit is only bested by its complex, rich flavor with tart tinges of acidity and a balanced, silky, syrupy, honey finish…”. Okay…so that sounds ridiculous…but you get the point.

Exploring the data

What is the range of ratings found on the top rated page?

The maximum rating any single coffee receives on this page (of highest rated coffees) is 97, while the minimum is 94. There isn’t a lot of variance. Most of the top rated coffees are rated 94, a third are 95, and the remaining15 percent are either 96 or 97. We will revisit this data later.

Distribution of Top Rated Coffees from CoffeeReview.com

Distribution of Top Rated Coffees from CoffeeReview.com

What years of ratings do we have the most robust data for in order to do more specific analysis on our variables?

We decided to drop all years prior to 2010 (which had 29 coffees reviewed that year).

year count
2014 70
2013 58
2012 40
2011 39
2015 24
2010 20
Which coffee roasters were the most frequently reviewed and top rated by coffeereview.com between 2010 and roughly six months into 2015?

JBC Coffee Roasters from Madison, Wisconsin was the favorite by far in terms of its 26 reviews on the website in the time span specified. Followed by Temple Coffee and Tea in Sacramento, CA (20) and PT’s Coffee Roasting Company in Topeka, Kansas (13). This was a surprise to me, as I have never sampled ANY coffee from these roasters and feel like I have been missing out. In order to show the table of roasters, i used the combination of R packages “RGraphics” and “gridExtra” to save some nice incremental (sets of 15) graphics.

roasters_1_15

roasters_16_30 roasters_31_45 roasters_46_60 roasters_61_73

A quick visualization of the top rated coffees by year, price per pound and origin country shows some semi-distinct segments within the data based on price alone. This led me to ponder if we could use a clustering algorithm (such as k-means using dummy variables for each country, price per pound, and rating) in order to more clearly segment particular coffees by segment. Instead of using R for this exploration, I exported the data into a .csv and imported it into SPSS to run the analysis there.

Price per pound by origin country and year ($US).

Price per pound by origin country and year ($US). United States = Hawai’i.

A five-way cluster solution seemed the most suitable for segmenting the data in a way that illustrated differences across price and producer country.

Price unreasonably drove the segmentation, as seen in this graphic.

Price unreasonably drove the segmentation, as seen in this graphic.

The segments broke out into groupings containing the following number of coffee reviews each:

Segment                       Count               $US/lb

1                                       174                   $21
2                                       8                      $121
3                                       35                    $44
4                                       1                       $243
5                                       20                    $84

Segment 1: No Geisha or Hawaiian Coffees, Espresso Blends
Segment 2: Panama and Colombian Geishas
Segment 3: Mix of Geishas, Ethiopian, and Hawaiian
Segment 4: Semeon Abay Ethiopia
Segment 5: Mid-priced Geisha, Hawaiian and Ethiopian

Interestingly, a few roasters exhibited a bit of dispersion across the segments due to the variety of awesome tasting coffees they had reviewed. Those roasters included:

PT’s Coffee Roasting Co.

5 (Seg 1)
3 (Seg 2)
3 (Seg 3)
2 were (Seg 5)

Barrington Coffee Roasting Co.

3 were (Seg 1)
4 were (Seg 3)
1 was (Seg 4)
3 were (Seg 5)

Bird Rock Coffee Roasters

6 were (Seg 1)
1 was (Seg 2)
3 were (Seg 3)
1 was (Seg 5)

Paradise Roasters

6 were (Seg 1)
1 was (Seg 2)
1 was (Seg 3)
2 were (Seg 5)

After exploring the data in this way, I wondered if 1) my approach to segmentation was appropriate 2) what the comments from these segments looked like comparatively. To answer the first question: no, but that will be the topic of my next blog post. To answer the second, let’s explore some word clouds below.

Word Cloud: Segment 1

Word Cloud: Segment 1

Word Cloud: Segment 2

Word Cloud: Segment 2

Word Cloud: Segment 3

Word Cloud: Segment 3

Word Cloud: Segment 4

Word Cloud: Segment 4

Word Cloud: Segment 5

Word Cloud: Segment 5

 

Perhaps clustering by cupping notes is a better way to segment groups…stay tuned.

Cafe Hounding: Verve Coffee

Verve Coffee Roasters
816 41st Avenue
Santa Cruz, CA 95062
Phone: (831) 475-7776

http://www.vervecoffeeroasters.com/

Awesome service, attention to coffee, and people.

Kris checked this place out when stopping through Santa Cruz while galavanting through California per usual.  The life of a professor… The atmosphere at Verve is relaxing in true Californian style.  Plenty of natural light and wood and metal decor reminiscent of Urban Outfitters in its layout.  The shop is located only a couple of blocks from the Pacific Ocean, which undoubtedly adds to the tranquil feel.

The beans are roasted next door in their own facility – it appears they also sell wholesale across the country. Check out their website to make sure. Verve is among the most raved about roasters in the continental US and their baristas are talented enough to consistently make a splash at the annual SCAA barista competitions.  Unfortunately for Kris, during his visit the top baristas were actually in Houston competing in the SCAA annual competition on the national stage.  The staff are super fun and friendly and the wifi is very, very free.

Two paws up for this place.  We look forward to continuing to sample their coffee throughout 2012!

– Cafe Hound

Bean Counting: Idido Natural Sun-dried – Counter Culture

Roaster: Counter Culture
Place of Purchase: Peregrine Espresso (14th St. NW Location)
Preferred Brew Method: Paper Filter Drip (pour over)
Excerpt From Counter Culture Describing Coffee:

Yirgacheffe, Ethiopia
Organic • Shade Grown
The community of Idido, just outside the town of Yirgacheffe, has once again produced the quintessential Ethiopian Natural Sundried coffee. One of the cleanest and most refined naturals we have tasted in years, Idido offers notes of strawberry, blueberry, and orange zest with a balanced, chocolate-like sweetness.

Cafe Hound Review:   Generally, Counter Culture got this one right. This coffee cups clean, though it has  enough of what I call, “berry funk” to entertain your palate. I cycled through several of the Counter Culture coffees this year, with the Central Americans admittedly disappointing after a VERY strong showing in 2009, and a decent showing in 2010. In 2010 my favorite growing region of the world ended up being Kenya, though the Cafe Hound annual blend at the end of 2009 included a fair amount of sun-dried Ethiopian coffee from the Amaro region. (washed version for sale at Novo now). Right now, this Yirgacheffe is rocking my world. I admit that as the weather cools here in the Nation’s Capital, I’m leaning towards more bold and fruity coffees – though I enjoy a clean cup so much that I rarely venture to the extremes of many Indonesian grown coffees (Sumatra). Though, it is all a matter of taste and I encourage you to post your comments letting us know what your preferences are this year! Happy Hounding!

info@cafehound.com

Rusty’s Hawaiian Site Visit: Pahala, Hawaii

In early April 2011, both Cafe Hounds took a journey to Hawaii in search of the storied Kona coffee, in addition to some sunshine and snorkeling – oh, and Kris had a conference for the Association for Asian Studies  (AAS), which he presented at. During out visit we had the pleasure of sampling some wonderfully crafted coffee drinks on the island of Oahu before we jumped on a short flight destined for the beautiful Big Island, where we landed in Kona.  Once there, we decided to casually sample a few plantations in the immediate area near our Bed & Breakfast in south Kona (Ka’awa Loa). In short – they stunk.

Ka'awa Loa B&B

So the ONE big coffee related adventure on Big Island was our visit to the wonderful farm of Lorie Obra and her family in the Ka’u District (in Pahala, Hawai’i). It was amazing. The Obra family house and farm is located in the small and relatively impoverished village of Pahala – with less than 1,350 inhabitants just east of the southern tip (South Point – Ka Lae). According to 2010 Census Data, more than  80% of the population is Asian/Pacific Islander or a mix of the two. Many of the inhabitants descend from the Philippines – a country that Maher Hound used to live in until a volcanic eruption destroyed his home in 1991. This fact made visiting the volcanic island of Hawai’i that much more special.

Maher, Miguel and Lorie On The Farm in Pahala, HI

Lorie agreed to meet with Kris and Maher on relatively short notice and coordinated the meeting with her coffee consultant – and friend – Miguel Meza. Lorie’s daughter Joan Obra and her husband Ralph Gaston joined the group as well – after somewhat recently arriving in Hawai’i themselves to join the family business after most of their lives on he mainland.

I drove us all up the road a mile or so to their farm where we were then able to walk around and experience the relatively young and VERY well planned out coffee farm of Rusty’s Hawaiian. The brand and the farm were started  in 1999 with the seedling of a dream by Rusty Obra, a retired chemist who sadly passed away in 2006 – leaving his wife, Lorie with the tough decision of whether to continue his dream…or cut her losses and move on. She bought into his dream and kept forging forward in a naturally advantageous habitat for superb coffee – planted on the slopes of the Mauna Loa volcano, which makes up the majority of Hawai’i’s biggest of islands, Big Island. Standing on the farm you can see the ocean off in the distance looking south towards South Point where one can find Green Sand Beach – where the sand is colored in such a way due to chemical and gaseous reactions from volcanic/lava eruptions with ocean water.

On the farm, Miguel and Lorie have experimented with several varietals – but the five that we had the pleasure of cupping that day were the:

  1. RH Lot 24 Tipica
  2. Bourbon (red)
  3. Yellow Caturra Natural Dried FRUKO
  4. Red Caturra
  5. Yellow Caturra Natural Dried

The farm currently has no certifications at all – though they stated that they plan to certify organic eventually. They stated the reason was because achieving certification is not viable and – probably mostly – it is not viewed as an important aspect of quality in their sales strategy. (aka – their buyers don’t care about certification as much as their unique flavor profile and superb quality control). Thus far, they have not experienced broque (bug diseases).

Cupping With Miguel At Lorie's Home

One of their most important variables in annual yields is rainfall – 1) they don’t have an irrigation system 2) volcanic soil doesn’t really retain water very well. To that end, Miguel shared with me that the average commercial farm in the Kona district (where he also engages in coffee consulting for other farms) yields about 1,000 pounds per acre (due to higher rainfall counts) compared to averages ranging from 400 to 600 pounds per acre at Rusty’s farm. This relatively limited annual yield capacity for Rusty’s creates a situation where demand outstrips supply by far. For this reason, the $80 per pound for some of the coffees we sampled was understandable.
The hospitality shown to Kris and I at Rusty’s Hawaiian farm and home was exceptional and encapsulates not only the Hawaiian way, but reminds me fondly of my (Maher) time in the Philippines. Hopefully, there will be more to come on Rusty’s Hawaiian and Miguel Meza – and the change that they are catalyzing in Hawaii’s specialty coffee industry.