Author Archives: Maher

Data Science: Exploring CoffeeReview.com Top Coffees

Over the past few years, I’ve transitioned my career from government-oriented management consulting to the field of advanced analytics and data science.

 

In general terms, this has required me to climb a significant learning curve in the related areas of computer programming languages and advanced statistical methods. While it has been challenging, the rewards of being able to more effectively and efficiently extract insights from various types of information/data is encouraging.

With the objective of exploring my love of specialty coffee, I chose to practice a few basic data science methods on a relatively well-known specialty coffee review website: coffeereview.com .

The goal was to apply web scraping, text analytics, segmentation, and some visualization techniques to coffee review data in order to explore correlations between price, producer country, roaster, and quality over time.

My colleague and I discussed the objective over Memorial Day weekend and set out on parallel paths to scrape review data from the website. He used a Python script to scrape the website, and I used an R script to do the same. In the end, his Python script achieved a more efficient scrape, producing a column separated variable (.csv) file that could be imported into a statistical computing software package like SPSS or R.

The website we targeted in this scrape was the 21 pages of: http://www.coffeereview.com/highest-rated-coffees/

 

From there, I cleaned up the file (using R packages such as “dplyr”, “stringr” and “sqldf” to get things to a point where we could calculate price per pound amounts and country of origin for most of the coffees reviewed. I was also able to pull down city/state location data for each of the roasters and their websites.

One of my first business questions involved the type of descriptive language used to review the website’s top-rated coffees. Where there any particular words that we could associate with the best rated coffee out there, according to coffeereview.com?

A relatively straightforward way to investigate that question is to use a Word Cloud to illustrate the words with the highest frequency of mention in individual review comments.

Most frequent words describing top rated coffees.

Most frequent words describing top rated coffees.

Clearly, if you want to appear to know the jargon for communicating your delight about a quality cup of java, you should say something like, “This coffee’s intense aroma of flowers, baker’s chocolate and fruit is only bested by its complex, rich flavor with tart tinges of acidity and a balanced, silky, syrupy, honey finish…”. Okay…so that sounds ridiculous…but you get the point.

Exploring the data

What is the range of ratings found on the top rated page?

The maximum rating any single coffee receives on this page (of highest rated coffees) is 97, while the minimum is 94. There isn’t a lot of variance. Most of the top rated coffees are rated 94, a third are 95, and the remaining15 percent are either 96 or 97. We will revisit this data later.

Distribution of Top Rated Coffees from CoffeeReview.com

Distribution of Top Rated Coffees from CoffeeReview.com

What years of ratings do we have the most robust data for in order to do more specific analysis on our variables?

We decided to drop all years prior to 2010 (which had 29 coffees reviewed that year).

year count
2014 70
2013 58
2012 40
2011 39
2015 24
2010 20
Which coffee roasters were the most frequently reviewed and top rated by coffeereview.com between 2010 and roughly six months into 2015?

JBC Coffee Roasters from Madison, Wisconsin was the favorite by far in terms of its 26 reviews on the website in the time span specified. Followed by Temple Coffee and Tea in Sacramento, CA (20) and PT’s Coffee Roasting Company in Topeka, Kansas (13). This was a surprise to me, as I have never sampled ANY coffee from these roasters and feel like I have been missing out. In order to show the table of roasters, i used the combination of R packages “RGraphics” and “gridExtra” to save some nice incremental (sets of 15) graphics.

roasters_1_15

roasters_16_30 roasters_31_45 roasters_46_60 roasters_61_73

A quick visualization of the top rated coffees by year, price per pound and origin country shows some semi-distinct segments within the data based on price alone. This led me to ponder if we could use a clustering algorithm (such as k-means using dummy variables for each country, price per pound, and rating) in order to more clearly segment particular coffees by segment. Instead of using R for this exploration, I exported the data into a .csv and imported it into SPSS to run the analysis there.

Price per pound by origin country and year ($US).

Price per pound by origin country and year ($US). United States = Hawai’i.

A five-way cluster solution seemed the most suitable for segmenting the data in a way that illustrated differences across price and producer country.

Price unreasonably drove the segmentation, as seen in this graphic.

Price unreasonably drove the segmentation, as seen in this graphic.

The segments broke out into groupings containing the following number of coffee reviews each:

Segment                       Count               $US/lb

1                                       174                   $21
2                                       8                      $121
3                                       35                    $44
4                                       1                       $243
5                                       20                    $84

Segment 1: No Geisha or Hawaiian Coffees, Espresso Blends
Segment 2: Panama and Colombian Geishas
Segment 3: Mix of Geishas, Ethiopian, and Hawaiian
Segment 4: Semeon Abay Ethiopia
Segment 5: Mid-priced Geisha, Hawaiian and Ethiopian

Interestingly, a few roasters exhibited a bit of dispersion across the segments due to the variety of awesome tasting coffees they had reviewed. Those roasters included:

PT’s Coffee Roasting Co.

5 (Seg 1)
3 (Seg 2)
3 (Seg 3)
2 were (Seg 5)

Barrington Coffee Roasting Co.

3 were (Seg 1)
4 were (Seg 3)
1 was (Seg 4)
3 were (Seg 5)

Bird Rock Coffee Roasters

6 were (Seg 1)
1 was (Seg 2)
3 were (Seg 3)
1 was (Seg 5)

Paradise Roasters

6 were (Seg 1)
1 was (Seg 2)
1 was (Seg 3)
2 were (Seg 5)

After exploring the data in this way, I wondered if 1) my approach to segmentation was appropriate 2) what the comments from these segments looked like comparatively. To answer the first question: no, but that will be the topic of my next blog post. To answer the second, let’s explore some word clouds below.

Word Cloud: Segment 1

Word Cloud: Segment 1

Word Cloud: Segment 2

Word Cloud: Segment 2

Word Cloud: Segment 3

Word Cloud: Segment 3

Word Cloud: Segment 4

Word Cloud: Segment 4

Word Cloud: Segment 5

Word Cloud: Segment 5

 

Perhaps clustering by cupping notes is a better way to segment groups…stay tuned.

Advertisement

Coffee Inspiration: TEDtalk

As I gear up for a well-deserved break — and return to the origin of my coffee inspiration — I leave you with a couple inspirational coffee-themed independent TED talks.

Cafe Hounding: Caffe / Illy – Washington, D.C.

Caffe: Marriott Renaissance M Street Hotel
1143 New Hampshire Ave NW
Washington, DC 20037
(202) 775-0800
http://www.yelp.com/map/illy-cafe-washington
http://www.marriottmodules.com/restaurant/hotels/hotel-information/travel/wasrw-renaissance-m-street-hotel/caffe_an_italian_coffee_house/

Caffe is the name of the coffee concept boutique coffee shop located within the Marriott Renaissance Hotel in the West End of NW Washington, D.C.  This was the first of several shops opened in the last three years that exclusively sell Illy coffee and their designer products (namely their fancy hand painted espresso cups/plates and pods). Although not my first choice for espresso in most cases, every time I’ve had a cup of Illy at this M Street location, I have been thoroughly pleased. The dark, complex and caramel-like finish of the typical Illy espresso is a proven winner.  The true to form syrupy crema that commonly accompanies a well made Italian espresso consistently shines through here and, based on third-hand accounts, their cappuccinos are also well-made.

This is definitely not a place to sit down and work, eat a meal or chat for too long with friends.  Keeping in the typical Italian espresso bar tradition, there is only a standing counter along the windows of this petite shop where one is able to down their drink and continue on.  Not too linger friendly here.  Not to worry, just a quick walk through the into the adjoined restaurant (also part of the Marriott Renaissance Hotel) and you can begin an entirely separate dining experience.

In short, although this is not a place for much more than a quick coffee on the go – it is a quality coffee drinking experience and is worth a stop if you’re in the area and desire a quality made coffee drink.  The iced latte I had here in Summer 2010 was probably the best I’ve ever had.  Try getting a simlilar experience across the street at Starbucks — simply unheard of.

I like the cup (seen above) so much that I asked to purchase it.  I was pleased to find out that they happily sell the cup/plate/spoon sets used for a little under $10.

Here are some additional links that discuss the place:

WaPo
Yelp
Examiner
UrbanSpoon


Cafe Hounding: Azi’s Cafe – Washington, D.C.

1336 Ninth St. NW
Washington, D.C.
20001-4208
http://aziscafe.com/index.html

http://maps.google.com/maps/place?client=safari&rls=en&oe=UTF-8&um=1&ie=UTF-8&q=washington+dc+nw+1336+9+st&fb=1&gl=us&hnear=Washington+D.C.,+DC&cid=12196182154941226661

Azi’s Café is a wonderful place to grab a coffee and a meal in one of DC’s most diverse and dynamic neighborhoods – albeit not very commercial.  The charming owner, Azeb Desta (nicknamed Azi), hails from coffee’s disputed birthplace in the Horn of Africa.  Before opening Azi’s in 2005 she worked for eleven years in food and beverage with Ritz-Carlton hotels.

Her location at the corner of 9th and O streets is smack in the middle of a rapidly changing area of the Shaw area of DC, where an improving standard of living and an aversion to the normal “Starbucks” options appear to partially drive traffic to Azi’s Cafe. Perhaps more important, Azeb and her staff are some of the warmest and most dedicated employees in the business and their service clearly helps with customer loyalty. Furthermore, for the time being, there is very little direct competition in the immediate area.

The menu of light food fare boasts decent pastry, soup, salad and panini (the roasted turkey breast, tomato, cheddar, and garlic spread goes for $6.50) options.  Personally, I often find myself succumbing to the flavorful biscotti displayed in large glass containers in front of the cashier – it perfectly compliments a warm frothy cappuccino on a cold day.

Generally, the coffee is above average for Washington and I’ve grown fond of their cappuccinos.  They use Illy coffee and have a stand of retail Illy for sale proudly exhibited in their front window.

Having sampled an Illy espresso across town at the Illy shop at the Renaissance M Street Hotel, I was excited to see how Azi’s compared.  The coffee itself was definitely up to par, bold and complex from start to finish.  The cup they used in my case was a designer Illy cup – of my choosing – that was plenty warm from sitting atop the French-made UNIC machine. The quantity of crema was less than sufficient, though, and I would have to wager the guess that the machine could be the problem. I’ll undoubtedly try another espresso here before making a final judgment on the quality of their coffee and ability to make drinks.  It also appears that they keep a pretty steady line of customers asking for both specialty drinks and regular cups of coffee during this time of year.

I’ve never visited this locale without a pleasant and eclectic mix of music weaving through the small locale.  The southern wall is littered with a few electric sockets for those who tote laptops and have a use for their free wi-fi. Others may choose between a few tables in the middle of the shop and a couple two-seater tables squeezed in between columns with plenty of natural light on the northside of the shop (sorry, no electric plugs on this side of the shop).

Whether for a hot bowl of soup, a freshly made salad, a steamy latte or a shot of espresso – Azi’s is quickly becoming an institution in the Shaw neighborhood and – with over five years of business in this locale – Azeb Desta seems satisfied that things are going in the right direction.  Although, she thinks that the last five years have gone by quickly, and that both the neighborhood and the clientele have changed equally quickly.  Azi’s Café is one of very few businesses thriving in this section of NW and it will be interesting to see how much/little she changes in the next five years in order to maintain a successful enterprise.

Café Hound will undoubtedly continue to frequent her shop and wishes her the best in growing her business.

Angolan Coffee: Cafe Ginga Lobito

AngoNabeiro / Cafe Delta / Cafe Ginga
Estrada do Cacuaco Km 5
PO Box 5727, Luanda
Email: anabeiro@snet.co.ao
Tel: +244 222 840161 / 62

How is the coffee?  How well is it delivered?

My expectations for any coffee that is roasted in a hot and muggy coffee producing country and transported to the United States in luggage are generally pretty low.  Opportunities for the coffee to be damaged by heat, humidity, and poor packaging are far too great. Upon receiving this kilogram of roasted whole bean coffee I politely thanked the gift bearer and placed any hope of this coffee stimulating my palate far from the reach of reality.  A couple of days later, I used my 480-watt Baratza Virtuoso burr grinder to grind up a fine espresso sample of the beans for use in a Gaggia Classic modified machine with a Rancho Silvia wand.  About 23 seconds later, a full Illy cup of syrupy espresso was ready to be slurped.  My initial surprise was that the machine pulled the shot surprisingly well for a first try.

After sipping the shot I was surprised again with the freshness and fruitiness of the drink.  The aroma of the beans was not nearly as satisfying as the drink itself.  The quality of the beans themselves did leave a little to be desired.  The roast was not consistent enough to be considered specialty quality – with some beans barely brown and others burnt to a crisp. Also, some were very small and damaged while others were huge.  Furthermore, I found a piece of metal wire resting in between a few beans when I was pouring the bag into a storage container – reflecting less than ideal quality control standards by the processing company. The packaging for the beans is metalized with an additional layer of multicolored labeling and a valve application for allowing gases to escape after sealing – a high quality packing meant for beans that a company would expect to export and/or sell retail.

Again, the taste was exotic and I was encouraged enough to make an entire pot of drip coffee with the same beans.  The end result was a bit less to my specific liking – I like a brighter coffee with a lighter roast and more mild finish.  Although, on colder days I like a drip coffee with a bit more character in the body than my usual Central American and Colombian varieties.  I’ve begun mixing some beans from Cundinamarca, Colombia with my Angolan coffee that apparently originates on an estate (fazenda) called Lobito (not to be confused with the port city of the same name) and am pleased to drink this blend in both espresso and drip coffee form.

What’s in a name? Ginga’s backstory

The Ginga (Njinga) name is distinctly Angolan, as it refers to a queen dating back to the times of the Ngondo Kingdom in Africa.  The Ngondo Kingdom was originally a tributary kingdom of the Kingdom of Congo – existing before the Portuguese colonizers arrived in 1482.  The Ngondo Kingdom was governed by Ginga’s father, Ngola Kiluange(Kiluanji), when the Portuguese arrived. He fiercely resisted the Portuguese as well as all other foreigners until his eventual decapitation. The Portuguese attributed the name Angola to the lands now known as Angola, not knowing/caring that the Ngola was the name of the ruler, not the lands.

Queen Ginga is a legendary figure in African history and the object of pride in Angola, as she is viewed as one of Angola’s most shrewd diplomats, rulers, military minds and intelligent leaders.  So much is written on her that her entire history appears to be in dispute and includes elements of near-mythology – certainly originating from the 16th century equivalent of smear campaigns and propaganda.  She is rumored to, at times, have adopted cannibalism, a very pious Catholic lifestyle, and – according to Maquis de Sade’s “Philosophy in the Bedroom” – she sacrificed elements of her all male harem of lovers immediately after lovemaking. In other words, there is much mystery and intrigue surrounding her life but she is most certainly a key historical figure in the Angolan national identity.

Throughout her political career, Queen Ginga both resisted and compromised with her Portuguese occupiers.  There seems to have been a relative interdependency between Ginga and Portugal.  She converted to Christianity, adopted tribal customs, and went to war with the crown and neighboring tribes – whatever ensured her survival.  Perhaps this is why the brand name Ginga is appropriate for a coffee company that claims to be 100% Angolan, yet is very much entangled in a past connected to Portugal. Ginga is one of two coffee brands connected to a holding company called AngoNabeiro, the other being Delta Café (a widely known Portuguese brand).  AngoNabeiro is part of a Portuguese conglomerate known as Nabeirogest, or more informally, Grupo Nabeiro.  One of the strongest performing companies in this group is Café Delta.  Café Delta dominates the coffee market in Portugal, is expanding rapidly in Angola and Brazil, and has long been active in segments of the East Asian market for roasted coffee (see Macau).

But, the Portuguese connection dates back to before Angolan Independence when AngoNabeiro was setting up coffee production operations in 1973 right before Portugal experienced a coup d’état in 1974 and, as part of a larger Portuguese agreement, Angola was liberated from colonization through the Alvor Agreement (Acordo do Alvor) in 1975.  Between 1975 and 2002, Angola endured a violent civil war that ravaged the countryside and made sustaining its agricultural economy very unpredictable. As in nearly all civil conflicts, land/property rights were constantly challenged creating terrible instability for coffee farm owners.

During the earlier part of the difficult times in Angola, Rui Patricio oversaw daily operations and ownership of AngoNabeiro inside of Angola.  Production continued, although at very small quantities, until 1983 when the company closed due to lacking technical assistance and know-how.  The physical infrastructure where AngoNabeiro’s main facility was located was loosely protected, unproductively, until 1998 when Delta Café proposed a revitalization of its coffee production in Angola.  By 2000, the Café Ginga brand emerged and by 2002 the civil war in Angola finally ended. Café Ginga and AngoNabeiro has grown steadily since, with an estimated US$1.2 million of annual revenues in 2005 according to Director General Rui Melo. Part of their growth has been thanks to a business structure where the mixed-capital Angolan company, AngoNabeiro benefits from Grupo Nabeiro’s know-how and financial largesse (capital and cash-on-hand). Café Delta is one of many companies housed within Grupo Nabeiro and it has been tremendously successful over the past decade.  As Ginga changes outside perceptions of high quality coffee within the Angolan market their ambitions are set on carving out market share in nearby South Africa and other countries in their immediate vicinity.

Rui Melo interview on history of AngoNabeiro (Portuguese): http://www.winne.com/dninterview.php?intervid=1686

Mr. Rui Melo
Manager / Director General of AngoNabeiro