Today we will discuss the case of the Yahoo Group shutdown.
Yahoo Group was an internet forum and open discussion groups. It opened in 2001 and shutdown in December 2019, all content deleted.
Bear in mind that Yahoo was one of the most well known and powerful internet company around the 2000’s, similar to the Google of today. It steadily declined and was bought for scrap by Verizon in 2017.
How do software come to life and go extinct in an enterprise?
Using enterprise to designate large organization, whether it’s a commercial product or not.
New software and products usually get written by small teams, of their own will or on a wider company initiative.
In this specific case, an internet forum, this could have been done by a small team.
The rise of the internet in the 2000’s gave rise to HTTP discussion boards. A large variety of boards (general or topic-specific), hosting offerings to run your own board (GoDaddy and other), and frameworks to create your own board (phpBB and co). Yahoo groups was flirting surfed on the wave at the right time.
The product is launched and it henceforth begins its lifestyle.
Users are signing up (or not).
The original team is working on it, keeping it running, adding features and ironing out bugs.
5 years later
Slowly over time, software rot and company goals shift. 5 years later, it’s time for the first bump.
The original team is long gone by now. They left to new companies or transferred internally to new roles.
There are two things that come into play to decide whether the product will live on or be shutdown: upkeep and revenues.
- Upkeep is how much work it takes to keep the product running.
- Revenues is how much revenues the product brings in.
Revenues is the primary reason a company delivers a product (I should say “sell” instead of “deliver”), companies exist to make money. If a product brings in significant revenues or customers, it will continue to exist no matter what. The development team would still be in place (albeit with different members) and have grown bigger than initially.
Revenues are none in the case of Yahoo Groups. It’s a free product that doesn’t generate much revenues (I don’t think it has much ad or business plans). It means that internally the Yahoo Groups would have little budget for headcount and little political power.
Upkeep is an additional factor on top of that. How much resources it takes to keep the product running, operation folks to watch it, developer folks to improve and maintain it, servers and AWS bill, etc… A forum can have a minimal upkeep, running for years in the background with nobody taking care of it.
If a product has high upkeep, it can work out as long as it has high revenues too (this could actually be a good candidate for empire-building and promotions in an enterprise). The opposite –high upkeep with no revenues– would have the project crumble under its own weight (after the initial building stage benefiting of the new initiative goodwill).
The 5 years mark is an important milestone. Everything is outdated and there is a mountain of work to do suddenly (assuming there was little maintenance along the way). Think PHP 5.0 is outdated and has security vulnerabilities, MySQL X.Y can’t be downloaded anymore, Debian Pumpkin is one or two major releases behind the current one, all the physical servers are about to be retired reaching their 3-5 years replacement cycle. Hiring and retaining are getting more difficult, working with PrehistoricalHP to dig up a dinosaur product is not a selling point.
Note: Some folks call that technical debt, but it’s not an appropriate term, it’s regular maintenance really. When you replace break pads on your car after 80 000 kilometers, you don’t say the previous mechanics has left debted pads all over the place.
These issues can and do kill software (and car) abruptly overnight, in a “stop and doesn’t restart” kind of way (webserver shutdown, database gone, source code lost, remote hack and delete). Assuming the product is used and of any importance, it’s extremely bad from a PR and practical standpoint, the company will immediately put some folks to fix it (PR and production incident = budget+headcount).
That’s why if there is no intention or resources to maintain a product, the product should be discontinued ASAP, meaning giving notice to customers now to shutdown later.
In conclusion, upkeep and revenues drive whether the product has a chance to survive. Yahoo groups survived the first time.
Yahoo groups is low revenues but low upkeep. Bearing in mind that 2000’s internet giant may be counting users instead of revenues. It might also have been part of a bigger plan around the Yahoo portal and controlling discussion boards at the time.
10 years later
Round two. More challenges, mightier challenges.
You’re either facing a main or a background product. The main product is well established and sustained by dedicated team(s), it’s the bread and butter of the company. The background product is chugging along with minimal periodic maintenance, typically sucking out arbitrary resources out of their projects.
First thing first, the 5 year cycle is repeating. The whole stack and runtime are obsolete and in dire need of maintenance, same as before. It’s already handled or in progress if there is a dedicated team, otherwise it can be challenging.
Next. The world changes a lot in a decade. Is the product still relevant? Is it the right product?
The iPhone happens in 2007 shortly followed by the mobile revolution. Pretty much every web product has to be adjusted to be mobile friendly, small-screen-with-touch in short but there’s more to it. Does the product as-is make sense on a mobile? Can it take advantage of the new platform?
The landscape, the competitors, the needs have changed. A company doing paper or fax management is facing paper usage going down few percents each year (so are the customer base and revenues), tough business to be in. SourceForge/SVN is being replaced by GitHub/Git. Google groups or Facebook groups or Reddit are substitute to Yahoo Groups.
A product may only go on if it still makes sense and makes money. A shrinking market or a shift in usage can be the end of it.
Bear in mind that when closing a product or a business, it doesn’t have to be -already- losing money or losing customers, the mere indication of a future downturn can be enough to shut it down.
Yahoo is not doing great but Yahoo Groups has near one million unique daily visitors in 2011. With the advance in web advertising, it could make money if monetized appropriately, don’t know if enough to cover operating costs.
Repeat the last two events.
Yahoo the company is a shadow of its former self, steadily on the decline for years, desiccated by Google and Facebook. It is acquired and absorbed by Verizon in 2017.
Yahoo and Yahoo Groups die here. An abandoned product, with little revenues and little market share, outside of Verizon expertise and goodwill (turns out they’re not into running free internet discussion boards).
Remember the first part. Software require people, budget and time to keep running (the upkeep). Verizon has none to offer.
Why not archiving it?
Remember the last line? Software require people, budget and time to to run. Well, same thing to archive/migrate it. It’s tough hard work.
Therefore it simple can’t be done. Dying projects don’t get the luxury of having (tens of) people assigned to them.
Think corporate, the poor suckers in this story are not the millions of users and messages who will be dropped, but the person who had to write and make the announcement. Best case scenario, he is borrowed from another department to take care of this stuff that doesn’t contribute to his personal advancement, worst case scenario he is the last man standing in yahoo groups and about to be laid off.
Given the timeline, announcement in October to delete in December, Verizon was very much in a hurry to kill it.
How much work is archiving?
Whether an archive means a dump of a database or other form.
First, no, do not dump any database and give it up to archive.org. Think for a minute how many user, password, email and private information information there is in full database dumps? That’s not something to give nor to keep around.
Things to consider when moving data:
- Where does it run?
- How to access it?
- What to convert it to? (a board can’t be a simple CSV for example)
- What’s the volume?
- How many threads and messages?
- Attachments? (image hosting included or something?)
- Unicode handling
Lowest estimate, for the fastest man on the planet, it would take more than a week full-time, to be able to dump something like that.
More moderate estimate, it would be a few months of work (many days spread here and there in-between other projects).
I’m cheating here, I’ve done work like that before so I know very well how long it takes and what were the challenges on the way. The timeline is de-facto months just because there are different steps: figure out the setup, estimate the options, announcements, make it read only, delete public access, decommission the software, decommission the hardware.
Decommissioning ancient software is hard work. Figuring out the setup and getting access (SSH, database passwords and equivalent) is daunting, it might be close to impossible if lost or blocked by enterprise security.
The sheer scale of data is a massive pain to deal with. Wikipedia is saying 113 million accounts, 9 million groups, countless messages, 1 million unique daily visitors in 2010. Hearing hundreds of gigabytes of data even if messages are only a “hello” placeholder and user profiles are empty.
It takes in the order of day(s) to transfer terabytes of data between datacenters, assuming a stable 100 Mbps connection. I work with databases from time to time in my current role, typical size of 500 GB or 20 TB, any task is easily a full day because of the time to transfer and we’re lucky the company has dedicated 10 Gbps links through the Atlantic (that cost a fortune by the way).
Size and transfer time will significantly slow down the project (relevant xkcd). Expect full dumps to not fit on local disks by the way. While one can buy a 8 TB external disk for a few hundreds dollars, an employee doesn’t choose what storage is in its computer or servers.
Long story short, avoid the pain and stick to a decommission. It’s the only reasonable thing to do.