How Yahoo Japan serves more than 40 million users with help from Pivotal, Spring, and SRE

December 4, 2019 Madison Schlegel

With over 93 million unique browsers per day, and one-third of Japan’s population actively using at least one of its services, Yahoo Japan understands scale. However, with technology and user demands constantly changing, the company’s more than 3,000 software engineers need the right tools in place in order to continue delivering value and speed for its customers. Pivotal Platform plays a big part in helping them accomplish that goal.

In their talk at SpringOne Platform 2019, Cloud Platform Manager Yusuke Kondo and Software Engineer Akinori Nitta elaborated on some of the distinct challenges Yahoo Japan faces around operating at this scale, as well as some of the solutions it has devised for solving them. Among the primary challenges are:

  • Role management across multiple clusters.

  • Routine configuration for multi-foundation environments.

  • Developer implementation errors.

  • Increasing log traffic.

  • An “explosion” of pipelines to maintain.

  • 24/7 multi-cluster monitoring.

You can watch the video for specifics on how Yahoo Japan resolved each of these challenges, both technologically and procedurally.

If you’re wondering what Yahoo Japan’s scale translates to in terms of footprint, Kondo summed it up: “We now have more than 10,000 apps in production using more than 40,000 [application instances], supporting more than 180,000 requests per second.” 

From an infrastructure perspective, that translates to 16 (soon to be 24) PaaS clusters, managed by 25 people spread across an SRE team and a CRE team. Notably, while Yahoo Japan’s application and application-instance counts grew by about 10x from October 2018 to October 2019, it only added 4 additional SREs during that same timeframe.

In addition to some of the technical and procedural improvements Kondo and Nitta explain in detail during the session, Yahoo Japan has also benefited from embracing Buildpacks and turning to Java and Node.js as its preferred programming languages. In 2016, most development was done in PHP, but today approximately 75% of its Buildpacks support Java or Node.js, compared with only 6% for PHP.

For a little extra flavor, here’s an excerpt from the session, in which Kondo explains the unique responsibilities of Yahoo Japan’s SRE and CRE teams:

"The PaaS team is composed of two teams: CRE and SRE. CRE has a mission to provide value to developers, while SRE has a mission to improve overall system reliability. Each team works together to maximize the productivity and value of the Pivotal platform.

 

"The CRE team's mission is to focus on the platform users [and] the engineer's productivity improvement. There are three main responsibilities here. Firstly, to make the platform easy to use and provide developers with more productivity by developing service broker APIs or some useful tools, libraries, and CLI plugins that help engineers to ship their apps more frequently. Secondly, to be the contact point of developers to provide feedback and raise issues. The team is also responsible for proactively detecting problems with apps before they become problems, and reaching out to developers to improve them. Lastly, the CRE team is responsible for education, providing workshops and documenting best practices, as well as providing architectural guidance.

 

"The mission of the SRE team is to maximize system reliability. There are three big responsibilities. Firstly, defining SLOs and setting up a monitoring scheme to achieve those targets. . . . Secondly, supporting requests from the CRE team. The CRE team gathers feedback from engineers and works closely with the SRE team to deliver improvement. Lastly, the SRE team is also responsible for platform updates and enabling new platform features, and also logging metrics for developers. They always try to automate their operations as much as possible to eliminate toil."

Learn more about Yahoo Japan, SRE, and scaling PAS

Automation and Culture Changes for 40M Subscriber Platform Operation

SRE and the value of treating operations as a software problem

Thinking in Error Budgets: How Pivotal’s Cloud Ops Team Used Service Level Objectives and Other Modern SRE Practices to Improve Outcomes

Pivotal Platform at T-Mobile

 

About the Author

Madison Schlegel

Madison is a Customer Analytics Manager for the Pivotal Vanguards, a super user community centered around customer advocacy. Prior to Pivotal, Madison worked in a research department at the University of California, Berkeley where she holds a B. A. in English and Communications.

More Content by Madison Schlegel
Previous
AutoZone's road to cloud-native computing and empowered developers
AutoZone's road to cloud-native computing and empowered developers

How AutoZone modernized its infrastructure and applications, and improved its business.

Next
Steeltoe 2.4 boosts .NET microservices development with a code generator, new getting started guides, and more
Steeltoe 2.4 boosts .NET microservices development with a code generator, new getting started guides, and more

Application generation, a new site, and the Steeltoe CLI - Get familiar with Steeltoe 2.4.

SpringOne Platform 2019 Presentations

Watch Now