Karyn Ross - Invention is the Mother of Necessity: Creating the Services of the Future Often, people think that Lean is all about solving already existing problems. An internationally acclaimed consultant, coach and lean practitioner, Karyn has taught organizations of all sizes, in sectors as diverse as insurance, HR, transportation, retail and manufacturing, how to use creativity combined with Toyota Way principles, practices and tools.
Reading List Key Takeaways There are many decisions and tradeoffs that must be made when moving from batch ETL to stream data processing. Engineers should not "stream all the things" just because stream processing technology is popular The Netflix case study presented here migrated to Apache Flink.
Aroraa senior data engineer at Netflix, began by Lean case study interview that the key goal of the presentation was to Lean case study interview the audience decide if a stream-processing data pipeline would help resolve problems they may be experiencing with a traditional extract-transform-load ETL batch processing job.
In addition to this, she discussed core decisions and tradeoffs that must be made when moving from batch to streaming. The Netflix system uses the microservice architectural style and services communicate via remote procedure call RPC and messaging.
At a high level, microservice application instances emit user and system-driven data events that are collected within the Netflix Keystone data pipeline — a petabyte-scale real-time event streaming-processing system for business and product analytics.
Batch-processed data is stored within tables or indexers like Elasticsearch for consumption by the research team, downstream systems, or dashboard applications. There are clear business wins for using stream processing, including the opportunity to train machine-learning algorithms with the latest data, provide innovation in the marketing of new launches, and create opportunities for new kinds of machine-learning algorithms.
There are also technical wins, such as the ability to save on storage costs as raw data does not need to be stored in its original formfaster turnaround time on error correction long-running batch jobs can incur significant delays when they failreal-time auditing on key personalization metrics, and integration with other real-time systems.
A core challenge when implementing stream processing is picking an appropriate engine. The first key question to ask is will the data be processed as an event-based stream or in micro-batches. If results are simply required sooner than currently provided, and the organization has already invested heavily in batch, then migrating to micro-batching could be the most appropriate and cost-effective solution.
The next challenge in picking a stream-processing engine is to ask what features will be most important in order to solve the problem being tackled.
This will most likely not be an issue that is solved in an initial brainstorming session — often a deep understanding of the problem and data only emerge after an in-depth investigation. Each engine supports this feature to varying degrees with varying mechanisms.
Another question to ask is whether the implementation requires the lambda architecture. This architecture is not to be confused with AWS Lambda or serverless technology in general — in the data-processing domain, the lambda architecture is designed to handle massive quantities of data by taking advantage of both batch-processing and stream-processing methods.
It may be the case that an existing batch job simply needs to be augmented with a speed layer, and if this is the case then choosing a data-processing engine that supports both layers of the lambda architecture may facilitate code reuse.
Several additional questions to ask when choosing a stream-processing engine include: What are other teams using within your organization?
If there is a significant investment in a specific technology, then existing implementation and operational knowledge can often be leveraged. What is the landscape of the existing ETL systems within your organization? Will a new technology easily fit in with existing sources and sinks?
What are your requirements for learning curve? What engines do you use for batch processing, and what are the most widely adopted programming languages? The Netflix DEA team previously analyzed sources of play and sources of discovery within the Netflix application using a batch-style ETL job that can take longer than eight hours to complete.
Sources of play are the locations from the Netflix application homepage from which users initiate playback. Sources of discovery are the locations on the homepage where users discover new content to watch. The ultimate goal of the DEA team was to learn how to optimize the homepage to maximize discovery of content and playback for users, and to improve the overly long hour latency between occurring events and analysis.
Real-time processing could shorten this gap between action and analysis. Ultimately, Arora and her team chose Apache Flink with an ensemble cast of supporting technology: Apache Kafka acting as a message bus; Apache Hive providing data summarization, query, and analysis using an SQL-like interface particularly for metadata in this case ; Amazon S3 for storing data within HDFS; the Netflix OSS stack for integration into the wider Netflix ecosystem; Apache Mesos for job scheduling and execution; and Spinnaker for continuous delivery.
An overview of the complete source of discovery pipeline can be seen below. Arora outlined the implementation challenges that the DEA team faced with the migration process: Getting data from live sources: The job being migrated required access to the complete viewing history of the user of every playback initiation event.
This was conceptually easy to implement with stream processing, as the integration with the Netflix stack and real-time nature of the data processing meant that a simple RPC-like call was required for each event as it was processed.
Each item of metadata required within the stream-processing job could have been obtained by making a call in the same fashion as getting data from live sources.
However, this would require many network calls, and ultimately be a very inefficient use of resources. Instead the metadata was cached into memory for each stream-processing instance, and the data refreshed every 15 minutes.
When a batch job fails due to an infrastructure issue, it is easy to rerun the job, as the data is still stored within the underlying object store — i. This is not necessarily the case with stream processing, as the original events can be discarded as they are processed. Within the Netflix ecosystem, the TTLs of the message bus Kafka that stores the original events can be relatively aggressive — due to the volume, as little as four to six hours.Lean and Six Sigma Case Studies.
We have picked some of our projects to show as Lean and Six Sigma case studies. We have conducted a huge variety of Lean Six Sigma projects across the World.
Learn about the Model for Improvement, forming the improvement team, setting aims, establishing measures, and selecting and testing changes..
Measures. Measurement is a critical part of testing and implementing changes; measures tell a team whether the changes they are making actually lead to improvement.
Changes for Improvement.
interview case study examples, business case interview examples, capital one case interview examples, case interview examples and answers, sample case interview. TXM Lean Case Study Video – Food Industry Lean Manufacturing – Ferguson Plarre Ferguson Plarre is a major industrial bakery supplying quality cakes and pastries fresh daily to a chain of over 50 retail stores. Today we bring back the ever-popular reader case study series with an interesting twist.. First of all, our subject is a new reader, with sizable financial baggage from .
While all changes do not lead to improvement, all . Planet Lean is the official online magazine of the Lean Global Network to share LGN's knowledge of and experience with lean thinking and practice around the world.
John Shook's eLetters In pioneering books such as Learning to See, Managing to Learn, and Kaizen Express John Shook has taken Lean Thinkers from the fundamentals of implementing lean business systems to new territory. This Lean Six Sigma Black Belt certification training course helps you learn practical applications of LSSBB in quality projects, Lean, DFSS & DMAIC.
Enroll now! Lean startup is a methodology for developing businesses and products, which aims to shorten product development cycles by adopting a combination of business-hypothesis-driven experimentation, iterative product releases, and validated arteensevilla.com central hypothesis of the lean startup methodology is that if startup companies invest their time into iteratively building products or services to.