Last month, the SNIA Cloud Storage Technologies Initiative was fortunate to have artificial intelligence (AI) expert, Parviz Peiravi, explore the topic of AI Operations (AIOps) at our live webcast, “IT Modernization with AIOps: The Journey.” Parviz explained why the journey to cloud native and microservices, and the complexity that comes along with that, requires a rethinking of enterprise architecture. If you missed the live presentation, it’s now available on demand together with the webcast slides.
We had some interesting questions from our live audience. As promised, here are answers to them all:
Q. Can you please define the Data Lake and how different it is from other data storage models?
A. A data lake is another form of data repository with specific capability that allows data ingestion from different sources with different data types (structured, unstructured and semi-structured), data as is and not transformed. The data transformation process Extract, Load, Transform (ELT) follow schema on read vs. schema on write Extract, Transform and Load (ETL) that has been used in traditional database management systems. See the definition of data lake in the SNIA Dictionary here.
In 2005 Roger Mougalas coined the term Big Data, it refers to large volume high velocity data generated by the Internet and billions of connected intelligent devices that was impossible to store, manage, process and analyze by traditional database management and business intelligent systems. The need for a high-performance data management systems and advanced analytics that can deal with a new generation of applications such as Internet of things (IoT), real-time applications and, streaming apps led to development of data lake technologies. Initially, the term “data lake” was referring to Hadoop Framework and its distributed computing and file system that bring storage and compute together and allow faster data ingestion, processing and analysis. In today’s environment, “data lake” could refer to both physical and logical forms: a logical data lake could include Hadoop, data warehouse (SQL/No-SQL) and object-based storage, for instance.
Q. One of the aspects of replacing and enhancing a brownfield environment is that there are different teams in the midst of different budget cycles. This makes greenfield very appealing. On the other hand, greenfield requires a massive capital outlay. How do you see the percentages of either scenario working out in the short term?
A. I do not have an exact percentage, but the majority of enterprises using a brownfield implementation strategy have been in place for a long time. In order to develop and deliver new capabilities with velocity, greenfield approaches are gaining significant traction. Most of the new application development based on microservices/cloud native is being implemented in greenfield to reduce the risk and cost using cloud resources available today in smaller scale at first and adding more resources later.
Q. There is a heavy reliance upon mainframes in banking environments. There’s quite a bit of error that has been eliminated through decades of best practices. How do we ensure that we don’t build in error because these models are so new?
A. The compelling reasons behind mainframe migration – beside the cost – is ability to develop and deliver new application capabilities, business services and making data available to all other applications.
There are four methods for mainframe migration:
- Data migration only
- Re-platforming
- Re-architecting
- Re-factoring
Leave a Reply