Search engines use a concept called “inverted index” for fast information retrieval on huge corpus of documents. The compute involves “indexing” and “query”. Due to low latency requirements, storage and compute are closely tied in these systems, which prevents independent scalability, and impacts price/performance. This talk will cover the traditional search engine architectures, how Amazon is innovating in this area with a “remote storage” concept in the OpenSearch engine.
Education of product/engine
Innovation Awareness
Brand Awareness
Industry trends on open standard architectures
Past trends and future possibilities
How various standards are coming together?
Identification of file system performance metrics
Understanding the influence of various system components on file system performance
Discussion on modelling benchmark performance & using microbenchmarks to investigate components of file system performance
Understanding Basic concepts of AIOPS
Understanding relationship of AIOPS with Storage Management and Support
Architecture building blocks required to implement AIOPS
Learn the internal architecture of Apache Ozone suitable for on-prem workloads
Understand the design of a distributed storage system with both hierarchical file system and KV capabilities
Learn about scalability and performance characteristics of Apache Ozone for analytics workloads
Containerising Distributed Applications
Migrate distributed storage system onto the kubernetes platform
Deploy distributed applications in the kubernetes platform
Get acquainted with Multi-cloud Security Challenges
Identify threat causing Configurations or Behaviors
Preventive Actions
Filesystems and protocols
Cloud NAS topologies
Performance engineering challenges
An SSD with high-end cellular technology provides high storage durability but suffers significant performance degradation due to multiple retrieval functions. However, the re-reading method is essential to ensure the integrity of the SSD memory. It can significantly increase SSD reading delays by introducing multiple re-read steps that re-read the target page with adjusted reading reference values. To reduce re-read delays, two advanced features are widely accepted in NAND Flash-based SSDs: 1) CACHE READ command and 2) ECC robust engine. First, we can minimize the delay in retrying using the advanced CACHE READ command that allows the NAND flash chip to perform sequential readings in a pipeline. Second, a large ECC competent margin exists in the final attempt and can be used to reduce chip-level learning delays. Based on new findings, we discuss two new strategies to reduce re-reading delays effectively: 1) Pipelined Read-Retry (PR²) and 2) and Adaptive Read-Retry (AR²). PR² minimizes the delay of the experimental task by installing the sequence retrieval steps using the CACHE READ command. AR² minimizes the delay of each re-step step by flexibly reducing the chip level reading delay depending on current operating conditions that determine the ECC power margin. These strategies improve SSD response time by up to 31.5% (17% on average) over a modern basis with only a slight change in SSD controller.
Challenges while deploying a solution for device cooling
Machine learning implementation for device thermal management
Recommendations with respect to effective device thermal management based on a-priori predictions
Methodology and principle of using DNA for long-term data storage
Process of encoding and decoding the binary data
Pros and cons of each drive type - SSD, HDD, Tape Drive
Understanding of Monolithic and Microservices Architecture
Memory model and sizing consideration for Java based microservices
Potential tools to Monitor and measure the microservice memory usage
Terahertz Magnon Laser
RKKY interaction
Ultra-fast magnetic recording without energy dissipation
Understand SPDK block layer
Understand xNVMe and xNVMe BDEV
Understand io_uring passthrough for NVMe
Understand Computational Storage
Offloading of query planner to hardware
Understanding of advantages of the offload mechanism
Explains how s3select improves the performance and reduces the cost of applications that need to access data in S3
Explaining the official S3 operation (Introduced by AWS a few years ago) and how it is implemented in Ceph/s3select
Helps in understanding how s3select can be used and where it is beneficial
We examine the benefits of using computational storage devices like Xilinx SmartSSD to offload the compression
Peer to Peer data flow and it's advantages
Advantages of offloading at File System
The rapid development of the Fintech sector in recent years has brought with it a number of data related challenges such as managing data growth, data protection and security against hacking, compliance with money laundering and data privacy legislation, and demand for new applications in a competitive space. Many Fintech companies must overcome these challenges so they can continue to develop sustainably and build trust amongst their customer base.
This talk will outline the needs of the market sector and act as a motivation to storage developers, architects, and engineers to produce innovative storage products and solutions to overcome the challenges in the Fintech industry.
Modernization of applications is a major trend in the IT industry. Modernization implies creating containerized applications using micro-services and devops approaches and includes transitioning existing applications to containers. As more enterprise workloads move to containers, storage and storage services become a critical part of running containerized applications in a production environment. This talk will describe the current state of storage technology for containerized applications, the challenges and required solutions for providing enterprise level storage services for critical applications running in a containerized environment.
Understand the limitation of existing path (block IO and sync passthrough)
Understand how to use the the new path (async passthrough)
Understand io_uring based async programming
NVMeOF based storage systems
Test library design for complex storage system test
Poseidon OS and its Introduction
Challenges while performing analytics over historical data
Introduction to Streaming World
Pravega's contribution to stream processing and storage system
CXL and Computational storage's driving force
Commonality and differences between both the standards.
Use cases and fitment of each standards.
How automated machine learning can be used for applying dynamic machine learning at scale in production systems
Fundamentals of hyperparameter tuning for a resource-constrained environment
Enhanced AutoML approach for dynamic modelling to get required speed and accuracy
Understanding cloud native operating model at edge
Understanding Cloud native storage services at far edge ,Enterprise edge and network edge
Understanding Different data sources(block,object,streaming etc) at edge and its associated storage services
Poseidon Project
Significance of Object Storage in Cloud
How to build Object based STaaS with Kubernetes
Scale Out Cloud Storage
Cloud Storage Testing challenges
Storage Testing Tools and Techniques
What is SMB multichannel feature?
How to get best perf out of Azure files?
How does Linux do SMB multichannel?
Illustrate the different congestion scenarios in a SAN
Describe the working principles of the DIRL solution
Present the benefits of DIRL in contrast to alternative approaches
Know the existing challenges for long-term data archival and current cybersecurity trends
Learn the fundamental needs for data archive protection that led to Zero Trust Security measure
Understand the intrusion defence and data manipulation protection that can be achieved by logically air gapping offered by Zero Trust
Checking validity of configurations
AI driven methods that could classify systems with issues
Auto remediation of issues
API Security issues
Recent attacks against APIs
Learning the OWASP API Security top 10
TLS equipped TCP interfaces
Comparative study between gRPC and secure TCP interfaces
Performance benefits of secure TCP over gRPC interfaces
Learning state of the art secure Peer to Peer File sharing method
Learning state of the art distributed cloud storage architecture and algorithms
Feedback from storage developers developing distributed cloud storage
Some learning on container Data protection
Containerised storage usage patterns
Containerised application deployment view
Understand real life containerised application
Understand current state of backup and restore in container environment
Understand Consistent Backup and restore of Containerised application
Performance of applications run in containerized environment in unpredictable situations
Evaluate the stability and reliability of storage resources by applications
Evaluate fault tolerance, resiliency and recovery of applications
Introduce Storage Monitoring and Observability
What is available in Kubernetes for Storage monitoring
Heterogeneous container storage monitoring
Understand how native solution services offered by clouds fail short for unified data lake environments.
Learn how one can start architecting their storage solutions for multi-cloud (best practices, tools, tips).
Evolution of metrics to measure how multi-cloud storage solutions react to control and data path operations.