The Perfect AI Storage: Trino From Facebook And Iceberg From Netflix?
The Nex Platform, Tuesday, April 30th, 2024
When it comes to solving data analytics problems at scale, it is tough to beat the hyperscalers. And that is why a combination of technologies that were originally developed at Facebook (now Meta Platforms) and Netflix could end up being the perfect pairing to create a 'lakehouse' underpinning AI training and other applications.
Not surprisingly, everyone who builds a high performance all-flash storage array or a parallel file system commonly used for HPC simulation and modeling applications will try to convince you that their iron is the best one to use for storing the massive amounts of data that are required to train AI neural networks.
The big clouds - notably Amazon Web Services, Microsoft Azure, and Google Cloud - all have object and file systems that they want you to use for storing raw data for AI training, and Snowflake, which is the darling of cloudy data warehousing with a SQL interface, has won its share of business as the storage layer underneath AI training runs.