Understanding The Generative AI Development Process
InfoWorld, Monday, May 13th, 2024
Developing generative AI applications is very different from developing traditional machine learning applications. These are the steps.
Back in the ancient days of machine learning, before you could use large language models (LLMs) as foundations for tuned models, you essentially had to train every possible machine learning model on all of your data to find the best (or least bad) fit. By ancient, I mean prior to the seminal paper on the transformer neural network architecture, 'Attention is all you need,' in 2017.
Yes, most of us continued to blindly train every possible machine learning model for years after that. It was because only hyper-scalers and venture-funded AI companies had access to enough GPUs or TPUs or FPGAs and vast tracts of text to train LLMs, and it took a while before the hyper-scalers started sharing their LLMs with the rest of us (for a 'small' fee).