As machine learning technology has matured and moved from research curiosity to something industrial-grade, the methods and infrastructure needed to support large-scale machine learning have also evolved.
Taking advantage of these advances represents both opportunities and risks for startups – almost all of which are leveraging machine learning in one way or another as they compete for a piece of their respective markets.
Neural Networks
- Neural networks have changed every aspect of machine intelligence in software systems we all use daily, from recognizing our speech to recommending what’s in our news feed
- Today’s systems still employ neural networks of an extremely large and powerful size
- Recent systems for understanding and generating human language, such as OpenAI’s GPT-3, were trained on supercomputer-scale resources
- These new systems require tens of millions of dollars in computation
Cloud APIs are easier, but outsourcing isn’t free
- The cost of API calls, data storage, and cloud instances will scale along with usage.
- Many companies that have used cloud APIs for machine learning today may eventually attempt to transition to self-hosted or self-trained models to gain more control over their machine learning pipelines.
Be strategic and keep an eye on the big cats
- Cloud APIs bring their own problems long-term. It’s important to have a strategic exit plan so these APIs do not control your product destiny.
- Keep tabs on what is coming out of the big corporate AI labs.
Pre-trained neural networks give smaller teams a leg up
- A neural network is first trained on a large general-purpose dataset using significant amounts of computational resources, and then fine-tuned for the task at hand using a much smaller amount of data and compute resources.
- The use of pre-trained networks has exploded in recent years as the industrialization of machine learning has taken over many fields and data has increased.
The risks of foundation models: size, cost, and outsourced innovation
- One of the risks associated with foundation models is their ever-increasing scale.
- Pre-training on a large general-purpose dataset is no guarantee that the network will be able to perform a new task on proprietary data.
- Dataset alignment and recency of training data can matter immensely depending on the use case.