OpenAI is launching a developer platform called Foundry without much fanfare. This platform will allow clients to employ OpenAI’s state-of-the-art machine learning models, such as GPT-3.5, on a dedicated server environment. Pictures of the documentation accompanying Foundry that were posted to Twitter by first-time users indicate that Foundry is specified for “cutting-edge customers running larger workloads.”
The documentation claims that Foundry provides the ability to perform inference on a large scale while having full control over how the model is configured and its performance.
It appears that Foundry, which is set to be released soon, will provide a one-time setup of computing power that is completely devoted to one customer. This setup can be monitored with the same applications and panels that OpenAI employs to construct and optimize various models. Additionally, Foundry will give customers the option of deciding to upgrade or not to upgrade models, as well as gentle adjustment for the latest OpenAI models.
Foundry will provide guarantees of satisfactory service such as performance times and technical help that is accessible on certain dates. Those who rent the dedicated compute units, which can be rented for either three months or an entire year, will need to lease a particular amount of compute units as required to run an individual model program (as indicated in the table below).
Using GPT-3.5 for a three month period will cost $78,000, but if you commit to a one year term the cost goes up to $264,000. To put this in context, a single DGX Station from Nvidia (which is of the newer generation of supercomputers) costs around $149,000.
Observant Twitter and Reddit users noticed that one of the entries on the price guide table has a 32k maximum context window size. (The context window is the text the model takes into account prior to producing more text; longer context windows give the model the ability to “keep in mind” much more text.) OpenAI’s most up-to-date text-generating model GPT-3.5 has a 4k max context window, implying that this unfamiliar model could be the anticipated GPT-4 or a springboard towards it.
OpenAI is facing rising demands to become profitable following a large donation from Microsoft. It is estimated that the company will generate $200 million in 2023, which is substantially smaller than the over $1 billion invested in the business up to this time.
The expensive costs of computing power are mainly responsible for these high expenses. Creating AI versions of state-of-the-art technology can reach millions of dollars and working with them is not necessarily much more cost effective. According to Sam Altman, co-founder and CEO of OpenAI, you need to pay a few cents per discussion to run their viral ChatGPT discussion bot, which had more than a million users by last December – a cost that definitely adds up.
OpenAI has been striving for profitability, releasing ChatGPT Plus, starting at a cost of $20 every month, as well as joining forces with Microsoft to form Bing Chat, something which has earned a fair amount of notoriety. Apparently, OpenAI is also on track to launch a ChatGPT mobile app at some point, along with integrating its AI language capabilities in Microsoft programs such as Word, PowerPoint and Outlook.
OpenAI has two distinct services they are providing through Microsoft’s Azure OpenAI Service. This is an enterprise-oriented model-serving platform. They also have Copilot, a premium code-generating service they created in conjunction with GitHub.