When Pinecone launched last year, the company’s message was around building a serverless vector database designed specifically for the needs of data scientists. While that database is at the core of what the company is doing, it is moving towards a more refined use case for that database around AI-driven search, helping those data scientists find the proverbial needle in the haystack.
When we spoke to Pinecone founder and CEO Edo Liberty last year at the time of his $10 million seed round, his company was just feeling its way, building out the database. He came from Amazon where he helped build the SageMaker database service. He says that they have come a long way since then.
“A lot has changed since our seed announcement, so first and foremost we launched our proper production paid service in October, and it’s been growing rapidly both in adoption and revenue since, and so things are going really well,” Liberty said.
He described the reason for a purpose-built database for data scientists at the time of the seed funding this way:
“The data that a machine learning model expects isn’t a JSON record, it’s a high dimensional vector that is either a list of features or what’s called an embedding that’s a numerical representation of the items or the objects in the world. This [format] is much more semantically rich and actionable for machine learning,” he explained.
He says that today that semantically rich approach is driving customers to use Pinecone.”The predominant use of the vector databases is for search, and search in the broad sense of the word. It’s searching through documents, but you can think about search as information retrieval in general, discovery, recommendation, anomaly detection and so on,” he said.
The system is organized into pods, which are sets of resources designed to process the data in the Pinecone database. The company offers a single pod for free to help customers get comfortable with the product and perform a simple proof of concept. After that, they start paying based on the number of pods.
He is confident that the company has architected the system in such a way that it can scale to billions of objects. “You’re able to scale to as much as your software is able to actually withstand and you can actually orchestrate. We’ve designed the system such that there really isn’t any well-defined limit to how much data you can index and use,” he said.
As a serverless database, the customer doesn’t have to worry about provisioning at all, but they have to tell Pinecone just how much they are willing to spend each month, based on the amount of data they need to process.
“They kind of do the back of the envelope to figure out that x pods is going to be plenty for what we’re using in terms of the data that it can hold and the performance it would give me and that’s it.” After that the person simply signs up and with a few clicks in the console and an API call to create the index, it’s up and running and ready to use.
Liberty didn’t want to share growth numbers or employee numbers, but he says he expects to double the staff (whatever that means) in the next year. It’s worth noting that the startup had 10 employees at the time of the seed announcement.
In terms of diversity he said last year, ““We have instructed our recruiters to be proactive [in finding more diverse applicants], making sure they don’t miss out on great candidates, and that they bring us a diverse set of candidates.” In practice he says that has translated into 50% of new technical hires (as opposed to the total number of employees) have been female this year.
The company announced a $28 million Series A today led by Menlo Ventures with participation from new investor Tiger Global along with previous investors including Wing Venture Capital, who led the company’s seed funding. The company has now raised $38 million.