In the AI Startup of the Week, the editorial staff of ai.nl is featuring promising AI startups, their innovations, solutions and challenges. In this seventh episode, we are taking a look at California-based Segmed, a startup building the biggest database of labelled medical data in the world and it is starting with radiology images.
Artificial Intelligence has revitalised the healthcare industry by offering a way for healthcare professionals to offer their services to more people. From early detection of diseases and improved decision making, to connected care and health monitoring through wearables, AI is transforming healthcare into an accessible, inclusive, and innovative one. However, there are some gaps that need to be addressed and Segmed is trying to address one very important gap.
Plugging the gap in medical dataset
Menlo Park, California-based Segmed is building a labelled dataset for medical research that is of high quality and representative of more people. It is also trying to build a dataset that will help healthcare research without personal information. Segmed is able to do this out of a deeply personal experience for its Dutch co-founder Martin Willemink.
While trying to put together a radiology dataset for a publication despite having vast academic resources at his disposal, Martin realised how the healthcare industry does not have a centralised labelled dataset easily accessible to everyone. So, Martin joined hands with co-founder and CEO Cailin Hardell, co-founder and CTO Adam Koszek, and Jie Wu, co-founder and Chief Data Officer, to build Segmed in 2019.

The life science and health tech startup is alumni of prestigious Y-Combinator and is now building the biggest database of labelled medical data with a team of 14 members. It is starting with radiology images where we have already seen some early success with artificial intelligence (AI).
Segmed: how does it work
According to Crunchbase, Segmed has raised a total of $2.2M from Blumberg Capital and Nina Capital to build its medical dataset. The ingenuity of Segmed lies in the way it is building this database, which is an unconventional approach. The health tech startup first forms a revenue sharing partnership with medical imaging clinics, hospitals, and teleradiology companies.
Once these deals are in place, it starts ingesting huge batches of data from these partners and even sets up services in their facilities. The platform built on top of this ingested data is called Insight, which Segmed refers to as democratised healthcare data. Segmed doesn’t stop at just the data platform, it has ensured that the AI teams work with anonymised medical data from its healthcare partners.
The startup is also clever to differentiate its platform to healthcare partners and AI teams. For healthcare partners, Segmed is a tool to generate revenue while reducing bias in the system. The startup also securely manages medical data and provides it in a double-blind platform to AI researchers and developers. The phrase “data is the new oil” fits really well in this case.
For healthcare systems, hospitals, and imaging centres, Segmed is not only a way to generate revenue, but also brings other benefits. It offers easy integration where Segmed handles 100 per cent of the IT work and offers a data analytics tool for these healthcare partners to get information or insights on their own data.
For AI teams, Segmed is a one-stop shop with access to diverse, standardised data that has been a stumbling block for acceleration in AI development. Segmed Insight acts as a curated medical data for AI teams and it does all this in an ethical, fast and easy way. The biggest benefit for AI teams here is the massive reduction in data sourcing time.
This should speed up research for “AI radiology, medical devices, pharma, surgical robotics & academic research.” The Insight platform also brings datasets from multiple international locations and delivers a systemic, organised report as well.
Focus on patient privacy and security
A lot of industry watchers are worried about data privacy with AI tools and services, and Segmed is not only aware of these concerns but is addressing them. “At Segmed, our philosophy is that patient privacy and data security come first,” the startup says on its website.
Segmed says it applies a de-identification process while the data ingested from its partners is still being transferred to its servers. This means that no PHI (protected health information) will ever land in the servers of Segmed. It does this de-identification or removal of PHI using a combination of traditional and machine learning methods. This information is removed from DICOM headers and reports.
Once the automation is done, Segmed routinely checks random samples for proper de-identification. As mentioned earlier, it also supports two-way anonymity with its double blind architecture, which means neither the healthcare partner nor the AI team can identify one another. Lastly, the Segmed Insight platform is also HIPAA compliant with all of its team members participating in HIPAA training.
Segmed is currently in the process of getting SOC2 and ISO 27001 certifications. It uses Amazon Web Services (AWS) for security of its data on cloud and deploys cybersecurity scanners and detection systems to embed security in the platform. In order to reduce the number of people interacting with sensitive data, Segmed has also automated most of its data pipeline. All this makes for a truly private and secure network for medical data.
Segmed and its work with Aidence on lung nodule management tool
Segmed is arguably doing mission critical work in the field of healthcare. It is not only building a medical dataset but one that is easily accessible, diverse in nature and can pave the way for transformative AI work. This is evident from the collaboration between Segmed and Aidence, a pioneering AI-based healthcare startup with lung nodules being an area of interest.
Aidence is a team of over 50 members representing 15 nationalities and has become one of the leaders in lung cancer screening across Europe. It has even become the leader in the UK, but the startup wants to move beyond diagnosis and follow patients through their entire lung cancer pathway by providing guidance and support at each step.
Aidence’s focus on chest and access to reliable data from the American National Lung Screening Trial (NLST) helped it with its training initially. However, Aidence found itself in need of better and newer data to further innovate and improve its product. For that data, Aidence turned to Segmed and the startup helped with data crunch by utilising its current data partnerships.
Initially, Aidence did try to contract a contract research organisation (CRO) for its data needs but ran into compliance and regulatory problems. However, Segmed was able to provide Aidence with data at a shorter time frame of just a few weeks. For Aidence, the work with Segmed marked its first attempt with hands-off approach and the startup also became beta user of Segmed Insight platform, which helps with querying anonymised reports.
There is a vast amount of data available out there but companies often struggle to collect them, clean them, and make that data usable. Segmed is essentially doing that heavy lifting for AI-based healthcare companies and other AI teams looking to disrupt the healthcare industry.