As hit upon in earlier articles in this series on working with regulatory device data, there are three large problems that are involved in medical device datasets: usability, accessibility and relevance. In the last article, we talked about how to make the data more usable and accessible to end users. Unfortunately, however, this solution still left us with arguably the largest problem: relevancy.
Because the datasets in question are massive and because it takes significant amounts of time to locate specific reports that have bearing on a given problem or inquiry, a more refined approach to tracking down relevant information is indispensable to driving efficiency.
Currently, end users are required to read through each report in order to identify if it is relevant to their purpose—and even if one were to cut that data down, this could still entail over 10,000 reports and require substantial human time. By the hundredth report, the average user starts to think: there has to be a better way to do this.
And there is: we can use Cortex within Snowflake to classify the data into categories that are relevant to specific users and generate categories to generate issues about the medical devices based on the reports.
This can speed the time spent digging through reports by up to 300 times and allows end users to find data that is relevant to their work—finding the gold in the dataset.

What is Cortex?
Snowflake Cortex provides robust LLM (Large Language Model) capabilities directly within the Snowflake platform, enabling seamless integration of AI features with data workflows.
Users can access pre-built LLM functions for tasks like text summarization, sentiment analysis, language translation, and content generation through simple SQL or Python commands. These serverless functions eliminate the need for external AI infrastructure, ensuring data governance and security.
Cortex also supports fine-tuning, allowing customization of pre-trained LLMs with proprietary data for improved task-specific performance. Additionally, Cortex Search enables advanced vector and fuzzy searches, ideal for enterprise search and retrieval-augmented generation (RAG) applications.
These features make it easy for technical and non-technical users to incorporate generative AI into their analytics and applications.
By integrating LLM functionality natively, Snowflake Cortex simplifies AI adoption, empowering organizations to extract actionable insights and deliver AI-driven innovations without leaving the Snowflake ecosystem.
This allows us to build custom solutions that work with not only external data sources but also your internal data assets around these adverse events.
Classifying the Data: Why LLMs?
Now that we’ve looked at how Cortex can help power a Snowflake-native solution, it’s time to make the data relevant to the users who need the data.
Many times people who access the adverse events data are looking for specific types of adverse events to address in their device.
This could be a use error for Human Factors Engineers, a distribution or manufacturing error for Quality Engineers, but they are forced to sift through all the data and read the reports to find out if it is relevant to them.
But what if you could use LLMs to cut down on this large human workload?
The reason for using an LLM as opposed to other traditional methods like fuzzy matching or other natural language processing approaches is that there are no standardized processes around writing a report. Therefore it is hard to find search terms that fit in all the scenarios that might be considered a type of error.
For example, you may want to use the word “user” when trying to find user errors, but this exact terminology may not be deployed in every instance, especially outside the User Experience fold.
By using LLMs you can capture the context of the report and compare it to your definitions instead of relying on specific words to correctly capture the category.
Classifying for Relevance via Prompt Engineering
To do this using the LLM capabilities we have to generate a prompt to send to the LLM—also known as “prompt engineering.” To generate our prompt we break it into a few parts.
The first is the context of what is going on (for example: “you are an agent that is tasked with classifying these descriptions into categories based on medical device verticals”).
Second, we give it examples or training data, this is usually piped in from a function that pulls them from a table. Which can use your internal data, to help train the model to perform similarly to your engineers, and work in a way they understand.
Third, we give it the classifications and the definitions we have, this is again piped in from a function that pulls them from a table. These can also be generated by your engineers or created by the solution itself.
Finally, you have the output, and you have to give the prompt how you expect the prompt to output the categorized data, for example just appending another column on the data and putting the classification in there. But this can be tuned to whatever is needed in your circumstances.
This is a good way of approaching a “first pass” attempt for categorizing our data, and can get us useful responses in a quick manner, but given that we are using cortex we can fine tune our model by adding more context to the LLM and by doing more training by building a data set for cortex.
The other useful aspect of this prompt engineering framework is that we can modify our definitions of the categories and get a quicker response. This allows users to modify their categories and definitions to make the data even more relevant to them.
For example you may start with a definition of “Device breaks due to an internal component breaking, or a step in the process not working” for device malfunction but if you are only wanting to look for battery issues you could modify the definition to “Device fails to turn on, sustain activity or breaks, maybe due to battery issues”.
You can then use these categories to either find specific adverse events or get an overview of how your organization is doing while developing the devices and if more focus needs to be put on one area of the company such as Research and Development, Manufacturing or Human Factors Engineering.

Generating Classifications & Next Steps
The second big cornerstone of this solution is being able to generate classifications for the user. This allows the user to quickly come to grips with the data if they haven’t worked with it before by generating error classification categories.
That said, the biggest thing that this enables is allowing users to see if there is a new problem that they haven’t encountered before that is popping up within the data. This, in turn, allows the users to address the error before it becomes a larger problem.
For example, you may be aware that there are issues with the buttons on the device that render them difficult to press, but maybe with the new devices being pushed out there is a change that made the flow of the process more confusing the users and wasn’t caught during testing, by using this solution we have developed a way to sift through the data and catch those new potential errors that pop up with the medical devices and send alerts to the company so these errors can be addressed.
Since this solution already leverages LLMs, you can also implement a Chat Agent into the tool to ask specific questions about device manufacturers or specific device families.
A New Era for Relevant and Accessible Medical Device Data
Solutions like the one above is the kind of innovation that excites us at Hakkoda—allowing us to enable our end users to do more with their data to better serve patients and clients alike.
Instead of spending valuable time reading through thousands of reports just to be able to identify relevant information, LLM-powered solutions like the one above can now do the “panning for your gold” on your behalf, providing you with the information you need while freeing up your teams for more meaningful—and impactful—tasks.
Interested in learning more? Watch a full demo of the tool here, or talk to one of our experts today.