LLMs As ‘Crystal Ball’?

Just before Christmas 2023, a team of researchers from several universities in and around Copenhagen, Denmark, and Northeastern University in Boston, published a study titled Using Sequences of Life-Events to Predict Human Lives in Nature Computational Science. Its lead author, Sune Lehmann, thought it was “a cool idea.”

The idea—that one could use the principles of large language models (LLMs) to predict upcoming events in a person’s life in much the same way those models predict upcoming words in a sentence—turned out to be much hotter than cool. The model demonstrated an ability to predict a person’s likelihood of dying within a four-year span with 78.8% accuracy; on a cohort of Danes aged 35 to 65, the model was significantly more accurate (11%) than existing baseline methods, including recurrent neural networks, logistic regression, and actuarial tables. It garnered worldwide attention from both technical and general audiences, and spawned more than one off-base concept about how it works.

“What’s interesting about the model is that it’s a foundation model, but for life events rather than for words,” said Lehmann, associate professor of informatics at the Technical University of Denmark. “So we could predict anything in the entire universe. I did choose the prediction time for death because I wanted to do something that would capture people’s attention. That said, if I had known it would go that viral, maybe I would have chosen something like income.”

Lehmann said his planned winter holiday turned into a period in which he had to counter misconceptions about what his team’s model, called life2vec, was actually doing. A misleading story in a general interest newspaper, which said the model could predict an individual’s death with “eerily exact accuracy,” led to what Lehmann called “a loop of sensationalist garbage,” and a lot of time explaining exactly what life2vec is trying to demonstrate.

“I kind of felt when I looked at all that that I needed to start talking to people and explaining what we were doing and what’s exciting about it. I have spent a lot of time speaking to people, and I think now we are getting to a place where people realize what is actually going on.

“In some sense, I didn’t do anything all that brilliant,” he contended. “I just said, ‘Wait a minute, with large language models, we have a tool that can handle sequences in a new way. And there are so many important things in the health and social sciences that are sequences and we haven’t been able to process them. That’s what’s really most exciting; if your income takes a dip, for example, we can capture that in a completely different and much stronger way than ever before.”

While Lehmann may contend life2vec is more of a new vector for a well-accepted premise in modeling (that of forecasting based on large amounts of data), the introduction of life2vec is in the vanguard of an AI ecosystem in which both traditional machine learning and large generative models are emerging that, in some sense, try to help people divine their futures.

The life2vec researchers had the advantage of using data from the Danish national registry, which included what they called “a host of indicators” (health, professional occupation and affiliation, income level, residency, working hours, education) on Denmark’s 6 million people over a 10-year span; such comprehensive datasets are rare elsewhere. However, other predictive AI pioneers are emerging in other nations, using a variety of machine learning methods and datasets. Some are commercial startups cautiously testing the waters of predictive computation; others are public sector entities laying the groundwork for what they hope will be less-uncertain forecasts, especially in areas like population health.

Back To Asimov’s Idea of The Future

One such public sector effort is All Of Us, a comprehensive effort by the U.S. government’s National Institutes of Health to allow doctors to deliver personalized precision medicine, using widely heterogeneous data from 1 million Americas. So far, All Of Us has signed up more that 780,000 people toward that goal.

All Of Us Chief Technology Officer Chris Lunt read the life2vec paper and said it stimulated thoughts of the possibilities of using AI to predict medical events more accurately, but also of visionary material that was once clearly labeled “science fiction” that now appears possible.

“It touched on many of the things that inspired me to come into this space,” Lunt said. “For example, Isaac Asimov’s Foundation series, and trying to imagine if could you really get to the point where you could predict the direction a society is taking. For All Of Us, that means it’s about health. How can I use my capacity to guess what’s going to happen to you next as a way to steer towards better outcomes?”

Lunt told Communications there is already a lot of theoretical connection between transformer models and genetics research; he noted imputation servers, which allow geneticists to accurately evaluate the evidence for association at genetic markers that are not directly genotyped.

“As soon as I understood how the transformer models worked, it seemed to me to be a great way to build an imputation server,” he said. “The imputation server principles of filling in missing data are exactly what a transformer model does.”

Lunt said transformer models also could be useful on what he termed “dark data,” which in the health realm means data that is “complex and that we know is ultimately related to your health but that we don’t understand how those factors impact you.” He noted datasets such as activity data (a recent study of All Of Us data showed people who took 10,700 steps a day helped reduce the likelihood of developing Type 2 diabetes) and unstructured physician notes, though he said data control and integrity concerns need to be addressed before free text can be modeled by All of Us-affiliated researchers.

Startups are Being Careful

Two U.S.-based commercial startups are balancing their ethical obligations to use AI while also protecting users’ interests. One, San Francisco-based Waterlily Planning, is combining data from public sources and user questionnaires to help users plan for long-term care in the fragmented U.S. market, but does not use generative AI. The other, investment advisory tool iFi AI, does use generative AI through a partnership with IBM’s watsonx, to ingest and analyze 1 million bits of data per day, but has no plan to combine market data with investors’ personal information such as income history, amount of assets owned, or risk tolerance.

“We haven’t had that conversation yet,” iFi AI CEO Ron Insana said. “One, we’re not a fiduciary. We don’t want to be in investment advisory per se, making recommendations as opposed to assisting in decision making. We would rather be on the assist side for a wide variety of reasons, and it’s not just for legal and compliance. It’s also that for autonomy’s sake. People should make their own decisions. If we can give them a tool that enhances that process and makes their outcome more likely to be beneficial, we’re more than happy to do that.”

iFi AI, which is marketed to the individual investor, is a corporate division of QuantumStreet AI, which serves institutional investors. The company’s AI platform ingests 1 million bits of data per day, including news articles and macro, fundamental and technical financial signals, and other data sources to provide forecasted returns for more than 1,000 stocks and exchange traded funds. Each forecasted return is associated with a predicted high and low price based on the platform’s confidence scores. A higher confidence score results in a smaller spread between the low and high forecasts. AI generative summaries are available to help investors better understand the signals and news that drove the forecasted returns and confidence scores.

Though many investment firms claim to be using AI, specifics are sparse (two of the largest in the U.S., Fidelity and Vanguard, did not wish to comment for this story). But Insana believes the QuantumStreet/iFi AI approach is unique.

“We hope to be both inside and outside product in every capacity—institutional, advisory, individual investor—we don’t think anybody’s in that space,” Insana said. “We are not trying to gamify it. We’re trying to give them a leg up, with complete transparency on how the models work.”

Counter to iFi AI’s lack of including personal data in its models, Waterlily founder Lily Vittayarukskul said her company depends on it. Waterlily, which she said is in a “mature beta” phase used by several financial planning firms, including Prudential and Financial Independence Group, draws on questionnaire material generated by nearly 50,000 respondents and 500 million public and private data points. The platform then predicts whether a person will need long-term care, splits the care into three phases—early, moderate, and full—and predicts the cost of that care, including unpaid care delivered by family members. Vittayarukskul said the “soft costs” of such care, such as lost earnings for family caregivers or expenses incurred by having to move back nearer to an aging parent, are often neglected.

However, generative AI is not in the mix. Vittayarukskul said she opted out of using generative AI for now to minimize the chance of hallucinations and because other forms of AI are just a better fit for the data her platform uses.

In a detailed disclaimer, Waterlily outlines the machine learning methods it uses; time-to-event, classification modeling, and regression modeling. The use of ML, Vittayarukskul said, helps to pinpoint individuals’ estimates that are often inaccurately calculated when using Monte Carlo projections or projections based on national averages. She also said a user could go back into the calculator and change a data point—for instance, if a person were to suffer a minor heart attack, that may change the projected trajectory of if or when long-term care night be needed. And, she said, looking at individualized projected costs based on personal questions could possibly be the inspiration to improve lifestyle factors.

“We actually think there is a good chance with what we are building that we could improve behavior in a way no other health player has been able to do because we have incorporated the financial piece—it could lower your costs or improve your longevity.”

A Persistent Digital Key for AI

In addition to the concept of computing life events sequentially as one would language in a transformer model, life2vec also introduces an element that could someday be useful to anyone using any AI infrastructure. Called a person-summary, Lehmann said it could someday be considered like a persistent, personalized digital key. The paper defines the person-summary as “a single vector that encapsulates the essential aspects of an individual’s entire sequence of life-events, and the model determines which aspects are relevant to the task at hand.”

“Let’s say someone took a photo of your retina to predict risk of Alzheimer’s Disease,” Lehmann said. “What if you just add the information into the information of your life trajectory as encapsulated in the person-summary? You can think of it as something you can use as an adjunct to some analysis you are running on a completely different piece of data. We haven’t tried that, but I bet we would get an increase in accuracy if we combined data from a test to a person-summary.”

And, though the paper introduced some intriguing advances in “humanizing” AI predictions, Lehmann emphasized that it is just a step in a lot of rigorous science that must follow.

“Keep in mind with all of this, including our model, there is no magic there,” he said. “It really is about finding correlations in what’s there, and even though it’s very convincing sometimes, they can’t find stuff that’s not in the data. That’s a crucial thing to remember. A lot of people unfamiliar with the science might very well think there is some magic involved. That’s part of the alchemy that made the paper explode, is that we think these models can do anything.”

Gregory Goth is an Oakville, CT-based writer who specializes in science and technology.

LLMs As ‘Crystal Ball’?

Join the Discussion (0)

Become a Member or Sign In to Post a Comment

The Latest from CACM

Shape the Future of Computing

Communications of the ACM (CACM) is now a fully Open Access publication.