HomeAI News
The widening web of effective altruism in AI security | The AI Beat

The widening web of effective altruism in AI security | The AI Beat

Hayo News
Hayo News
December 13th, 2023

A couple of days ago, a US AI policy expert told me the following: “At this point, I regret to say that if you’re not looking for the EA [effective altruism] influence, you are missing the story.”

Well, I regret to say that, at least in part, I missed the story last week.

Ironically, I considered an article I published on Friday a slam-dunk. A story on why top AI labs and respected think tanks are super-worried about securing LLM model weights? Timely and straightforward, I thought. After all, the recently-released White House AI Executive Order includes a requirement that foundation model companies provide the federal government with documentation about “the ownership and possession of the model weights of any dual-use foundation models, and the physical and cybersecurity measures taken to protect those model weights.”

I interviewed Jason Clinton, Anthropic’s chief information security officer, for my piece: We discussed why he considers securing the model weights for Claude, Anthropic’s LLM, to be his number one priority. The threat of opportunistic criminals, terrorist groups or highly-resourced nation-state operations accessing the weights of the most sophisticated and powerful LLMs is alarming, he explained, because “if an attacker got access to the entire file, that’s the entire neural network.” Other ‘frontier’ model companies are similarly concerned — just yesterday OpenAI’s new “Preparedness Framework” addressed the issue of “restricting access to critical know-how such as algorithmic secrets or model weights.”

I also spoke with Sella Nevo and Dan Lahav, two of five co-authors of a new report from influential policy think tank RAND Corporation on the same topic, called Securing Artificial Intelligence Model Weights. Nevo, whose bio describes him as director of RAND’s Meselson Center, which is “dedicated to reducing risks from biological threats and emerging technologies,” told me that within two years it was plausible AI models will have significant national security importance, such as the possibility that malicious actors could misuse them for biological weapon development.

The web of effective altruism connections in AI security

As it turns out, my story did not highlight some important context: That is, the widening web of connections from the effective altruism (EA) community within the fast-evolving field of AI security and in AI security policy circles.

That’s because I didn’t notice the finely woven thread of connections. Which is ironic, because like other reporters covering the AI landscape, I have spent much of the past year trying to understand how effective altruism — an “intellectual project using evidence and reason to figure out how to benefit others as much as possible” — turned into what many call a cult-like group of highly influential and wealthy adherents (made famous by FTX founder and jailbird Sam Bankman-Fried) whose paramount concern revolves around preventing a future AI catastrophe from destroying humanity. Critics of the EA focus on this existential risk, or ‘x-risk,’ say it is happening to the detriment of a necessary focus on current, measurable AI risks — including bias, misinformation, high-risk applications and traditional cybersecurity.

EA made worldwide headlines most recently in connection with the firing of OpenAI CEO Sam Altman, as its non-employee nonprofit board members all had EA connections.

But for some reason it didn’t occur to me to go down the EA rabbit hole for this piece, even though I knew about Anthropic’s connections to the movement (for one thing, Bankman-Fried’s FTX had a $500 million stake in the startup). An important missing link, however, became clear when I read an article published by Politico the day after mine. It maintains that RAND Corporation researchers were key policy influencers behind the White House’s requirements in the Executive Order, and that RAND received more than $15 million this year from Open Philanthropy, an EA group financed by Facebook co-founder Dustin Moskovits. (Fun fact from the EA nexus: Open Philanthropy CEO Holden Karnofsky is married to Daniela Amodei, president and co-founder of Anthropic, and was on the OpenAI nonprofit board of directors until stepping down in 2021.)

The Politico article also pointed out that RAND CEO Jason Matheny and senior information scientist Jeff Alstott are “well-known effective altruists, and both men have Biden administration ties: They worked together at both the White House Office of Science and Technology Policy and the National Security Council before joining RAND last year.”

After reading the Politico article — and after a long sigh — I immediately did an in-depth Google search and dove into the Effective Altruism Forum. Here are a few things I didn’t realize (or had forgotten) that put my own story into context:

Matheny, RAND’s CEO, is also a member of Anthropic’s Long-Term Benefit Trust, “an independent body of five financially disinterested members with an authority to select and remove a portion of our Board that will grow over time (ultimately, a majority of our Board).” His term ends on December 31 of this year. Sella Nevo, Dan Lahav and the other three researchers who wrote the RAND LLM model weights report I cited – RAND CEO Jason Matheny, as well as Ajay Karpur and Jeff Alstott — are strongly connected to the EA community. (Nevo’s EA Hub profile says“I’m excited about almost anything EA-related, and am happy to connect, especially if there’s a way I can help with your EA-related plans.” Nevo’s Meselson Center, as well as the LLM model weights report, was funded by philanthropic gifts to RAND including Open Philanthropy. Open Philanthropy has also given $100 million to another big security-focused think tank, the Georgetown Center for Security and Emerging Technology (where OpenAI former board member Helen Toner is director of strategy and foundational research grants)Anthropic CISO Jason Clinton spoke at the recent EA-funded “Existential InfoSec Forum” in August, “a half-day event aimed at strengthening the infosec community pursuing important ways to reduce the risk of an existential catastrophe.” Clinton runs a EA Infosec book club with fellow Anthropic staffer Wim van der Schoot that is “directed to anyone who considers themselves EA-aligned” because “EA needs more skilled infosec folk.” Effective altruism wants people to consider information security as a career: According to 80,000 Hours, a project started by EA leader William McCaskill, “securing the most advanced AI systems may be among the highest-impact work you could do.”

No surprise that EA and AI security is connected

When I followed up with Nevo for additional comment about EA connections to RAND and his Meselson Center, he suggested that I shouldn’t be surprised that there are so many EA connections in the AI security community.

Until recently, he said, the effective altruism community was one of the primary groups of people discussing, working on, and advocating for AI safety and security. “As a result, if someone has been working in this space for a significant amount of time, there’s a decent chance they’ve interacted with this community in some way,” he said.

He added that he thought the Politico article was frustrating because it is “written with a conspiratorial tone that implies RAND is doing something inappropriate, when in fact, RAND has provided research and analysis to policy makers and shapers for many decades. It is literally what we do.”

Nevo stated that neither he nor the Meselson Center “were directly involved nor were we aware of the EO.” Their work didn’t affect the security rules in the EO, “although we believe it may have indirectly influenced other non-security parts of the EO.” He added that the EO’s provisions on securing model weights were already part of the White House Voluntary Commitments “that had been made months before our report.”

While there is little information online about the Meselson Center, Nevo pointed out that RAND has dozens of small and large research centers. “Mine is not only the youngest center at RAND, but also one of the smallest, at least for now,” he said. “Work so far has focused on pathogen agnostic bio surveillance, DNA synthesis screening, dual-use research of concern, and the intersection of AI in biology.” The center currently engages a handful of researchers, he said, but “has funding to ramp up its capacity…we have been sharing more and more about our center internally and hope to stand up the external web site very soon.”

Do we need effective altruism on that wall?

Does any of this EA brouhaha really matter? I think of Jack Nicholson’s famous speech in the movie “A Few Good Men” that included ”You want me on that wall…you need me on the wall!” If we really need people on the AI security wall — and a majority of organizations are affected by a long-term cybersecurity skills shortage — does knowing their belief system really matter?

To me and many others seeking transparency from Big Tech companies and policy leaders, it does. As Politico’s Brendan Bordelan makes clear in another recent piece on the sprawling network of EA influence in DC policy circles (yep, I missed it), these are issues that will shape policy, regulation and AI development for decades to come.

The US AI policy expert I chatted with a couple of days ago mused that policy people don’t tend to think of AI as an area where there are ideological agendas. Unfortunately, he added, “they are wrong.”

Reprinted from Sharon GoldmanView Original

Comments

no dataCoffee time! Feel free to comment