Beneficial Artificial Intelligence

AI systems are now, by many measures, outperforming humans on tasks ranging from drug development to gaming to visual art. AI researchers and top technology investors expect this progress to continue. A 2022 survey of over 4,000 researchers publishing in NeurIPS and ICML (top AI conferences) concluded that it is 50% likely that unaided machines can accomplish every single task better and more cheaply than human workers by 2059.

Improvements in AI have enormous potential to better our world: increasing the speed and accuracy of medical diagnoses, reducing traffic accidents by making autonomous vehicles possible, facilitating personalised education, accelerating the development of sustainable energy, and more. But the risks of transformative AI are high. On average, respondents to the above mentioned survey of top AI researchers said that when unaided machines can accomplish every task better and more cheaply than human workers, the effect is 14% likely to be “extremely bad (e.g. human extinction)”.

In order to mitigate these risks, we need to determine how to build AI systems that reliably pursue their user’s goals. Moreover, we need to achieve sufficient coordination between companies and governments to ensure that these solutions are broadly implemented, and so that the long-term trajectory of our world is not determined by a malicious or reckless actor.

Are you a major philanthropist seeking to learn more about these areas? Get in touch with our Founder & President Natalie Cargill at natalie.enquiries@longview.org.

Featured Grants

Center for Human-Compatible Artificial Intelligence

Training at the first academic centre dedicated to AI safety.

As the first centre of its kind, the Center for Human-Compatible AI at UC Berkeley is a landmark in AI safety field-building efforts. Led by Stuart Russell, co-author of one of the most widely-used textbooks on AI, the centre has played a key role in establishing AI alignment as a problem worthy of substantial attention. The centre also trains excellent PhD students in work on AI safety. Since its founding in 2016, the centre has had an excellent track record of placing PhD students in the very top academic departments and AI companies.

Model Evaluation & Threat Research

Testing for dangerous capabilities before release

How would we know if the most powerful AI systems pose catastrophic dangers to the public before they are released on the internet? We currently have only very weak methods for finding out, making it difficult for companies and governments to understand when AI systems are safe, and making every new release an increasing public risk. METR, based at the Alignment Research Center, is developing tools to better evaluate the safety of powerful AI systems. They were granted early access by OpenAI and Anthropic to evaluate the safety of GPT4 (made famous by ChatGPT) and Claude before they were released, and will continue to collaborate with frontier laboratories to ensure their systems do not pose society-scale risks.

Interpretability at Harvard University

Looking inside modern AI systems

Frontier AI models are a black box. Today we have almost no ability to understand why the most powerful AI systems make the decision that they do, making AI systems unpredictable and difficult to align with human values. A new lab at Harvard University is trying to change that through their work on “mechanistic interpretability”, a field of research to understand how networks function internally. On our recommended grant, Harvard researchers will progress their work to locate specific concepts and facts within a neural network.