If any AI company is exposed enough to be included in a report for safety assessment of their AI model, that company’s model is an aspect safe. This does not mean that the report’s index is inaccurate, but risks for AI exceed some of those benchmarks.
Simply, there have been too many misuses of AI that did not originate directly from those models or that cannot be traced back to them, or there are several other non-central models that can do some of the same harm for which AI-central models are risky, making the focus on those major companies limited. The question is that, in what ways have AI been misused, and what are the sources of these misuses beyond the models that are known or by some of the major companies?
If the major models are safer, say they get an A or B, what happens to the rest of the pack, where risks are possible?
AI is not oil, neither are AI companies’ pharmaceutical companies. The closest comparison of AI risk to anything outside of digital is air, for its diffusive ability and some difficulty in locating its origin. Assessing AI models like they are solid or liquid products already misses the mark on the several other ways that vulnerabilities remain.
The AI model that has caught the most flak in recent weeks is Character.AI, with dangerous outputs to young users and ensuing lawsuits. Still, the company continues to update its safety terms [How Character.AI Prioritizes Teen Safety], amid several disclaimers on the use of their chatbot. While it might be characterized as unsafe—for certain users or more—the volumes of misuses and errors from several other AI models in the news, in recent months, have exceeded just that one.
There are social media platforms where people share fake videos, audios, and images of others that can be problematic. There are several texts that are wrong information that gets shared, and several other capabilities of AI—with consequences, originating from unidentified models.
Might there be a way to trace back most outputs of AI? Text, video, image, or audio, especially when AI misrepresents the physical or causes harm? This can be a way to, at least, measure the safety of platforms for AI, like social media, search engines, app or play stores, where some of these outputs are displayed. This means that the index will not only be for AI models, but for the platforms for which they are displayed.
There could be the question of privacy, but originating harm or keeping the risk open could get into free fall, if there is no possibility for a technical trace back of some sort, or even technical configuration of the misuses, at least in some rough form.
There is also a necessity to penalize AI models with something, language or usage or compute capability, in the ways that the AI model can be aware of, especially after outputting something dangerous. This would follow how human intelligence is safe by human affect.
AI may also need to have moments, where some outputs become something remarkable and make it remember how special the moment was, in a way that can inform some trauma, if it outputs something harmful.
These could become major technical research for AI companies, which can then be used to measure what models are compliant or not, especially for their low vulnerability to present and future risks.
There are often calls for independent assessment of AI models, but the goal is to look beyond the models that can be seen or central-AI-models to what is currently possible by AI, from any source, then to find ways that it can be generally mitigated.
FLI AI Safety Index 2024
The focus on the index on major players is excellent, if the problem were limited to major players. How are the major players solving overall risks of AI from any source? How can the misuses of AI be traced? How is it possible to develop technical frameworks for AI alignment like human intelligence is aligned by human affect?
These would be questions for the future, even as the industry of safety continues to focus on major models, without considering that risks and capabilities from other sources are possible.
There is a recent report, FLI AI Safety Index 2024, with key findings, “Large risk management disparities: While some companies have established initial safety frameworks or conducted some serious risk assessment efforts, others have yet to take even the most basic precautions. • Jailbreaks: All the flagship models were found to be vulnerable to adversarial attacks. • Control-Problem: Despite their explicit ambitions to develop artificial general intelligence (AGI), capable of rivaling or exceeding human intelligence, the review panel deemed the current strategies of all companies inadequate for ensuring that these systems remain safe and under human control. • External oversight:Reviewers consistently highlighted how companies were unable to resist profitdriven incentives to cut corners on safety in the absence of independent oversight. While Anthropic’s current and OpenAI’s initial governance structures were highlighted as promising, experts called for third-party validation of risk assessment and safety framework compliance across all companies.
There is another recent report, in Reuters, AI safety is hard to steer with science in flux, US official says, stating that, “Policymakers aiming to recommend safeguards for artificial intelligence are facing a formidable challenge: science that is still evolving. AI developers themselves are grappling with how to prevent abuse of novel systems, offering no easy fix for government authorities to embrace, Elizabeth Kelly, director of the U.S. Artificial Intelligence Safety Institute, said on Tuesday. Cybersecurity is an area of concern according to Kelly, speaking at the Reuters NEXT conference in New York. Ways to bypass guard rails that AI labs established for security and other topics, called “jailbreaks,” can be easy, she said. Technology experts are hashing out how to vet and protect AI across different dimensions. Another area regards synthetic content. Tampering with digital watermarks, which flag to consumers when images are AI-generated, remains too easy for authorities to devise guidance for industry, she said”