Engage
What do AI companies mean when they claim "openness" ?
It should come as no surprise that nearly all AI companies claiming to have “open ecosystems” use the term broadly and not entirely in earnest. What exactly does “openness” in AI mean, and can we trust these (historically not-entirely-scrupulous) tech giants when they claim to participate in open ecosystems?
Many open-source models are built on top of LLaMa, Meta AI's open-source LLM. It’s become a massively popular foundation for researchers, hobbyists, and companies building and exploring Chat-GPT-like tools. The release of the more powerful “free and open” Llama 2 in July 2023 caused some confusion; “Why is Meta releasing free open source stuff?” asks one Redditor. “Is Zuck the good guy now?”(Ha ha, not in this lifetime! In fact, back in June, lawmakers questioned whether the "leak" of the previous LLaMa was truly accidental, accusing Meta of "doing little" to restrict its access.)
Though LLaMa 2 is free to download, modify, and deploy, the license agreement set significant restrictions on commercial use, prohibiting it from training other language models and requiring a special license if a developer deploys it in an app or service with more than 700 million daily users.
Due to these restrictions, the Open Source Initiative (OSI) determined LLaMa 2 does not meet their definition of open source. But already there are moves being made to rebrand the “open” of “open source” as “open innovation,” with nonprofit OpenUK supporting its release as it “demonstrates clearly the potential value of openness, collaboration and democratising AI,” all because Meta in their Acceptable Usage Policy (AUP) guided users to “do no harm.” Which…ok, sure, whatever, it’s not as though LLMs weren’t immediately used for immense harm to create a booming nonconsensual deepfake porn market, target children with misinformation, or transform into malevolent systems optimised for cybercrime and targeted harassment.
The emergence of “open-washing”
This ill-defined and seemingly ever-changing definition of “open” is problematic for several reasons. If we can’t agree on what “open” AI is then we certainly can’t identify which parts of it to limit or “close”—meaning regulate—in any substantial way. This responsibility de facto falls to the handful of big tech companies already in charge. A frightening thought!
This is exactly the point a group of researchers from Carnegie Mellon University, the AI Now Institute, and the Signal Foundation, made back in August 2023 when they released a paper observing how the term “openness” is more aspirational than technical and examining how truly “open” LLaMa 2 and similarly described AI models are (not very!).
They ultimately warned that “companies have moved to embrace ‘open’ AI as a mechanism to entrench dominance, using the rhetoric of ‘open’ AI to expand market power while investing in ‘open’ AI efforts in ways that allow them to set standards of development while benefiting from the free labour of open source contributors.” These companies rarely reveal how their data sets were trained, disallowing their reconstruction or full assessment. In enticing outside developers to innovate within their own specific limited ecosystems, they consolidate their power and ultimately reduce competition.
But the problem doesn’t end there; “Marketing around openness and investing in open artificial intelligence systems," write the researchers, "is being leveraged by powerful companies to establish their positions against the growing interest in regulation in the field." This “open-washing” is particularly relevant as companies encourage the EU to relax rules for open-source AI development; back in June OpenAI actually successfully got some of their proposed changes into EU AI legislation.
This includes shamelessly exploiting the nuances and confusion between “open” and “open source.” Of course “openness” doesn’t just refer to open source software, and Meta isn’t the only tech company describing its AI model as to some extent “open.”
The bait-and-switch approach to open ecosystems
Microsoft asserts an “open-source ecosystem,” but there’s confusion around what data sets their Azure AI built on (companies generally keep their methods and sources of data collection secret, raising significant privacy questions and accountability concerns when things inevitably go south). Last year, IBM declared they were doing “open ecosystems” but used intentionally murky language, referring instead to an open ecosystem of partners who will probably sell you vendor locked in enterprise-priced tools and platforms (since then, they have joined the AI Alliance launched at the end of 2023, with partners including Meta. Observers have noted the overwhelmingly male-dominated launch events “despite the AI Alliance pointing out that diversity is important to ensure AI evolves in sync with societal concerns”.
These approaches to declaring openness rekindle previous promises by tech giants to offer open ecosystems: Google Maps API gained a ton of users then introduced an exorbitant pricing model, perhaps a playbook they will follow with their Vertex AI offerings.
OpenAI transitioned from a non-profit research org spewing lofty ideas about ethical AI development to a for-profit Microsoft partner, swiftly reversing stance on “openness” and hiding all its data. How’s that for “Open” AI? It’s an issue that OpenAI is now needing to face with multiple copyright lawsuits underway as 2024 commences. (Seems worthwhile to mention here that CEO Sam Altman’s other company, Worldcoin, exploited and bribed people in countries throughout Asia, Africa, and South America for a digital scan of their eyeballs, prompting probes by the Kenyan government and (valid!) accusations of AI colonialism.)
The importance of open experiments and shared learning in AI
We’re at a stage when everyone should be collaborating and sharing their learnings around AI, but this is laughably far from the reality. There’s too much happening behind the scenes as companies fight to come out on top, and not just AI companies—Amazon, Spotify, TikTok, and YouTube—all the big guys are jumping on the bandwagon. As a velocity and scale multiplier, AI’s negative impacts are going to be nearly impossible to pull back unless there’s transparency and accountability via ongoing processes of review and adjustment.
Words are important. They are foundational. As children, we use words to communicate and make sense of the world around us. If we can’t even agree on the most basic definition of “open” in the context of AI, how will we ever figure out how it actually works? What are its biases, intentions, and limitations? Most importantly, who is running it and what they hope to achieve? At the risk of sounding too doomerpilled, I pray the answer to this last point isn’t “make rich people richer,” but it’s not looking good.
As Ted Chiang posited in the New Yorker, the AI doomsday scenario isn’t something radical like a Terminator 2 style machine takeover; it’s just more of the same garbage (wealth inequality, environmental destruction), but worse. “It’s AI-supercharged corporations destroying the environment and the working class in their pursuit of shareholder value,” writes Chiang. “Capitalism is the machine that will do whatever it takes to prevent us from turning it off, and the most successful weapon in its arsenal has been its campaign to prevent us from considering any alternatives.”
Mariam Sharia
TECH POLICY WRITERaccounts@platformable.com