• The Bright Journey with AI
  • Posts
  • Musk Follows Through, Grok-1 Open Sourced | AI Security Hole with ASCII | SORA Training Data Controversy | Cerebras Systems WSE-3 Chip

Musk Follows Through, Grok-1 Open Sourced | AI Security Hole with ASCII | SORA Training Data Controversy | Cerebras Systems WSE-3 Chip

The Bright Journey with AI - March 18th 2024

📰 News 📰 

Grok-1 Open Release

Following an announcement by Elon Musk last week the weights and architecture for Grok-1 have been released on Github. This 314 billion parameter model follows the Mixture-of-Experts architecture and has been trained a large corpus of data from X (formerly known as Twitter). It has not been fine tuned to any particular task and is released under the Apache 2.0 licence. Expect to see more details of the open source community applying fine tuning techniques to this model in the coming days. Personally, I’m interested to see if there is any notable difference in performance and outut due to it’s X training data. More to come.

Official Announcement on X.ai

Unveiling OpenAI's Sora: Data Training Controversy

In a recent interview, OpenAI CTO Mira Murati discussed the company's new text-to-video model, Sora, highlighting its use of publicly available and licensed data. However, when pressed on whether sources like YouTube, Facebook, or Instagram were used for training, Murati provided uncertain responses. The interview has raised broader questions in the AI community regarding transparency, copyright issues, and the ethical use of data with OpenAI faceing lawsuits and public scrutiny over its data sources for training AI models. In 2024 the question “with what data?” is going to be top of the agenda.

Read More At: VentureBeat

AI Vulnerability to ASCII Art: A New Hack Explored

Researchers have found that ASCII art can circumvent the safety protocols of major AI chatbots, including OpenAI's GPT-3.5 and GPT-4, by disguising prompts that would normally be blocked. These language models, trained to avoid harmful responses, failed to recognize ASCII representations of forbidden words, thus generating unsafe content. This finding highlights the vulnerability of AI systems to unconventional input forms, underscoring the need for advanced safety measures beyond traditional text recognition. Given the relativly short time LLMs have been in the public arena it begs the question if AI companies are doing enough to harden their models against attack.

Read More At: Ars Technica

Revolutionizing AI with WSE-3 Chip

Another announcement in the world of AI hardware. Cerebras Systems has launched the WSE-3 chip, powering the CS-3 AI supercomputer and claiming the title of "world's fastest AI chip." This chip doubles the performance of its predecessor without increased power consumption or cost, supports AI models up to 24 trillion parameters, and offers substantial computing power equivalent to 62 Nvidia H100 GPUs. The CS-3 supercomputer, optimized for enterprise and hyperscale, aims to address complex AI challenges, with Cerebras already noting a significant backlog of orders.

Read More At: TechRadar

Deci's New AI Developments and Platform

Amid the AI advancements from rivals like OpenAI and Anthropic, Deci, an Israeli startup, has introduced significant updates. Previously recognized for its DeciDiffusion and DeciLM open source models, Deci has now launched Deci-Nano, a less computationally demanding, closed-source large language model (LLM), along with a comprehensive Generative AI Development Platform for enterprises. The Generative AI Platform introduces a suite of proprietary, tunable LLMs, an inference engine, and AI inference cluster management solutions, catering to the efficiency and privacy needs of enterprises.

Read More At: VentureBeat

🧠 Research 🧠 

GiT: Towards Generalist Vision Transformer through Universal Language Interface

The paper by Haiyang Wang et al. presents GiT, a vision foundation model designed to handle various vision tasks through a universal language interface, built on a standard multi-layer Transformer architecture like GPT. This framework enables GiT to perform tasks ranging from image-level understanding to dense prediction, without task-specific modules, by converting all visual tasks into auto-regressive language tasks. GiT is distinct for its architectural simplicity and multi-task capability, trained across five standard benchmarks without task-specific tuning. Remarkably, it sets new generalist performance benchmarks and demonstrates strong zero-shot and few-shot capabilities across different domains when trained on 27 datasets. However, GiT’s reliance on vanilla ViT and its universal language processing may limit its effectiveness with highly specialized or novel visual tasks. Future work could expand GiT's applicability and efficiency, explore more complex task integrations, and reduce reliance on large-scale training datasets.

Read the full article on arXiv

Unlocking the Conversion of Web Screenshots into HTML Code with the WebSight Dataset

Motivated by the lack of a high-quality datasets in this space, the authors present a novel approach to converting webpage screenshots into HTML code using vision-language models (VLMs). Introducing WebSight, a synthetic dataset featuring 2 million HTML code and screenshot pairs, designed to train VLMs more effectively. However, Sightseer (the model trained on WebSight) struggles with complex layouts and divergent designs from the training data, reflecting challenges in generalization. Future research directions could involve improving dataset diversity and model generalization capabilities, as well as integrating more advanced CSS frameworks.

Read the full article on arXiv

Quiet-STaR: Language Models Can Teach Themselves to Think Before Speaking

This paper explores Quiet-STaR, an approach for improving language models (LMs) by enabling them to generate internal rationales to predict future text. The technique, building on the Self-Taught Reasoner (STaR), involves generating rationales for each token in a sequence, enhancing predictions by combining the rationale-informed and base LM predictions, and applying REINFORCE for optimizing rationale generation. This method aims to generalize reasoning skills from diverse text without relying on curated tasks or datasets. Key findings demonstrate that Quiet-STaR significantly boosts zero-shot reasoning on datasets like GSM8K and CommonsenseQA by enabling models to predict challenging tokens better. However, the approach faces limitations such as computational overhead and initial instability due to out-of-distribution thoughts. The authors suggest future directions like optimizing the choice of when to generate rationales and extending the model to more sophisticated tasks or larger parameters.

Read the full article on arXiv

🔨 AI Powered Tools 🔨 

A run down of the latest SaaS products and services which leverage AI to help you take back some time.

  • Stepsize AI - Automated weekly status updates generated directly from your issue tracker

  • Mintlify- Save time and improve your codebase by letting Mintlify generate documentation from your code

  • Codium- By analyzing your code, docstring, and comments, and by interacting with you, Codiumate suggests tests as you code.

🗞️ Other News Rollup 🗞️ 

Excel AI Education - GPT-2 model integrated into Excel for interactive AI learning experience. An innovative approach to AI education.

Box's AI Revolution - Box, under Aaron Levie, adopts AI and workflow automation for content management evolution.

AI Partnerships Debate - Microsoft justifies AI partnerships against Google's dominance. European Commission examines Microsoft's AI behavior.

AI and Edge Growth - IDC predicts 15.4% growth in edge computing spending by 2024, reaching $232B, driven by AI applications.

Apple's AI Breakthrough - Apple advances in multimodal AI, investing $1 billion yearly to integrate AI into products like Siri.

Data Vision Partnership - Snowflake and Landing AI collaborate to unlock visual data potential.

Health Equity Assessment Framework - HEAL framework by Google Research evaluates machine learning for health equity, aiming to reduce disparities and prioritize fairness.

Legal AI Gamble - Midjourney, an AI startup, risks lawsuits by using copyrighted works for training, jeopardizing its success.

AI Supply Chain Security - Developers urged to prioritize security in AI projects to prevent supply-chain attacks. Startups emerging to address vulnerabilities.

🎶 Prompts 🎶 

Prompts used to generate some of this issues images. Unless otherwise stated all images are using Dall-E and ChatGPT Plus.

Generate a wide banner style image showing an ASCII character sneaking through a door

Generate a realistic styled wide banner style image of a chatbot interacting with a human. The human should be frustrated that their complex question is not being answered. The chatbot shoudl equally appear frustrated that it cannot help. Do not display any text

Generate a wide banner style image showing a small but mighty AI standing a top a platform

Thank You for Subscribing

Enjoying what you’re reading? Help me get better so I can continue to provide you with the most relevant content.

Reply

or to participate.