Anthropic’s Claude 3.5 Sonnet beats GPT-4o in most benchmarks

Anthropic has launched Claude 3.5 Sonnet, its mid-tier model that outperforms competitors and even surpasses Anthropic's current top-tier Claude 3 Opus in various evaluations.

Claude 3.5 Sonnet is now accessible for free on Claude.ai and the Claude iOS app, with higher rate limits for Claude Pro and Team plan subscribers. It's also available through the Anthropic API, Amazon Bedrock, and Google Cloud's Vertex AI. The model is priced at $3 per million input tokens and $15 per...

Hugging Face launches Idefics2 vision-language model

Hugging Face has announced the release of Idefics2, a versatile model capable of understanding and generating text responses based on both images and texts. The model sets a new benchmark for answering visual questions, describing visual content, story creation from images, document information extraction, and even performing arithmetic operations based on visual input.

Idefics2 leapfrogs its predecessor, Idefics1, with just eight billion parameters and the versatility afforded by...

Anthropic’s latest AI model beats rivals and achieves industry first

Anthropic’s latest cutting-edge language model, Claude 3, has surged ahead of competitors like ChatGPT and Google's Gemini to set new industry standards in performance and capability.

According to Anthropic, Claude 3 has not only surpassed its predecessors but has also achieved "near-human" proficiency in various tasks. The company attributes this success to rigorous testing and development, culminating in three distinct chatbot variants: Haiku, Sonnet, and...

DeepMind framework offers breakthrough in LLMs’ reasoning

A breakthrough approach in enhancing the reasoning abilities of large language models (LLMs) has been unveiled by researchers from Google DeepMind and the University of Southern California.

Their new 'SELF-DISCOVER' prompting framework – published this week on arXiV and Hugging Face – represents a significant leap beyond existing techniques, potentially revolutionising the performance of leading models such as OpenAI’s GPT-4 and Google’s PaLM 2.

The framework...

OpenAI releases new models and lowers API pricing

OpenAI has announced several updates that will benefit developers using its AI services, including new embedding models, a lower price for GPT-3.5 Turbo, an updated GPT-4 Turbo preview, and more robust content moderation capabilities.

The San Francisco-based AI lab said its new text-embedding-3-small and text-embedding-3-large models offer upgraded performance over previous generations. For example, text-embedding-3-large achieves average scores of 54.9 percent on the MIRACL...

Microsoft unveils 2.7B parameter language model Phi-2

Microsoft’s 2.7 billion-parameter model Phi-2 showcases outstanding reasoning and language understanding capabilities, setting a new standard for performance among base language models with less than 13 billion parameters.

Phi-2 builds upon the success of its predecessors, Phi-1 and Phi-1.5, by matching or surpassing models up to 25 times larger—thanks to innovations in model scaling and training data curation.

The compact size of Phi-2 makes it an ideal playground...

MLPerf Inference v3.1 introduces new LLM and recommendation benchmarks

The latest release of MLPerf Inference introduces new LLM and recommendation benchmarks, marking a leap forward in the realm of AI testing.

The v3.1 iteration of the benchmark suite has seen record participation, boasting over 13,500 performance results and delivering up to a 40 percent improvement in performance. 

What sets this achievement apart is the diverse pool of 26 different submitters and over 2,000 power results, demonstrating the broad spectrum of...

Gcore partners with UbiOps and Graphcore to empower AI teams

Gcore has joined forces with UbiOps and Graphcore to introduce a groundbreaking service catering to the escalating demands of modern AI tasks.

This strategic partnership aims to empower AI teams with powerful computing resources on-demand, enhancing their capabilities and streamlining their operations.

The collaboration combines the strengths of three industry leaders: Graphcore, renowned for its Intelligence Processing Units (IPUs) hardware; UbiOps, a powerful machine...

Baidu to launch powerful ChatGPT rival

Chinese web giant Baidu is preparing to launch a powerful ChatGPT rival in March.

Baidu is often called the “Google of China” because it offers similar services, including search, maps, email, ads, cloud storage, and more. Baidu, like Google, also invests heavily in AI and machine learning.

Earlier this month, AI News reported that Google was changing its AI review processes to speed up the release of new solutions. One of the first products to be released under...

MLCommons releases latest MLPerf Training benchmark results

Open engineering consortium MLCommons has released its latest MLPerf Training community benchmark results.

MLPerf Training is a full system benchmark that tests machine learning models, software, and hardware.

The results are split into two divisions: closed and open. Closed submissions are better for comparing like-for-like performance as they use the same reference model to ensure a level playing field. Open submissions, meanwhile, allow participants to submit a...