Daily AI Skills
Posts
What is this Hybrid Transformer-Mamba Architecture by Tencent?

What is this Hybrid Transformer-Mamba Architecture by Tencent?

PLUS: Microsoft's AI Agents for Beginners Course for Free

Daily AI Skills
March 24, 2025

Welcome back to Daily AI Skills.

Here’s what we are covering today:
1. Tencent Hunyuan T1 Reasoning Model
2. OpenAIs New Audio Models
3. NVIDIA GTC 2025 - AI’s Super Bowl

+ JP Morgan’s Python Training Program

What is the first Mamba-Powered Ultra Large Model by Tencent?

Tencent has unveiled Hunyuan T1, a new reasoning model that rivals DeepSeek's R1 in both performance and pricing while introducing the industry's first hybrid Transformer-Mamba architecture for greater efficiency.

Key Highlights:

T1 competes with top models like DeepSeek R1, OpenAI’s o1, and GPT-4.5, particularly excelling in math and Chinese language benchmarks.
Tencent claims it is the first to integrate Google’s Transformer framework with Mamba, a system developed by researchers at Carnegie Mellon and Princeton.
This hybrid design reportedly doubles processing speed while reducing computational costs, especially for long-text reasoning tasks.
Pricing aligns with DeepSeek’s competitive structure: 1 yuan ($0.14) per million input tokens and 4 yuan ($0.55) per million output tokens.

Read the full article by Tencent here

Try out the model here: https://llm.hunyuan.tencent.com/#/chat/hy-t1

OpenAIs New Realistic Audio Models

OpenAI has introduced its next-generation API-based audio models for text-to-speech and speech-to-text, enabling developers to customize AI voices through text prompts while enhancing multilingual speech recognition.

Key Highlights:

The gpt-4o-mini-tts model can adjust its speaking style based on simple text cues, such as "talk like a pirate" or "use a soothing bedtime voice."
The GPT-4o-transcribe models achieve state-of-the-art accuracy and reliability in speech-to-text tasks, surpassing OpenAI’s previous Whisper models.
OpenAI launched openai.fm, a public demo platform where users can experiment with different AI-generated voice styles.
The models are accessible via OpenAI’s API, with integration support through the Agents SDK for developers building voice-driven AI applications.

😱 Scary realistic emotions with the new OpenAI audio models!
It even adds more words automatically to the transcript based on the settings.
— Mick.net - Saas, ✈️ Aviation, Nerdy topics 🤓 (@mick__net)
5:40 PM • Mar 20, 2025

Read the full article by OpenAI here

Can This AI Model Outperform Doctors in Cancer Detection?

Scientists have introduced ECgMLP, an AI model that detects endometrial cancer with 99.26% accuracy from microscopic tissue images—far surpassing human specialists and existing automated methods.

Key Highlights:

ECgMLP leverages advanced attention mechanisms to identify cancer cells in microscopic tissue samples, catching details that doctors might overlook.
Traditional human diagnostic accuracy for endometrial cancer ranges from 78% to 81%, significantly lower than the AI’s 99%+ performance.
The model also demonstrated high accuracy in detecting other cancers, including colorectal (98.57%), breast (98.20%), and oral (97.34%).

Microsoft’s AI Agents for Beginners Course

GitHub - microsoft/ai-agents-for-beginners: 10 Lessons to Get Started Building AI Agents

10 Lessons to Get Started Building AI Agents. Contribute to microsoft/ai-agents-for-beginners development by creating an account on GitHub.

github.com/microsoft/ai-agents-for-beginners

📩Forward it to people you know who are keeping pace with the changing AI world, and stay tuned for the next edition to stay ahead of the curve!

Maybe you missed out on these: