Saturday, November 8, 2025

The Week AI Changed Science Forever -- Launch of AI Researcher and AI Data Scientist


See All Articles on AI


In one breathtaking week, three announcements signaled a new era for artificial intelligence — and for humanity itself.

Microsoft unveiled Kosmos, an autonomous AI scientist that works 12-hour research shifts and actually makes real scientific discoveries.
At the same time, Microsoft’s AI chief Mustafa Suleyman revealed plans for a Humanist Super Intelligence, designed not to replace humans, but to serve them.
Google quietly dropped DS Star, an autonomous data scientist that writes, tests, and fixes its own Python code.
And from across the globe, China’s Moonshot AI launched Kimi K2 Thinking, an open source reasoning model that can plan and think across hundreds of steps.

All of this — in just a few days.
Let’s unpack what it means.


🌌 Microsoft’s Kosmos: The AI That Actually Does Science

Meet Kosmos, the project shaking up research labs everywhere.

Backed by Microsoft Research, Kosmos is the first AI scientist that conducts research from start to finish — autonomously.
You give it a dataset and a goal (say, analyzing brain scans or studying new materials), and it goes into a 12-hour deep dive.

In that time, it:

  • Reads 1,500+ research papers

  • Writes ~40,000 lines of Python code

  • Runs analyses and tests hypotheses

  • Produces a full research report with citations and executable code

No human steering it midway — just pure autonomous science.

And the results? Stunning.
Kosmos has already made new discoveries in biology, neuroscience, and clean energy:

  • It revealed how cooling protects the brain by triggering an energy-saving mode in neurons.

  • It discovered that high humidity destroys perovskite solar cells during manufacturing — later confirmed by human scientists.

  • It even found a shared wiring rule across species — from flies to humans — suggesting all brains might follow the same mathematical pattern.

That’s not all. Kosmos identified a heart-protecting protein (SOD2), a diabetes-resisting DNA variant, and mapped the exact moment neurons collapse in Alzheimer’s disease.

How Kosmos Works

Kosmos runs on a swarm of AI agents, each with a specific role — paper reading, data analysis, coding, and hypothesis testing — all linked by a shared World Model, a collective memory that tracks context and progress.

Think of it as a brain made of sub-brains, coordinating long, multi-step scientific investigations.

In independent reviews, 80% of Kosmos’ findings were scientifically accurate — a staggering rate for a fully autonomous system.
One 12-hour Kosmos run produced the equivalent of six months of human research output.

Still, Kosmos isn’t perfect. It struggles with messy datasets and can’t yet process files larger than 5GB. And it can’t change course mid-run — once it starts, it commits.
But the biggest challenge? Judgment. Teaching an AI to know which discoveries matter.

Even so, this marks a historic moment: AI is now conducting real, verifiable research.


🤝 Microsoft’s Humanist Super Intelligence

While Kosmos pushes the boundaries of AI research, Microsoft’s Mustafa Suleyman is charting a different path — toward Humanist Super Intelligence (HSI).

This isn’t about building an AGI that replaces humans.
It’s about creating a super-intelligent system that serves them.

Suleyman describes it as a bounded, values-driven AI, designed to stay contextual, controllable, and subordinate.
A kind of deeply integrated AI companion — one that helps people learn, create, and think more clearly, while remaining ethically constrained.

Microsoft’s approach contrasts sharply with OpenAI and Anthropic’s open-ended AGI ambitions.
In Suleyman’s words: “Humans matter more than AI.”

With Microsoft now legally able to develop AGI independently using OpenAI’s IP, this philosophical divide could soon define the next great AI rivalry.


🧠 Moonshot AI’s Kimi K2: The Reasoning Machine

Meanwhile, in China, Moonshot AI is taking open source reasoning to a new level.

Their new model, Kimi K2 Thinking, doesn’t just generate text — it thinks, plans, and executes code across hundreds of reasoning steps without human help.

It scored:

  • 40.9% on Humanity’s Last Exam (expert-level interdisciplinary benchmark)

  • 60.2% on BrowseComp (research and browsing tasks) — double the human average

  • 71.3% on SWE Bench Verified (software engineering benchmark)

That’s not just incremental progress — it’s a leap.

In one demo, K2 solved a PhD-level hyperbolic geometry problem, performing 23 nested reasoning loops, running code, and verifying results until it derived the correct formula.

In another, it identified an actor from a vague description — parsing 20+ web sources, combining biographical clues, and assembling the answer.

This ability to reason across long horizons — chaining 300+ tool calls — represents a new frontier in AI.
Moonshot’s bet is that open source reasoning can rival (or even surpass) proprietary Western models.


🧩 Google’s DS Star: The Autonomous Data Scientist

Then there’s Google.

Their new system, DS Star, might quietly revolutionize enterprise analytics.
If Kosmos is an AI researcher, DS Star is an AI data scientist that turns messy real-world data into clean Python insights — all by itself.

Most AI tools require clean SQL databases. DS Star? It thrives in chaos:
CSVs, JSON logs, random spreadsheets, unstructured reports — bring it on.

You can ask it a question like:

“Which products performed best in Q3 based on sales and reviews?”

And DS Star will:

  1. Find the relevant files

  2. Write and test the Python code

  3. Debug its own errors

  4. Return the correct analysis

It uses a six-agent loop — one reads data, another plans, another codes, a verifier checks, a router fixes issues, and a finalizer packages the output.

If the code fails, it repairs itself automatically by studying the logs.

Powered by Gemini 2.5 Pro, DS Star outperforms every other data reasoning system on major benchmarks — including a 30-point leap on Dabstep, a benchmark for real-world data analysis.

Even more impressive, it’s model-agnostic — meaning the same architecture could work with GPT-5 or Claude 4.5.

In essence, AI no longer just assists the analyst — it is the analyst.


⚙️ The New AI Frontier: Long-Horizon Thinking

The thread connecting Kosmos, K2, and DS Star is clear:
AI systems are evolving from reactive assistants into autonomous thinkers.

They plan, code, reason, verify, and self-correct — traits once thought uniquely human.

The next frontier won’t be about larger models.
It’ll be about how long and coherently an AI can think before it loses focus — what researchers now call test-time scaling.

That’s the new battleground for AI supremacy.


🚀 The Takeaway

In just one week, we’ve seen:

  • Microsoft prove that AI can do real science

  • Google show that AI can analyze messy data autonomously

  • China demonstrate that open-source reasoning can rival the world’s best

This isn’t hype anymore — it’s happening.
AI isn’t just assisting human intelligence; it’s beginning to extend it.

We’re entering the era where AI doesn’t just help the process — it is the process.

Wild times, indeed.


What do you think — should AI be trusted to conduct science independently?
Drop your thoughts in the comments below.

If you enjoyed this deep dive, share it — and follow for more explorations at the edge of AI and human creativity.

Addendum

What is Microsoft Kosmos? Microsoft Kosmos (Knowledge-based Operating System for Modeling Scientific knowledge) refers to a series of multimodal large language models (MLLMs) developed by Microsoft Research and, in a related but distinct effort, an AI system developed by Edison Scientific designed for scientific research. Microsoft Research Kosmos Series These models are designed to understand and process information from multiple modalities, including language, images, and potentially audio, enabling capabilities beyond traditional text-only models. Kosmos-1: The foundational model, introduced by Microsoft Research, can perceive images and language, perform in-context learning, reason, and generate content. It handles tasks like visual question answering (VQA), image captioning, and Optical Character Recognition (OCR)-free text processing. Kosmos-2: Building on Kosmos-1, this model introduced the ability of multimodal grounding and referring. It can link specific text spans (like noun phrases) in a caption directly to corresponding regions (using bounding boxes) within an image, essentially creating "invisible hyperlinks" between text and pixels. This allows for more precise human-AI interaction and visual responses. Kosmos-2.5: This version is a "multimodal literate model" specifically designed for machine reading and understanding of text-intensive images such as academic papers, receipts, and web pages. It excels at generating spatially-aware text blocks (with coordinates) and structured text in markdown format, performing on par with larger models like GPT-4o on document understanding benchmarks. Edison Scientific Kosmos AI System A separate, recent development, this system is described as an "AI scientist" designed for deep scientific research workloads, not general chat. It operates using "structured world models" and runs hundreds of smaller AI agents in sync. It can ingest thousands of papers and data sets to perform complex analyses, generate hypotheses, and produce traceable reports with citations and code references with high accuracy. Source: Gemini
Tags: Technology,Artificial Intelligence,Video,

No comments:

Post a Comment