Showing posts with label Generative AI. Show all posts
Showing posts with label Generative AI. Show all posts

Friday, May 29, 2026

How does Google Gemini Client accesses the API Key?

See All on GenAI    « Previously

Gemini said

When you instantiate genai.Client() without passing any arguments, the Google GenAI SDK automatically looks for your API key in your system's environment variables.

Specifically, it checks for these two variables:

  1. GEMINI_API_KEY (Recommended)

  2. GOOGLE_API_KEY (Supported as a fallback or for legacy compatibility)

If it finds either of these variables, it automatically injects the key into the client. If both happen to be set, GOOGLE_API_KEY takes precedence.

Why do it this way?

This is a standard security best practice. By relying on environment variables rather than explicitly passing the key like genai.Client(api_key="YOUR_KEY"), you avoid hardcoding sensitive credentials directly into your source code, preventing accidental leaks if you push your code to a public repository like GitHub.

How to set the environment variable

To make this code work on your machine, you need to set the variable before running your script.

On Mac/Linux (Terminal):

Bash
export GEMINI_API_KEY="your_api_key_here"
python your_script.py

On Windows (Command Prompt):

DOS
set GEMINI_API_KEY=your_api_key_here
python your_script.py

Alternatively, in Python (using python-dotenv): If you are using a .env file to manage your secrets, you can load it before calling the client:

Python
import os
from dotenv import load_dotenv
from google import genai

# This loads the GEMINI_API_KEY from your .env file into the environment
load_dotenv() 

client = genai.Client()
response = client.models.generate_content(
    model="gemini-3.5-flash",
    contents="Explain how AI works in a few words"
)
print(response.text)

See All on GenAI    « Previously

DeepSeek Premium -- What Paying Users Get

See All on GenAI    « Previously    Next »

DeepSeek Premium: What Paying Users Get (May 2026)

May 19, 2026 — 4 min read

DeepSeek's free tier is generous, but serious development and enterprise use cases require premium access. Here's exactly what you get when you pay.

The Short Answer

DeepSeek does not offer a monthly "Pro" subscription for individual users [citation:2]. Instead, premium access comes through API pay-per-token pricing. Paying gives you programmatic access, more powerful models (V4-Pro), higher reliability, and enterprise-grade features that free users cannot access at all [citation:9].

Feature Free Tier (Web/App) Premium (API Access)
API Access No API key available [citation:9] Full REST API with key management [citation:2][citation:9]
Available Models V4-Flash only (with optional "expert mode" toggle for Pro) [citation:2] V4-Flash + V4-Pro (1.6T total parameters, 49B activated) [citation:1][citation:3]
Context Window 1M tokens [citation:2] 1M tokens (same) [citation:3]
Max Output Tokens ~8K tokens (MoE-16B limitation) [citation:9] 384K tokens [citation:4]
Usage Limits 30 questions/day + Token-based throttling [citation:9] Pay-as-you-go, no hard caps [citation:7]
Image Upload Can upload images for OCR/text extraction [citation:2] Native vision understanding (V4-Pro) [citation:9]
SLA / Reliability Best-effort, IP rate limiting [citation:2] Dedicated channels, higher throughput [citation:9]
Enterprise Features None Private deployment, audit logs, compliance [citation:8]

> What Premium Unlocks (The Details)

> 1. API Access & Automation

Free users cannot generate an API key at all [citation:9]. The "API Management" menu remains grayed out, and any direct curl request returns HTTP 401 Unauthorized [citation:9]. Premium API access enables integration with CI/CD pipelines, IDE plugins, local CLI tools, and any automated workflow [citation:2]. This is not a "nice to have" — it's an absolute requirement for programmatic use.

> 2. V4-Pro Model (1.6T Parameters)

Free tier defaults to the smaller DeepSeek-MoE-16B model, which caps responses at ~8K tokens and struggles with complex reasoning [citation:9]. Premium API users can call deepseek-v4-pro, a 1.6 trillion parameter MoE model with 49 billion activated parameters per forward pass [citation:1][citation:3]. In benchmarks, V4-Pro matches or exceeds GPT-5.4 and Claude Opus 4.6 on coding and STEM tasks, while staying vastly cheaper [citation:1][citation:3].

> 3. 384K Output Tokens (vs ~8K on Free)

This is one of the most dramatic differences. Free tier models cut off around 8,000 output tokens — fine for short replies, but useless for generating long-form content [citation:9]. Premium V4 models support up to 384,000 output tokens per request [citation:4]. That's enough to generate an entire novella in a single API call.

> 4. Native Vision Understanding

Free tier allows image uploads, but only for OCR/text extraction — the model reads text from images but cannot "see" what's in them [citation:2]. Premium V4-Pro includes native multimodal understanding, meaning you can upload screenshots, charts, or contract photos and ask contextual questions about the visual content itself [citation:9].

> 5. Higher Reliability & No Throttling

Free users face IP-based rate limiting, fair-use throttling during peak hours, and opaque Token-based usage caps [citation:2][citation:9]. One user reported getting cut off after 650K tokens even though they had 25 remaining "question slots" [citation:9]. Premium API access removes these guesswork limits — you pay per token, and DeepSeek serves every request on a best-effort basis with no hard concurrency caps [citation:7]. Enterprise customers can negotiate dedicated throughput and SLA guarantees [citation:8].

> 6. Enterprise Add-Ons (Private Deployment, Compliance)

For organizations, DeepSeek's premium enterprise tier adds private deployment (on-prem or VPC), SM4 encryption (China's national cipher standard), model audit trails, and compliance with GDPR, HIPAA, and China's Class 2.0 security standards [citation:8]. This is available through custom enterprise agreements, not standard API pricing.

Premium Pricing (May 2026)

Model Input (per 1M tokens) Output (per 1M tokens) Note
V4-Flash (API) $0.14 $0.28 Low-cost, fast responses [citation:4]
V4-Pro (promo) $0.435 $0.87 75% off through May 31, 2026 [citation:4]
V4-Pro (regular) $1.74 $3.48 After promo ends [citation:3][citation:4]

Every new developer account also receives 5 million free tokens for the first 30 days — no credit card required [citation:2][citation:4]. That's roughly $8.40 worth of V4-Flash usage, enough to run 2,500-5,000 test calls [citation:4]. Off-peak discounts (16:30 to 00:30 UTC) can further reduce costs by 50-75% [citation:7].

Who Should Pay?

  • > Individual developers building apps, scripts, or IDE plugins → API required (no free tier access) [citation:9]
  • > Teams needing long-form generation (reports, codebases, book-length output) → 384K output tokens are only available via API [citation:4]
  • > Anyone needing vision understanding (not just OCR) → V4-Pro API required [citation:9]
  • > Businesses with compliance requirements → Enterprise tier with private deployment [citation:8]
  • > Casual users just chatting or summarizing docs → Free web/app tier is completely sufficient [citation:2]

Bottom Line

DeepSeek's free tier is genuinely useful — unlimited chats, 1M context, file uploads [citation:2]. But premium access is not about "more features." It's about a different category of use: automation, long outputs, stronger reasoning, and enterprise-grade reliability. At $0.14-$1.74 per million input tokens (with a 5M free trial), the barrier to entry is remarkably low compared to Western alternatives that cost 35-100x more per token [citation:4].

If you need programmatic control or V4-Pro's reasoning power, the API is the only path. For everything else, the free web chat remains one of the best deals in AI.

Pricing data as of May 19, 2026. Promotional rates valid through May 31, 2026. All prices in USD. Source: DeepSeek official API documentation and verified third-party trackers.

See All on GenAI    « Previously    Next »

Million-Token Milestone -- Comparing GPT, Gemini, Claude, and DeepSeek

See All on GenAI    « Previously    Next »

The Million-Token Milestone: Comparing GPT-5.5, Gemini 3.1, Claude Opus 4.6, and DeepSeek-V4

May 19, 2026 — 8 min read

All major AI models now support 1M+ token context windows, but pricing and output limits tell a very different story. Here is how OpenAI, Google, Anthropic, and DeepSeek stack up.

The Context Window War Is Over (For Now)

For the past two years, AI labs have been racing to expand how much text a model can "remember" at once. In 2026, that race reached a new equilibrium: all four frontier models now offer a 1 million token context window as a standard feature. That is roughly the length of all three The Lord of the Rings books combined, or about 750,000 English words in a single conversation.

But while the headline number looks the same, the real differences hide in three places: pricing, output length, and multimodality. The table below breaks down exactly what each provider gives you for your dollar (or for free).

Provider Model (Latest) Context Window Max Output Tokens Modalities Pricing (Input / Output per 1M tokens)
OpenAI GPT-5.5 1M+ tokens 272K tokens Text + Images $2.50 / $10.00
Google Gemini 3.1 Pro Up to 1M tokens 64K tokens Text, Images, Audio, Video $1.75 / $5.25
Anthropic Claude Opus 4.6 1M tokens (standard) 128K tokens Text, Images, PDFs $3.00 / $15.00
DeepSeek DeepSeek-V4-Pro 1M tokens (standard) 384K tokens Text-only $0.27 / $1.10

What The Table Does Not Show (But Matters More)

> DeepSeek's 384K output advantage

Most models cut you off after 64K–128K generated tokens. DeepSeek-V4-Pro lets you generate up to 384K tokens in a single response — almost four times more than GPT-5.5. For use cases like translating entire book chapters, generating long-form reports, or writing full codebases, this is a game-changer.

> Claude's pricing reset

Until early 2026, Anthropic charged a premium multiplier once you exceeded 200K tokens. With Opus 4.6, the 1M window is now available at standard pricing — no surprise fees. At $3/$15 per million tokens, Claude remains the most expensive of the group, but you no longer pay extra for long conversations.

> Gemini's native video understanding

OpenAI and Claude can see images. Google's Gemini 3.1 Pro goes further: it processes audio and video natively within the 1M context window. You can upload a 45-minute lecture video and ask for timestamps, summaries, or specific quotes. No other model on this list offers that.

> DeepSeek's disruptive pricing

At $0.27 per million input tokens, DeepSeek is roughly 10x cheaper than GPT-5.5 and 11x cheaper than Claude Opus 4.6. For high-volume applications — think log analysis, document processing pipelines, or RAG over large codebases — the cost difference becomes massive. The tradeoff: no image recognition and a less mature ecosystem.

Which Model Should You Choose?

Use this quick decision matrix based on your primary constraint:

  • > Cheapest for massive volume → DeepSeek-V4-Pro (text-only, huge output limit)
  • > Best multimodal (video + audio) → Gemini 3.1 Pro (smaller output limit but unmatched input variety)
  • > Highest quality reasoning + long output → GPT-5.5 (272K output, strong agentic performance)
  • > Long conversations with predictable pricing → Claude Opus 4.6 (standard 1M, best safety fine-tuning)
  • > Generating very long content (books, reports) → DeepSeek-V4-Pro (384K output tokens)

One note on "free tiers": all four models offer limited free access via their web interfaces or API credits, but the 1M context window is generally fully available on paid tiers only. Free versions typically cap at 8K–32K tokens to manage compute costs.

Bottom Line

The 1 million token context window is no longer a differentiator — it is table stakes. The real differentiators in 2026 are output length limits, per-token pricing, and what types of data (video, audio, PDFs, images) a model can see. If you are building for scale, DeepSeek wins on cost. If you need video understanding, Gemini is the only choice. And if you need the most balanced all-rounder with strong output capacity, GPT-5.5 or Claude Opus 4.6 are your picks.

Test with a representative sample of your own data before committing — context window size matters less than how well a model uses that context at the 500K–1M range.

Data compiled from official API documentation and public announcements as of May 19, 2026. Pricing shown for standard pay-as-you-go API tier (USD).

See All on GenAI    « Previously    Next »

Wednesday, May 13, 2026

Generative AI For Everyone (Course @ DeepLearning.AI)

View Course on DeepLearning.AI    View Other Courses Audited By Us    « Previously




Quiz

Week 1A - What is GenAI

Week 1B - GenAI Applications

Week 2A - GenAI Projects - Software Applications

Week 2B - GenAI Projects - Advanced technologies (Beyond prompting)

Week 3A - Generative AI in Business and Society - Generative AI and business

Week 3B - Generative AI in Business and Society - Generative AI and society


View Course on DeepLearning.AI    View Other Courses Audited By Us    « Previously Tags: Agentic AI,Generative AI,