CH 59Phase 9 · Research & Career Roadmap

AI Engineer Mathematical Roadmap

১৫–২৫ মিনিট বাংলা · Math · Python
📖 একটি ছোট গল্প

একজন AI Engineer (industry) vs AI Researcher (academia) — দুজনের math requirement আলাদা। Industry-তে "কীভাবে implement করব" জানা দরকার, research-তে "কেন কাজ করে" prove করতে হয়। এই অধ্যায়ে দুটি path-এর জন্য math roadmap, interview focus area, এবং continuous learning strategy দেখব।

AI Engineer Path (Industry)

Goal: Production model build, optimize, deploy — correctness ও efficiency balance।

Core Math (Must-Know)

  • Linear Algebra — matrix ops, broadcasting, shape reasoning (debugging-এ daily use)।
  • Calculus — gradient intuition, chain rule, backprop trace করতে পারা।
  • Probability — distributions, expectation, Bayesian thinking (uncertainty quantification)।
  • Optimization — Adam, learning rate, batch size, convergence diagnosis।

Nice-to-Have (Deeper Impact)

  • Information Theory — cross-entropy, KL divergence (loss design, distillation)।
  • Numerical Analysis — stability, mixed precision, gradient scaling (production debugging)।
  • Graph Theory — GNN applications (recommendation, fraud detection)।

AI Researcher Path (Academia/Lab)

Goal: New method invent, theoretical guarantee provide, publish।

Core Math (Must-Know Deeply)

  • Matrix Calculus — vector-Jacobian products, trace identities, Hessian analysis (CH 48)।
  • Probability & Measure Theory — convergence, PAC bounds, concentration inequalities।
  • Optimization Theory — convex analysis, convergence rates, second-order methods (CH 49)।
  • Statistical Learning Theory — Rademacher complexity, VC dimension, generalization bounds।

Specialized Depth

  • Variational Inference — ELBO, reparameterization, normalizing flows (generative models)।
  • Differential Geometry / Topology — manifold learning, latent space structure।
  • Stochastic Processes — SDEs (diffusion models), Markov chain mixing time।

Interview Math Focus by Role

MLE / AI Engineer (Applied)

  • Shape debugging: X \in \mathbb{R}^{N \times D}, batch norm-এর dimension কত?
  • Complexity analysis: self-attention O(n^2 d) vs linear attention O(nd^2) — কখন কোনটা better?
  • Gradient behavior: vanishing gradient কীভাবে diagnose করবেন? (gradient norm plot)।
  • Loss landscape: training loss oscillate করছে — LR বাড়াবেন না কমাবেন?

Research Scientist / PhD

  • Prove that SGD converges for convex Lipschitz functions — rate কত?
  • Attention mechanism-এর expressive power — কীভাবে prove করবেন?
  • Generalization bound: model complexity vs dataset size — relationship?
  • Novel architecture design — inductive bias কী? কেন কাজ করবে theoretically?

Structured Learning Plan

  1. Month 1-2: Phase 0-2 (Foundation + Linear Algebra + Calculus) — solid base গড়ুন।
  2. Month 3-4: Phase 3-4 (Probability + Optimization) — ML theory understand করুন।
  3. Month 5-6: Phase 5 (Deep Learning Math) — architecture গুলোর গণিত বোঝুন।
  4. Month 7-8: Phase 6-7 (Advanced) — paper পড়া শুরু, citation trace করুন।
  5. Ongoing: Phase 8-9 (Applied + Career) — real system-এ apply, interview practice।

Recommended Resources

  • Books: "Mathematics for Machine Learning" (Deisenroth et al.) — free, comprehensive।
  • Books: "Deep Learning" (Goodfellow et al.) — Ch 2-5 = math foundation, Ch 6-12 = DL math।
  • Books: "Probabilistic Machine Learning" (Murphy) — Bayesian, PGM, inference।
  • Courses: Stanford CS229 (ML theory), CS231n (CNNs), CS224n (NLP), CS234 (RL)।
  • YouTube: 3Blue1Brown (Linear Algebra, Calculus), Mathematical Monk (Probability, ML)।
  • Blogs: Distill.pub (interactive explanations), Lilian Weng (technical deep dives)।

Python: Skill Assessment Checklist

pythonPython · NumPy
# Self-assessment: can you derive/implement these from scratch?

checklist = {
    "Linear Algebra": [
        "Matrix multiplication (no library)",
        "Eigenvalue decomposition",
        "SVD for dimensionality reduction",
        "Batch matrix ops shape reasoning"
    ],
    "Calculus": [
        "Chain rule for neural network",
        "Gradient of matrix loss w.r.t. weights",
        "Hessian-vector product",
        "Numerical gradient check"
    ],
    "Probability": [
        "MLE for Gaussian parameters",
        "Bayesian posterior update",
        "KL divergence between two Gaussians",
        "Monte Carlo integration"
    ],
    "Optimization": [
        "SGD with momentum",
        "Adam update rule",
        "Learning rate schedule (cosine)",
        "Convergence rate analysis (simple case)"
    ],
    "Deep Learning": [
        "Manual backprop for MLP",
        "CNN forward/backward pass",
        "Attention mechanism from scratch",
        "Loss function design with math"
    ]
}

for category, items in checklist.items():
    print(f"\n{category}:")
    for item in items:
        print(f"  [ ] {item}")

Practice Tasks

  1. নিজের current role-এর জন্য math requirement list বানান — কোন phase-এর কোন chapter সবচেয়ে relevant?
  2. Target company-এর job description ৫টি পড়ুন — common math skills কী চায়?
  3. CS229 problem set ১টি try করুন — theoretical proof + programming both।
  4. ৩Blue1Brown-এর Linear Algebra series দেখুন — geometric intuition refresh করুন।

Interview Questions

  1. Industry vs research role — আপনার math depth কীভাবে differ হওয়া উচিত?
  2. একটি new architecture paper পড়ে ১ সপ্তাহে implement করতে বলা হলে — approach?
  3. Math skill weak area identify করে — কীভাবে improve করবেন quickest?
  4. Production bug: model diverges — math-based debugging steps কী?

Summary · সারসংক্ষেপ

  • AI Engineer = implement + debug + optimize — Linear Algebra, Calculus, Probability, Optimization depth।
  • AI Researcher = invent + prove + publish — Matrix Calculus, Measure Theory, Learning Theory, convergence analysis।
  • Interview: applied = shape/complexity/gradient diagnosis; research = proofs/bounds/novelty justification।
  • Structured plan: 8 months base + ongoing paper reading + real system application।
  • Books + courses + blogs + self-assessment = continuous mathematical growth।